Jul 30 2018
Eye Movements and Personality
Have you ever seen someone with “shifty” eyes? How much can we really tell about someone from just their eye movements? This is an interesting question that researchers are exploring. A recent study uses artificial intelligence software to attempt to correlate eye movements of subjects during a specific task with standard measures of personality.
Let’s take a look at this study and then discuss what it all means. Researchers rigged 50 subjects with head gear to record their eye movements:
“Binocular gaze data were tracked using a state-of-the-art head-mounted video-based eye tracker from SensorMotoric Instruments (SMI) at 60Hz. The tracker has a reported gaze estimation accuracy of 0.5° and precision of 0.1°. The tracker recorded gaze data, along with a high-resolution scene video on a mobile phone that was carried in a cross-body bag.”
The data from eight of the subjects was not usable, so they were left with 42 subjects. They recorded an average of about 13 minutes from each subject with about 20% loss of data. They also gave each subject a standardized personality test, and then used an AI learning algorithm to correlate features of eye movement with outcomes on the personality test. They analyzed the data to see how much the eye movements “predicted” the personality traits, and found:
“…our classifier performs well above chance (that is, confidence intervals do not overlap with any of the baseline performances) for neuroticism (40.3%), extraversion (48.6%), agreeableness (45.9%), conscientiousness (43.1%), and perceptual curiosity (PCS, 37.1%). For openness (30.8%) and the Curiosity and Exploration Inventory (CEI, 27.2%) our classifier performs below chance level.”
Chance level was about 33%. While there seems to be something to this data, let’s go through the caveats. First, a sample of 42 is relatively small. Also, this is a “convenience” sample of mostly students (and some faculty) and so may not be broadly representative.
Effect sizes, while significant, are relatively modest. The authors concede the data are not at the point that they can have a practical application. Even for the most significant results, predicting extraversion, the match was less than 50%.
But further the reason I put “predicts” in quotes above is because this word has a specific use in this kind of research – that one set of measures predicts another statistically – but this is the data being used to find the correlation. So really nothing was predicted at the start of the experiment that was later confirmed. The use of “prediction” here is a statistical term.
With this kind of research mining data, whether using AI algorithms or another method, to find variables that predict other variables is really a way of generating hypotheses. Any correlations or predictions would then need to be confirmed with a fresh set of data – measure someone’s eye movements, predict based on this study what their personality should be, then test their personality and see how well the predictions worked.
Sometimes researchers will do this prior to publication, which I always like and think should be done more routines. Do one experiment to find the correlations, then do an internal replication with fresh data, and publish all the results together. This, of course, takes more time. There is an argument for getting the original data published so that other experts can review it and do their own replications. But I think in practice this results in far too much noise in the scientific literature. Fewer but more reliable published studies would probably be better.
In any case, these researchers did to some internal replication, but not with entirely fresh data. They made predictions based on the first half of the recording and then applied it to the second half of the recording as if that were new data. This is a measure of internal reliability. They found that the reliability of the predictions ranged from 0.63 and 0.83. Not bad, but not great either. Since the effect sizes are pretty small to begin with, modest reliability makes the data even more fuzzy.
But even more limited was another test of reliability – comparing two different tasks while measuring eye movements.
The reliability values were lower (0.39–0.63) when the predictions were based on the comparison between two task activities (walking and shopping). These findings suggest that trait-specific eye movements vary substantially across activities.
When comparing eye movements while walking vs shopping, correlations were essentially a coin-flip. The authors conclude:
While predictions are not yet accurate enough for practical applications, they are clearly above chance level and outperform several baselines.
I like their optimism, but I am not sure the “yet” is justified. That small word glosses over a huge question – is the variability in outcome a factor of simply gathering more data and refining the algorithm, or does it represent inherent variability in people? If the latter, then the results of this study may be close to the best we can do, and there may never be practical applications.
The fact is, human behavior is extremely complex, with numerous internal and external variables. Personality probably does have an effect on the pattern of eye movements, but that effect is probably mixed with a host of situations factors. This study shows that the task someone is engaged in affects their eye movement. What about sleepiness, mood, the effect of caffeine, various mental tasks one might be engaged in at the same time? Is someone neurotic, or did they just have a fight? Are they conscientious, or especially anxious because of some recent experience? Maybe they are just in a rush.
While this was the most “real world” study of its kind, as opposed to being in the lab, it is still an artificial situation and likely does not generalize much to people going about their lives and not engaged in a psychology experiment.
I think this is a similar situation to alleged human lie detectors who use subtle behaviors, gestures, and “microexpressions” to determine if someone is lying, engaged in deception, anxious, or whatever. The results are just as fuzzy and imprecise – because people are complex and variable.
Bringing AI into the research may give us the ability to get one level more precise and accurate in terms of reading eye movements or facial expressions, but the human brain is actually quite good at that already. We already have an algorithm in our massive biological processor to determine mood and intent of others from their behavior and expressions. It works pretty well, but does not render us immune to deception because people can also become good at hiding their intentions.
We can also misinterpret other people’s intentions, or their personality, if we are not aware of confounding factors. Maybe that person isn’t mean, maybe they were just fired and or are under a lot of stress. Also, many people simply have facial characteristics which give false clues about their mood and personality.
I suspect that the results of this study are not far off from where we will ultimately be able to get. Maybe we can improve the results somewhat, but confounding factors and inherent variability will probably limit the practical applications of reading people from their eye movements, or any similar technology.