Post-NAS Report Developments in Research


Post-NAS Report Developments in Research

The NAS report reviewed research on visual memory tests conducted in a lab and applied cognitive research, as well as neuroscience research on mechanisms of vision and memory that underlie object and facial recognition. This report also encouraged the application of much existing research in statistical design and analysis of experiments, including more advanced and robust methods than had been used in prior eyewitness-related research. Since the release of the NAS report, more than 200 studies on eyewitness identification have been published. These studies assessed factors that influence eyewitness memory, and questions involving legal applications and police practices, that had not been examined in the past.


Evidence Synthesis

Evidence synthesis refers to the process of assembling information from multiple sources and disciplines to inform debates and decisions on specific issues. Decision-making and public debate are thought to be best served if policymakers have access to the best current evidence on an issue. Similarly, researchers can develop the most efficient research agendas if they know what research has already been conducted on a given topic, the quality or risk of bias in that body of research, and the conclusions of that body of research. Systematic reviews are considered to be the most reliable strategy for synthesizing a body of research. Systematic reviews consist of transparent, reproducible methods to systematically gather all research meeting specific inclusion criteria, appraise that research for appropriateness and rigor, extract relevant information from these studies, and synthesize the research, sometimes in the form of meta-analytic statistics.

As a part of its work, the NAS Committee identified and analyzed 22 evidence syntheses which were largely non-systematic meta-analyses. In general, these research syntheses were often neither well-conducted by current standards nor well-reported, making it difficult to assess the credibility of the findings. Rigorously conducted systematic reviews of available research are critical to establishing the state of the science, identifying gaps in the literature, and suggesting other research that would further the understanding of eyewitness identification and improve law enforcement and courtroom practice. Accordingly, the NAS report recommended “more probing analyses of research findings (such as analyses of consequences of data uncertainties), and more sophisticated systematic reviews and meta-analyses (that adopt current guidelines, including transparency and reproducibility of methods).”1CITATION NEEDED – ed

One continuing project aims to conduct a systematic overview to identify additional research syntheses of quantitative research on eyewitness identification, to appraise their quality/rigor, and to recommend any changes needed to improve synthesis methods. Additionally, a scoping review identifies original quantitative studies of eyewitness identification accuracy and confidence to document and catalogue all available studies. This comprehensive identification of studies will enable more transparent, reproducible systematic reviews, including meta-analyses of the eyewitness research evidence where appropriate. Ideally, this work will lead to one or more systematic reviews and meta-analyses.

The work is more complex than one might have anticipated, due to many more research syntheses that were uncovered than originally identified in the NAS report. In total, we have identified 45 evidence syntheses which meet inclusion criteria and an additional protocol for an ongoing review. The next steps involve appraising the methods used by these syntheses and extracting data on variables examined. Of these 45 syntheses 15 have been published since 2014 (when the NAS literature search was completed; some 2014 studies were identified during the work of the NAS Committee). These syntheses cover a wide range of topics, including relationship between confidence and accuracy, the impact of post-identification feedback, age of the witness, the validity of show-up identifications, the relative validity of photo, video, and live lineups, weapon focus effects, and the impact of eyewitness identification reforms, among other topics. In general, the quality and reporting of research syntheses have improved in recent years, using comprehensive search methods, reproducible data extraction, and improved meta-analytical methods.

As part of the scoping review, 1,246 empirical studies of eyewitness identification or facial memory which meet our inclusion criteria have now been identified. Since the NAS report in 2014, 265 studies have been reported. In part because language used by researchers in this field is often inconsistent, a variety of methods were needed to identify these studies, including traditional article database searches, identification by experts in the field, and harvesting citations from the evidence syntheses discussed above. While the scoping review does not appraise the risk of bias in these individual studies, the project has entered the final stages of extracting bibliographic data and information about the independent, moderating, and dependent variables analyzed in these studies. A final step in the research will involve indicating which studies have already been harvested in evidence syntheses and those which have not been so used but which are available. An online, accessible evidence-and-gap map database could be developed to array all of the identified evidence syntheses and primary empirical studies by the independent variables on which they focus, provide links to these studies where available, and indicate which studies have been synthesized and which have not. Finally, a new collaboration with faculty and a PhD student in the Department of Statistics at University of Virginia has been initiated to use text mining analysis to identify relationships between and among these studies.

References for all studies identified in the overview of research syntheses and scoping review are presented on the Open Science Foundation website here. References to studies considered for inclusion in the overview and scoping review, but ultimately excluded, will be published to the Open Science Framework in the near future. A critical appraisal of the methods used in the identified research syntheses and suggestions for improvement of the conduct and reporting of future synthesis studies is underway. The collaborative work in text mining analysis will lead to innovative strategies for analyzing and synthesizing large bodies of research evidence. As a follow-up to this part of the project, the possible development of an evidence-and-gap-map will illustrate where systematic research synthesis and additional empirical research are likely to produce the most useful findings for practice and policy in eyewitness identification.


Statistical Methods

Experiments are conducted, primarily in laboratory settings, to assess the effects of eyewitness identification (ID) procedures on accuracy of the ID; i.e., whether sequential or simultaneous lineups lead to higher proportions of accurate decisions, the effects of presence or absence of weapons or lighting during the incident, etc. The statistical analysis of the data resulting from such experiments is essential: improper design of such experiments is critical to avoid systematic biases and confounding. Further, improper analyses of data from them lead to unjustified conclusions. Conversely, the use of correct and powerful statistical procedures will lend greater credibility to the conclusions.

Many experiments have been, and continue to be, conducted, to evaluate the effect of “system” and “estimator” variables on the accuracy of eyewitness identification, both in terms of correct hit rate (identify the true perpetrator) and a low false alarm rate (exclude a true innocent).  “System” variables are factors that are under the control of law enforcement; e.g., the number of persons in a lineup and the lineup instructions given. “Estimator” variables are not under the control of law enforcement personnel, such as levels of light, distance between the eyewitness and perpetrator, or presence of a weapon at the time of the incident.

The NAS report emphasized the importance of conducting multi-factor studies, which enable researchers to assess the relative importance of concurrent factors, as well as possible interactions among them.  Multi-factor experiments prevail in many fields of science, particularly when interactions between factors are anticipated.2Cf. Terry Speed. Statistics for Experimenters: Design, Innovation, and Discovery by George E. P. Box; J. Stuart Hunter; William G. Hunter, J. Am. Stat. Ass’n (2006). For an eyewitness ID (EWI) example, Lineup Type A may outperform Lineup Type B (in terms of accuracy) in the presence of a weapon, but B may outperform A if a weapon is absent. Such interactions will be critical to detect, and impossible to detect without multi-factor studies. Since the time of the NAS report, more multi-factor studies have been conducted.3Chad S. Dodson, Brandon L. Garrett, Karen Kafadar, & Joanne Yaffe, Examining the Relative Influence of Five Factors on Eyewitness Accuracy: Face Recognition Ability, The Weapon Focus Effect, Same Versus Cross-Race Identifications, Simultaneous Versus Sequential Lineup Presentation And Fair Versus Biased Lineups (draft manuscript, 2020); Brandon L. Garrett, Alice J. Liu, Karen Kafadar, Joanne Yaffe & Chad S Dodson, Factoring the Role of Eyewitness Evidence in the Courtroom These studies have demonstrated both the feasibility of more advanced designs and the value of studying multiple factors simultaneously. Garrett et al. (2020) used a fractional-factorial design to study the main effects of seven factors, each at two levels and their two-factor interactions, by assigning participants at random to one of only 32 (versus 27 = 128) conditions.4Id. While no interactions were identified, the study demonstrated the advantages of evaluating seven factors at once.

Prior to the NAS report, statistical analyses of data from single-factor studies (e.g., “Sequential” versus “Simultaneous”) was conducted using one of two measures:

  1. A single “Diagnosticity Ratio” (DR), equal to the hit rate (HR) divided by the false alarm rate (FAR), collapsed over all study participants; or,
  2. A receiver operating characteristic curve, by plotting hit rate versus false alarm rate, for those participants having expressed a level of confidence at a given threshold.

For example, for a study in which participants express confidence level (ECL) in their choice on a scale of 0 (not at all confident) to 6 (fully confident), the ECL-based ROC curve is derived by plotting HR versus FAR for those participants who expressed “at least 0” (i.e., all participants), “at least 1”, “at least 2”, … “at least 6.”

In other words, the analysis proceeds by calculating HR(c) and FAR(c) at each confidence threshold c, and then plots HR(c) versus FAR(c) for each c. The tangent of the ROC curve at each of the points on the ECL-ROC curve, (FAR(c), HAR(c)), corresponds precisely to the DR for those who expressed “at least confidence level c.” Mickes et al. (2012) showed that often this curve is not straight (as it would be, if the DR remained unchanged by expressed confidence), and often the DR is higher with higher confidence levels.5Laura Mickes, Heather Flowe, & John T. Wixted, Receiver Operating Characteristic Analysis of Eyewitness Memory: Comparing the Diagnostic Accuracy of Simultaneous Versus Sequential Lineups, 18 J. Exp. Psychol.: App. 361-376. (2012) The ECL-ROC curves suggest that inferences concerning accuracy of EWI in two situations (e.g., simultaneous versus sequential) can differ from those obtained using a single, collapsed DR.  However, Appendix C in the NAS report demonstrated the lack of clear superiority in either lineup type based on the ECL-ROC when variability in experimental results is taken into account. The effect of lineup type on EWI accuracy may be moderated by other factors, such as presence of weapon or delay (between incident and presentation of suspects to eyewitness).

The NAS report emphasized that the EWI accuracy paradigm can be viewed more constructively as a binary classification problem. Each eyewitness is, essentially, a binary classifier:


“That’s the culprit” “That’s not the culprit”
TRUE PERPETRATOR Correct ID False Exclusion
INNOCENT PERSON False ID  Correct Exclusion




Much literature in both statistics and computer science has been devoted to binary classification algorithms (e.g., “Like” versus “Dislike”; “Clicked” versus “No Click”). The NAS report noted one very common statistical method, logistic regression, where the probability of an ID may be a function of not only ECL but also of many other variables, both demographic (e.g., age) and environmental (e.g., presence or absence of a weapon).  Garrett et al. (2020) analyzed the data from their study using logistic regression, which was an especially constructive approach for identifying the relative importance of the seven factors under investigation.6Garrett, supra.

Analyses using more advanced binary classifiers were offered in Liu et al.7Alice J. Liu, Karen Kafadar, Brandon L. Garrett & Joanne Yaffe, Bringing New Statistical Approaches To Eyewitness Evidence, in Handbook of Statistics in Forensic Science (eds. Banks, David L.; Kafadar, Karen; Kaye, David L.; Tackett, Maria L. 2020) These authors noted most analyses focus on correct and false IDs among those that make a definitive choice (“choosers”), to avoid the complications involved with incorporating “non-choosers.”  Liu et al. (2020) developed a model for characterizing the probability of an accurate response by estimating first the probability of making a choice (which may depend on several factors) and then estimating the conditional probabilities of accuracy given “choice” or “no choice.” Several binary classifiers were considered, including Support Vector Machines (SVMs), Neural Networks (NNs), and Random Forests (RFs).

Among them, RFs seemed to provide the most accurate classifications, as well as more interpretable results (i.e., which factors were most influential in the classification), for a specific data set. Liu et al. (2020) emphasized the value of considering multiple approaches.8Id. The NAS report also mentioned other useful displays of the results of an EWI study beyond ROC curves.

An example of a more informative display is the PROC curve, described in Shiu and Gatsonis (2008).9Shang-Ying Shiu & Constantine A Gatsonis, The Predictive Receiver Operating Characteristic Curve for the Joint Assessment of the Positive and Negative Predictive Values, 366 Philosophical Transactions of the Royal Society 2313 (2008). This display is based on fact that the DR(c), or the tangent to the ROC curve at the point defined by (FAR(c), HR(c)) using ECL threshold c, is related to the Positive Predictive Value (PPV) for confidence threshold c; i.e., the probability that the claim “That’s the one” really was an accurate identification of the true perpetrator.  Law enforcement officials emphasize that Negative Predictive Value (NPV), or the probability that the claim “That’s not the one” really was a correct exclusion, is also important.  Shiu and Gatsonis (2008) proposed a display that showed both PPV and NPV, by plotting PPV(c) versus 1 – NPV(c), so that, like the traditional ROC curve, the point in the upper left corner, (0,1) is “ideal;” i.e., the value of c for which 1-NPV(c) and PPV(c) is closest to (0,1). They defined a statistic10Id. that measures how far away a given procedure (e.g., “Sequential lineup” or “Simultaneous lineup”) is from perfect prediction (0,1) for a given threshold c: r(c) = [1 – PPV(c)] + [1 – NPV(c)].  The statistic r(c) in Shui and Gatsonis (2008) is called “Deviation from Perfect Performance” in Smith et al. (2018).11Shiu, supra.; Andrew M. Smith, James Michael Lampinen, Gary L.Wells, Laura Smalarz & Simona Mackovichova, Deviation from Perfect Performance Measures the Diagnostic Utility of Eyewitness Lineups but Partial Area Under the ROC Curve Does Not, 8 J. App. Res. in Mem. Cog. 50 (2018). As Liu et al. (2020) note, many of the statistical methods used for binary classification (accuracy of classification as a function of several variables) or to compare diagnostic methods in medicine (e.g., mammogram versus low-dose CT scan) are directly applicable to the comparison of levels of systematic variables (e.g., sequential vs simultaneous) in EWI.12Liu, supra.

More researchers are utilizing these developments and applying them to better characterize the performance of eyewitness identification procedures, in terms of accurate IDs, accurate exclusions, and the proper consideration of “choosers” and “non-choosers” in the analysis. A key to the success of using them in this field is the clear communication, to both researchers and judicial personnel who rely on them, of their advantages of proper interpretation, and limits of uncertainty in the results.

A very promising direction is the development of experimental procedures that allow the separate “measurement” of the two aspects in an eyewitness identification: memory retrieval and classification.13Sergei Gepshtein, Yurong Wang, Fangchao He, Dinh Diep & Thomas D. Albright, Perceptual Scaling Approach to Eyewitness Identification, 11 Nature Comm. 3380 (2020).  Such a procedure would permit greater understanding of the two aspects and hence greater accuracy.

These recent studies reveal that more complex, possibly highly fractionated, factorial designs have been far more useful in characterizing the effects of multiple factors (both system and estimator variables) simultaneously, and that more sophisticated analyses experimental results, such as logistic regression, random forests, and other statistical learning methods, can offer far more informative summaries of the experiments beyond “diagnosticity ratios” and ROC curves. Experiments that allow deconvolution of the “memory retrieval” process from the statistical classification process should be investigated for their feasibility.


Interventions in Police Procedures

The NAS Committee made five major recommendations:

  1. training all law enforcement officers on variables that can affect eyewitness identifications;
  2. adopting “blind” lineup and photo array procedures;
  3. providing officers who do administer the procedures with standardized witness instructions;
  4. documenting the witness’s stated level of confidence at the time of an identification; and,
  5. videotaping the witness identification process.

None of those recommendations has been called into question; most have been strengthened. In 2020, the American Psychology and Law Society (AP-LS) updated its White Paper regarding eyewitness identification procedures.14Gary L. Wells et al, Policy and Procedure Recommendations for the Collection and Preservation of Eyewitness Identification Evidence, American-Psychology and Law Society, 44 Law & Hum. Behav. 3 (2020).

New recommendations included:

  1. the need for reasonable suspicion before conducting an identification procedure;
  2. the need to conduct a pre-lineup interview of the witness;
  3. video-recording the entire procedure, including the pre-lineup interview, lineup instructions and the witness’s confidence level, in addition to the identification process;
  4. avoiding repeated identification attempts with the same witness and same suspect; and,
  5. avoiding the use of showups when possible and improving how showups are conducted when they are necessary.

Finally, the American Law Institute, as part of the Principles of Policing project, issued recommendations to police agencies regarding eyewitness evidence in 2020, including that agencies should not conduct identification procedures without a suspect (trawling) nor conduct an identification without a “substantial basis” to place a witness in a lineup procedure.15American Law Institute, Principles of Policing, supra. We describe below new recommendations that have arisen from this recent work.




1. Decision time

Police officials have speculated that an eyewitness’s time to identify a suspect from a lineup may be related to the accuracy of the identification (faster is more accurate). Decision time was not addressed in either the NAS report or the AP-LS White Paper. However, research suggests that faster identifications are more likely to be correct than are slower identifications. For example, in Sporer (1992), participants were over twice as fast (10.7 seconds) when they correctly identified the confederate in the lineup than when participants falsely identified a filler face (23.8 seconds).16Siegfried L. Sporer, Post-Dicting Eyewitness Accuracy: Confidence, Decision-Times and Person Descriptions of Choosers and Non-Choosers, 22 European J. Soc. Psychol. 157 (1992). Over the last 30 years, many studies have demonstrated a relationship between decision-time and identification accuracy.

This research has produced three consistent findings. First, the relationship between decision-time and identification accuracy has been observed in various conditions. When participants view a live, in-person event, such as watching a staged crime17See id. or give directions to a target person,18Melanie Sauerland & Siegfried L. Sporer. Fast and Confident: Postdicting Eyewitness Identification Accuracy in a Field Study. Journal of experimental psychology. (2009). they are faster when they correctly identify the target from a lineup than when they make a false identification. The same pattern of faster responses for correct versus false identifications occurs when participants have seen either a video of a mock crime,19Siegfried L. Sporer, Eyewitness identification accuracy, confidence, and decision times in simultaneous and sequential lineups, 78 J. App. Psychol. 22 (1993), Neil Brewer and Gary Wells, The Confidence–Accuracy Relationship in Eyewitness Identification: Effects of Lineup Instructions, Foil Similarity, and Target-Absent Base Rates, 12 J. Exp. Psychol. 11 (2006). or a series of static photos of faces.20See e.g. David G. Dobolyi & Chad S. Dodson, Actual vs. Perceived Eyewitness Accuracy and Confidence and the Featural Justification Effect, 24 J. Exp. Psych.: App. 543 (2018); Nathan Weber, Neil Brewer, Gary L. Wells, Carolyn Semmler, & Amber Keast, Eyewitness Identification Accuracy and Response Latency: The Unruly 10-12-Second Rule, 10 J. Exp. Psychol. App. 139 (2004). The cross-race effect (i.e., same race or different race) does not appear to affect the relationship. Same-race and cross-race lineup identifications show a comparable pattern: faster identifications are more likely to be correct.21Id.; Chad S. Dodson & David G. Dobolyi. Confidence and Eyewitness Identifications: The Cross-Race Effect, Decision Time and Accuracy, 30 App. Cog. Psychol. 113 (2016); Jesse H. Grabman, David G. Dobolyi, N.L. Berelovich, & Chad S. Dodson, Predicting High Confidence Errors in Eyewitness Memory: The Role of Face Recognition Ability, Decision-Time, And Justifications, 8 J. App. Res. Memory and Cog. 233 (2019). The same pattern holds for how participants are asked to identify the previously seen person. Regardless of whether participants are shown a simultaneous lineup22Steven M. Smith, R.C.L. Lindsay & Sean Pryke (2000). Postdictors of eyewitness errors: Can false identifications be diagnosed? Journal of Applied Psychology, 85, 542-550 or a sequential lineup23Sporer (1993), supra. or even a show-up,24Melanie Sauerland, Anna Sagana, Siegfried L. Sporer & John T. Wixted (2018) Decision time and confidence predict choosers’ identification performance in photographic showups. PLoS ONE 13(1): e0190416. correct identifications are generally faster than false identifications. Finally, the relationship between decision-time and identification accuracy applies to individuals of differing face recognition ability.25Jesse H. Grabman, David G. Dobolyi, N.L. Berelovich, & Chad S. Dodson, C. S., Predicting High Confidence Errors in Eyewitness Memory: The Role of Face Recognition Ability, Decision-Time, And Justifications, 8 J. App. Res. Memory and Cog. 233 (2019). Even though stronger face recognizers are more often correct than weaker face recognizers, individuals across the range of face recognition ability tend to be faster when they make a correct than a false identification from a lineup.26Jessica N. Gettleman, Jesse H. Grabman, David G. Dobolyi & Chad S. Dodson (2021). A decision processes account of the differences in the eyewitness confidence-accuracy relationship between strong and weak face recognizers under suboptimal exposure and delay conditions. Journal of Experimental Psychology: Learning, Memory, and Cognition, 47(3), 402–421. https://doi.org/10.1037/xlm0000922 Following good scientific practice, all of these experiments will benefit from replication studies.

The second important finding from this literature is that the combination of how long it takes an eyewitness to make a lineup decision and the level of confidence in the decision is a useful predictor of the eyewitness’s identification accuracy. In fact, the combination of decision-time and confidence can be more useful in predicting identification accuracy than either variable alone. For example, in Sauerland and Sporer (2009), a target person asked participants for directions and then after a brief delay, participants were asked to identify the target from a simultaneous lineup.27Sauerland & Sporer, supra. Participants identified the correct person 72% of the time when their decision took six seconds or less and 36% of the time when they took longer than six seconds. But when they responded within six seconds and they were highly confident (i.e., 90 or 100% confident) in their lineup identification, participants’ identification accuracy was over 96% when they responded within six seconds and they were highly confident (i.e., 90 or 100% confident) in their lineup identification. A similar pattern of extremely high accuracy (greater than 90% correct) when individuals are both highly confident in their lineup identification and respond quickly has been observed in other studies.28David G. Dobolyi & Chad S. Dodson, Actual vs. Perceived Eyewitness Accuracy and Confidence and The Featural Justification Effect, 24 J. Exp. Psychol.: App. 543 (2018); Chad S. Dodson & David G. Dobolyi. Confidence and Eyewitness Identifications: The Cross-Race Effect, Decision Time and Accuracy, 30 App. Cog. Psychol. 113 (2016). This level of accuracy for fast and highly confident identifications appears to occur for both same-race and cross-race identifications.29Dodson & Dobolyi, supra.

Decision-time may be a useful measure for assessing the accuracy of a high confidence identification. Dodson and Dobolyi (2016) showed participants same-race and cross-race photos of individuals and then, after a delay, participants attempted to identify these individuals from a series of simultaneous lineups.30Id. When participants decided within five seconds and they were more than 90% confident in their identification, they were 90% (SE = 2.1%) accurate at making same-race and cross-race identifications. But, the accuracy of these highest confidence responses dropped to 59% (SE = 5.2%) when participants took longer than 21 seconds to make a lineup decision.31Grabman et al. 2019, supra shows a similar pattern

One further consistent finding in this literature is that decision-time is not a reliable predictor of accuracy when an eyewitness responds that the suspect is “not present” in the lineup. Nearly all of the foregoing studies that have observed a consistent relationship between decision-time and identification accuracy have also observed that decision-time is unrelated to the accuracy of a “not present” response. However, this conclusion of little or no relationship between decision-time and rejection accuracy applies only to when eyewitnesses confront a lineup, as showups appear to show a different pattern of results.32Different results appear to arise when eyewitnesses are given show-ups.  Sauerland et al. (2012) observed that when participants are presented with a showup – a single photo that either matches or not a previously seen person – then decision-time is related to the accuracy of a reject response.  They observed that non-choosers were faster (M = 8.3 seconds) at correctly rejecting a photo that did not match a previously seen thief than when they wrongly rejected the photo (M = 11.1 seconds).

From a practical perspective, a question remains: How fast is fast? Is there a reliable time boundary that distinguishes accurate from inaccurate identifications? Dunning and Perretta (2002) suggested a 10-12 second rule whereby identification decisions faster than this cutoff were likely to be correct.33David Dunning & Scott Perretta. Automaticity and eyewitness accuracy: A 10- to 12-second rule for distinguishing accurate from inaccurate positive identifications. Journal of Applied Psychology, 87(5), 951–962. (2002) Subsequent research shows that this 10-12 second rule is not a consistent cutoff. Both Weber et al. (2004) and Brewer et al. (2006) show that other factors, such as the retention interval between seeing the culprit and taking the lineup test, influence the temporal cutoff between correct and false identifications so that under some conditions the optimal time boundary is close to 5 seconds, whereas under other conditions this boundary is longer than 25 seconds.34Nathan Weber, Neil Brewer, Gary L. Wells, Carolyn Semmler & Amber Keast (2004). Eyewitness Identification Accuracy and Response Latency: The Unruly 10-12-Second Rule. Journal of Experimental Psychology: Applied, 10(3), 139–147; Neil Brewer & Gary L. Wells (2006). The confidence-accuracy relationship in eyewitness identification: Effects of lineup instructions, foil similarity, and target-absent base rates. Journal of Experimental Psychology: Applied, 12(1), 11–30. In short, there does not appear to be a consistent, optimal time boundary that distinguishes between most correct and incorrect identifications. One critical unanswered question is whether there exists a consistent time boundary for high confidence identifications.

This research is promising and may provide an additional and readily-measured tool to assess eyewitness accuracy. This research also supports video, or at least audio, recording of eyewitness identification procedures, which allows police to readily document identification speed.




2. Lineup Construction & Administration

New research has focused on implementation of lineup construction and presentation to eyewitnesses, some of which has the potential to inform the revision of eyewitness identification procedures. This topic includes: (a) lineup administration; and, (b) lineup fairness, and the similarity of the fillers to the target.

Regarding lineup administration, several recent areas of research have explored new ways of administering lineups. In one recent area of research, lineups have been conducted in which confidence scores are collected for each image in the lineup.35Neil Brewer, Nathan Weber & Nicola Guerin, Police line-ups of the future? 75 Am. Psych. 76 (2020) Second, recent studies have examined pairwise presentation of images in lineups, in which the eyewitness is scored on comparative choices as between sets of pairs, but does not make a single “identification.”36Sergei Gepshtein & Thomas D. Albright, Perceptual Scaling Improves Eyewitness Identification, Psychonomic Society Annual Meeting Abstracts (2019); Sergei Gepshtein, Y. Wang, F. He, D. Diep, and Thomas D. Albright, A Perceptual Scaling Approach to Eyewitness Identification, 11 Nature Comm. 3380 (2020). Third, researchers have explored interactive lineups, in which the eyewitness can manipulate images in a lineup.37Melissa Colloff, Travis Seale-Carlisle, Nilda Karoğlu, James Rockey, Harriet Smith, Lisa Smith, John Maltby & Heather Flow, Enabling witnesses to reinstate perpetrator pose during a lineup test increases accuracy (2020).

Each of these possible innovations may be promising as a future direction. The first two approaches replace a categorical “identification” decision with a set of scores, in which all images in a lineup are “graded.” Ultimately, however, the legal system currently demands a single choice or identification by an eyewitness, and there may be resistance to supplementing (or replacing) such an identification with a scoring system. Interactive lineups, which require video or images that can be manipulated by a viewer, may pose logistical and practical challenges in law enforcement adoption. However, in-person lineups were common for many years; today photo identifications are used, lineups with short videos are used in the U.K., and in the future, automated identification procedures might lend themselves to more sophisticated presentation methods.

Much literature discusses whether lineups should be administered using photos presented simultaneously or sequentially. The NAS report did not make a recommendation in support of either simultaneous or sequential lineup. In the years since the NAS report, several studies indicate that, on average, witnesses may exhibit better discriminability with simultaneous lineups.38See Figure 7 of Travis Seale-Carlisle, S.A. Wetmore, Heather D. Flowe & Laura Mickes, Designing Police Lineups to Maximize Memory Performance, 25 J. Exp. Psychol.: App. 410-430 (2019). Research conducted since 2014 has been investigating the impact of these other variables on the performance of these two types of lineups. Any differences between the procedures, however, may be reduced if they are evaluated in settings more similar to those used by law enforcement agencies. The differences may also depend on other variables, such as presence of weapon or nature of instructions.

Finally, lineup fairness is an important problem for which much progress can be made in the future, but where agencies currently lack any clear guidance. Traditionally, police officers have selected photos for lineups based on their own sense of fairness and similarity. Typical guidance suggests that the suspect not stand out, and that any distinguishing marks like a tattoo be masked in both suspect and filler photos. Also, law enforcement generally selects filler faces in the lineup because of either their resemblance to the suspect or their resemblance to the description of the suspect, or by using a combination of these methods. However, Colloff et al. (2021) have demonstrated a superior method of constructing lineups.39Melissa F. Colloff, Brent M. Wilson, Travis M. Seale-Carlisle & John T. Wixted. Optimizing the Selection of Fillers in Police Lineups, 8 Proc. Nat. Ac. Sci. 118 (2021). They show that eyewitness identification is better when filler faces are dissimilar rather than similar to the suspect, as long as all of the filler faces match a description of the suspect. This finding is consistent with past research that shows that eyewitness identification performance suffers when the filler faces are highly similar to the target.40Amanda Bergold & Paul Heaton. Does Filler Database Size Influence Identification Accuracy? 42 L. & Hum. Behav. 227 (2018).; Ryan J. Fitzgerald, Heather L. Price, Chris Oriet, & Steve D. Charman (2013). The effect of suspect-filler similarity on eyewitness identification decisions: A meta-analysis, 19 Psychol., Pub. Pol’y & L. 151 (2013). Overall, further research is needed to identify whether there is an optimal level of eyewitness identification performance that is based on specifying both the degree of face similarity between the suspect and filler faces and the variance of face similarity amongst the fillers.




3. Testing Face Memory

People differ in their ability to remember faces.41David White & Richard Kemp, Identifying People from Images., in Psychological science and the law, 239 (2019). The topic of face memory ability was not discussed in the NAS report or the subsequent AP-LS White Paper, but it is a promising area in which some new research has been done. While a court might order an eye exam in cases of questions about an eyewitness’s eyesight who might have been unable to perceive a face from a distance, courts have not asked for face memory tests. Such tests have been developed, and they show that some people are better at remembering new faces than others.42Dodson, supra.

One test widely used in experimental settings is the Cambridge Face Memory Test.43Brad Duchaine & Ken Nakayama, The Cambridge Face Memory Test: Results for Neurologically Intact Individuals and an Investigation of its Validity Using Inverted Face Stimuli and Prosopagnosic Participants, 44 Neuropsychologia 576 (2006). Recent research suggests that an eyewitness’s facial recognition ability affects the identification accuracy. While individuals with excellent facial recognition ability display a strong relationship between confidence and accuracy, this is less true for average or weak facial-recognizers. Weak face-recognizers who were 100% confident were roughly 60% accurate in their identifications.44Duchaine, supra; see also J.H. Grabman, David G. Dobolyi, N.L. Berelovich, & Chad S. Dodson, Predicting High Confidence Errors in Eyewitness Memory: The Role of Face Recognition Ability, Decision-Time, And Justifications, 8 J. App. Res. Memory and Cog. 233 (2019). If these results are replicated and also generalize to more ecologically-valid situations then police agencies may wish to “screen” the potential eyewitness first by conducting a face memory test. More work is needed to examine the utility of face memory testing, particularly with different races.




4. What Experts Know about Eyewitness Factors

Since the release of the NAS report in 2014, hundreds of papers have been published on eyewitness memory, including the AP-LS White Paper on this topic.45Gary L. Wells et al, Policy and Procedure Recommendations for the Collection and Preservation of Eyewitness Identification Evidence, American-Psychology and Law Society, 44 Law & Hum. Behav. 3 (2020). What has been the impact of this research on what experts think about eyewitness memory? The most recent survey of experts and laypeople was conducted over 10 years ago.46J. Don Read & Sarah L. Desmarais, Lay Knowledge of Eyewitness Issues: A Canadian Evaluation, 23 Appl. Cognit. Psychol. 301 (2009). The time seemed ripe to survey experts about their opinions on various issues related to eyewitness memory research.

We surveyed scientists, researchers, and other academics who focus predominately on eyewitness memory research. The survey consisted of 10 general field-related statements as well as 24 more specific eyewitness memory statements. Experts indicated their agreement or disagreement to each statement by selecting a point on a 7-point Likert scale ranging from 1 – strongly disagree – to 7 – strongly agree. If unsure, experts could also respond by selecting a “don’t know” option. In this report, we discuss experts’ opinions on only the ten general field-related statements. Because data collection is ongoing, these results are preliminary and may change. We list these statements below (in Figure 2).

To date, 74 respondents have completed the survey. Figure 2 shows the distribution of responses using the 7-point Likert scale as well as the number of respondents who selected the “don’t know” option. The figure highlights the wide range of opinions for nearly every statement, apart from #4: “Researchers and practitioners communicating and collaborating with each other has been important for improving eyewitness evidence” (on which there was great consistency). When excluding those who selected “don’t know,” 92% of respondents agreed with that statement by selecting five, six, or seven on the Likert scale. Similarly, a large majority (85%) of respondents also agreed on the statement: “Eyewitness memory and identification research can be directly applied in practice.”

Figure 2: 2021 Eyewitness Survey Results

Several statements received mixed responses. For example, 39% disagreed, 49% agreed, and 12% neither agreed nor disagreed with the statement: “Practitioners can understand results from eyewitness memory and identification studies.” Similarly, 35% disagreed, 41% agreed, and 24% neither agreed nor disagreed with the statement: “Policies currently adopted by police agencies align with research-based recommendations regarding the collection of eyewitness evidence.” Lastly, 28% disagreed, 45% agreed, 19% neither agreed nor disagreed, and 8% selected “don’t know” for the statement: “The current law sets out clear rules for how eyewitness identification procedures should be conducted.”

Respondents generally disagreed with the remaining statements. As one might expect, experts disagreed with this statement the most: “Eyewitness memory and identification research is generally clear and free from controversy.” Disagreement was also high for this statement: “Where relevant, judges provide jurors with instructions that adequately inform jurors of the strengths and limitations of eyewitness evidence.”

Although data collection is ongoing, some preliminary observations include: 1) experts generally believe that eyewitness identification research can be directly applied to practice and 2) it is very important to work closely with practitioners to implement evidence-based policies. These views echo the recommendations by the NAS Committee six years ago. However, respondents have highlighted concerns about (a) practitioners’ understanding of eyewitness research and (b) whether current policies are evidence-based. These preliminary results therefore highlight areas of further research and focus.

Next: Post NAS Report Legal Change