Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

  • Machine learning
  • Sustainability
  • Black holes
  • Classes and programs

Departments

  • Aeronautics and Astronautics
  • Brain and Cognitive Sciences
  • Architecture
  • Political Science
  • Mechanical Engineering

Centers, Labs, & Programs

  • Abdul Latif Jameel Poverty Action Lab (J-PAL)
  • Picower Institute for Learning and Memory
  • Lincoln Laboratory
  • School of Architecture + Planning
  • School of Engineering
  • School of Humanities, Arts, and Social Sciences
  • Sloan School of Management
  • School of Science
  • MIT Schwarzman College of Computing

An optimized solution for face recognition

Press contact :.

Photo of a woman's face with reference points connected by lines

Previous image Next image

The human brain seems to care a lot about faces. It’s dedicated a specific area to identifying them, and the neurons there are so good at their job that most of us can readily recognize thousands of individuals. With artificial intelligence, computers can now recognize faces with a similar efficiency — and neuroscientists at MIT’s McGovern Institute for Brain Research have found that a computational network trained to identify faces and other objects discovers a surprisingly brain-like strategy to sort them all out.

The finding, reported March 16 in Science Advances , suggests that the millions of years of evolution that have shaped circuits in the human brain have optimized our system for facial recognition.

“The human brain’s solution is to segregate the processing of faces from the processing of objects,” explains Katharina Dobs, who led the study as a postdoc in the lab of McGovern investigator Nancy Kanwisher , the Walter A. Rosenblith Professor of Cognitive Neuroscience at MIT. The artificial network that she trained did the same. “And that’s the same solution that we hypothesize any system that’s trained to recognize faces and to categorize objects would find,” she adds.

“These two completely different systems have figured out what a — if not the — good solution is. And that feels very profound,” says Kanwisher.

Functionally specific brain regions

More than 20 years ago, Kanwisher and her colleagues discovered a small spot in the brain’s temporal lobe that responds specifically to faces. This region, which they named the fusiform face area, is one of many brain regions Kanwisher and others have found that are dedicated to specific tasks, such as the detection of written words, the perception of vocal songs, and understanding language.

Kanwisher says that as she has explored how the human brain is organized, she has always been curious about the reasons for that organization. Does the brain really need special machinery for facial recognition and other functions? “‘Why questions’ are very difficult in science,” she says. But with a sophisticated type of machine learning called a deep neural network, her team could at least find out how a different system would handle a similar task.

Dobs, who is now a research group leader at Justus Liebig University Giessen in Germany, assembled hundreds of thousands of images with which to train a deep neural network in face and object recognition. The collection included the faces of more than 1,700 different people and hundreds of different kinds of objects, from chairs to cheeseburgers. All of these were presented to the network, with no clues about which was which. “We never told the system that some of those are faces, and some of those are objects. So it’s basically just one big task,” Dobs says. “It needs to recognize a face identity, as well as a bike or a pen.”

As the program learned to identify the objects and faces, it organized itself into an information-processing network with that included units specifically dedicated to face recognition. Like the brain, this specialization occurred during the later stages of image processing. In both the brain and the artificial network, early steps in facial recognition involve more general vision processing machinery, and final stages rely on face-dedicated components.

It’s not known how face-processing machinery arises in a developing brain, but based on their findings, Kanwisher and Dobs say networks don’t necessarily require an innate face-processing mechanism to acquire that specialization. “We didn’t build anything face-ish into our network,” Kanwisher says. “The networks managed to segregate themselves without being given a face-specific nudge.”

Kanwisher says it was thrilling seeing the deep neural network segregate itself into separate parts for face and object recognition. “That’s what we’ve been looking at in the brain for 20-some years,” she says. “Why do we have a separate system for face recognition in the brain? This tells me it is because that is what an optimized solution looks like.”

Now, she is eager to use deep neural nets to ask similar questions about why other brain functions are organized the way they are. “We have a new way to ask why the brain is organized the way it is,” she says. “How much of the structure we see in human brains will arise spontaneously by training networks to do comparable tasks?”

Share this news article on:

Related links.

  • Nancy Kanwisher
  • McGovern Institute for Brain Research
  • Department of Brain and Cognitive Sciences

Related Topics

  • Brain and cognitive sciences
  • McGovern Institute
  • Artificial intelligence
  • Computer science and technology
  • Neuroscience

Related Articles

baby viewing screen

A key brain region responds to faces similarly in infants and adults

a person enters an MRI

Face-specific brain area responds to faces even in people born blind

Perception of a familiar face, such as Scarlett Johansson, is more robust than for unfamiliar faces, such as German celebrity Karoline Herferth.

What's in a face?

Previous item Next item

More MIT News

Photo of Alex Shalek standing by the wall of a science lab

Alex Shalek named director of the Institute for Medical Engineering and Science

Read full story →

An aerial view of a complex highway interchange in Los Angeles.

New tool empowers pavement life-cycle decision-making while reducing data collection burden

A dual-arm robot manipulates objects on a table in front of it

A new model offers robots precise pick-and-place solutions

Workers spreading wet cement

With sustainable cement, startup aims to eliminate gigatons of CO₂

Jennifer Huck takes a selfie in her white naval dress uniform at Fenway Park

3 Questions: Preparing students in MIT’s naval ROTC program

13 people pose together on a catwalk over hydrogen pipelines

Going Dutch on climate

  • More news on MIT News homepage →

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

  • Map (opens in new window)
  • Events (opens in new window)
  • People (opens in new window)
  • Careers (opens in new window)
  • Accessibility
  • Social Media Hub
  • MIT on Facebook
  • MIT on YouTube
  • MIT on Instagram
  • A-Z Publications

Annual Review of Vision Science

Volume 7, 2021, review article, face recognition by humans and machines: three fundamental advances from deep learning.

  • Alice J. O'Toole 1 , and Carlos D. Castillo 2
  • View Affiliations Hide Affiliations Affiliations: 1 School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas 75080, USA; email: [email protected] 2 Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA; email: [email protected]
  • Vol. 7:543-570 (Volume publication date September 2021) https://doi.org/10.1146/annurev-vision-093019-111701
  • First published as a Review in Advance on August 04, 2021
  • Copyright © 2021 by Annual Reviews. All rights reserved

Deep learning models currently achieve human levels of performance on real-world face recognition tasks. We review scientific progress in understanding human face processing using computational approaches based on deep learning. This review is organized around three fundamental advances. First, deep networks trained for face identification generate a representation that retains structured information about the face (e.g., identity, demographics, appearance, social traits, expression) and the input image (e.g., viewpoint, illumination). This forces us to rethink the universe of possible solutions to the problem of inverse optics in vision. Second, deep learning models indicate that high-level visual representations of faces cannot be understood in terms of interpretable features. This has implications for understanding neural tuning and population coding in the high-level visual cortex. Third, learning in deep networks is a multistep process that forces theoretical consideration of diverse categories of learning that can overlap, accumulate over time, and interact. Diverse learning types are needed to model the development of human face processing skills, cross-race effects, and familiarity with individual faces.

Article metrics loading...

Full text loading...

Literature Cited

  • Abadi M , Barham P , Chen J , Chen Z , Davis A et al. 2016 . Tensorflow: a system for large-scale machine learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) 265– 83 Berkeley, CA: USENIX [Google Scholar]
  • Abudarham N , Shkiller L , Yovel G. 2019 . Critical features for face recognition. Cognition 182 : 73– 83 [Google Scholar]
  • Abudarham N , Yovel G. 2020 . Face recognition depends on specialized mechanisms tuned to view-invariant facial features: insights from deep neural networks optimized for face or object recognition. bioRxiv 2020.01.01.890277. https://doi.org/10.1101/2020.01.01.890277 [Crossref]
  • Azevedo FA , Carvalho LR , Grinberg LT , Farfel JM , Ferretti RE et al. 2009 . Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain. J. Comp. Neurol. 513 : 5 532– 41 [Google Scholar]
  • Barlow HB. 1972 . Single units and sensation: a neuron doctrine for perceptual psychology?. Perception 1 : 4 371– 94 [Google Scholar]
  • Bashivan P , Kar K , DiCarlo JJ. 2019 . Neural population control via deep image synthesis. Science 364 : 6439 eaav9436 [Google Scholar]
  • Best-Rowden L , Jain AK. 2018 . Learning face image quality from human assessments. IEEE Trans. Inform. Forensics Secur. 13 : 12 3064– 77 [Google Scholar]
  • Blanz V , Vetter T. 1999 . A morphable model for the synthesis of 3D faces. Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques 187– 94 New York: ACM [Google Scholar]
  • Blauch NM , Behrmann M , Plaut DC. 2020a . Computational insights into human perceptual expertise for familiar and unfamiliar face recognition. Cognition 208 : 104341 [Google Scholar]
  • Blauch NM , Behrmann M , Plaut DC. 2020b . Deep learning of shared perceptual representations for familiar and unfamiliar faces: reply to commentaries. Cognition 208 : 104484 [Google Scholar]
  • Box GE. 1976 . Science and statistics. J. Am. Stat. Assoc. 71 : 356 791– 99 [Google Scholar]
  • Box GEP 1979 . Robustness in the strategy of scientific model building. Robustness in Statistics RL Launer, GN Wilkinson 201– 36 Cambridge, MA: Academic Press [Google Scholar]
  • Bruce V , Young A. 1986 . Understanding face recognition. Br. J. Psychol. 77 : 3 305– 27 [Google Scholar]
  • Burton AM , Bruce V , Hancock PJ. 1999 . From pixels to people: a model of familiar face recognition. Cogn. Sci. 23 : 1 1– 31 [Google Scholar]
  • Cavazos JG , Noyes E , O'Toole AJ. 2019 . Learning context and the other-race effect: strategies for improving face recognition. Vis. Res. 157 : 169– 83 [Google Scholar]
  • Cavazos JG , Phillips PJ , Castillo CD , O'Toole AJ. 2020 . Accuracy comparison across face recognition algorithms: Where are we on measuring race bias?. IEEE Trans. Biom. Behav. Identity Sci. 3 : 1 101– 11 [Google Scholar]
  • Chang L , Tsao DY. 2017 . The code for facial identity in the primate brain. Cell 169 : 6 1013– 28 [Google Scholar]
  • Chen JC , Patel VM , Chellappa R. 2016 . Unconstrained face verification using deep CNN features. Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV) 1– 9 Piscataway, NJ: IEEE [Google Scholar]
  • Cichy RM , Kaiser D. 2019 . Deep neural networks as scientific models. Trends Cogn. Sci. 23 : 4 305– 17 [Google Scholar]
  • Collins E , Behrmann M. 2020 . Exemplar learning reveals the representational origins of expert category perception. PNAS 117 : 20 11167– 77 [Google Scholar]
  • Colón YI , Castillo CD , O'Toole AJ. 2021 . Facial expression is retained in deep networks trained for face identification. J. Vis. 21 : 4 4 [Google Scholar]
  • Cootes TF , Taylor CJ , Cooper DH , Graham J. 1995 . Active shape models—their training and application. Comput. Vis. Image Underst. 61 : 1 38– 59 [Google Scholar]
  • Crosswhite N , Byrne J , Stauffer C , Parkhi O , Cao Q , Zisserman A. 2018 . Template adaptation for face verification and identification. Image Vis. Comput. 79 : 35– 48 [Google Scholar]
  • Deng J , Guo J , Xue N , Zafeiriou S. 2019 . Arcface: additive angular margin loss for deep face recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 4690– 99 Piscataway, NJ: IEEE [Google Scholar]
  • Dhar P , Bansal A , Castillo CD , Gleason J , Phillips P , Chellappa R. 2020 . How are attributes expressed in face DCNNs?. Proceedings of the 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020) 61– 68 Piscataway, NJ: IEEE [Google Scholar]
  • DiCarlo JJ , Cox DD. 2007 . Untangling invariant object recognition. Trends Cogn. Sci. 11 : 8 333– 41 [Google Scholar]
  • Dobs K , Kell AJ , Martinez J , Cohen M , Kanwisher N. 2020 . Using task-optimized neural networks to understand why brains have specialized processing for faces. J. Vis. 20 : 11 660 [Google Scholar]
  • Dowsett A , Sandford A , Burton AM. 2016 . Face learning with multiple images leads to fast acquisition of familiarity for specific individuals. Q. J. Exp. Psychol. 69 : 1 1– 10 [Google Scholar]
  • El Khiyari H , Wechsler H. 2016 . Face verification subject to varying (age, ethnicity, and gender) demographics using deep learning. J. Biom. Biostat. 7 : 323 [Google Scholar]
  • Fausey CM , Jayaraman S , Smith LB. 2016 . From faces to hands: changing visual input in the first two years. Cognition 152 : 101– 7 [Google Scholar]
  • Freiwald WA , Tsao DY. 2010 . Functional compartmentalization and viewpoint generalization within the macaque face-processing system. Science 330 : 6005 845– 51 [Google Scholar]
  • Freiwald WA , Tsao DY , Livingstone MS. 2009 . A face feature space in the macaque temporal lobe. Nat. Neurosci. 12 : 9 1187– 96 [Google Scholar]
  • Fukushima K. 1988 . Neocognitron: a hierarchical neural network capable of visual pattern recognition. Neural Netw 1 : 2 119– 30 [Google Scholar]
  • Goodfellow I , Pouget-Abadie J , Mirza M , Xu B , Warde-Farley D et al. 2014 . Generative adversarial nets. NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems 2672– 80 New York: ACM [Google Scholar]
  • Goodman CS , Shatz CJ. 1993 . Developmental mechanisms that generate precise patterns of neuronal connectivity. Cell 72 : 77– 98 [Google Scholar]
  • Grill-Spector K , Kushnir T , Edelman S , Avidan G , Itzchak Y , Malach R. 1999 . Differential processing of objects under various viewing conditions in the human lateral occipital complex. Neuron 24 : 1 187– 203 [Google Scholar]
  • Grill-Spector K , Weiner KS. 2014 . The functional architecture of the ventral temporal cortex and its role in categorization. Nat. Rev. Neurosci. 15 : 8 536– 48 [Google Scholar]
  • Grill-Spector K , Weiner KS , Gomez J , Stigliani A , Natu VS. 2018 . The functional neuroanatomy of face perception: from brain measurements to deep neural networks. Interface Focus 8 : 4 20180013 [Google Scholar]
  • Gross CG. 2002 . Genealogy of the “grandmother cell. Neuroscientist 8 : 5 512– 18 [Google Scholar]
  • Grother P , Ngan M , Hanaoka K 2019 . Face recognition vendor test (FRVT) part 3: demographic effects Rep. Natl. Inst. Stand. Technol., US Dept. Commerce Gaithersburg, MD: [Google Scholar]
  • Hancock PJ , Bruce V , Burton AM. 2000 . Recognition of unfamiliar faces. Trends Cogn. Sci. 4 : 9 330– 37 [Google Scholar]
  • Hasson U , Nastase SA , Goldstein A. 2020 . Direct fit to nature: an evolutionary perspective on biological and artificial neural networks. Neuron 105 : 3 416– 34 [Google Scholar]
  • Hayward WG , Favelle SK , Oxner M , Chu MH , Lam SM. 2017 . The other-race effect in face learning: using naturalistic images to investigate face ethnicity effects in a learning paradigm. Q. J. Exp. Psychol. 70 : 5 890– 96 [Google Scholar]
  • Hesse JK , Tsao DY. 2020 . The macaque face patch system: a turtle's underbelly for the brain. Nat. Rev. Neurosci. 21 : 12 695– 716 [Google Scholar]
  • Hill MQ , Parde CJ , Castillo CD , Colon YI , Ranjan R et al. 2019 . Deep convolutional neural networks in the face of caricature. Nat. Mach. Intel. 1 : 11 522– 29 [Google Scholar]
  • Hong H , Yamins DL , Majaj NJ , DiCarlo JJ. 2016 . Explicit information for category-orthogonal object properties increases along the ventral stream. Nat. Neurosci. 19 : 4 613– 22 [Google Scholar]
  • Hornik K , Stinchcombe M , White H. 1989 . Multilayer feedforward networks are universal approximators. Neural Netw 2 : 5 359– 66 [Google Scholar]
  • Huang GB , Lee H , Learned-Miller E. 2012 . Learning hierarchical representations for face verification with convolutional deep belief networks. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2518– 25 Piscataway, NJ: IEEE [Google Scholar]
  • Huang GB , Mattar M , Berg T , Learned-Miller E. 2008 . Labeled faces in the wild: a database for studying face recognition in unconstrained environments Paper presented at the Workshop on Faces in “Real-Life” Images: Detection, Alignment, and Recognition Marseille, France: [Google Scholar]
  • Ilyas A , Santurkar S , Tsipras D , Engstrom L , Tran B , Madry A 2019 . Adversarial examples are not bugs, they are features. arXiv:1905.02175 [stat.ML]
  • Issa EB , DiCarlo JJ. 2012 . Precedence of the eye region in neural processing of faces. J. Neurosci. 32 : 47 16666– 82 [Google Scholar]
  • Jacquet M , Champod C. 2020 . Automated face recognition in forensic science: review and perspectives. Forensic Sci. Int. 307 : 110124 [Google Scholar]
  • Jayaraman S , Fausey CM , Smith LB. 2015 . The faces in infant-perspective scenes change over the first year of life. PLOS ONE 10 : 5 e0123780 [Google Scholar]
  • Jayaraman S , Smith LB. 2019 . Faces in early visual environments are persistent not just frequent. Vis. Res. 157 : 213– 21 [Google Scholar]
  • Jenkins R , White D , Van Montfort X , Burton AM. 2011 . Variability in photos of the same face. Cognition 121 : 3 313– 23 [Google Scholar]
  • Kandel ER , Schwartz JH , Jessell TM , Siegelbaum S , Hudspeth AJ , Mack S 2000 . Principles of Neural Science , Vol. 4 New York: McGraw-Hill [Google Scholar]
  • Kay KN , Weiner KS , Grill-Spector K. 2015 . Attention reduces spatial uncertainty in human ventral temporal cortex. Curr. Biol. 25 : 5 595– 600 [Google Scholar]
  • Kelly DJ , Quinn PC , Slater AM , Lee K , Ge L , Pascalis O. 2007 . The other-race effect develops during infancy: evidence of perceptual narrowing. Psychol. Sci. 18 : 12 1084– 89 [Google Scholar]
  • Kelly DJ , Quinn PC , Slater AM , Lee K , Gibson A et al. 2005 . Three-month-olds, but not newborns, prefer own-race faces. Dev. Sci. 8 : 6 F31– 36 [Google Scholar]
  • Kietzmann TC , Swisher JD , König P , Tong F. 2012 . Prevalence of selectivity for mirror-symmetric views of faces in the ventral and dorsal visual pathways. J. Neurosci. 32 : 34 11763– 72 [Google Scholar]
  • Krishnapriya KS , Albiero V , Vangara K , King MC , Bowyer KW. 2020 . Issues related to face recognition accuracy varying based on race and skin tone. IEEE Trans. Technol. Soc. 1 : 1 8– 20 [Google Scholar]
  • Krishnapriya K , Vangara K , King MC , Albiero V , Bowyer K. 2019 . Characterizing the variability in face recognition accuracy relative to race. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops , Vol. 1 2278– 85 Piscataway, NJ: IEEE [Google Scholar]
  • Krizhevsky A , Sutskever I , Hinton GE. 2012 . Imagenet classification with deep convolutional neural networks. NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems 1097– 105 New York: ACM [Google Scholar]
  • Kumar N , Berg AC , Belhumeur PN , Nayar SK. 2009 . Attribute and simile classifiers for face verification. Proceedings of the 2009 IEEE International Conference on Computer Vision 365– 72 Piscataway, NJ: IEEE [Google Scholar]
  • Laurence S , Zhou X , Mondloch CJ. 2016 . The flip side of the other-race coin: They all look different to me. Br. J. Psychol. 107 : 2 374– 88 [Google Scholar]
  • LeCun Y , Bengio Y , Hinton G. 2015 . Deep learning. Nature 521 : 7553 436– 44 [Google Scholar]
  • Levin DT. 2000 . Race as a visual feature: using visual search and perceptual discrimination tasks to understand face categories and the cross-race recognition deficit. J. Exp. Psychol. Gen. 129 : 4 559– 74 [Google Scholar]
  • Lewenberg Y , Bachrach Y , Shankar S , Criminisi A. 2016 . Predicting personal traits from facial images using convolutional neural networks augmented with facial landmark information. arXiv:1605.09062 [cs.CV]
  • Li Y , Gao F , Ou Z , Sun J. 2018 . Angular softmax loss for end-to-end speaker verification. Proceedings of the 11th International Symposium on Chinese Spoken Language Processing (ISCSLP) 190– 94 Baixas, France: ISCA [Google Scholar]
  • Liu Z , Luo P , Wang X , Tang X 2015 . Deep learning face attributes in the wild. Proceedings of the 2015 IEEE International Conference on Computer Vision 3730– 38 Piscataway, NJ: IEEE [Google Scholar]
  • Lundqvist D , Flykt A , Ohman A. 1998 . Karolinska directed emotional faces Database of standardized facial images Psychol. Sect., Dept. Clin. Neurosci. Karolinska Hosp. Solna, Swed: https://www.kdef.se/#:∼:text=The%20Karolinska%20Directed%20Emotional%20Faces,from%20the%20original%20KDEF%20images [Google Scholar]
  • Malpass RS , Kravitz J. 1969 . Recognition for faces of own and other race. J. Personal. Soc. Psychol. 13 : 4 330– 34 [Google Scholar]
  • Matthews CM , Mondloch CJ. 2018 . Improving identity matching of newly encountered faces: effects of multi-image training. J. Appl. Res. Mem. Cogn. 7 : 2 280– 90 [Google Scholar]
  • Maurer D , Le Grand R , Mondloch CJ 2002 . The many faces of configural processing. Trends Cogn. Sci. 6 : 6 255– 60 [Google Scholar]
  • Maze B , Adams J , Duncan JA , Kalka N , Miller T et al. 2018 . IARPA Janus Benchmark—C: face dataset and protocol. Proceedings of the 2018 International Conference on Biometrics (ICB) 158– 65 Piscataway, NJ: IEEE [Google Scholar]
  • McCurrie M , Beletti F , Parzianello L , Westendorp A , Anthony S , Scheirer WJ. 2017 . Predicting first impressions with deep learning. Proceedings of the 2017 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017) 518– 25 Piscataway, NJ: IEEE [Google Scholar]
  • Murphy J , Ipser A , Gaigg SB , Cook R. 2015 . Exemplar variance supports robust learning of facial identity. J. Exp. Psychol. Hum. Percept. Perform. 41 : 3 577– 81 [Google Scholar]
  • Natu VS , Barnett MA , Hartley J , Gomez J , Stigliani A , Grill-Spector K. 2016 . Development of neural sensitivity to face identity correlates with perceptual discriminability. J. Neurosci. 36 : 42 10893– 907 [Google Scholar]
  • Natu VS , Jiang F , Narvekar A , Keshvari S , Blanz V , O'Toole AJ. 2010 . Dissociable neural patterns of facial identity across changes in viewpoint. J. Cogn. Neurosci. 22 : 7 1570– 82 [Google Scholar]
  • Nordt M , Gomez J , Natu V , Jeska B , Barnett M , Grill-Spector K. 2019 . Learning to read increases the informativeness of distributed ventral temporal responses. Cereb. Cortex 29 : 7 3124– 39 [Google Scholar]
  • Nordt M , Gomez J , Natu VS , Rezai AA , Finzi D , Grill-Spector K. 2020 . Selectivity to limbs in ventral temporal cortex decreases during childhood as selectivity to faces and words increases. J. Vis. 20 : 11 152 [Google Scholar]
  • Noyes E , Jenkins R. 2019 . Deliberate disguise in face identification. J. Exp. Psychol. Appl. 25 : 2 280– 90 [Google Scholar]
  • Noyes E , Parde C , Colon Y , Hill M , Castillo C et al. 2021 . Seeing through disguise: getting to know you with a deep convolutional neural network. Cognition 211 : 104611 [Google Scholar]
  • Noyes E , Phillips P , O'Toole A 2017 . What is a super-recogniser?. Face Processing: Systems, Disorders and Cultural Differences M Bindemann 173– 201 Hauppage, NY: Nova Sci. Publ. [Google Scholar]
  • Oosterhof NN , Todorov A. 2008 . The functional basis of face evaluation. PNAS 105 : 32 11087– 92 [Google Scholar]
  • O'Toole AJ , Castillo CD , Parde CJ , Hill MQ , Chellappa R. 2018 . Face space representations in deep convolutional neural networks. Trends Cogn. Sci. 22 : 9 794– 809 [Google Scholar]
  • O'Toole AJ , Phillips PJ , Jiang F , Ayyad J , Pénard N , Abdi H. 2007 . Face recognition algorithms surpass humans matching faces over changes in illumination. IEEE Trans. Pattern Anal. Mach. Intel. 9 1642– 46 [Google Scholar]
  • Parde CJ , Castillo C , Hill MQ , Colon YI , Sankaranarayanan S et al. 2017 . Face and image representation in deep CNN features. Proceedings of the 2017 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017) 673– 80 Piscataway, NJ: IEEE [Google Scholar]
  • Parde CJ , Colón YI , Hill MQ , Castillo CD , Dhar P , O'Toole AJ. 2021 . Face recognition by humans and machines: closing the gap between single-unit and neural population codes—insights from deep learning in face recognition. J. Vis. In press [Google Scholar]
  • Parde CJ , Hu Y , Castillo C , Sankaranarayanan S , O'Toole AJ. 2019 . Social trait information in deep convolutional neural networks trained for face identification. Cogn. Sci. 43 : 6 e12729 [Google Scholar]
  • Parkhi OM , Vedaldi A , Zisserman A. 2015 . Deep face recognition . Rep., Vis. Geom. Group Dept. Eng. Sci., Univ. Oxford UK: [Google Scholar]
  • Paszke A , Gross S , Massa F , Lerer A , Bradbury J et al. 2019 . Pytorch: an imperative style, high-performance deep learning library. NeurIPS 2019: Proceedings of the 32nd International Conference on Neural Information Processing Systems 8024– 35 New York: ACM [Google Scholar]
  • Pezdek K , Blandon-Gitlin I , Moore C 2003 . Children's face recognition memory: more evidence for the cross-race effect. J. Appl. Psychol. 88 : 4 760– 63 [Google Scholar]
  • Phillips PJ , Beveridge JR , Draper BA , Givens G , O'Toole AJ et al. 2011 . An introduction to the good, the bad, & the ugly face recognition challenge problem. Proceedings of the 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG) 346– 53 Piscataway, NJ: IEEE [Google Scholar]
  • Phillips PJ , O'Toole AJ. 2014 . Comparison of human and computer performance across face recognition experiments. Image Vis. Comput. 32 : 1 74– 85 [Google Scholar]
  • Phillips PJ , Yates AN , Hu Y , Hahn CA , Noyes E et al. 2018 . Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms. PNAS 115 : 24 6171– 76 [Google Scholar]
  • Poggio T , Banburski A , Liao Q. 2020 . Theoretical issues in deep networks. PNAS 117 : 48 30039– 45 [Google Scholar]
  • Ponce CR , Xiao W , Schade PF , Hartmann TS , Kreiman G , Livingstone MS. 2019 . Evolving images for visual neurons using a deep generative network reveals coding principles and neuronal preferences. Cell 177 : 4 999– 1009 [Google Scholar]
  • Ranjan R , Bansal A , Zheng J , Xu H , Gleason J et al. 2019 . A fast and accurate system for face detection, identification, and verification. IEEE Trans. Biom. Behav. Identity Sci. 1 : 2 82– 96 [Google Scholar]
  • Ranjan R , Castillo CD , Chellappa R. 2017a . L2-constrained softmax loss for discriminative face verification. arXiv:1703.09507 [cs.CV]
  • Ranjan R , Sankaranarayanan S , Castillo CD , Chellappa R. 2017b . An all-in-one convolutional neural network for face analysis. Proceedings of the 2017 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017) 17– 24 Piscataway, NJ: IEEE [Google Scholar]
  • Richards BA , Lillicrap TP , Beaudoin P , Bengio Y , Bogacz R et al. 2019 . A deep learning framework for neuroscience. Nat. Neurosci. 22 : 11 1761– 70 [Google Scholar]
  • Ritchie KL , Burton AM. 2017 . Learning faces from variability. Q. J. Exp. Psychol. 70 : 5 897– 905 [Google Scholar]
  • Rosch E , Mervis CB , Gray WD , Johnson DM , Boyes-Braem P. 1976 . Basic objects in natural categories. Cogn. Psychol. 8 : 3 382– 439 [Google Scholar]
  • Russakovsky O , Deng J , Su H , Krause J , Satheesh S et al. 2015 . ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 115 : 3 211– 52 [Google Scholar]
  • Russell R , Duchaine B , Nakayama K. 2009 . Super-recognizers: people with extraordinary face recognition ability. Psychon. Bull. Rev. 16 : 2 252– 57 [Google Scholar]
  • Sangrigoli S , Pallier C , Argenti AM , Ventureyra V , de Schonen S. 2005 . Reversibility of the other-race effect in face recognition during childhood. Psychol. Sci. 16 : 6 440– 44 [Google Scholar]
  • Sankaranarayanan S , Alavi A , Castillo C , Chellappa R. 2016 . Triplet probabilistic embedding for face verification and clustering. arXiv:1604.05417 [cs.CV]
  • Schrimpf M , Kubilius J , Hong H , Majaj NJ , Rajalingham R et al. 2018 . Brain-Score: Which artificial neural network for object recognition is most brain-like?. bioRxiv 407007 : https://doi.org/10.1101/407007 [Crossref] [Google Scholar]
  • Schroff F , Kalenichenko D , Philbin J. 2015 . Facenet: a unified embedding for face recognition and clustering. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition 815– 23 Piscataway, NJ: IEEE [Google Scholar]
  • Scott LS , Monesson A. 2010 . Experience-dependent neural specialization during infancy. Neuropsychologia 48 : 6 1857– 61 [Google Scholar]
  • Sengupta S , Chen JC , Castillo C , Patel VM , Chellappa R , Jacobs DW. 2016 . Frontal to profile face verification in the wild. Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV) 1– 9 Piscataway, NJ: IEEE [Google Scholar]
  • Sim T , Baker S , Bsat M. 2002 . The CMU pose, illumination, and expression (PIE) database. Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition 53– 58 Piscataway, NJ: IEEE [Google Scholar]
  • Simonyan K , Zisserman A. 2014 . Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 [cs.CV]
  • Smith LB , Jayaraman S , Clerkin E , Yu C 2018 . The developing infant creates a curriculum for statistical learning. Trends Cogn. Sci. 22 : 4 325– 36 [Google Scholar]
  • Smith LB , Slone LK. 2017 . A developmental approach to machine learning?. Front. Psychol. 8 : 2124 [Google Scholar]
  • Song A , Linjie L , Atalla C , Gottrell G. 2017 . Learning to see people like people: predicting social impressions of faces. Cogn. Sci. 2017 : 1096– 101 [Google Scholar]
  • Storrs KR , Kietzmann TC , Walther A , Mehrer J , Kriegeskorte N. 2020 . Diverse deep neural networks all predict human IT well, after training and fitting bioRxiv 2020.05.07.082743. https://doi.org/10.1101/2020.05.07.082743 [Crossref] [Google Scholar]
  • Su H , Maji S , Kalogerakis E , Learned-Miller E. 2015 . Multi-view convolutional neural networks for 3D shape recognition. Proceedings of the 2015 IEEE International Conference on Computer Vision 945– 53 Piscataway, NJ: IEEE [Google Scholar]
  • Sugden NA , Moulson MC. 2017 . Hey baby, what's “up”? One- and 3-month-olds experience faces primarily upright but non-upright faces offer the best views. Q. J. Exp. Psychol. 70 : 5 959– 69 [Google Scholar]
  • Taigman Y , Yang M , Ranzato M , Wolf L. 2014 . Deepface: closing the gap to human-level performance in face verification. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition 1701– 8 Piscataway, NJ: IEEE [Google Scholar]
  • Tanaka JW , Pierce LJ. 2009 . The neural plasticity of other-race face recognition. Cogn. Affect. Behav. Neurosci. 9 : 1 122– 31 [Google Scholar]
  • Terhörst P , Fährmann D , Damer N , Kirchbuchner F , Kuijper A. 2020 . Beyond identity: What information is stored in biometric face templates?. arXiv:2009.09918 [cs.CV]
  • Thorpe S , Fize D , Marlot C. 1996 . Speed of processing in the human visual system. Nature 381 : 6582 520– 22 [Google Scholar]
  • Todorov A. 2017 . Face Value: The Irresistible Influence of First Impressions Princeton, NJ: Princeton Univ. Press [Google Scholar]
  • Todorov A , Mandisodza AN , Goren A , Hall CC. 2005 . Inferences of competence from faces predict election outcomes. Science 308 : 5728 1623– 26 [Google Scholar]
  • Valentine T. 1991 . A unified account of the effects of distinctiveness, inversion, and race in face recognition. Q. J. Exp. Psychol. A 43 : 2 161– 204 [Google Scholar]
  • van der Maaten L , Weinberger K. 2012 . Stochastic triplet embedding. Proceedings of the 2012 IEEE International Workshop on Machine Learning for Signal Processing 1– 6 Piscataway, NJ: IEEE [Google Scholar]
  • Walker M , Vetter T. 2009 . Portraits made to measure: manipulating social judgments about individuals with a statistical face model. J. Vis. 9 : 11 12 [Google Scholar]
  • Wang F , Liu W , Liu H , Cheng J. 2018 . Additive margin softmax for face verification. IEEE Signal Process. Lett. 25 : 926– 30 [Google Scholar]
  • Wang F , Xiang X , Cheng J , Yuille AL. 2017 . Normface: L 2 hypersphere embedding for face verification. MM '17: Proceedings of the 25th ACM International Conference on Multimedia 1041– 49 New York: ACM [Google Scholar]
  • Xie C , Tan M , Gong B , Wang J , Yuille AL , Le QV. 2020 . Adversarial examples improve image recognition. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition 819– 28 Piscataway, NJ: IEEE [Google Scholar]
  • Yamins DL , Hong H , Cadieu CF , Solomon EA , Seibert D , DiCarlo JJ. 2014 . Performance-optimized hierarchical models predict neural responses in higher visual cortex. PNAS 111 : 23 8619– 24 [Google Scholar]
  • Yi D , Lei Z , Liao S , Li SZ. 2014 . Learning face representation from scratch. arXiv:1411.7923 [cs.CV]
  • Yoshida H , Smith LB. 2008 . What's in view for toddlers? Using a head camera to study visual experience. Infancy 13 : 3 229– 48 [Google Scholar]
  • Young AW , Burton AM. 2020 . Insights from computational models of face recognition: a reply to Blauch, Behrmann and Plaut. Cognition 208 : 104422 [Google Scholar]
  • Yovel G , Abudarham N. 2020 . From concepts to percepts in human and machine face recognition: a reply to Blauch, Behrmann & Plaut. Cognition 208 : 104424 [Google Scholar]
  • Yovel G , Halsband K , Pelleg M , Farkash N , Gal B , Goshen-Gottstein Y. 2012 . Can massive but passive exposure to faces contribute to face recognition abilities?. J. Exp. Psychol. Hum. Percept. Perform. 38 : 2 285– 89 [Google Scholar]
  • Yovel G , O'Toole AJ. 2016 . Recognizing people in motion. Trends Cogn. Sci. 20 : 5 383– 95 [Google Scholar]
  • Yuan L , Xiao W , Kreiman G , Tay FE , Feng J , Livingstone MS. 2020 . Adversarial images for the primate brain. arXiv:2011.05623 [q-bio.NC]

Data & Media loading...

  • Article Type: Review Article

Most Read This Month

Most cited most cited rss feed, deep neural networks: a new framework for modeling biological vision and brain information processing, a revised neural framework for face processing, capabilities and limitations of peripheral vision, visual adaptation, microglia in the retina: roles in development, maturity, and disease, circuits for action and cognition: a view from the superior colliculus, neuronal mechanisms of visual attention, the functional neuroanatomy of human face perception, scene perception in the human brain, the organization and operation of inferior temporal cortex.

Subscribe to the PwC Newsletter

Join the community, add a new evaluation result row, face recognition.

604 papers with code • 23 benchmarks • 64 datasets

Facial Recognition is the task of making a positive identification of a face in a photo or video image against a pre-existing database of faces. It begins with detection - distinguishing human faces from other objects in the image - and then works on identification of those detected faces.

The state of the art tables for this task are contained mainly in the consistent parts of the task : the face verification and face identification tasks.

( Image credit: Face Verification )

recent research on face recognition

Benchmarks Add a Result

--> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> -->
Trend Dataset Best ModelPaper Code Compare
GhostFaceNetV2-1 (MS1MV3)
GhostFaceNetV2-1
MS1MV2, R100, SFace
Fine-tuned ArcFace
Fine-tuned ArcFace
ArcFace+CSFM
PIC - QMagFace
Prodpoly
Prodpoly
PIC - MagFace
PIC - ArcFace
FaceNet+Adaptive Threshold
FaceNet+Adaptive Threshold
FaceNet+Adaptive Threshold
Model with Up Convolution + DoG Filter (Aligned)
Model with Up Convolution + DoG Filter
GhostFaceNetV2-1
Model with Up Convolution + DoG Filter
GhostFaceNetV2-1
Multi-task
FaceTransformer+OctupletLoss
Partial FC
MCN

recent research on face recognition

Most implemented papers

Facenet: a unified embedding for face recognition and clustering.

On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99. 63%.

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

recent research on face recognition

Recently, a popular line of research in face recognition is adopting margins in the well-established softmax loss function to maximize class separability.

VGGFace2: A dataset for recognising faces across pose and age

The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimize the label noise.

SphereFace: Deep Hypersphere Embedding for Face Recognition

This paper addresses deep face recognition (FR) problem under open-set protocol, where ideal face features are expected to have smaller maximal intra-class distance than minimal inter-class distance under a suitably chosen metric space.

A Light CNN for Deep Face Representation with Noisy Labels

This paper presents a Light CNN framework to learn a compact embedding on the large-scale face data with massive noisy labels.

Learning Face Representation from Scratch

The current situation in the field of face recognition is that data is more important than algorithm.

Circle Loss: A Unified Perspective of Pair Similarity Optimization

This paper provides a pair similarity optimization viewpoint on deep feature learning, aiming to maximize the within-class similarity $s_p$ and minimize the between-class similarity $s_n$.

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base.

CosFace: Large Margin Cosine Loss for Deep Face Recognition

recent research on face recognition

The central task of face recognition, including face verification and identification, involves face feature discrimination.

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sensors (Basel)

Logo of sensors

Face Recognition Systems: A Survey

Yassin kortli.

1 AI-ED Department, Yncrea Ouest, 20 rue du Cuirassé de Bretagne, 29200 Brest, France; [email protected] (M.J.); [email protected] (A.A.F.)

2 Electronic and Micro-electronic Laboratory, Faculty of Sciences of Monastir, University of Monastir, Monastir 5000, Tunisia

Maher Jridi

Ayman al falou, mohamed atri.

3 College of Computer Science, King Khalid University, Abha 61421, Saudi Arabia; as.ude.ukk@irtam

Over the past few decades, interest in theories and algorithms for face recognition has been growing rapidly. Video surveillance, criminal identification, building access control, and unmanned and autonomous vehicles are just a few examples of concrete applications that are gaining attraction among industries. Various techniques are being developed including local, holistic, and hybrid approaches, which provide a face image description using only a few face image features or the whole facial features. The main contribution of this survey is to review some well-known techniques for each approach and to give the taxonomy of their categories. In the paper, a detailed comparison between these techniques is exposed by listing the advantages and the disadvantages of their schemes in terms of robustness, accuracy, complexity, and discrimination. One interesting feature mentioned in the paper is about the database used for face recognition. An overview of the most commonly used databases, including those of supervised and unsupervised learning, is given. Numerical results of the most interesting techniques are given along with the context of experiments and challenges handled by these techniques. Finally, a solid discussion is given in the paper about future directions in terms of techniques to be used for face recognition.

1. Introduction

The objective of developing biometric applications, such as facial recognition, has recently become important in smart cities. In addition, many scientists and engineers around the world have focused on establishing increasingly robust and accurate algorithms and methods for these types of systems and their application in everyday life. All types of security systems must protect all personal data. The most commonly used type for recognition is the password. However, through the development of information technologies and security algorithms, many systems are beginning to use many biometric factors for recognition task [ 1 , 2 , 3 , 4 ]. These biometric factors make it possible to identify people’s identity by their physiological or behavioral characteristics. They also provide several advantages, for example, the presence of a person in front of the sensor is sufficient, and there is no more need to remember several passwords or confidential codes anymore. In this context, many recognition systems based on different biometric factors such as iris, fingerprints [ 5 ], voice [ 6 ], and face have been deployed in recent years.

Systems that identify people based on their biological characteristics are very attractive because they are easy to use. The human face is composed of different structures and characteristics. For this reason, in recent years, it has become one of the most widely used biometric authentication systems, given its potential in many applications and fields (surveillance, home security, border control, and so on) [ 7 , 8 , 9 ]. Facial recognition system as an ID (identity) is already being offered to consumers outside of phones, including at airport check-ins, sports stadiums, and concerts. In addition, this system does not require the intervention of people to operate, which makes it possible to identify people only from images obtained from the camera. In addition, many biometric systems that are developed using different types of search provide good identification accuracy. However, it would be interesting to develop new biometric systems for face recognition in order to reach real-time constraints.

Owing to the huge volume of data generated and rapid advancement in artificial intelligence techniques, traditional computing models have become inadequate to process data, especially for complex applications like those related to feature extraction. Graphics processing units (GPUs) [ 4 ], central processing unit (CPU) [ 3 ], and programmable gate arrays (FPGAs) [ 10 ] are required to efficiently perform complex computing tasks. GPUs have computing cores that are several orders of magnitude larger than traditional CPU and allow greater capacity to perform parallel computing. Unlike GPUs, the FPGAs have a flexible hardware configuration and offer better performance than GPUs in terms of energy efficiency. However, FPGAs present a major drawback related to the programming time, which is higher than that of CPU and GPU.

There are many computer vision approaches proposed to address face detection or recognition tasks with high robustness and discrimination, such as local, subspace, and hybrid approaches [ 10 , 11 , 12 , 13 , 14 , 15 , 16 ]. However, several issues still need to be addressed owing to various challenges, such as head orientation, lighting conditions, and facial expression. The most interesting techniques are developed to face all these challenges, and thus develop reliable face recognition systems. Nevertheless, they require high processing time, high memory consumption, and are relatively complex.

Rapid advances in technologies such as digital cameras, portable devices, and increased demand for security make the face recognition system one of the primary biometric technologies.

To sum up, the contributions of this paper review are as follows:

  • We first introduced face recognition as a biometric technique.
  • We presented the state of the art of the existing face recognition techniques classified into three approaches: local, holistic, and hybrid.
  • The surveyed approaches were summarized and compared under different conditions.
  • We presented the most popular face databases used to test these approaches.
  • We highlighted some new promising research directions.

2. Face Recognition Systems Survey

2.1. essential steps of face recognition systems.

Before detailing the techniques used, it is necessary to make a brief description of the problems that must be faced and solved in order to perform the face recognition task correctly. For several security applications, as detailed in the works of [ 17 , 18 , 19 , 20 , 21 , 22 ], the characteristics that make a face recognition system useful are the following: its ability to work with both videos and images, to process in real time, to be robust in different lighting conditions, to be independent of the person (regardless of hair, ethnicity, or gender), and to be able to work with faces from different angles. Different types of sensors, including RGB, depth, EEG, thermal, and wearable inertial sensors, are used to obtain data. These sensors may provide extra information and help the face recognition systems to identify face images in both static images and video sequences. Moreover, three categories of sensors that may improve the reliability and the accuracy of a face recognition system by tackling the challenges include illumination variation, head pose, and facial expression in pure image/video processing. The first group is non-visual sensors, such as audio, depth, and EEG sensors, which provide extra information in addition to the visual dimension and improve the recognition reliability, for example, in illumination variation and position shift situation. The second is detailed-face sensors, which detect a small dynamic change of a face component, such as eye-trackers, which may help differentiate the background noise and the face images. The last is target-focused sensors, such as infrared thermal sensors, which can facilitate the face recognition systems to filter useless visual contents and may help resistance illumination variation.

Three basic steps are used to develop a robust face recognition system: (1) face detection, (2) feature extraction, and (3) face recognition (shown in Figure 1 ) [ 3 , 23 ]. The face detection step is used to detect and locate the human face image obtained by the system. The feature extraction step is employed to extract the feature vectors for any human face located in the first step. Finally, the face recognition step includes the features extracted from the human face in order to compare it with all template face databases to decide the human face identity.

  • Face Detection : The face recognition system begins first with the localization of the human faces in a particular image. The purpose of this step is to determine if the input image contains human faces or not. The variations of illumination and facial expression can prevent proper face detection. In order to facilitate the design of a further face recognition system and make it more robust, pre-processing steps are performed. Many techniques are used to detect and locate the human face image, for example, Viola–Jones detector [ 24 , 25 ], histogram of oriented gradient (HOG) [ 13 , 26 ], and principal component analysis (PCA) [ 27 , 28 ]. Also, the face detection step can be used for video and image classification, object detection [ 29 ], region-of-interest detection [ 30 ], and so on.
  • Feature Extraction : The main function of this step is to extract the features of the face images detected in the detection step. This step represents a face with a set of features vector called a “signature” that describes the prominent features of the face image such as mouth, nose, and eyes with their geometry distribution [ 31 , 32 ]. Each face is characterized by its structure, size, and shape, which allow it to be identified. Several techniques involve extracting the shape of the mouth, eyes, or nose to identify the face using the size and distance [ 3 ]. HOG [ 33 ], Eigenface [ 34 ], independent component analysis (ICA), linear discriminant analysis (LDA) [ 27 , 35 ], scale-invariant feature transform (SIFT) [ 23 ], gabor filter, local phase quantization (LPQ) [ 36 ], Haar wavelets, Fourier transforms [ 31 ], and local binary pattern (LBP) [ 3 , 10 ] techniques are widely used to extract the face features.
  • Face Recognition : This step considers the features extracted from the background during the feature extraction step and compares it with known faces stored in a specific database. There are two general applications of face recognition, one is called identification and another one is called verification. During the identification step, a test face is compared with a set of faces aiming to find the most likely match. During the identification step, a test face is compared with a known face in the database in order to make the acceptance or rejection decision [ 7 , 19 ]. Correlation filters (CFs) [ 18 , 37 , 38 ], convolutional neural network (CNN) [ 39 ], and also k-nearest neighbor (K-NN) [ 40 ] are known to effectively address this task.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g001.jpg

Face recognition structure [ 3 , 23 ].

2.2. Classification of Face Recognition Systems

Compared with other biometric systems such as the eye, iris, or fingerprint recognition systems, the face recognition system is not the most efficient and reliable [ 5 ]. Moreover, this biometric system has many constraints resulting from many challenges, despite all the above advantages. The recognition under the controlled environments has been saturated. Nevertheless, in uncontrolled environments, the problem remains open owing to large variations in lighting conditions, facial expressions, age, dynamic background, and so on. In this paper survey, we review the most advanced face recognition techniques proposed in controlled/uncontrolled environments using different databases.

Several systems are implemented to identify a human face in 2D or 3D images. In this review paper, we will classify these systems into three approaches based on their detection and recognition method ( Figure 2 ): (1) local, (2) holistic (subspace), and (3) hybrid approaches. The first approach is classified according to certain facial features, not considering the whole face. The second approach employs the entire face as input data and then projects into a small subspace or in correlation plane. The third approach uses local and global features in order to improve face recognition accuracy.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g002.jpg

Face recognition methods. SIFT, scale-invariant feature transform; SURF, scale-invariant feature transform; BRIEF, binary robust independent elementary features; LBP, local binary pattern; HOG, histogram of oriented gradients; LPQ, local phase quantization; PCA, principal component analysis; LDA, linear discriminant analysis; KPCA, kernel PCA; CNN, convolutional neural network; SVM, support vector machine.

3. Local Approaches

In the context of face recognition, local approaches treat only some facial features. They are more sensitive to facial expressions, occlusions, and pose [ 1 ]. The main objective of these approaches is to discover distinctive features. Generally, these approaches can be divided into two categories: (1) local appearance-based techniques are used to extract local features, while the face image is divided into small regions (patches) [ 3 , 32 ]. (2) Key-points-based techniques are used to detect the points of interest in the face image, after which the features localized on these points are extracted.

3.1. Local Appearance-Based Techniques

It is a geometrical technique, also called feature or analytic technique. In this case, the face image is represented by a set of distinctive vectors with low dimensions or small regions (patches). Local appearance-based techniques focus on critical points of the face such as the nose, mouth, and eyes to generate more details. Also, it takes into account the particularity of the face as a natural form to identify and use a reduced number of parameters. In addition, these techniques describe the local features through pixel orientations, histograms [ 13 , 26 ], geometric properties, and correlation planes [ 3 , 33 , 41 ].

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g003.jpg

The local binary pattern (LBP) descriptor [ 19 ].

Khoi et al. [ 20 ] propose a fast face recognition system based on LBP, pyramid of local binary pattern (PLBP), and rotation invariant local binary pattern (RI-LBP). Xi et al. [ 15 ] have introduced a new unsupervised deep learning-based technique, called local binary pattern network (LBPNet), to extract hierarchical representations of data. The LBPNet maintains the same topology as the convolutional neural network (CNN). The experimental results obtained using the public benchmarks (i.e., LFW and FERET) have shown that LBPNet is comparable to other unsupervised techniques. Laure et al. [ 40 ] have implemented a method that helps to solve face recognition issues with large variations of parameters such as expression, illumination, and different poses. This method is based on two techniques: LBP and K-NN techniques. Owing to its invariance to the rotation of the target image, LBP become one of the important techniques used for face recognition. Bonnen et al. [ 42 ] proposed a variant of the LBP technique named “multiscale local binary pattern (MLBP)” for features’ extraction. Another LBP extension is the local ternary pattern (LTP) technique [ 43 ], which is less sensitive to the noise than the original LBP technique. This technique uses three steps to compute the differences between the neighboring ones and the central pixel. Hussain et al. [ 36 ] develop a local quantized pattern (LQP) technique for face representation. LQP is a generalization of local pattern features and is intrinsically robust to illumination conditions. The LQP features use the disk layout to sample pixels from the local neighborhood and obtain a pair of binary codes using ternary split coding. These codes are quantized, with each one using a separately learned codebook.

The magnitude of the gradient and the orientation of each pixel in the cell are voted in nine bins with the tri-linear interpolation. The histograms of each cell are generated pixel based on direction gradients and, finally, the histograms of the whole cells are combined to extract the feature of the face image. Karaaba et al. [ 44 ] proposed a combination of different histograms of oriented gradients (HOG) to perform a robust face recognition system. This technique is named “multi-HOG”.

The authors create a vector of distances between the target and the reference face images for identification. Arigbabu et al. [ 46 ] proposed a novel face recognition system based on the Laplacian filter and the pyramid histogram of gradient (PHOG) descriptor. In addition, to investigate the face recognition problem, support vector machine (SVM) is used with different kernel functions.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g004.jpg

All “ 4 f ” optical configuration [ 37 ].

For example, the VLC technique is done by two cascade Fourier transform structures realized by two lenses [ 4 ], as presented in Figure 5 . The VLC technique is presented as follows: firstly, a 2D-FFT is applied to the target image to get a target spectrum S . After that, a multiplication between the target spectrum and the filter obtain with the 2D-FFT of a reference image is affected, and this result is placed in the Fourier plane. Next, it provides the correlation result recorded on the correlation plane, where this multiplication is affected by inverse FF.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g005.jpg

Flowchart of the VanderLugt correlator (VLC) technique [ 4 ]. FFT, fast Fourier transform; POF, phase-only filter.

The correlation result, described by the peak intensity, is used to determine the similarity degree between the target and reference images.

where F F T − 1 stands for the inverse fast FT (FFT) operation, * represents the conjugate operation, and ∘ denotes the element-wise array multiplication. To enhance the matching process, Horner and Gianino [ 49 ] proposed a phase-only filter (POF). The POF filter can produce correlation peaks marked with enhanced discrimination capability. The POF is an optimized filter defined as follows:

where S ∗ ( u , v ) is the complex conjugate of the 2D-FFT of the reference image. To evaluate the decision, the peak to correlation energy (PCE) is defined as the energy in the correlation peaks’ intensity normalized to the overall energy of the correlation plane.

where i , j are the coefficient coordinates; M and N are the size of the correlation plane and the size of the peak correlation spot, respectively; E p e a k is the energy in the correlation peaks; and E c o r r e l a t i o n − p l a n e is the overall energy of the correlation plane. Correlation techniques are widely applied in recognition and identification applications [ 4 , 37 , 50 , 51 , 52 , 53 ]. For example, in the work of [ 4 ], the authors presented the efficiency performances of the VLC technique based on the “4f” configuration for identification using GPU Nvidia Geforce 8400 GS. The POF filter is used for the decision. Another important work in this area of research is presented by Leonard et al. [ 50 ], which presented good performance and the simplicity of the correlation filters for the field of face recognition. In addition, many specific filters such as POF, BPOF, Ad, IF, and so on are used to select the best filter based on its sensitivity to the rotation, scale, and noise. Napoléon et al. [ 3 ] introduced a novel system for identification and verification fields based on an optimized 3D modeling under different illumination conditions, which allows reconstructing faces in different poses. In particular, to deform the synthetic model, an active shape model for detecting a set of key points on the face is proposed in Figure 6 . The VanderLugt correlator is proposed to perform the identification and the LBP descriptor is used to optimize the performances of a correlation technique under different illumination conditions. The experiments are performed on the Pointing Head Pose Image Database (PHPID) database with an elevation ranging from −30° to +30°.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g006.jpg

( a ) Creation of the 3D face of a person, ( b ) results of the detection of 29 landmarks of a face using the active shape model, ( c ) results of the detection of 26 landmarks of a face [ 3 ].

3.2. Key-Points-Based Techniques

The key-points-based techniques are used to detect specific geometric features, according to some geometric information of the face surface (e.g., the distance between the eyes, the width of the head). These techniques can be defined by two significant steps, key-point detection and feature extraction [ 3 , 30 , 54 , 55 ]. The first step focuses on the performance of the detectors of the key-point features of the face image. The second step focuses on the representation of the information carried with the key-point features of the face image. Although these techniques can solve the missing parts and occlusions, scale invariant feature transform (SIFT), binary robust independent elementary features (BRIEF), and speeded-up robust features (SURF) techniques are widely used to describe the feature of the face image.

  • Scale invariant feature transform (SIFT) [ 56 , 57 ]: SIFT is an algorithm used to detect and describe the local features of an image. This algorithm is widely used to link two images by their local descriptors, which contain information to make a match between them. The main idea of the SIFT descriptor is to convert the image into a representation composed of points of interest. These points contain the characteristic information of the face image. SIFT presents invariance to scale and rotation. It is commonly used today and fast, which is essential in real-time applications, but one of its disadvantages is the time of matching of the critical points. The algorithm is realized in four steps: (1) detection of the maximum and minimum points in the space-scale, (2) location of characteristic points, (3) assignment of orientation, and (4) a descriptor of the characteristic point. A framework to detect the key-points based on the SIFT descriptor was proposed by L. Lenc et al. [ 56 ], where they use the SIFT technique in combination with a Kepenekci approach for the face recognition.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g007.jpg

Face recognition based on the speeded-up robust features (SURF) descriptor [ 58 ]: recognition using fast library for approximate nearest neighbors (FLANN) distance.

  • Binary robust independent elementary features (BRIEF) [ 30 , 57 ]: BRIEF is a binary descriptor that is simple and fast to compute. This descriptor is based on the differences between the pixel intensity that are similar to the family of binary descriptors such as binary robust invariant scalable (BRISK) and fast retina keypoint (FREAK) in terms of evaluation. To reduce noise, the BRIEF descriptor smoothens the image patches. After that, the differences between the pixel intensity are used to represent the descriptor. This descriptor has achieved the best performance and accuracy in pattern recognition.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g008.jpg

Fast retina keypoint (FREAK) descriptor used 43 sampling patterns [ 19 ].

3.3. Summary of Local Approaches

Table 1 summarizes the local approaches that we presented in this section. Various techniques are introduced to locate and to identify the human faces based on some regions of the face, geometric features, and facial expressions. These techniques provide robust recognition under different illumination conditions and facial expressions. Furthermore, they are sensitive to noise, and invariant to translations and rotations.

Summary of local approaches. SIFT, scale-invariant feature transform; SURF, scale-invariant feature transform; BRIEF, binary robust independent elementary features; LBP, local binary pattern; HOG, histogram of oriented gradients; LPQ, local phase quantization; PCA, principal component analysis; LDA, linear discriminant analysis; KPCA, kernel PCA; CNN, convolutional neural network; SVM, support vector machine; PLBP, pyramid of LBP; KNN, k-nearest neighbor; MLBP, multiscale LBP; LTP, local ternary pattern.; PHOG, pyramid HOG; VLC, VanderLugt correlator; LFW, Labeled Faces in the Wild; FERET, Face Recognition Technology; PHPID, Pointing Head Pose Image Database; PCE, peak to correlation energy; POF, phase-only filter; PSR, peak-to-sidelobe ratio.

Author/Technique UsedDatabaseMatchingLimitationAdvantageResult
Local Appearance-Based Techniques
Khoi et al. [ ]LBPTDFMAPSkewness in face imageRobust feature in fontal face5%
CF199913.03%
LFW90.95%
Xi et al. [ ]LBPNetFERETCosine similarityComplexities of CNNHigh recognition accuracy97.80%
LFW94.04%
Khoi et al. [ ]PLBPTDFMAPSkewness in face imageRobust feature in fontal face5.50%
CF9.70%
LFW91.97%
Laure et al. [ ]LBP and KNNLFWKNNIllumination conditionsRobust85.71%
CMU-PIE99.26%
Bonnen et al. [ ]MRF and MLBPAR (Scream)Cosine similarityLandmark extraction fails or is not idealRobust to changes in facial expression86.10%
FERET (Wearing sunglasses) 95%
Ren et al. [ ]Relaxed LTPCMU-PIEChisquare distanceNoise levelSuperior performance compared with LBP, LTP95.75%
Yale B98.71%
Hussain et al. [ ]LPQFERET/Cosine similarityLot of discriminative informationRobust to illumination variations99.20%
LFW75.30%
Karaaba et al. [ ]HOG and MMDFERETMMD/MLPDLow recognition accuracyAligning difficulties68.59%
LFW23.49%
Arigbabu et al. [ ]PHOG and SVMLFWSVMComplexity and time of computationHead pose variation88.50%
Leonard et al. [ ]VLC correlatorPHPIDASPOFThe low number of the reference image usedRobustness to noise92%
Napoléon et al. [ ]LBP and VLCYaleBPOFIlluminationRotation + Translation98.40%
YaleB Extended95.80%
Heflin et al. [ ]correlation filterLFW/PHPIDPSRSome pre-processing steps More effort on the eye localization stage39.48%
Zhu et al. [ ]PCA–FCFCMU-PIECorrelation filterUse only linear methodOcclusion-insensitive96.60%
FRGC2.091.92%
Seo et al. [ ]LARK + PCALFWCosine similarityFace detectionReducing computational complexity78.90%
Ghorbel et al. [ ]VLC + DoGFERETPCELow recognition rateRobustness81.51%
Ghorbel et al. [ ]uLBP + DoGFERETchi-square distanceRobustnessProcessing time93.39%
Ouerhani et al. [ ]VLCPHPIDPCEPowerProcessing time77%
Lenc et al. [ ]SIFTFERETa posterior probabilityStill far to be perfectSufficiently robust on lower quality real data97.30%
AR95.80%
LFW98.04%
Du et al. [ ]SURFLFWFLANN distanceProcessing timeRobustness and distinctiveness95.60%
Vinay et al. [ ]SURF + SIFTLFWFLANNProcessing timeRobust in unconstrained scenarios78.86%
Face94distance96.67%
Calonder et al. [ ]BRIEF_KNNLow recognition rateLow processing time48%

4. Holistic Approach

Holistic or subspace approaches are supposed to process the whole face, that is, they do not require extracting face regions or features points (eyes, mouth, noses, and so on). The main function of these approaches is to represent the face image by a matrix of pixels, and this matrix is often converted into feature vectors to facilitate their treatment. After that, these feature vectors are implemented in low dimensional space. However, holistic or subspace techniques are sensitive to variations (facial expressions, illumination, and poses), and these advantages make these approaches widely used. Moreover, these approaches can be divided into categories, including linear and non-linear techniques, based on the method used to represent the subspace.

4.1. Linear Techniques

The most popular linear techniques used for face recognition systems are Eigenfaces (principal component analysis; PCA) technique, Fisherfaces (linear discriminative analysis; LDA) technique, and independent component analysis (ICA).

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g009.jpg

Example of dimensional reduction when applying principal component analysis (PCA) [ 62 ].

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g010.jpg

The first five Eigenfaces built from the ORL database [ 63 ].

An image may also be considering the vector of dimension M × N , so that a typical image of size 4 × 4 becomes a vector of dimension 16. Let the training set of images be { X 1 , X 2 ,   X 3 …   X N } . The average face of the set is defined by the following:

Calculate the estimate covariance matrix to represent the scatter degree of all feature vectors related to the average vector. The covariance matrix Q is defined by the following:

The Eigenvectors and corresponding Eigen-values are computed using

where V is the set of eigenvectors matrix Q associated with its eigenvalue λ . Project all the training images of i t h person to the corresponding Eigen-subspace:

where the y k i are the projections of x and are called the principal components, also known as eigenfaces. The face images are represented as a linear combination of these vectors’ “principal components”. In order to extract facial features, PCA and LDA are two different feature extraction algorithms that are used. Wavelet fusion and neural networks are applied to classify facial features. The ORL database is used for evaluation. Figure 10 shows the first five Eigenfaces constructed from the ORL database [ 63 ].

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g011.jpg

The first five Fisherfaces obtained from the ORL database [ 63 ].

  • Independent component analysis (ICA) [ 35 ]: The ICA technique is used for the calculation of the basic vectors of a given space. The goal of this technique is to perform a linear transformation in order to reduce the statistical dependence between the different basic vectors, which allows the analysis of independent components. It is determined that they are not orthogonal to each other. In addition, the acquisition of images from different sources is sought in uncorrelated variables, which makes it possible to obtain greater efficiency, because ICA acquires images within statistically independent variables.
  • Improvements of the PCA, LDA, and ICA techniques: To improve the linear subspace techniques, many types of research are developed. Z. Cui et al. [ 67 ] proposed a new spatial face region descriptor (SFRD) method to extract the face region, and to deal with noise variation. This method is described as follows: divide each face image in many spatial regions, and extract token-frequency (TF) features from each region by sum-pooling the reconstruction coefficients over the patches within each region. Finally, extract the SFRD for face images by applying a variant of the PCA technique called “whitened principal component analysis (WPCA)” to reduce the feature dimension and remove the noise in the leading eigenvectors. Besides, the authors in [ 68 ] proposed a variant of the LDA called probabilistic linear discriminant analysis (PLDA) to seek directions in space that have maximum discriminability, and are hence most suitable for both face recognition and frontal face recognition under varying pose.
  • Gabor filters: Gabor filters are spatial sinusoids located by a Gaussian window that allows for extracting the features from images by selecting their frequency, orientation, and scale. To enhance the performance under unconstrained environments for face recognition, Gabor filters are transformed according to the shape and pose to extract the feature vectors of face image combined with the PCA in the work of [ 69 ]. The PCA is applied to the Gabor features to remove the redundancies and to get the best face images description. Finally, the cosine metric is used to evaluate the similarity.
  • Frequency domain analysis [ 70 , 71 ]: Finally, the analysis techniques in the frequency domain offer a representation of the human face as a function of low-frequency components that present high energy. The discrete Fourier transform (DFT), discrete cosine transform (DCT), or discrete wavelet transform (DWT) techniques are independent of the data, and thus do not require training.
  • Discrete wavelet transform (DWT): Another linear technique used for face recognition. In the work of [ 70 ], the authors used a two-dimensional discrete wavelet transform (2D-DWT) method for face recognition using a new patch strategy. A non-uniform patch strategy for the top-level’s low-frequency sub-band is proposed by using an integral projection technique for two top-level high-frequency sub-bands of 2D-DWT based on the average image of all training samples. This patch strategy is better for retaining the integrity of local information, and is more suitable to reflect the structure feature of the face image. When constructing the patching strategy using the testing and training samples, the decision is performed using the neighbor classifier. Many databases are used to evaluate this method, including Labeled Faces in Wild (LFW), Extended Yale B, Face Recognition Technology (FERET), and AR.
  • Discrete cosine transform (DCT) [ 71 ] can be used for global and local face recognition systems. DCT is a transformation that represents a finite sequence of data as the sum of a series of cosine functions oscillating at different frequencies. This technique is widely used in face recognition systems [ 71 ], from audio and image compression to spectral methods for the numerical resolution of differential equations. The required steps to implement the DCT technique are presented as follows.

Owing to their limitations in managing the linearity in face recognition, the subspace or holistic techniques are not appropriate to represent the exact details of geometric varieties of the face images. Linear techniques offer a faithful description of face images when the data structures are linear. However, when the face images data structures are non-linear, many types of research use a function named “kernel” to construct a large space where the problem becomes linear. The required steps to implement the DCT technique are presented as Algorithm 1.

DCT Algorithm
      ,        

4.2. Nonlinear Techniques

Kernel PCA Algorithm
. .

The performance of the KPCA technique depends on the choice of the kernel matrix K. The Gaussian or polynomial kernel are linear typically-used kernels. KPCA has been successfully used for novelty detection [ 72 ] or for speech recognition [ 62 ].

  • Kernel linear discriminant analysis (KDA) [ 73 ]: the KLDA technique is a kernel extension of the linear LDA technique, in the same kernel extension of PCA. Arashloo et al. [ 73 ] proposed a nonlinear binary class-specific kernel discriminant analysis classifier (CS-KDA) based on the spectral regression kernel discriminant analysis. Other nonlinear techniques have also been used in the context of facial recognition:
  • Gabor-KLDA [ 74 ].
  • Evolutionary weighted principal component analysis (EWPCA) [ 75 ].
  • Kernelized maximum average margin criterion (KMAMC), SVM, and kernel Fisher discriminant analysis (KFD) [ 76 ].
  • Wavelet transform (WT), radon transform (RT), and cellular neural networks (CNN) [ 77 ].
  • Joint transform correlator-based two-layer neural network [ 78 ].
  • Kernel Fisher discriminant analysis (KFD) and KPCA [ 79 ].
  • Locally linear embedding (LLE) and LDA [ 80 ].
  • Nonlinear locality preserving with deep networks [ 81 ].
  • Nonlinear DCT and kernel discriminative common vector (KDCV) [ 82 ].

4.3. Summary of Holistic Approaches

Table 2 summarizes the different subspace techniques discussed in this section, which are introduced to reduce the dimensionality and the complexity of the detection or recognition steps. Linear and non-linear techniques offer robust recognition under different lighting conditions and facial expressions. Although these techniques (linear and non-linear) allow a better reduction in dimensionality and improve the recognition rate, they are not invariant to translations and rotations compared with local techniques.

Subspace approaches. ICA, independent component analysis; DWT, discrete wavelet transform; FFT, fast Fourier transform; DCT, discrete cosine transform.

Author/Techniques UsedDatabases MatchingLimitationAdvantage Result
Linear Techniques
Seo et al. [ ]LARK and PCALFWL2 distanceDetection accuracyReducing computational complexity85.10%
Annalakshmi et al. [ ]ICA and LDALFWBayesian ClassifierSensitivity Good accuracy88%
Annalakshmi et al. [ ]PCA and LDALFWBayesian ClassifierSensitivity Specificity59%
Hussain et al. [ ]LQP and GaborFERETCosine similarityLot of discriminative informationRobust to illumination variations99.2%
75.3%
LFW
Gowda et al. [ ]LPQ and LDAMEPCOSVM Computation timeGood accuracy99.13%
Z. Cui et al. [ ]BoWARASMOcclusionsRobust99.43%
ORL 99.50%
FERET82.30%
Khan et al. [ ]PSO and DWTCKEuclidienne distanceNoiseRobust to illumination98.60%
MMI95.50%
JAFFE98.80%
Huang et al. [ ]2D-DWTFERETKNNPoseFrontal or near-frontal facial images90.63%
97.10%
LFW
Perlibakas and Vytautas [ ]PCA and Gabor filterFERETCosine metricPrecisionPose87.77%
Hafez et al. [ ]Gabor filter and LDAORL2DNCC PoseGood recognition performance98.33%
C. YaleB99.33%
Sufyanu et al. [ ]DCTORLNCCHigh memoryControlled and uncontrolled databases93.40%
Yale
Shanbhag et al. [ ]DWT and BPSO_ __ _RotationSignificant reduction in the number of features88.44%
Ghorbel et al. [ ]Eigenfaces and DoG filterFERETChi-square distanceProcessing timeReduce the representation84.26%
Zhang et al. [ ]PCA and FFTYALESVMComplexityDiscrimination93.42%
Zhang et al. [ ]PCAYALESVMRecognition rateReduce the dimensionality 84.21%
Fan et al. [ ]RKPCAMNIST ORL RBF kernelComplexityRobust to sparse noises_
Vinay et al. [ ] ORB and KPCAORLFLANN MatchingProcessing timeRobust87.30%
Vinay et al. [ ]SURF and KPCAORLFLANN MatchingProcessing timeReduce the dimensionality80.34%
Vinay et al. [ ]SIFT and KPCAORLFLANN MatchingLow recognition rateComplexity69.20%
Lu et al. [ ]KPCA and GDAUMIST faceSVMHigh error rate Excellent performance48%
Yang et al. [ ]PCA and MSRHELEN faceESRComplexityUtilizes color, gradient, and regional information98.00%
Yang et al. [ ]LDA and MSRFRGCESRLow performancesUtilizes color, gradient, and regional information90.75%
Ouanan et al. [ ]FDDL ARCNNOcclusionOrientations, expressions98.00%
Vankayalapati and Kyamakya [ ]CNNORL_ _PosesHigh recognition rate95%
Devi et al. [ ]2FNNORL_ _ComplexityLow error rate98.5

5. Hybrid Approach

5.1. technique presentation.

The hybrid approaches are based on local and subspace features in order to use the benefits of both subspace and local techniques, which have the potential to offer better performance for face recognition systems.

  • Gabor wavelet and linear discriminant analysis (GW-LDA) [ 91 ]: Fathima et al. [ 91 ] proposed a hybrid approach combining Gabor wavelet and linear discriminant analysis (HGWLDA) for face recognition. The grayscale face image is approximated and reduced in dimension. The authors have convolved the grayscale face image with a bank of Gabor filters with varying orientations and scales. After that, a subspace technique 2D-LDA is used to maximize the inter-class space and reduce the intra-class space. To classify and recognize the test face image, the k-nearest neighbour (k-NN) classifier is used. The recognition task is done by comparing the test face image feature with each of the training set features. The experimental results show the robustness of this approach in different lighting conditions.
  • Over-complete LBP (OCLBP), LDA, and within class covariance normalization (WCCN): Barkan et al. [ 92 ] proposed a new representation of face image based over-complete LBP (OCLBP). This representation is a multi-scale modified version of the LBP technique. The LDA technique is performed to reduce the high dimensionality representations. Finally, the within class covariance normalization (WCCN) is the metric learning technique used for face recognition.
  • Advanced correlation filters and Walsh LBP (WLBP): Juefei et al. [ 93 ] implemented a single-sample periocular-based alignment-robust face recognition technique based on high-dimensional Walsh LBP (WLBP). This technique utilizes only one sample per subject class and generates new face images under a wide range of 3D rotations using the 3D generic elastic model, which is both accurate and computationally inexpensive. The LFW database is used for evaluation, and the proposed method outperformed the state-of-the-art algorithms under four evaluation protocols with a high accuracy of 89.69%.
  • Multi-sub-region-based correlation filter bank (MS-CFB): Yan et al. [ 94 ] propose an effective feature extraction technique for robust face recognition, named multi-sub-region-based correlation filter bank (MS-CFB). MS-CFB extracts the local features independently for each face sub-region. After that, the different face sub-regions are concatenated to give optimal overall correlation outputs. This technique reduces the complexity, achieves higher recognition rates, and provides a better feature representation for recognition compared with several state-of-the-art techniques on various public face databases.
  • SIFT features, Fisher vectors, and PCA: Simonyan et al. [ 64 ] have developed a novel method for face recognition based on the SIFT descriptor and Fisher vectors. The authors propose a discriminative dimensionality reduction owing to the high dimensionality of the Fisher vectors. After that, these vectors are projected into a low dimensional subspace with a linear projection. The objective of this methodology is to describe the image based on dense SIFT features and Fisher vectors encoding to achieve high performance on the challenging LFW dataset in both restricted and unrestricted settings.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g012.jpg

Flowchart of the proposed multimodal deep face representation (MM-DFR) technique [ 95 ]. CNN, convolutional neural network.

  • PCA and ANFIS: Sharma et al. [ 96 ] propose an efficient pose-invariant face recognition system based on PCA technique and ANFIS classifier. The PCA technique is employed to extract the features of an image, and the ANFIS classifier is developed for identification under a variety of pose conditions. The performance of the proposed system based on PCA–ANFIS is better than ICA–ANFIS and LDA–ANFIS for the face recognition task. The ORL database is used for evaluation.
  • DCT and PCA: Ojala et al. [ 97 ] develop a fast face recognition system based on DCT and PCA techniques. Genetic algorithm (GA) technique is used to extract facial features, which allows to remove irrelevant features and reduces the number of features. In addition, the DCT–PCA technique is used to extract the features and reduce the dimensionality. The minimum Euclidian distance (ED) as a measurement is used for the decision. Various face databases are used to demonstrate the effectiveness of this system.
  • PCA, SIFT, and iterative closest point (ICP): Mian et al. [ 98 ] present a multimodal (2D and 3D) face recognition system based on hybrid matching to achieve efficiency and robustness to facial expressions. The Hotelling transform is performed to automatically correct the pose of a 3D face using its texture. After that, in order to form a rejection classifier, a novel 3D spherical face representation (SFR) in conjunction with the SIFT descriptor is used, which provide efficient recognition in the case of large galleries by eliminating a large number of candidates’ faces. A modified iterative closest point (ICP) algorithm is used for the decision. This system is less sensitive and robust to facial expressions, which achieved a 98.6% verification rate and 96.1% identification rate on the complete FRGC v2 database.
  • PCA, local Gabor binary pattern histogram sequence (LGBPHS), and GABOR wavelets: Cho et al. [ 99 ] proposed a computationally efficient hybrid face recognition system that employs both holistic and local features. The PCA technique is used to reduce the dimensionality. After that, the local Gabor binary pattern histogram sequence (LGBPHS) technique is employed to realize the recognition stage, which proposed to reduce the complexity caused by the Gabor filters. The experimental results show a better recognition rate compared with the PCA and Gabor wavelet techniques under illumination variations. The Extended Yale Face Database B is used to demonstrate the effectiveness of this system.
  • PCA and Fisher linear discriminant (FLD) [ 100 , 101 ]: Sing et al. [ 101 ] propose a novel hybrid technique for face representation and recognition, which exploits both local and subspace features. In order to extract the local features, the whole image is divided into a sub-regions, while the global features are extracted directly from the whole image. After that, PCA and Fisher linear discriminant (FLD) techniques are introduced on the fused feature vector to reduce the dimensionality. The CMU-PIE, FERET, and AR face databases are used for the evaluation.
  • SPCA–KNN [ 102 ]: Kamencay et al. [ 102 ] develop a new face recognition method based on SIFT features, as well as PCA and KNN techniques. The Hessian–Laplace detector along with SPCA descriptor is performed to extract the local features. SPCA is introduced to identify the human face. KNN classifier is introduced to identify the closest human faces from the trained features. The results of the experiment have a recognition rate of 92% for the unsegmented ESSEX database and 96% for the segmented database (700 training images).

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g013.jpg

The proposed CNN–LSTM–ELM [ 103 ].

5.2. Summary of Hybrid Approaches

Table 3 summarizes the hybrid approaches that we presented in this section. Various techniques are introduced to improve the performance and the accuracy of recognition systems. The combination between the local approaches and the subspace approach provides robust recognition and reduction of dimensionality under different illumination conditions and facial expressions. Furthermore, these technologies are presented to be sensitive to noise, and invariant to translations and rotations.

Hybrid approaches. GW, Gabor wavelet; OCLBP, over-complete LBP; WCCN, within class covariance normalization; WLBP, Walsh LPB; ICP, iterative closest point; LGBPHS, local Gabor binary pattern histogram sequence; FLD, Fisher linear discriminant; SAE, stacked auto-encoder.

Author/Technique UsedDatabaseMatchingLimitationAdvantage Result
Fathima et al. [ ]GW-LDAAT&Tk-NNHigh processing timeIllumination invariant and reduce the dimensionality88%
FACES9494.02%
MITINDIA88.12%
Barkan et al., [ ]OCLBP, LDA, and WCCNLFWWCCN_Reduce the dimensionality87.85%
Juefei et al. [ ]ACF and WLBPLFW ComplexityPose conditions89.69%
Simonyan et al. [ ]Fisher + SIFTLFWMahalanobis matrixSingle feature typeRobust87.47%
Sharma et al. [ ]PCA–ANFISORLANFISSensitivity-specificity 96.66%
ICA–ANFISANFISPose conditions71.30%
LDA–ANFISANFIS 68%
Ojala et al. [ ] DCT–PCAORLEuclidian distanceComplexityReduce the dimensionality92.62%
UMIST99.40%
YALE95.50%
Mian et al. [ ] Hotelling transform, SIFT, and ICPFRGCICPProcessing timeFacial expressions99.74%
Cho et al. [ ]PCA–LGBPHSExtended Yale FaceBhattacharyya distanceIllumination conditionComplexity95%
PCA–GABOR Wavelets
Sing et al. [ ]PCA–FLDCMUSVMRobustnessPose, illumination, and expression71.98%
FERET94.73%
AR68.65%
Kamencay et al. [ ]SPCA-KNNESSEXKNNProcessing timeExpression variation96.80%
Sun et al. [ ]CNN–LSTM–ELMOPPORTUNITYLSTM/ELMHigh processing timeAutomatically learn feature representations90.60%
Ding et al. [ ]CNNs and SAELFW_ _ComplexityHigh recognition rate99%

6. Assessment of Face Recognition Approaches

In the last step of recognition, the face extracted from the background during the face detection step is compared with known faces stored in a specific database. To make the decision, several techniques of comparison are used. This section describes the most common techniques used to make the decision and comparison.

6.1. Measures of Similarity or Distances

  • Peak-to-correlation energy (PCE) or peak-to-sidelobe ratio (PSR) [ 18 ]: The PCE was introduced in (8).

In general, the Euclidean distance between two points P = ( 1 ,   p 2 ,   … ,   p n ) and Q = ( q 1 ,   q 2 , …   ,   q n ) in the n-dimensional space would be defined by the following:

  • Bhattacharyya distance [ 104 , 105 ]: The Bhattacharyya distance is a statistical measure that quantifies the similarity between two discrete or continuous probability distributions. This distance is particularly known for its low processing time and its low sensitivity to noise. For the probability distributions p and q defined on the same domain, the distance of Bhattacharyya is defined as follows: D B ( p ,   q ) = − l n ( B C ( p ,   q ) ) , (17) B C ( p ,   q ) = ∑ x ∈ X p ( x ) q ( x )   ( a ) ;   B C ( p ,   q ) = ∫ p ( x ) q ( x ) d x   ( b ) , (18) where B C is the Bhattacharyya coefficient, defined as Equation (18a) for discrete probability distributions and as Equation (18b) for continuous probability distributions. In both cases, 0 ≤ BC ≤ 1 and 0 ≤ DB ≤ ∞. In its simplest formulation, the Bhattacharyya distance between two classes that follow a normal distribution can be calculated from a mean ( μ ) and the variance ( σ 2 ): D B ( p ,   q ) = 1 4 l n ( 1 4 ( σ p 2 σ q 2 + σ q 2 σ p 2 + 2 ) ) + 1 4 ( ( μ p − μ q ) σ q 2 + σ p 2 ) . (19)
  • Chi-squared distance [ 106 ]: The Chi-squared ( X 2 ) distance was weighted by the value of the samples, which allows knowing the same relevance for sample differences with few occurrences as those with multiple occurrences. To compare two histograms S 1 = ( u 1 , …   …   … . u m ) and S 2 = ( w 1 , …   …   … . w m ) , the Chi-squared ( X 2 ) distance can be defined as follows: ( X 2 ) = D ( S 1 , S 2 ) = 1 2 ∑ i = 1 m ( u i − w i ) 2 u i + w i . (20)

6.2. Classifiers

There are many face classification techniques in the literature that allow to select, from a few examples, the group or class to which the objects belong. Some of them are based on statistics, such as the Bayesian classifier and correlation [ 18 ], and so on, and others based on the regions that generate the different classes in the decision space, such as K-means [ 9 ], CNN [ 103 ], artificial neural networks (ANNs) [ 37 ], support vector machines (SVMs) [ 26 , 107 ], k-nearest neighbors (K-NNs), decision trees (DTs), and so on.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g014.jpg

Optimal hyperplane, support vectors, and maximum margin.

There is an infinite number of hyperplanes capable of perfectly separating two classes, which implies to select a hyperplane that maximizes the minimal distance between the learning examples and the learning hyperplane (i.e., the distance between the support vectors and the hyperplane). This distance is called “margin”. The SVM classifier is used to calculate the optimal hyperplane that categorizes a set of labels training data in the correct class. The optimal hyperplane is solved as follows:

Given that x i are the training features vectors and y i are the corresponding set of l (1 or −1) labels. An SVM tries to find a hyperplane to distinguish the samples with the smallest errors. The classification function is obtained by calculating the distance between the input vector and the hyperplane.

where w and b are the parameters of the model. Shen et al. [ 108 ] proposed the Gabor filter to extract the face features and applied the SVM for classification. The proposed FaceNet method achieves a good record accuracy of 99.63% and 95.12% using the LFW YouTube Faces DB datasets, respectively.

  • k-nearest neighbor (k-NN) [ 17 , 91 ]: k-NN is an indolent algorithm because, in training, it saves little information, and thus does not build models of difference, for example, decision trees.
  • K-means [ 9 , 109 ]: It is called K-means because it represents each of the groups by the average (or weighted average) of its points, called the centroid. In the K-means algorithm, it is necessary to specify a priori the number of clusters k that one wishes to form in order to start the process.

An external file that holds a picture, illustration, etc.
Object name is sensors-20-00342-g015.jpg

Artificial neural network.

Various variants of neural networks have been developed in the last years, such as convolutional neural networks (CNN) [ 14 , 110 ] and recurrent neural networks (RNN) [ 111 ], which very effective for image detection and recognition tasks. CNNs are a very successful deep model and are used today in many applications [ 112 ]. From a structural point of view, CNNs are made up of three different types of layers: convolution layers, pooling layers, and fully-connected layers.

  • Convolutional layer : sometimes called the feature extractor layer because features of the image are extracted within this layer. Convolution preserves the spatial relationship between pixels by learning image features using small squares of the input image. The input image is convoluted by employing a set of learnable neurons. This produces a feature map or activation map in the output image, after which the feature maps are fed as input data to the next convolutional layer. The convolutional layer also contains rectified linear unit (ReLU) activation to convert all negative value to zero. This makes it very computationally efficient, as few neurons are activated each time.
  • - Average-pooling takes all the elements of the sub-matrix, calculates their average, and stores the value in the output matrix.
  • - Max-pooling searches for the highest value found in the sub-matrix and saves it in the output matrix.
  • Fully-connected layer : in this layer, the neurons have a complete connection to all the activations from the previous layers. It connects neurons in one layer to neurons in another layer. It is used to classify images between different categories by training.

Wen et al. [ 113 ] introduce a new supervision signal, called center loss, for the face recognition task in order to improve the discriminative power of the deeply learned features. Specifically, the proposed center loss function is trainable and easy to optimize in the CNNs. Several important face recognition benchmarks are used for evaluation including LFW, YTF, and MegaFace Challenge. Passalis and Tefas [ 114 ] propose a supervised codebook learning method for the bag-of-features representation able to learn face retrieval-oriented codebooks. This allows using significantly smaller codebooks enhancing both the retrieval time and storage requirements. Liu et al. [ 115 ] and Amato et al. [ 116 ] propose a deep face recognition technique under open-set protocol based on the CNN technique. A face dataset composed of 39,037 faces images belonging to 42 different identities is used to perform the experiments. Taigman et al. [ 117 ] present a system (DeepFace) able to outperform existing systems with only very minimal adaptation. It is trained on a large dataset of faces acquired from a population vastly different than the one used to construct the evaluation benchmarks. This technique achieves an accuracy of 97.35% on the LFW. Ma et al. [ 118 ] introduce a robust local binary pattern (LBP) guiding pooling (G-RLBP) mechanism to improve the recognition rates of the CNN models, which can successfully lower the noise impact. Koo et al. [ 119 ] propose a multimodal human recognition method that uses both the face and body and is based on a deep CNN. Cho et al. [ 120 ] propose a nighttime face detection method based on CNN technique for visible-light images. Koshy and Mahmood [ 121 ] develop deep architectures for face liveness detection that uses a combination of texture analysis and a CNN technique to classify the captured image as real or fake. Elmahmudi and Ugail [ 122 ] present the performance of machine learning for face recognition using partial faces and other manipulations of the face such as rotation and zooming, which we use as training and recognition cues. The experimental results on the tasks of face verification and face identification show that the model obtained by the proposed DNN training framework achieves 97.3% accuracy on the LFW database with low training complexity. Seibold et al. [ 123 ] proposed a morphing attack detection method based on DNNs. A fully automatic face image morphing pipeline with exchangeable components was used to generate morphing attacks, train neural networks based on these data, and analyze their accuracy. Yim et al. [ 124 ] propose a new deep architecture based on a novel type of multitask learning, which can achieve superior performance in rotating to a target-pose face image from an arbitrary pose and illumination image while preserving identity. Nguyen et al. [ 111 ] propose a new approach for detecting presentation attack face images to enhance the security level of a face recognition system. The objective of this study was the use of a very deep stacked CNN–RNN network to learn the discrimination features from a sequence of face images. Finally, Bajrami et al. [ 125 ] present experiment results with LDA and DNN for face recognition, while their efficiency and performance are tested on the LFW dataset. The experimental results show that the DNN method achieves better recognition accuracy, and the recognition time is much faster than that of the LDA method in large-scale datasets.

6.3. Databases Used

The most commonly used databases for face recognition systems under different conditions are Pointing Head Pose Image Database (PHPID) [ 126 ], Labeled Faces in Wild (LFW) [ 127 ], FERET [ 15 , 16 ], ORL, and Yale. The last are used for face recognition systems under different conditions, which provide information for supervised and unsupervised learning. Supervised learning is based on two training modules: image unrestricted training setting and image restricted training setting. For the first model, only “same” or “not same” binary labels are used in the training splits. For the second model, the identities of the person in each pair are provided in the training splits.

  • LFW (Labeled Faces in the Wild) database was created in October 2007. It contains 13,333 images of 5749 subjects, with 1680 subjects with at least two images and the rest with a single image. These face images were taken on the Internet, pre-processed, and localized by the Viola–Jones detector with a resolution of 250 × 250 pixels. Most of them are in color, although there are also some in grayscale and presented in JPG format and organized by folders.
  • FERET (Face Recognition Technology) database was created in 15 sessions in a semi-controlled environment between August 1993 and July 1996. It contains 1564 sets of images, with a total of 14,126 images. The duplicate series belong to subjects already present in the series of individual images, which were generally captured one day apart. Some images taken from the same subject vary overtime for a few years and can be used to treat facial changes that appear over time. The images have a depth of 24 bits, RGB, so they are color images, with a resolution of 512 × 768 pixels.
  • AR face database was created by Aleix Martínez and Robert Benavente in the computer vision center (CVC) of the Autonomous University of Barcelona in June 1998. It contains more than 4000 images of 126 subjects, including 70 men and 56 women. They were taken at the CVC under a controlled environment. The images were taken frontally to the subjects, with different facial expressions and three different lighting conditions, as well as several accessories: scarves, glasses, or sunglasses. Two imaging sessions were performed with the same subjects, 14 days apart. These images are a resolution of 576 × 768 pixels and a depth of 24 bits, under the RGB RAW format.
  • ORL Database of Faces was performed between April 1992 and April 1994 at the AT & T laboratory in Cambridge. It consists of a total of 10 images per subject, out of a total of 40 images. For some subjects, the images were taken at different times, with varying illumination and facial expressions: eyes open/closed, smiling/without a smile, as well as with or without glasses. The images were taken under a black homogeneous background, in a vertical position and frontally to the subject, with some small rotation. These are images with a resolution of 92 × 112 pixels in grayscale.
  • Extended Yale Face B database contains 16,128 images of 640 × 480 grayscale of 28 individuals under 9 poses and 64 different lighting conditions. It also includes a set of images made with the face of individuals only.
  • Pointing Head Pose Image Database (PHPID) is one of the most widely used for face recognition. It contains 2790 monocular face images of 15 persons with tilt angles from −90° to +90° and variations of pan. Every person has two series of 93 different poses (93 images). The face images were taken under different skin color and with or without glasses.

6.4. Comparison between Holistic, Local, and Hybrid Techniques

In this section, we present some advantages and disadvantages of holistic, local, and hybrid approaches to identifying faces during the last 20 years. DL approaches can be considered as a statistical approach (holistic method), because the training procedure scheme usually searches for statistical structures in the input patterns. Table 4 presents a brief summary of the three approaches.

General performance of face recognition approaches.

ApproachesDatabases UsedAdvantagesDisadvantagesPerformancesChallenges Handled
TDF, CF1999,
LFW, FERET,
CMU-PIE, AR,
Yale B, PHPID,
YaleB Extended, FRGC2.0, Face94.
]. , ]. , ]. ], various lighting conditions[ ], facial expressions [ ], and low resolution.
]. ]. ]. ]. ]. ].
LFW, FERET, MEPCO, AR, ORL, CK, MMI, JAFFE,
C. Yale B, Yale, MNIST, ORL, UMIST face, HELEN face, FRGC.
, ]. , , , ]. ]. ]. ]. , ], scaling, facial expressions.
, , ]. , , ]. ]. , ]. ]. , ]. , ], poses [ ], conditions, scaling, facial expressions.
AT&T, FACES94,
MITINDIA, LFW, ORL, UMIST, YALE, FRGC, Extended Yale, CMU, FERET, AR, ESSEX.
]. , , ]. ]. ]. , ].

7. Discussion about Future Directions and Conclusions

7.1. discussion.

In the past decade, the face recognition system has become one of the most important biometric authentication methods. Many techniques are used to develop many face recognition systems based on facial information. Generally, the existing techniques can be classified into three approaches, depending on the type of desired features.

  • Local approaches: use features in which the face described partially. For example, some system could consist of extracting local features such as the eyes, mouth, and nose. The features’ values are calculated from the lines or points that can be represented on the face image for the recognition step.
  • Holistic approaches: use features that globally describe the complete face as a model, including the background (although it is desirable to occupy the smallest possible surface).
  • Hybrid approaches: combine local and holistic approaches.

In particular, recognition methods performed on static images produce good results under different lighting and expression conditions. However, in most cases, only the face images are processed at the same size and scale. Many methods require numerous training images, which limits their use for real-time systems, where the response time is an important aspect.

The main purpose of techniques such as HOG, LBP, Gabor filters, BRIEF, SURF, and SIFT is to discover distinctive features, which can be divided into two parts: (1) local appearance-based techniques, which are used to extract local features when the face image is divided into small regions (including HOG, LBP, Gabor filters, and correlation filters); and (2) key-points-based techniques, which are used to detect the points of interest in the face image, after which features’ extraction is localized based on these points, including BRIEF, SURF, and SIFT. In the context of face recognition, local techniques only treat certain facial features, which make them very sensitive to facial expressions and occlusions [ 4 , 14 , 37 , 50 , 51 , 52 , 53 ]. The relative robustness is the main advantage of these feature-based local techniques. Additionally, they take into account the peculiarity of the face as a natural form to recognize a reduced number of parameters. Another advantage is that they have a high compaction capacity and a high comparison speed. The main disadvantages of these methods are the difficulty of automating the detection of facial features and the fact that the person responsible for the implementation of these systems must make an arbitrary decision on really important points.

Unlike the local approaches, holistic approaches are other methods used for face recognition, which treat the whole face image and do not require extracting face regions or features points (eyes, mouth, noses, and so on). The main function of these approaches is to represent the face image with a matrix of pixels. This matrix is often converted into feature vectors to facilitate their treatment. After that, the feature vectors are applied in a low-dimensional space. In fact, subspace techniques are sensitive to different variations (facial expressions, illumination, and different poses), which make them easy to implement. Many subspace techniques are implemented to represent faces such as Eigenface, Eigenfisher, PCA, and LDA, which can be divided into two categories: linear and non-linear techniques. The main advantage of holistic approaches is that they do not destroy image information by focusing only on regions or points of interest. However, this property represents a disadvantage because it assumes that all the pixels of the image have the same importance. As a result, these techniques are not only computationally expensive, but also require a high degree of correlation between the test and the training images. In addition, these approaches generally ignore local details, which means they are rarely used to identify faces.

Hybrid approaches are based on local and global features to exploit the benefits of both techniques. These approaches combine the two approaches described above into a single system to improve the performance and accuracy of recognition. The choice of the required method to be used must take into account the application in which it was applied. For example, in the face recognition systems that use very small images, methods based on local features are a bad choice. Another consideration in the algorithm selection process is the number of training examples needed. Finally, we can remember that the tendency is to develop hybrid methods that combine the advantages of local and holistic approaches, but these methods are very complex and require more processing time.

A notable limitation that we found in all the publications reviewed is methodological: despite that the 2D facial recognition has reached a significant level of maturity and a high success rate, it is not surprising that it continues to be one of the most active research areas in computer vision. Considering the results published to date, in the opinion of these authors, three particularly promising techniques for further development of this area stand out: (i) the development of 3D face recognition methods; (ii) the use of multimodal fusion methods of complementary data types, in particular those based on visible and infrared images; and (iii) the use of DL methods.

  • Three-dimensional face recognition: In 2D image-based techniques, some features are lost owing to the 3D structure of the face. Lighting and pose variations are two major unresolved problems of 2D face recognition. Recently, 3D facial recognition for facial recognition has been widely studied by the scientific community to overcome unresolved problems in 2D facial recognition and to achieve significantly higher accuracy by measuring geometry of rigid features on the face. For this reason, several recent systems based on 3D data have been developed [ 3 , 93 , 95 , 128 , 129 ].
  • Multimodal facial recognition: sensors have been developed in recent years with a proven ability to acquire not only two-dimensional texture information, but also facial shape, that is, three-dimensional information. For this reason, some recent studies have merged the two types of 2D and 3D information to take advantage of each of them and obtain a hybrid system that improves the recognition as the only modality [ 98 ].
  • Deep learning (DL): a very broad concept, which means that it has no exact definition, but studies [ 14 , 110 , 111 , 112 , 113 , 121 , 130 , 131 ] agree that DL includes a set of algorithms that attempt to model high level abstractions, by modeling multiple processing layers. This field of research began in the 1980s and is a branch of automatic learning where algorithms are used in the formation of deep neural networks (DNN) to achieve greater accuracy than other classical techniques. In recent progress, a point has been reached where DL performs better than people in some tasks, for example, to recognize objects in images.

Finally, researchers have gone further by using multimodal and DL facial recognition systems.

7.2. Conclusions

Face recognition system is a popular study task in the field of image processing and computer vision, owing to its potentially enormous application as well as its theoretical value. This system is widely deployed in many real-world applications such as security, surveillance, homeland security, access control, image search, human-machine, and entertainment. However, these applications pose different challenges such as lighting conditions and facial expressions. This paper highlights the recent research on the 2D or 3D face recognition system, focusing mainly on approaches based on local, holistic (subspace), and hybrid features. A comparative study between these approaches in terms of processing time, complexity, discrimination, and robustness was carried out. We can conclude that local feature techniques are the best choice concerning discrimination, rotation, translation, complexity, and accuracy. We hope that this survey paper will further encourage researchers in this field to participate and pay more attention to the use of local techniques for face recognition systems.

Author Contributions

Y.K. highlights the recent research on the 2D or 3D face recognition system, focusing mainly on approaches based on local, holistic, and hybrid features. M.J., A.A.F. and M.A. supervised the research and helped in the revision processes. All authors have read and agreed to the published version of the manuscript.

The paper is co-financed by L@bISEN of ISEN Yncrea Ouest Brest, France, Dept Ai-DE, Team Vision-AD and by FSM University of Monastir, Tunisia with collaboration of the Ministry of Higher Education and Scientific Research of Tunisia. The context of the paper is the PhD project of Yassin Kortli.

Conflicts of Interest

The authors declare no conflict of interest.

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

sensors-logo

Article Menu

  • Subscribe SciFeed
  • Recommended Articles
  • PubMed/Medline
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Face recognition systems: a survey.

recent research on face recognition

1. Introduction

  • We first introduced face recognition as a biometric technique.
  • We presented the state of the art of the existing face recognition techniques classified into three approaches: local, holistic, and hybrid.
  • The surveyed approaches were summarized and compared under different conditions.
  • We presented the most popular face databases used to test these approaches.
  • We highlighted some new promising research directions.

2. Face Recognition Systems Survey

2.1. essential steps of face recognition systems.

  • Face Detection : The face recognition system begins first with the localization of the human faces in a particular image. The purpose of this step is to determine if the input image contains human faces or not. The variations of illumination and facial expression can prevent proper face detection. In order to facilitate the design of a further face recognition system and make it more robust, pre-processing steps are performed. Many techniques are used to detect and locate the human face image, for example, Viola–Jones detector [ 24 , 25 ], histogram of oriented gradient (HOG) [ 13 , 26 ], and principal component analysis (PCA) [ 27 , 28 ]. Also, the face detection step can be used for video and image classification, object detection [ 29 ], region-of-interest detection [ 30 ], and so on.
  • Feature Extraction : The main function of this step is to extract the features of the face images detected in the detection step. This step represents a face with a set of features vector called a “signature” that describes the prominent features of the face image such as mouth, nose, and eyes with their geometry distribution [ 31 , 32 ]. Each face is characterized by its structure, size, and shape, which allow it to be identified. Several techniques involve extracting the shape of the mouth, eyes, or nose to identify the face using the size and distance [ 3 ]. HOG [ 33 ], Eigenface [ 34 ], independent component analysis (ICA), linear discriminant analysis (LDA) [ 27 , 35 ], scale-invariant feature transform (SIFT) [ 23 ], gabor filter, local phase quantization (LPQ) [ 36 ], Haar wavelets, Fourier transforms [ 31 ], and local binary pattern (LBP) [ 3 , 10 ] techniques are widely used to extract the face features.
  • Face Recognition : This step considers the features extracted from the background during the feature extraction step and compares it with known faces stored in a specific database. There are two general applications of face recognition, one is called identification and another one is called verification. During the identification step, a test face is compared with a set of faces aiming to find the most likely match. During the identification step, a test face is compared with a known face in the database in order to make the acceptance or rejection decision [ 7 , 19 ]. Correlation filters (CFs) [ 18 , 37 , 38 ], convolutional neural network (CNN) [ 39 ], and also k-nearest neighbor (K-NN) [ 40 ] are known to effectively address this task.

2.2. Classification of Face Recognition Systems

3. local approaches, 3.1. local appearance-based techniques.

  • Local binary pattern (LBP) and it’s variant: LBP is a great general texture technique used to extract features from any object [ 16 ]. It has widely performed in many applications such as face recognition [ 3 ], facial expression recognition, texture segmentation, and texture classification. The LBP technique first divides the facial image into spatial arrays. Next, within each array square, a 3 × 3 pixel matrix ( p 1 … … p 8 ) is mapped across the square. The pixel of this matrix is a threshold with the value of the center pixel ( p 0 ) (i.e., use the intensity value of the center pixel i ( p 0 ) as a reference for thresholding) to produce the binary code. If a neighbor pixel’s value is lower than the center pixel value, it is given a zero; otherwise, it is given one. The binary code contains information about the local texture. Finally, for each array square, a histogram of these codes is built, and the histograms are concatenated to form the feature vector. The LBP is defined in a matrix of size 3 × 3, as shown in Equation (1). LBP = ∑ p = 1 8 2 p s ( i 0 − i p ) ,      w i t h   s ( x ) = { 1 x ≥ 0 0 x < 0 , (1) where i 0 and i p are the intensity value of the center pixel and neighborhood pixels, respectively. Figure 3 illustrates the procedure of the LBP technique. Khoi et al. [ 20 ] propose a fast face recognition system based on LBP, pyramid of local binary pattern (PLBP), and rotation invariant local binary pattern (RI-LBP). Xi et al. [ 15 ] have introduced a new unsupervised deep learning-based technique, called local binary pattern network (LBPNet), to extract hierarchical representations of data. The LBPNet maintains the same topology as the convolutional neural network (CNN). The experimental results obtained using the public benchmarks (i.e., LFW and FERET) have shown that LBPNet is comparable to other unsupervised techniques. Laure et al. [ 40 ] have implemented a method that helps to solve face recognition issues with large variations of parameters such as expression, illumination, and different poses. This method is based on two techniques: LBP and K-NN techniques. Owing to its invariance to the rotation of the target image, LBP become one of the important techniques used for face recognition. Bonnen et al. [ 42 ] proposed a variant of the LBP technique named “multiscale local binary pattern (MLBP)” for features’ extraction. Another LBP extension is the local ternary pattern (LTP) technique [ 43 ], which is less sensitive to the noise than the original LBP technique. This technique uses three steps to compute the differences between the neighboring ones and the central pixel. Hussain et al. [ 36 ] develop a local quantized pattern (LQP) technique for face representation. LQP is a generalization of local pattern features and is intrinsically robust to illumination conditions. The LQP features use the disk layout to sample pixels from the local neighborhood and obtain a pair of binary codes using ternary split coding. These codes are quantized, with each one using a separately learned codebook.
  • Histogram of oriented gradients (HOG) [ 44 ]: The HOG is one of the best descriptors used for shape and edge description. The HOG technique can describe the face shape using the distribution of edge direction or light intensity gradient. The process of this technique done by sharing the whole face image into cells (small region or area); a histogram of pixel edge direction or direction gradients is generated of each cell; and, finally, the histograms of the whole cells are combined to extract the feature of the face image. The feature vector computation by the HOG descriptor proceeds as follows [ 10 , 13 , 26 , 45 ]: firstly, divide the local image into regions called cells, and then calculate the amplitude of the first-order gradients of each cell in both the horizontal and vertical direction. The most common method is to apply a 1D mask, [–1 0 1]. G x ( x ,   y ) = I ( x + 1 ,   y ) − I ( x − 1 ,   y ) , (2) G y ( x ,   y ) = I ( x ,   y + 1 ) − I ( x ,   y − 1 ) , (3) where I ( x ,   y ) is the pixel value of the point ( x ,   y ) and G x ( x ,   y ) and G y ( x ,   y ) denote the horizontal gradient amplitude and the vertical gradient amplitude, respectively. The magnitude of the gradient and the orientation of each pixel ( x , y ) are computed as follows: G ( x ,   y ) = G x ( x ,   y ) 2 + G y ( x ,   y ) 2 , (4) θ ( x ,   y ) = tan − 1 ( G y ( x ,   y ) G x ( x ,   y ) ) . (5) The magnitude of the gradient and the orientation of each pixel in the cell are voted in nine bins with the tri-linear interpolation. The histograms of each cell are generated pixel based on direction gradients and, finally, the histograms of the whole cells are combined to extract the feature of the face image. Karaaba et al. [ 44 ] proposed a combination of different histograms of oriented gradients (HOG) to perform a robust face recognition system. This technique is named “multi-HOG”. The authors create a vector of distances between the target and the reference face images for identification. Arigbabu et al. [ 46 ] proposed a novel face recognition system based on the Laplacian filter and the pyramid histogram of gradient (PHOG) descriptor. In addition, to investigate the face recognition problem, support vector machine (SVM) is used with different kernel functions.
  • Correlation filters: Face recognition systems based on the correlation filter (CF) have given good results in terms of robustness, location accuracy, efficiency, and discrimination. In the field of facial recognition, the correlation techniques have attracted great interest since the first use of an optical correlator [ 47 ]. These techniques provide the following advantages: high ability for discrimination, desired noise robustness, shift-invariance, and inherent parallelism. On the basis of these advantages, many optoelectronic hybrid solutions of correlation filters (CFs) have been introduced such as the joint transform correlator (JTC) [ 48 ] and VanderLugt correlator (VLC) [ 47 ] techniques. The purpose of these techniques is to calculate the degree of similarity between target and reference images. The decision is taken by the detection of a correlation peak. Both techniques (VLC and JTC) are based on the “ 4 f ” optical configuration [ 37 ]. This configuration is created by two convergent lenses ( Figure 4 ). The face image F is processed by the fast Fourier transform (FFT) based on the first lens in the Fourier plane S F . In this Fourier plane, a specific filter P is applied (for example, the phase-only filter (POF) filter [ 2 ]) using optoelectronic interfaces. Finally, to obtain the filtered face image F ′ (or the correlation plane), the inverse FFT (IFFT) is made with the second lens in the output plane. For example, the VLC technique is done by two cascade Fourier transform structures realized by two lenses [ 4 ], as presented in Figure 5 . The VLC technique is presented as follows: firstly, a 2D-FFT is applied to the target image to get a target spectrum S . After that, a multiplication between the target spectrum and the filter obtain with the 2D-FFT of a reference image is affected, and this result is placed in the Fourier plane. Next, it provides the correlation result recorded on the correlation plane, where this multiplication is affected by inverse FF. The correlation result, described by the peak intensity, is used to determine the similarity degree between the target and reference images. C = F F T − 1 { S ∗ ∘ P O F } , (6) where F F T − 1 stands for the inverse fast FT (FFT) operation, * represents the conjugate operation, and ∘ denotes the element-wise array multiplication. To enhance the matching process, Horner and Gianino [ 49 ] proposed a phase-only filter (POF). The POF filter can produce correlation peaks marked with enhanced discrimination capability. The POF is an optimized filter defined as follows: H P O F ( u , v ) = S ∗ ( u , v ) | S ( u , v ) | , (7) where S ∗ ( u , v ) is the complex conjugate of the 2D-FFT of the reference image. To evaluate the decision, the peak to correlation energy (PCE) is defined as the energy in the correlation peaks’ intensity normalized to the overall energy of the correlation plane. P C E = ∑ i , j N E p e a k ( i , j ) ∑ i , j M E c o r r e l a t i o n − p l a n e ( i , j ) , (8) where i , j are the coefficient coordinates; M and N are the size of the correlation plane and the size of the peak correlation spot, respectively; E p e a k is the energy in the correlation peaks; and E c o r r e l a t i o n − p l a n e is the overall energy of the correlation plane. Correlation techniques are widely applied in recognition and identification applications [ 4 , 37 , 50 , 51 , 52 , 53 ]. For example, in the work of [ 4 ], the authors presented the efficiency performances of the VLC technique based on the “4f” configuration for identification using GPU Nvidia Geforce 8400 GS. The POF filter is used for the decision. Another important work in this area of research is presented by Leonard et al. [ 50 ], which presented good performance and the simplicity of the correlation filters for the field of face recognition. In addition, many specific filters such as POF, BPOF, Ad, IF, and so on are used to select the best filter based on its sensitivity to the rotation, scale, and noise. Napoléon et al. [ 3 ] introduced a novel system for identification and verification fields based on an optimized 3D modeling under different illumination conditions, which allows reconstructing faces in different poses. In particular, to deform the synthetic model, an active shape model for detecting a set of key points on the face is proposed in Figure 6 . The VanderLugt correlator is proposed to perform the identification and the LBP descriptor is used to optimize the performances of a correlation technique under different illumination conditions. The experiments are performed on the Pointing Head Pose Image Database (PHPID) database with an elevation ranging from −30° to +30°.

3.2. Key-Points-Based Techniques

  • Scale invariant feature transform (SIFT) [ 56 , 57 ]: SIFT is an algorithm used to detect and describe the local features of an image. This algorithm is widely used to link two images by their local descriptors, which contain information to make a match between them. The main idea of the SIFT descriptor is to convert the image into a representation composed of points of interest. These points contain the characteristic information of the face image. SIFT presents invariance to scale and rotation. It is commonly used today and fast, which is essential in real-time applications, but one of its disadvantages is the time of matching of the critical points. The algorithm is realized in four steps: (1) detection of the maximum and minimum points in the space-scale, (2) location of characteristic points, (3) assignment of orientation, and (4) a descriptor of the characteristic point. A framework to detect the key-points based on the SIFT descriptor was proposed by L. Lenc et al. [ 56 ], where they use the SIFT technique in combination with a Kepenekci approach for the face recognition.
  • Speeded-up robust features (SURF) [ 29 , 57 ]: the SURF technique is inspired by SIFT, but uses wavelets and an approximation of the Hessian determinant to achieve better performance [ 29 ]. SURF is a detector and descriptor that claims to achieve the same, or even better, results in terms of repeatability, distinction, and robustness compared with the SIFT descriptor. The main advantage of SURF is the execution time, which is less than that used by the SIFT descriptor. Besides, the SIFT descriptor is more adapted to describe faces affected by illumination conditions, scaling, translation, and rotation [ 57 ]. To detect feature points, SURF seeks to find the maximum of an approximation of the Hessian matrix using integral images to dramatically reduce the processing computational time. Figure 7 shows an example of SURF descriptor for face recognition using AR face datasets [ 58 ].
  • Binary robust independent elementary features (BRIEF) [ 30 , 57 ]: BRIEF is a binary descriptor that is simple and fast to compute. This descriptor is based on the differences between the pixel intensity that are similar to the family of binary descriptors such as binary robust invariant scalable (BRISK) and fast retina keypoint (FREAK) in terms of evaluation. To reduce noise, the BRIEF descriptor smoothens the image patches. After that, the differences between the pixel intensity are used to represent the descriptor. This descriptor has achieved the best performance and accuracy in pattern recognition.
  • Fast retina keypoint (FREAK) [ 57 , 59 ]: the FREAK descriptor proposed by Alahi et al. [ 59 ] uses a retinal sampling circular grid. This descriptor uses 43 sampling patterns based on retinal receptive fields that are shown in Figure 8 . To extract a binary descriptor, these 43 receptive fields are sampled by decreasing factors as the distance from the thousand potential pairs to a patch’s center yields. Each pair is smoothed with Gaussian functions. Finally, the binary descriptors are represented by setting a threshold and considering the sign of differences between pairs.

3.3. Summary of Local Approaches

4. holistic approach, 4.1. linear techniques.

  • Eigenface [ 34 ] and principal component analysis (PCA) [ 27 , 62 ]: Eigenfaces is one of the popular methods of holistic approaches used to extract features points of the face image. This approach is based on the principal component analysis (PCA) technique. The principal components created by the PCA technique are used as Eigenfaces or face templates. The PCA technique transforms a number of possibly correlated variables into a small number of incorrect variables called “principal components”. The purpose of PCA is to reduce the large dimensionality of the data space (observed variables) to the smaller intrinsic dimensionality of feature space (independent variables), which are needed to describe the data economically. Figure 9 shows how the face can be represented by a small number of features. PCA calculates the Eigenvectors of the covariance matrix, and projects the original data onto a lower dimensional feature space, which are defined by Eigenvectors with large Eigenvalues. PCA has been used in face representation and recognition, where the Eigenvectors calculated are referred to as Eigenfaces (as shown in Figure 10 ). An image may also be considering the vector of dimension M × N , so that a typical image of size 4 × 4 becomes a vector of dimension 16. Let the training set of images be { X 1 , X 2 ,   X 3 …   X N } . The average face of the set is defined by the following: X ¯ = 1 N ∑ i = 1 N X   i . (9) Calculate the estimate covariance matrix to represent the scatter degree of all feature vectors related to the average vector. The covariance matrix Q is defined by the following: Q = 1 N ∑ i = 1 N ( X ¯ − X i ) ( X ¯ − X i ) T . (10) The Eigenvectors and corresponding Eigen-values are computed using C V = λ V ,       ( V ϵ R n ,   V ≠ 0 ) , (11) where V is the set of eigenvectors matrix Q associated with its eigenvalue λ . Project all the training images of i t h person to the corresponding Eigen-subspace: y k i = w T    ( x i ) ,       ( i = 1 ,   2 ,   3   …   N ) , (12) where the y k i are the projections of x and are called the principal components, also known as eigenfaces. The face images are represented as a linear combination of these vectors’ “principal components”. In order to extract facial features, PCA and LDA are two different feature extraction algorithms that are used. Wavelet fusion and neural networks are applied to classify facial features. The ORL database is used for evaluation. Figure 10 shows the first five Eigenfaces constructed from the ORL database [ 63 ].
  • Fisherface and linear discriminative analysis (LDA) [ 64 , 65 ]: The Fisherface method is based on the same principle of similarity as the Eigenfaces method. The objective of this method is to reduce the high dimensional image space based on the linear discriminant analysis (LDA) technique instead of the PCA technique. The LDA technique is commonly used for dimensionality reduction and face recognition [ 66 ]. PCA is an unsupervised technique, while LDA is a supervised learning technique and uses the data information. For all samples of all classes, the within-class scatter matrix S W and the between-class scatter matrix S B are defined as follows: S B = ∑ I = 1 C M i ( x i − μ ) ( x i − μ ) T , (13) S w = ∑ I = 1 C ∑ x k ϵ X i M i ( x k − μ ) ( x k − μ ) T , (14) where μ is the mean vector of samples belonging to class i , X i represents the set of samples belonging to class i with x k being the number image of that class, c is the number of distinct classes, and M i is the number of training samples in class i . S B describes the scatter of features around the overall mean for all face classes and S w describes the scatter of features around the mean of each face class. The goal is to maximize the ratio d e t | S B | / d e t | S w |, in other words, minimizing S w while maximiz ing   S B . Figure 11 shows the first five Eigenfaces and Fisherfaces obtained from the ORL database [ 63 ].
  • Independent component analysis (ICA) [ 35 ]: The ICA technique is used for the calculation of the basic vectors of a given space. The goal of this technique is to perform a linear transformation in order to reduce the statistical dependence between the different basic vectors, which allows the analysis of independent components. It is determined that they are not orthogonal to each other. In addition, the acquisition of images from different sources is sought in uncorrelated variables, which makes it possible to obtain greater efficiency, because ICA acquires images within statistically independent variables.
  • Improvements of the PCA, LDA, and ICA techniques: To improve the linear subspace techniques, many types of research are developed. Z. Cui et al. [ 67 ] proposed a new spatial face region descriptor (SFRD) method to extract the face region, and to deal with noise variation. This method is described as follows: divide each face image in many spatial regions, and extract token-frequency (TF) features from each region by sum-pooling the reconstruction coefficients over the patches within each region. Finally, extract the SFRD for face images by applying a variant of the PCA technique called “whitened principal component analysis (WPCA)” to reduce the feature dimension and remove the noise in the leading eigenvectors. Besides, the authors in [ 68 ] proposed a variant of the LDA called probabilistic linear discriminant analysis (PLDA) to seek directions in space that have maximum discriminability, and are hence most suitable for both face recognition and frontal face recognition under varying pose.
  • Gabor filters: Gabor filters are spatial sinusoids located by a Gaussian window that allows for extracting the features from images by selecting their frequency, orientation, and scale. To enhance the performance under unconstrained environments for face recognition, Gabor filters are transformed according to the shape and pose to extract the feature vectors of face image combined with the PCA in the work of [ 69 ]. The PCA is applied to the Gabor features to remove the redundancies and to get the best face images description. Finally, the cosine metric is used to evaluate the similarity.
  • Frequency domain analysis [ 70 , 71 ]: Finally, the analysis techniques in the frequency domain offer a representation of the human face as a function of low-frequency components that present high energy. The discrete Fourier transform (DFT), discrete cosine transform (DCT), or discrete wavelet transform (DWT) techniques are independent of the data, and thus do not require training.
  • Discrete wavelet transform (DWT): Another linear technique used for face recognition. In the work of [ 70 ], the authors used a two-dimensional discrete wavelet transform (2D-DWT) method for face recognition using a new patch strategy. A non-uniform patch strategy for the top-level’s low-frequency sub-band is proposed by using an integral projection technique for two top-level high-frequency sub-bands of 2D-DWT based on the average image of all training samples. This patch strategy is better for retaining the integrity of local information, and is more suitable to reflect the structure feature of the face image. When constructing the patching strategy using the testing and training samples, the decision is performed using the neighbor classifier. Many databases are used to evaluate this method, including Labeled Faces in Wild (LFW), Extended Yale B, Face Recognition Technology (FERET), and AR.
  • Discrete cosine transform (DCT) [ 71 ] can be used for global and local face recognition systems. DCT is a transformation that represents a finite sequence of data as the sum of a series of cosine functions oscillating at different frequencies. This technique is widely used in face recognition systems [ 71 ], from audio and image compression to spectral methods for the numerical resolution of differential equations. The required steps to implement the DCT technique are presented as follows.
DCT Algorithm
      where , and        

4.2. Nonlinear Techniques

Kernel PCA Algorithm
using kernel function: . and normalize with the function: . using kernel function:
  • Kernel linear discriminant analysis (KDA) [ 73 ]: the KLDA technique is a kernel extension of the linear LDA technique, in the same kernel extension of PCA. Arashloo et al. [ 73 ] proposed a nonlinear binary class-specific kernel discriminant analysis classifier (CS-KDA) based on the spectral regression kernel discriminant analysis. Other nonlinear techniques have also been used in the context of facial recognition:
  • Gabor-KLDA [ 74 ].
  • Evolutionary weighted principal component analysis (EWPCA) [ 75 ].
  • Kernelized maximum average margin criterion (KMAMC), SVM, and kernel Fisher discriminant analysis (KFD) [ 76 ].
  • Wavelet transform (WT), radon transform (RT), and cellular neural networks (CNN) [ 77 ].
  • Joint transform correlator-based two-layer neural network [ 78 ].
  • Kernel Fisher discriminant analysis (KFD) and KPCA [ 79 ].
  • Locally linear embedding (LLE) and LDA [ 80 ].
  • Nonlinear locality preserving with deep networks [ 81 ].
  • Nonlinear DCT and kernel discriminative common vector (KDCV) [ 82 ].

4.3. Summary of Holistic Approaches

5. hybrid approach, 5.1. technique presentation.

  • Gabor wavelet and linear discriminant analysis (GW-LDA) [ 91 ]: Fathima et al. [ 91 ] proposed a hybrid approach combining Gabor wavelet and linear discriminant analysis (HGWLDA) for face recognition. The grayscale face image is approximated and reduced in dimension. The authors have convolved the grayscale face image with a bank of Gabor filters with varying orientations and scales. After that, a subspace technique 2D-LDA is used to maximize the inter-class space and reduce the intra-class space. To classify and recognize the test face image, the k-nearest neighbour (k-NN) classifier is used. The recognition task is done by comparing the test face image feature with each of the training set features. The experimental results show the robustness of this approach in different lighting conditions.
  • Over-complete LBP (OCLBP), LDA, and within class covariance normalization (WCCN): Barkan et al. [ 92 ] proposed a new representation of face image based over-complete LBP (OCLBP). This representation is a multi-scale modified version of the LBP technique. The LDA technique is performed to reduce the high dimensionality representations. Finally, the within class covariance normalization (WCCN) is the metric learning technique used for face recognition.
  • Advanced correlation filters and Walsh LBP (WLBP): Juefei et al. [ 93 ] implemented a single-sample periocular-based alignment-robust face recognition technique based on high-dimensional Walsh LBP (WLBP). This technique utilizes only one sample per subject class and generates new face images under a wide range of 3D rotations using the 3D generic elastic model, which is both accurate and computationally inexpensive. The LFW database is used for evaluation, and the proposed method outperformed the state-of-the-art algorithms under four evaluation protocols with a high accuracy of 89.69%.
  • Multi-sub-region-based correlation filter bank (MS-CFB): Yan et al. [ 94 ] propose an effective feature extraction technique for robust face recognition, named multi-sub-region-based correlation filter bank (MS-CFB). MS-CFB extracts the local features independently for each face sub-region. After that, the different face sub-regions are concatenated to give optimal overall correlation outputs. This technique reduces the complexity, achieves higher recognition rates, and provides a better feature representation for recognition compared with several state-of-the-art techniques on various public face databases.
  • SIFT features, Fisher vectors, and PCA: Simonyan et al. [ 64 ] have developed a novel method for face recognition based on the SIFT descriptor and Fisher vectors. The authors propose a discriminative dimensionality reduction owing to the high dimensionality of the Fisher vectors. After that, these vectors are projected into a low dimensional subspace with a linear projection. The objective of this methodology is to describe the image based on dense SIFT features and Fisher vectors encoding to achieve high performance on the challenging LFW dataset in both restricted and unrestricted settings.
  • CNNs and stacked auto-encoder (SAE) techniques: Ding et al. [ 95 ] proposed multimodal deep face representation (MM-DFR) framework based on convolutional neural networks (CNNs) technique from the original holistic face image, rendered frontal face by 3D face model (stand for holistic facial features and local facial features, respectively), and uniformly sampled image patches. The proposed MM-DFR framework has two steps: a CNNs technique is used to extract the features and a three-layer stacked auto-encoder (SAE) technique is employed to compress the high-dimensional deep feature into a compact face signature. The LFW database is used to evaluate the identification performance of MM-DFR. The flowchart of the proposed MM-DFR framework is shown in Figure 12 .
  • PCA and ANFIS: Sharma et al. [ 96 ] propose an efficient pose-invariant face recognition system based on PCA technique and ANFIS classifier. The PCA technique is employed to extract the features of an image, and the ANFIS classifier is developed for identification under a variety of pose conditions. The performance of the proposed system based on PCA–ANFIS is better than ICA–ANFIS and LDA–ANFIS for the face recognition task. The ORL database is used for evaluation.
  • DCT and PCA: Ojala et al. [ 97 ] develop a fast face recognition system based on DCT and PCA techniques. Genetic algorithm (GA) technique is used to extract facial features, which allows to remove irrelevant features and reduces the number of features. In addition, the DCT–PCA technique is used to extract the features and reduce the dimensionality. The minimum Euclidian distance (ED) as a measurement is used for the decision. Various face databases are used to demonstrate the effectiveness of this system.
  • PCA, SIFT, and iterative closest point (ICP): Mian et al. [ 98 ] present a multimodal (2D and 3D) face recognition system based on hybrid matching to achieve efficiency and robustness to facial expressions. The Hotelling transform is performed to automatically correct the pose of a 3D face using its texture. After that, in order to form a rejection classifier, a novel 3D spherical face representation (SFR) in conjunction with the SIFT descriptor is used, which provide efficient recognition in the case of large galleries by eliminating a large number of candidates’ faces. A modified iterative closest point (ICP) algorithm is used for the decision. This system is less sensitive and robust to facial expressions, which achieved a 98.6% verification rate and 96.1% identification rate on the complete FRGC v2 database.
  • PCA, local Gabor binary pattern histogram sequence (LGBPHS), and GABOR wavelets: Cho et al. [ 99 ] proposed a computationally efficient hybrid face recognition system that employs both holistic and local features. The PCA technique is used to reduce the dimensionality. After that, the local Gabor binary pattern histogram sequence (LGBPHS) technique is employed to realize the recognition stage, which proposed to reduce the complexity caused by the Gabor filters. The experimental results show a better recognition rate compared with the PCA and Gabor wavelet techniques under illumination variations. The Extended Yale Face Database B is used to demonstrate the effectiveness of this system.
  • PCA and Fisher linear discriminant (FLD) [ 100 , 101 ]: Sing et al. [ 101 ] propose a novel hybrid technique for face representation and recognition, which exploits both local and subspace features. In order to extract the local features, the whole image is divided into a sub-regions, while the global features are extracted directly from the whole image. After that, PCA and Fisher linear discriminant (FLD) techniques are introduced on the fused feature vector to reduce the dimensionality. The CMU-PIE, FERET, and AR face databases are used for the evaluation.
  • SPCA–KNN [ 102 ]: Kamencay et al. [ 102 ] develop a new face recognition method based on SIFT features, as well as PCA and KNN techniques. The Hessian–Laplace detector along with SPCA descriptor is performed to extract the local features. SPCA is introduced to identify the human face. KNN classifier is introduced to identify the closest human faces from the trained features. The results of the experiment have a recognition rate of 92% for the unsegmented ESSEX database and 96% for the segmented database (700 training images).
  • Convolution operations, LSTM recurrent units, and ELM classifier [ 103 ]: Sun et al. [ 103 ] propose a hybrid deep structure called CNN–LSTM–ELM in order to achieve sequential human activity recognition (HAR). Their proposed CNN–LSTM–ELM structure is evaluated using the OPPORTUNITY dataset, which contains 46,495 training samples and 9894 testing samples, and each sample is a sequence. The model training and testing runs on a GPU with 1536 cores, 1050 MHz clock speed, and 8 GB RAM. The flowchart of the proposed CNN–LSTM–ELM structure is shown in Figure 13 [ 103 ].

5.2. Summary of Hybrid Approaches

6. assessment of face recognition approaches, 6.1. measures of similarity or distances.

  • Peak-to-correlation energy (PCE) or peak-to-sidelobe ratio (PSR) [ 18 ]: The PCE was introduced in (8).
  • Euclidean distance [ 54 ]: The Euclidean distance is one of the most basic measures used to compute the direct distance between two points in a plane. If we have two points P 1 and P 2 , with the coordinates ( x 1 ,   y 1 ) and ( x 2 ,   y 2 ) , respectively, the calculation of the Euclidean distance between them would be as follows: d E ( P 1 ,   P 2   ) = ( x 2 − x 1 ) 2 + ( y 2 − y 1 ) 2 . (15) In general, the Euclidean distance between two points P = ( 1 ,   p 2 ,   … ,   p n ) and Q = ( q 1 ,   q 2 , …   ,   q n ) in the n-dimensional space would be defined by the following: d E ( P , Q ) = ∑ i n ( p i − q i ) 2 . (16)
  • Bhattacharyya distance [ 104 , 105 ]: The Bhattacharyya distance is a statistical measure that quantifies the similarity between two discrete or continuous probability distributions. This distance is particularly known for its low processing time and its low sensitivity to noise. For the probability distributions p and q defined on the same domain, the distance of Bhattacharyya is defined as follows: D B ( p ,   q ) = − l n ( B C ( p ,   q ) ) , (17) B C ( p ,   q ) = ∑ x ∈ X p ( x ) q ( x )   ( a ) ;   B C ( p ,   q ) = ∫ p ( x ) q ( x ) d x   ( b ) , (18) where B C is the Bhattacharyya coefficient, defined as Equation (18a) for discrete probability distributions and as Equation (18b) for continuous probability distributions. In both cases, 0 ≤ BC ≤ 1 and 0 ≤ DB ≤ ∞. In its simplest formulation, the Bhattacharyya distance between two classes that follow a normal distribution can be calculated from a mean ( μ ) and the variance ( σ 2 ): D B ( p ,   q ) = 1 4 l n ( 1 4 ( σ p 2 σ q 2 + σ q 2 σ p 2 + 2 ) ) + 1 4 ( ( μ p − μ q ) σ q 2 + σ p 2 ) . (19)
  • Chi-squared distance [ 106 ]: The Chi-squared ( X 2 ) distance was weighted by the value of the samples, which allows knowing the same relevance for sample differences with few occurrences as those with multiple occurrences. To compare two histograms S 1 = ( u 1 , …   …   … . u m ) and S 2 = ( w 1 , …   …   … . w m ) , the Chi-squared ( X 2 ) distance can be defined as follows: ( X 2 ) = D ( S 1 , S 2 ) = 1 2 ∑ i = 1 m ( u i − w i ) 2 u i + w i . (20)

6.2. Classifiers

  • Support vector machines (SVMs) [ 13 , 26 ]: The feature vectors extracted by any descriptor are classified by linear or nonlinear SVM. The SVM classifier may realize the separation of the classes with an optimal hyperplane. To determine the last, only the closest points of the total learning set should be used; these points are called support vectors ( Figure 14 ). There is an infinite number of hyperplanes capable of perfectly separating two classes, which implies to select a hyperplane that maximizes the minimal distance between the learning examples and the learning hyperplane (i.e., the distance between the support vectors and the hyperplane). This distance is called “margin”. The SVM classifier is used to calculate the optimal hyperplane that categorizes a set of labels training data in the correct class. The optimal hyperplane is solved as follows: D = { ( x i , y i ) | x i ∈ R n ,   y i ∈ { − 1 , 1 } ,   i = 1 … … l } . (21) Given that x i are the training features vectors and y i are the corresponding set of l (1 or −1) labels. An SVM tries to find a hyperplane to distinguish the samples with the smallest errors. The classification function is obtained by calculating the distance between the input vector and the hyperplane. w x i − b = C f , (22) where w and b are the parameters of the model. Shen et al. [ 108 ] proposed the Gabor filter to extract the face features and applied the SVM for classification. The proposed FaceNet method achieves a good record accuracy of 99.63% and 95.12% using the LFW YouTube Faces DB datasets, respectively.
  • k-nearest neighbor (k-NN) [ 17 , 91 ]: k-NN is an indolent algorithm because, in training, it saves little information, and thus does not build models of difference, for example, decision trees.
  • K-means [ 9 , 109 ]: It is called K-means because it represents each of the groups by the average (or weighted average) of its points, called the centroid. In the K-means algorithm, it is necessary to specify a priori the number of clusters k that one wishes to form in order to start the process.
  • Deep learning (DL): An automatic learning technique that uses neural network architectures. The term “deep” refers to the number of hidden layers in the neural network. While conventional neural networks have one layer, deep neural networks (DNN) contain several layers, as presented in Figure 15 .
  • Convolutional layer : sometimes called the feature extractor layer because features of the image are extracted within this layer. Convolution preserves the spatial relationship between pixels by learning image features using small squares of the input image. The input image is convoluted by employing a set of learnable neurons. This produces a feature map or activation map in the output image, after which the feature maps are fed as input data to the next convolutional layer. The convolutional layer also contains rectified linear unit (ReLU) activation to convert all negative value to zero. This makes it very computationally efficient, as few neurons are activated each time.
  • Pooling layer: used to reduce dimensions, with the aim of reducing processing times by retaining the most important information after convolution. This layer basically reduces the number of parameters and computation in the network, controlling over fitting by progressively reducing the spatial size of the network. There are two operations in this layer: average pooling and maximum pooling: - Average-pooling takes all the elements of the sub-matrix, calculates their average, and stores the value in the output matrix. - Max-pooling searches for the highest value found in the sub-matrix and saves it in the output matrix.
  • Fully-connected layer : in this layer, the neurons have a complete connection to all the activations from the previous layers. It connects neurons in one layer to neurons in another layer. It is used to classify images between different categories by training.

6.3. Databases Used

  • LFW (Labeled Faces in the Wild) database was created in October 2007. It contains 13,333 images of 5749 subjects, with 1680 subjects with at least two images and the rest with a single image. These face images were taken on the Internet, pre-processed, and localized by the Viola–Jones detector with a resolution of 250 × 250 pixels. Most of them are in color, although there are also some in grayscale and presented in JPG format and organized by folders.
  • FERET (Face Recognition Technology) database was created in 15 sessions in a semi-controlled environment between August 1993 and July 1996. It contains 1564 sets of images, with a total of 14,126 images. The duplicate series belong to subjects already present in the series of individual images, which were generally captured one day apart. Some images taken from the same subject vary overtime for a few years and can be used to treat facial changes that appear over time. The images have a depth of 24 bits, RGB, so they are color images, with a resolution of 512 × 768 pixels.
  • AR face database was created by Aleix Martínez and Robert Benavente in the computer vision center (CVC) of the Autonomous University of Barcelona in June 1998. It contains more than 4000 images of 126 subjects, including 70 men and 56 women. They were taken at the CVC under a controlled environment. The images were taken frontally to the subjects, with different facial expressions and three different lighting conditions, as well as several accessories: scarves, glasses, or sunglasses. Two imaging sessions were performed with the same subjects, 14 days apart. These images are a resolution of 576 × 768 pixels and a depth of 24 bits, under the RGB RAW format.
  • ORL Database of Faces was performed between April 1992 and April 1994 at the AT & T laboratory in Cambridge. It consists of a total of 10 images per subject, out of a total of 40 images. For some subjects, the images were taken at different times, with varying illumination and facial expressions: eyes open/closed, smiling/without a smile, as well as with or without glasses. The images were taken under a black homogeneous background, in a vertical position and frontally to the subject, with some small rotation. These are images with a resolution of 92 × 112 pixels in grayscale.
  • Extended Yale Face B database contains 16,128 images of 640 × 480 grayscale of 28 individuals under 9 poses and 64 different lighting conditions. It also includes a set of images made with the face of individuals only.
  • Pointing Head Pose Image Database (PHPID) is one of the most widely used for face recognition. It contains 2790 monocular face images of 15 persons with tilt angles from −90° to +90° and variations of pan. Every person has two series of 93 different poses (93 images). The face images were taken under different skin color and with or without glasses.

6.4. Comparison between Holistic, Local, and Hybrid Techniques

7. discussion about future directions and conclusions, 7.1. discussion.

  • Local approaches: use features in which the face described partially. For example, some system could consist of extracting local features such as the eyes, mouth, and nose. The features’ values are calculated from the lines or points that can be represented on the face image for the recognition step.
  • Holistic approaches: use features that globally describe the complete face as a model, including the background (although it is desirable to occupy the smallest possible surface).
  • Hybrid approaches: combine local and holistic approaches.
  • Three-dimensional face recognition: In 2D image-based techniques, some features are lost owing to the 3D structure of the face. Lighting and pose variations are two major unresolved problems of 2D face recognition. Recently, 3D facial recognition for facial recognition has been widely studied by the scientific community to overcome unresolved problems in 2D facial recognition and to achieve significantly higher accuracy by measuring geometry of rigid features on the face. For this reason, several recent systems based on 3D data have been developed [ 3 , 93 , 95 , 128 , 129 ].
  • Multimodal facial recognition: sensors have been developed in recent years with a proven ability to acquire not only two-dimensional texture information, but also facial shape, that is, three-dimensional information. For this reason, some recent studies have merged the two types of 2D and 3D information to take advantage of each of them and obtain a hybrid system that improves the recognition as the only modality [ 98 ].
  • Deep learning (DL): a very broad concept, which means that it has no exact definition, but studies [ 14 , 110 , 111 , 112 , 113 , 121 , 130 , 131 ] agree that DL includes a set of algorithms that attempt to model high level abstractions, by modeling multiple processing layers. This field of research began in the 1980s and is a branch of automatic learning where algorithms are used in the formation of deep neural networks (DNN) to achieve greater accuracy than other classical techniques. In recent progress, a point has been reached where DL performs better than people in some tasks, for example, to recognize objects in images.

7.2. Conclusions

Author contributions, conflicts of interest.

  • Liao, S.; Jain, A.K.; Li, S.Z. Partial face recognition: Alignment-free approach. IEEE Trans. Pattern Anal. Mach. Intell. 2012 , 35 , 1193–1205. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Jridi, M.; Napoléon, T.; Alfalou, A. One lens optical correlation: Application to face recognition. Appl. Opt. 2018 , 57 , 2087–2095. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Napoléon, T.; Alfalou, A. Pose invariant face recognition: 3D model from single photo. Opt. Lasers Eng. 2017 , 89 , 150–161. [ Google Scholar ] [ CrossRef ]
  • Ouerhani, Y.; Jridi, M.; Alfalou, A. Fast face recognition approach using a graphical processing unit “GPU”. In Proceedings of the 2010 IEEE International Conference on Imaging Systems and Techniques, Thessaloniki, Greece, 1–2 July 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 80–84. [ Google Scholar ]
  • Yang, W.; Wang, S.; Hu, J.; Zheng, G.; Valli, C. A fingerprint and finger-vein based cancelable multi-biometric system. Pattern Recognit. 2018 , 78 , 242–251. [ Google Scholar ] [ CrossRef ]
  • Patel, N.P.; Kale, A. Optimize Approach to Voice Recognition Using IoT. In Proceedings of the 2018 International Conference on Advances in Communication and Computing Technology (ICACCT), Sangamner, India, 8–9 February 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 251–256. [ Google Scholar ]
  • Wang, Q.; Alfalou, A.; Brosseau, C. New perspectives in face correlation research: A tutorial. Adv. Opt. Photonics 2017 , 9 , 1–78. [ Google Scholar ] [ CrossRef ]
  • Alfalou, A.; Brosseau, C.; Kaddah, W. Optimization of decision making for face recognition based on nonlinear correlation plane. Opt. Commun. 2015 , 343 , 22–27. [ Google Scholar ] [ CrossRef ]
  • Zhao, C.; Li, X.; Cang, Y. Bisecting k-means clustering based face recognition using block-based bag of words model. Opt. Int. J. Light Electron Opt. 2015 , 126 , 1761–1766. [ Google Scholar ] [ CrossRef ]
  • HajiRassouliha, A.; Gamage, T.P.B.; Parker, M.D.; Nash, M.P.; Taberner, A.J.; Nielsen, P.M. FPGA implementation of 2D cross-correlation for real-time 3D tracking of deformable surfaces. In Proceedings of the 2013 28th International Conference on Image and Vision Computing New Zealand (IVCNZ 2013), Wellington, New Zealand, 27–29 November 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 352–357. [ Google Scholar ]
  • Kortli, Y.; Jridi, M.; Al Falou, A.; Atri, M. A comparative study of CFs, LBP, HOG, SIFT, SURF, and BRIEF techniques for face recognition. In Pattern Recognition and Tracking XXIX ; International Society for Optics and Photonics; SPIE: Bellingham, WA, USA, 2018; Volume 10649, p. 106490M. [ Google Scholar ]
  • Dehai, Z.; Da, D.; Jin, L.; Qing, L. A pca-based face recognition method by applying fast fourier transform in pre-processing. In 3rd International Conference on Multimedia Technology (ICMT-13) ; Atlantis Press: Paris, France, 2013. [ Google Scholar ]
  • Ouerhani, Y.; Alfalou, A.; Brosseau, C. Road mark recognition using HOG-SVM and correlation. In Optics and Photonics for Information Processing XI ; International Society for Optics and Photonics; SPIE: Bellingham, WA, USA, 2017; Volume 10395, p. 103950Q. [ Google Scholar ]
  • Liu, W.; Wang, Z.; Liu, X.; Zeng, N.; Liu, Y.; Alsaadi, F.E. A survey of deep neural network architectures and their applications. Neurocomputing 2017 , 234 , 11–26. [ Google Scholar ] [ CrossRef ]
  • Xi, M.; Chen, L.; Polajnar, D.; Tong, W. Local binary pattern network: A deep learning approach for face recognition. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 3224–3228. [ Google Scholar ]
  • Ojala, T.; Pietikäinen, M.; Harwood, D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognit. 1996 , 29 , 51–59. [ Google Scholar ] [ CrossRef ]
  • Gowda, H.D.S.; Kumar, G.H.; Imran, M. Multimodal Biometric Recognition System Based on Nonparametric Classifiers. Data Anal. Learn. 2018 , 43 , 269–278. [ Google Scholar ]
  • Ouerhani, Y.; Jridi, M.; Alfalou, A.; Brosseau, C. Optimized pre-processing input plane GPU implementation of an optical face recognition technique using a segmented phase only composite filter. Opt. Commun. 2013 , 289 , 33–44. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Mousa Pasandi, M.E. Face, Age and Gender Recognition Using Local Descriptors. Ph.D. Thesis, Université d’Ottawa/University of Ottawa, Ottawa, ON, Canada, 2014. [ Google Scholar ]
  • Khoi, P.; Thien, L.H.; Viet, V.H. Face Retrieval Based on Local Binary Pattern and Its Variants: A Comprehensive Study. Int. J. Adv. Comput. Sci. Appl. 2016 , 7 , 249–258. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Zeppelzauer, M. Automated detection of elephants in wildlife video. EURASIP J. Image Video Process. 2013 , 46 , 2013. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Parmar, D.N.; Mehta, B.B. Face recognition methods & applications. arXiv 2014 , arXiv:1403.0485. [ Google Scholar ]
  • Vinay, A.; Hebbar, D.; Shekhar, V.S.; Murthy, K.B.; Natarajan, S. Two novel detector-descriptor based approaches for face recognition using sift and surf. Procedia Comput. Sci. 2015 , 70 , 185–197. [ Google Scholar ]
  • Yang, H.; Wang, X.A. Cascade classifier for face detection. J. Algorithms Comput. Technol. 2016 , 10 , 187–197. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001. [ Google Scholar ]
  • Rettkowski, J.; Boutros, A.; Göhringer, D. HW/SW Co-Design of the HOG algorithm on a Xilinx Zynq SoC. J. Parallel Distrib. Comput. 2017 , 109 , 50–62. [ Google Scholar ] [ CrossRef ]
  • Seo, H.J.; Milanfar, P. Face verification using the lark representation. IEEE Trans. Inf. Forensics Secur. 2011 , 6 , 1275–1286. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Shah, J.H.; Sharif, M.; Raza, M.; Azeem, A. A Survey: Linear and Nonlinear PCA Based Face Recognition Techniques. Int. Arab J. Inf. Technol. 2013 , 10 , 536–545. [ Google Scholar ]
  • Du, G.; Su, F.; Cai, A. Face recognition using SURF features. In MIPPR 2009: Pattern Recognition and Computer Vision ; International Society for Optics and Photonics; SPIE: Bellingham, WA, USA, 2009; Volume 7496, p. 749628. [ Google Scholar ]
  • Calonder, M.; Lepetit, V.; Ozuysal, M.; Trzcinski, T.; Strecha, C.; Fua, P. BRIEF: Computing a local binary descriptor very fast. IEEE Trans. Pattern Anal. Mach. Intell. 2011 , 34 , 1281–1298. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Smach, F.; Miteran, J.; Atri, M.; Dubois, J.; Abid, M.; Gauthier, J.P. An FPGA-based accelerator for Fourier Descriptors computing for color object recognition using SVM. J. Real-Time Image Process. 2007 , 2 , 249–258. [ Google Scholar ] [ CrossRef ]
  • Kortli, Y.; Jridi, M.; Al Falou, A.; Atri, M. A novel face detection approach using local binary pattern histogram and support vector machine. In Proceedings of the 2018 International Conference on Advanced Systems and Electric Technologies (IC_ASET), Hammamet, Tunisia, 22–25 March 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 28–33. [ Google Scholar ]
  • Wang, Q.; Xiong, D.; Alfalou, A.; Brosseau, C. Optical image authentication scheme using dual polarization decoding configuration. Opt. Lasers Eng. 2019 , 112 , 151–161. [ Google Scholar ] [ CrossRef ]
  • Turk, M.; Pentland, A. Eigenfaces for recognition. J. Cogn. Neurosci. 1991 , 3 , 71–86. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Annalakshmi, M.; Roomi, S.M.M.; Naveedh, A.S. A hybrid technique for gender classification with SLBP and HOG features. Clust. Comput. 2019 , 22 , 11–20. [ Google Scholar ] [ CrossRef ]
  • Hussain, S.U.; Napoléon, T.; Jurie, F. Face Recognition Using Local Quantized Patterns ; HAL: Bengaluru, India, 2012. [ Google Scholar ]
  • Alfalou, A.; Brosseau, C. Understanding Correlation Techniques for Face Recognition: From Basics to Applications. In Face Recognition ; Oravec, M., Ed.; IntechOpen: Rijeka, Croatia, 2010. [ Google Scholar ]
  • Napoléon, T.; Alfalou, A. Local binary patterns preprocessing for face identification/verification using the VanderLugt correlator. In Optical Pattern Recognition XXV ; International Society for Optics and Photonics; SPIE: Bellingham, WA, USA, 2014; Volume 9094, p. 909408. [ Google Scholar ]
  • Schroff, F.; Kalenichenko, D.; Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, USA, 7–12 June 2015; pp. 815–823. [ Google Scholar ]
  • Kambi Beli, I.; Guo, C. Enhancing face identification using local binary patterns and k-nearest neighbors. J. Imaging 2017 , 3 , 37. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Benarab, D.; Napoléon, T.; Alfalou, A.; Verney, A.; Hellard, P. Optimized swimmer tracking system by a dynamic fusion of correlation and color histogram techniques. Opt. Commun. 2015 , 356 , 256–268. [ Google Scholar ] [ CrossRef ]
  • Bonnen, K.; Klare, B.F.; Jain, A.K. Component-based representation in automated face recognition. IEEE Trans. Inf. Forensics Secur. 2012 , 8 , 239–253. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Ren, J.; Jiang, X.; Yuan, J. Relaxed local ternary pattern for face recognition. In Proceedings of the 2013 IEEE International Conference on Image Processing, Melbourne, Australia, 15–18 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 3680–3684. [ Google Scholar ]
  • Karaaba, M.; Surinta, O.; Schomaker, L.; Wiering, M.A. Robust face recognition by computing distances from multiple histograms of oriented gradients. In Proceedings of the 2015 IEEE Symposium Series on Computational Intelligence, Cape Town, South Africa, 7–10 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 203–209. [ Google Scholar ]
  • Huang, C.; Huang, J. A fast HOG descriptor using lookup table and integral image. arXiv 2017 , arXiv:1703.06256. [ Google Scholar ]
  • Arigbabu, O.A.; Ahmad, S.M.S.; Adnan, W.A.W.; Yussof, S.; Mahmood, S. Soft biometrics: Gender recognition from unconstrained face images using local feature descriptor. arXiv 2017 , arXiv:1702.02537. [ Google Scholar ]
  • Lugh, A.V. Signal detection by complex spatial filtering. IEEE Trans. Inf. Theory 1964 , 10 , 139. [ Google Scholar ]
  • Weaver, C.S.; Goodman, J.W. A technique for optically convolving two functions. Appl. Opt. 1966 , 5 , 1248–1249. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Horner, J.L.; Gianino, P.D. Phase-only matched filtering. Appl. Opt. 1984 , 23 , 812–816. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Leonard, I.; Alfalou, A.; Brosseau, C. Face recognition based on composite correlation filters: Analysis of their performances. In Face Recognition: Methods, Applications and Technology ; Nova Science Pub Inc.: London, UK, 2012. [ Google Scholar ]
  • Katz, P.; Aron, M.; Alfalou, A. A Face-Tracking System to Detect Falls in the Elderly ; SPIE Newsroom; SPIE: Bellingham, WA, USA, 2013. [ Google Scholar ]
  • Alfalou, A.; Brosseau, C.; Katz, P.; Alam, M.S. Decision optimization for face recognition based on an alternate correlation plane quantification metric. Opt. Lett. 2012 , 37 , 1562–1564. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Elbouz, M.; Bouzidi, F.; Alfalou, A.; Brosseau, C.; Leonard, I.; Benkelfat, B.E. Adapted all-numerical correlator for face recognition applications. In Optical Pattern Recognition XXIV ; International Society for Optics and Photonics; SPIE: Bellingham, WA, USA, 2013; Volume 8748, p. 874807. [ Google Scholar ]
  • Heflin, B.; Scheirer, W.; Boult, T.E. For your eyes only. In Proceedings of the 2012 IEEE Workshop on the Applications of Computer Vision (WACV), Breckenridge, CO, USA, 9–11 January 2012; pp. 193–200. [ Google Scholar ]
  • Zhu, X.; Liao, S.; Lei, Z.; Liu, R.; Li, S.Z. Feature correlation filter for face recognition. In Advances in Biometrics, Proceedings of the International Conference on Biometrics, Seoul, Korea, 27–29 August 2007 ; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4642, pp. 77–86. [ Google Scholar ]
  • Lenc, L.; Král, P. Automatic face recognition system based on the SIFT features. Comput. Electr. Eng. 2015 , 46 , 256–272. [ Google Scholar ] [ CrossRef ]
  • Işık, Ş. A comparative evaluation of well-known feature detectors and descriptors. Int. J. Appl. Math. Electron. Comput. 2014 , 3 , 1–6. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Mahier, J.; Hemery, B.; El-Abed, M.; El-Allam, M.; Bouhaddaoui, M.; Rosenberger, C. Computation evabio: A tool for performance evaluation in biometrics. Int. J. Autom. Identif. Technol. 2011 , 24 , hal-00984026. [ Google Scholar ]
  • Alahi, A.; Ortiz, R.; Vandergheynst, P. Freak: Fast retina keypoint. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 510–517. [ Google Scholar ]
  • Arashloo, S.R.; Kittler, J. Efficient processing of MRFs for unconstrained-pose face recognition. In Proceedings of the 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), Rlington, VA, USA, 29 September–2 October 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1–8. [ Google Scholar ]
  • Ghorbel, A.; Tajouri, I.; Aydi, W.; Masmoudi, N. A comparative study of GOM, uLBP, VLC and fractional Eigenfaces for face recognition. In Proceedings of the 2016 International Image Processing, Applications and Systems (IPAS), Hammamet, Tunisia, 5–7 November 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–5. [ Google Scholar ]
  • Lima, A.; Zen, H.; Nankaku, Y.; Miyajima, C.; Tokuda, K.; Kitamura, T. On the use of kernel PCA for feature extraction in speech recognition. IEICE Trans. Inf. Syst. 2004 , 87 , 2802–2811. [ Google Scholar ]
  • Devi, B.J.; Veeranjaneyulu, N.; Kishore, K.V.K. A novel face recognition system based on combining eigenfaces with fisher faces using wavelets. Procedia Comput. Sci. 2010 , 2 , 44–51. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Simonyan, K.; Parkhi, O.M.; Vedaldi, A.; Zisserman, A. Fisher vector faces in the wild. In Proceedings of the BMVC 2013—British Machine Vision Conference, Bristol, UK, 9–13 September 2013. [ Google Scholar ]
  • Li, B.; Ma, K.K. Fisherface vs. eigenface in the dual-tree complex wavelet domain. In Proceedings of the 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Kyoto, Japan, 12–14 September 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 30–33. [ Google Scholar ]
  • Agarwal, R.; Jain, R.; Regunathan, R.; Kumar, C.P. Automatic Attendance System Using Face Recognition Technique. In Proceedings of the 2nd International Conference on Data Engineering and Communication Technology ; Springer: Singapore, 2019; pp. 525–533. [ Google Scholar ]
  • Cui, Z.; Li, W.; Xu, D.; Shan, S.; Chen, X. Fusing robust face region descriptors via multiple metric learning for face recognition in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, Portland, OR, USA, 23–28 June 2013; pp. 3554–3561. [ Google Scholar ]
  • Prince, S.; Li, P.; Fu, Y.; Mohammed, U.; Elder, J. Probabilistic models for inference about identity. IEEE Trans. Pattern Anal. Mach. Intell. 2011 , 34 , 144–157. [ Google Scholar ]
  • Perlibakas, V. Face recognition using principal component analysis and log-gabor filters. arXiv 2006 , arXiv:cs/0605025. [ Google Scholar ]
  • Huang, Z.H.; Li, W.J.; Shang, J.; Wang, J.; Zhang, T. Non-uniform patch based face recognition via 2D-DWT. Image Vision Comput. 2015 , 37 , 12–19. [ Google Scholar ] [ CrossRef ]
  • Sufyanu, Z.; Mohamad, F.S.; Yusuf, A.A.; Mamat, M.B. Enhanced Face Recognition Using Discrete Cosine Transform. Eng. Lett. 2016 , 24 , 52–61. [ Google Scholar ]
  • Hoffmann, H. Kernel PCA for novelty detection. Pattern Recognit. 2007 , 40 , 863–874. [ Google Scholar ] [ CrossRef ]
  • Arashloo, S.R.; Kittler, J. Class-specific kernel fusion of multiple descriptors for face verification using multiscale binarised statistical image features. IEEE Trans. Inf. Forensics Secur. 2014 , 9 , 2100–2109. [ Google Scholar ] [ CrossRef ]
  • Vinay, A.; Shekhar, V.S.; Murthy, K.B.; Natarajan, S. Performance study of LDA and KFA for gabor based face recognition system. Procedia Comput. Sci. 2015 , 57 , 960–969. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Sivasathya, M.; Joans, S.M. Image Feature Extraction using Non Linear Principle Component Analysis. Procedia Eng. 2012 , 38 , 911–917. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Zhang, B.; Chen, X.; Shan, S.; Gao, W. Nonlinear face recognition based on maximum average margin criterion. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; IEEE: Piscataway, NJ, USA, 2005; Volume 1, pp. 554–559. [ Google Scholar ]
  • Vankayalapati, H.D.; Kyamakya, K. Nonlinear feature extraction approaches with application to face recognition over large databases. In Proceedings of the 2009 2nd International Workshop on Nonlinear Dynamics and Synchronization, Klagenfurt, Austria, 20–21 July 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 44–48. [ Google Scholar ]
  • Javidi, B.; Li, J.; Tang, Q. Optical implementation of neural networks for face recognition by the use of nonlinear joint transform correlators. Appl. Opt. 1995 , 34 , 3950–3962. [ Google Scholar ] [ CrossRef ]
  • Yang, J.; Frangi, A.F.; Yang, J.Y. A new kernel Fisher discriminant algorithm with application to face recognition. Neurocomputing 2004 , 56 , 415–421. [ Google Scholar ] [ CrossRef ]
  • Pang, Y.; Liu, Z.; Yu, N. A new nonlinear feature extraction method for face recognition. Neurocomputing 2006 , 69 , 949–953. [ Google Scholar ] [ CrossRef ]
  • Wang, Y.; Fei, P.; Fan, X.; Li, H. Face recognition using nonlinear locality preserving with deep networks. In Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, Hunan, China, 19–21 August 2015; ACM: New York, NY, USA, 2015; p. 66. [ Google Scholar ]
  • Li, S.; Yao, Y.F.; Jing, X.Y.; Chang, H.; Gao, S.Q.; Zhang, D.; Yang, J.Y. Face recognition based on nonlinear DCT discriminant feature extraction using improved kernel DCV. IEICE Trans. Inf. Syst. 2009 , 92 , 2527–2530. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Khan, S.A.; Ishtiaq, M.; Nazir, M.; Shaheen, M. Face recognition under varying expressions and illumination using particle swarm optimization. J. Comput. Sci. 2018 , 28 , 94–100. [ Google Scholar ] [ CrossRef ]
  • Hafez, S.F.; Selim, M.M.; Zayed, H.H. 2d face recognition system based on selected gabor filters and linear discriminant analysis lda. arXiv 2015 , arXiv:1503.03741. [ Google Scholar ]
  • Shanbhag, S.S.; Bargi, S.; Manikantan, K.; Ramachandran, S. Face recognition using wavelet transforms-based feature extraction and spatial differentiation-based pre-processing. In Proceedings of the 2014 International Conference on Science Engineering and Management Research (ICSEMR), Chennai, India, 27–29 November 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 1–8. [ Google Scholar ]
  • Fan, J.; Chow, T.W. Exactly Robust Kernel Principal Component Analysis. IEEE Trans. Neural Netw. Learn. Syst. 2019 . [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Vinay, A.; Cholin, A.S.; Bhat, A.D.; Murthy, K.B.; Natarajan, S. An Efficient ORB based Face Recognition framework for Human-Robot Interaction. Procedia Comput. Sci. 2018 , 133 , 913–923. [ Google Scholar ]
  • Lu, J.; Plataniotis, K.N.; Venetsanopoulos, A.N. Face recognition using kernel direct discriminant analysis algorithms. IEEE Trans. Neural Netw. 2003 , 14 , 117–126. [ Google Scholar ] [ PubMed ] [ Green Version ]
  • Yang, W.J.; Chen, Y.C.; Chung, P.C.; Yang, J.F. Multi-feature shape regression for face alignment. EURASIP J. Adv. Signal Process. 2018 , 2018 , 51. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Ouanan, H.; Ouanan, M.; Aksasse, B. Non-linear dictionary representation of deep features for face recognition from a single sample per person. Procedia Comput. Sci. 2018 , 127 , 114–122. [ Google Scholar ] [ CrossRef ]
  • Fathima, A.A.; Ajitha, S.; Vaidehi, V.; Hemalatha, M.; Karthigaiveni, R.; Kumar, R. Hybrid approach for face recognition combining Gabor Wavelet and Linear Discriminant Analysis. In Proceedings of the 2015 IEEE International Conference on Computer Graphics, Vision and Information Security (CGVIS), Bhubaneswar, India, 2–3 November 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 220–225. [ Google Scholar ]
  • Barkan, O.; Weill, J.; Wolf, L.; Aronowitz, H. Fast high dimensional vector multiplication face recognition. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 1960–1967. [ Google Scholar ]
  • Juefei-Xu, F.; Luu, K.; Savvides, M. Spartans: Single-sample periocular-based alignment-robust recognition technique applied to non-frontal scenarios. IEEE Trans. Image Process. 2015 , 24 , 4780–4795. [ Google Scholar ] [ CrossRef ]
  • Yan, Y.; Wang, H.; Suter, D. Multi-subregion based correlation filter bank for robust face recognition. Pattern Recognit. 2014 , 47 , 3487–3501. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Ding, C.; Tao, D. Robust face recognition via multimodal deep face representation. IEEE Trans. Multimed. 2015 , 17 , 2049–2058. [ Google Scholar ] [ CrossRef ]
  • Sharma, R.; Patterh, M.S. A new pose invariant face recognition system using PCA and ANFIS. Optik 2015 , 126 , 3483–3487. [ Google Scholar ] [ CrossRef ]
  • Moussa, M.; Hmila, M.; Douik, A. A Novel Face Recognition Approach Based on Genetic Algorithm Optimization. Stud. Inform. Control 2018 , 27 , 127–134. [ Google Scholar ] [ CrossRef ]
  • Mian, A.; Bennamoun, M.; Owens, R. An efficient multimodal 2D-3D hybrid approach to automatic face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2007 , 29 , 1927–1943. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Cho, H.; Roberts, R.; Jung, B.; Choi, O.; Moon, S. An efficient hybrid face recognition algorithm using PCA and GABOR wavelets. Int. J. Adv. Robot. Syst. 2014 , 11 , 59. [ Google Scholar ] [ CrossRef ]
  • Guru, D.S.; Suraj, M.G.; Manjunath, S. Fusion of covariance matrices of PCA and FLD. Pattern Recognit. Lett. 2011 , 32 , 432–440. [ Google Scholar ] [ CrossRef ]
  • Sing, J.K.; Chowdhury, S.; Basu, D.K.; Nasipuri, M. An improved hybrid approach to face recognition by fusing local and global discriminant features. Int. J. Biom. 2012 , 4 , 144–164. [ Google Scholar ] [ CrossRef ]
  • Kamencay, P.; Zachariasova, M.; Hudec, R.; Jarina, R.; Benco, M.; Hlubik, J. A novel approach to face recognition using image segmentation based on spca-knn method. Radioengineering 2013 , 22 , 92–99. [ Google Scholar ]
  • Sun, J.; Fu, Y.; Li, S.; He, J.; Xu, C.; Tan, L. Sequential Human Activity Recognition Based on Deep Convolutional Network and Extreme Learning Machine Using Wearable Sensors. J. Sens. 2018 , 2018 , 10. [ Google Scholar ] [ CrossRef ]
  • Soltanpour, S.; Boufama, B.; Wu, Q.J. A survey of local feature methods for 3D face recognition. Pattern Recognit. 2017 , 72 , 391–406. [ Google Scholar ] [ CrossRef ]
  • Sharma, G.; ul Hussain, S.; Jurie, F. Local higher-order statistics (LHS) for texture categorization and facial analysis. In European Conference on Computer Vision ; Springer: Berlin/Heidelberg, Germany, 2012; pp. 1–12. [ Google Scholar ]
  • Zhang, J.; Marszałek, M.; Lazebnik, S.; Schmid, C. Local features and kernels for classification of texture and object categories: A comprehensive study. Int. J. Comput. Vis. 2007 , 73 , 213–238. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Leonard, I.; Alfalou, A.; Brosseau, C. Spectral optimized asymmetric segmented phase-only correlation filter. Appl. Opt. 2012 , 51 , 2638–2650. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Shen, L.; Bai, L.; Ji, Z. A svm face recognition method based on optimized gabor features. In International Conference on Advances in Visual Information Systems ; Springer: Berlin/Heidelberg, Germany, 2007; pp. 165–174. [ Google Scholar ]
  • Pratima, D.; Nimmakanti, N. Pattern Recognition Algorithms for Cluster Identification Problem. Int. J. Comput. Sci. Inform. 2012 , 1 , 2231–5292. [ Google Scholar ]
  • Zhang, C.; Prasanna, V. Frequency domain acceleration of convolutional neural networks on CPU-FPGA shared memory system. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, 22–24 February 2017; ACM: New York, NY, USA, 2017; pp. 35–44. [ Google Scholar ]
  • Nguyen, D.T.; Pham, T.D.; Lee, M.B.; Park, K.R. Visible-Light Camera Sensor-Based Presentation Attack Detection for Face Recognition by Combining Spatial and Temporal Information. Sensors 2019 , 19 , 410. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Parkhi, O.M.; Vedaldi, A.; Zisserman, A. Deep face recognition. In Proceedings of the BMVC 2015—British Machine Vision Conference, Swansea, UK, 7–10 September.
  • Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A discriminative feature learning approach for deep face recognition. In European Conference on Computer Vision ; Springer: Berlin/Heidelberg, Germany, 2016; pp. 499–515. [ Google Scholar ]
  • Passalis, N.; Tefas, A. Spatial bag of features learning for large scale face image retrieval. In INNS Conference on Big Data ; Springer: Berlin/Heidelberg, Germany, 2016; pp. 8–17. [ Google Scholar ]
  • Liu, W.; Wen, Y.; Yu, Z.; Li, M.; Raj, B.; Song, L. Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 212–220. [ Google Scholar ]
  • Amato, G.; Falchi, F.; Gennaro, C.; Massoli, F.V.; Passalis, N.; Tefas, A.; Vairo, C. Face Verification and Recognition for Digital Forensics and Information Security. In Proceedings of the 2019 7th International Symposium on Digital Forensics and Security (ISDFS), Barcelos, Portugal, 10–12 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [ Google Scholar ]
  • Taigman, Y.; Yang, M.; Ranzato, M.A. Wolf, LDeepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition, Washington, DC, USA, 23–28 June 2014; pp. 1701–1708. [ Google Scholar ]
  • Ma, Z.; Ding, Y.; Li, B.; Yuan, X. Deep CNNs with Robust LBP Guiding Pooling for Face Recognition. Sensors 2018 , 18 , 3876. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Koo, J.; Cho, S.; Baek, N.; Kim, M.; Park, K. CNN-Based Multimodal Human Recognition in Surveillance Environments. Sensors 2018 , 18 , 3040. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Cho, S.; Baek, N.; Kim, M.; Koo, J.; Kim, J.; Park, K. Detection in Nighttime Images Using Visible-Light Camera Sensors with Two-Step Faster Region-Based Convolutional Neural Network. Sensors 2018 , 18 , 2995. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Koshy, R.; Mahmood, A. Optimizing Deep CNN Architectures for Face Liveness Detection. Entropy 2019 , 21 , 423. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Elmahmudi, A.; Ugail, H. Deep face recognition using imperfect facial data. Future Gener. Comput. Syst. 2019 , 99 , 213–225. [ Google Scholar ] [ CrossRef ]
  • Seibold, C.; Samek, W.; Hilsmann, A.; Eisert, P. Accurate and robust neural networks for security related applications exampled by face morphing attacks. arXiv 2018 , arXiv:1806.04265. [ Google Scholar ]
  • Yim, J.; Jung, H.; Yoo, B.; Choi, C.; Park, D.; Kim, J. Rotating your face using multi-task deep neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 676–684. [ Google Scholar ]
  • Bajrami, X.; Gashi, B.; Murturi, I. Face recognition performance using linear discriminant analysis and deep neural networks. Int. J. Appl. Pattern Recognit. 2018 , 5 , 240–250. [ Google Scholar ] [ CrossRef ]
  • Gourier, N.; Hall, D.; Crowley, J.L. Estimating Face Orientation from Robust Detection of Salient Facial Structures. Available online: venus.inrialpes.fr/jlc/papers/Pointing04-Gourier.pdf (accessed on 15 December 2019).
  • Gonzalez-Sosa, E.; Fierrez, J.; Vera-Rodriguez, R.; Alonso-Fernandez, F. Facial soft biometrics for recognition in the wild: Recent works, annotation, and COTS evaluation. IEEE Trans. Inf. Forensics Secur. 2018 , 13 , 2001–2014. [ Google Scholar ] [ CrossRef ]
  • Boukamcha, H.; Hallek, M.; Smach, F.; Atri, M. Automatic landmark detection and 3D Face data extraction. J. Comput. Sci. 2017 , 21 , 340–348. [ Google Scholar ] [ CrossRef ]
  • Ouerhani, Y.; Jridi, M.; Alfalou, A.; Brosseau, C. Graphics processor unit implementation of correlation technique using a segmented phase only composite filter. Opt. Commun. 2013 , 289 , 33–44. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Su, C.; Yan, Y.; Chen, S.; Wang, H. An efficient deep neural networks training framework for robust face recognition. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3800–3804. [ Google Scholar ]
  • Coşkun, M.; Uçar, A.; Yildirim, Ö.; Demir, Y. Face recognition based on convolutional neural network. In Proceedings of the 2017 International Conference on Modern Electrical and Energy Systems (MEES), Kremenchuk, Ukraine, 15–17 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 376–379. [ Google Scholar ]

Click here to enlarge figure

Author/Technique UsedDatabaseMatchingLimitationAdvantageResult
Local Appearance-Based Techniques
Khoi et al. [ ]LBPTDFMAPSkewness in face imageRobust feature in fontal face5%
CF199913.03%
LFW90.95%
Xi et al. [ ]LBPNetFERETCosine similarityComplexities of CNNHigh recognition accuracy97.80%
LFW94.04%
Khoi et al. [ ]PLBPTDFMAPSkewness in face imageRobust feature in fontal face5.50%
CF9.70%
LFW91.97%
Laure et al. [ ]LBP and KNNLFWKNNIllumination conditionsRobust85.71%
CMU-PIE99.26%
Bonnen et al. [ ]MRF and MLBPAR (Scream)Cosine similarityLandmark extraction fails or is not idealRobust to changes in facial expression86.10%
FERET (Wearing sunglasses) 95%
Ren et al. [ ]Relaxed LTPCMU-PIEChisquare distanceNoise levelSuperior performance compared with LBP, LTP95.75%
Yale B98.71%
Hussain et al. [ ]LPQFERET/Cosine similarityLot of discriminative informationRobust to illumination variations99.20%
LFW75.30%
Karaaba et al. [ ]HOG and MMDFERETMMD/MLPDLow recognition accuracyAligning difficulties68.59%
LFW23.49%
Arigbabu et al. [ ]PHOG and SVMLFWSVMComplexity and time of computationHead pose variation88.50%
Leonard et al. [ ]VLC correlatorPHPIDASPOFThe low number of the reference image usedRobustness to noise92%
Napoléon et al. [ ]LBP and VLCYaleBPOFIlluminationRotation + Translation98.40%
YaleB Extended95.80%
Heflin et al. [ ]correlation filterLFW/PHPIDPSRSome pre-processing steps More effort on the eye localization stage39.48%
Zhu et al. [ ]PCA–FCFCMU-PIECorrelation filterUse only linear methodOcclusion-insensitive96.60%
FRGC2.091.92%
Seo et al. [ ]LARK + PCALFWCosine similarityFace detectionReducing computational complexity78.90%
Ghorbel et al. [ ]VLC + DoGFERETPCELow recognition rateRobustness81.51%
Ghorbel et al. [ ]uLBP + DoGFERETchi-square distanceRobustnessProcessing time93.39%
Ouerhani et al. [ ]VLCPHPIDPCEPowerProcessing time77%
Lenc et al. [ ]SIFTFERETa posterior probabilityStill far to be perfectSufficiently robust on lower quality real data97.30%
AR95.80%
LFW98.04%
Du et al. [ ]SURFLFWFLANN distanceProcessing timeRobustness and distinctiveness95.60%
Vinay et al. [ ]SURF + SIFTLFWFLANNProcessing timeRobust in unconstrained scenarios78.86%
Face94distance96.67%
Calonder et al. [ ]BRIEF_KNNLow recognition rateLow processing time48%
Author/Techniques UsedDatabases MatchingLimitationAdvantage Result
Linear Techniques
Seo et al. [ ]LARK and PCALFWL2 distanceDetection accuracyReducing computational complexity85.10%
Annalakshmi et al. [ ]ICA and LDALFWBayesian ClassifierSensitivity Good accuracy88%
Annalakshmi et al. [ ]PCA and LDALFWBayesian ClassifierSensitivity Specificity59%
Hussain et al. [ ]LQP and GaborFERETCosine similarityLot of discriminative informationRobust to illumination variations99.2%
75.3%
LFW
Gowda et al. [ ]LPQ and LDAMEPCOSVM Computation timeGood accuracy99.13%
Z. Cui et al. [ ]BoWARASMOcclusionsRobust99.43%
ORL 99.50%
FERET82.30%
Khan et al. [ ]PSO and DWTCKEuclidienne distanceNoiseRobust to illumination98.60%
MMI95.50%
JAFFE98.80%
Huang et al. [ ]2D-DWTFERETKNNPoseFrontal or near-frontal facial images90.63%
97.10%
LFW
Perlibakas and Vytautas [ ]PCA and Gabor filterFERETCosine metricPrecisionPose87.77%
Hafez et al. [ ]Gabor filter and LDAORL2DNCC PoseGood recognition performance98.33%
C. YaleB99.33%
Sufyanu et al. [ ]DCTORLNCCHigh memoryControlled and uncontrolled databases93.40%
Yale
Shanbhag et al. [ ]DWT and BPSO_ __ _RotationSignificant reduction in the number of features88.44%
Ghorbel et al. [ ]Eigenfaces and DoG filterFERETChi-square distanceProcessing timeReduce the representation84.26%
Zhang et al. [ ]PCA and FFTYALESVMComplexityDiscrimination93.42%
Zhang et al. [ ]PCAYALESVMRecognition rateReduce the dimensionality 84.21%
Fan et al. [ ]RKPCAMNIST ORL RBF kernelComplexityRobust to sparse noises_
Vinay et al. [ ] ORB and KPCAORLFLANN MatchingProcessing timeRobust87.30%
Vinay et al. [ ]SURF and KPCAORLFLANN MatchingProcessing timeReduce the dimensionality80.34%
Vinay et al. [ ]SIFT and KPCAORLFLANN MatchingLow recognition rateComplexity69.20%
Lu et al. [ ]KPCA and GDAUMIST faceSVMHigh error rate Excellent performance48%
Yang et al. [ ]PCA and MSRHELEN faceESRComplexityUtilizes color, gradient, and regional information98.00%
Yang et al. [ ]LDA and MSRFRGCESRLow performancesUtilizes color, gradient, and regional information90.75%
Ouanan et al. [ ]FDDL ARCNNOcclusionOrientations, expressions98.00%
Vankayalapati and Kyamakya [ ]CNNORL_ _PosesHigh recognition rate95%
Devi et al. [ ]2FNNORL_ _ComplexityLow error rate98.5
Author/Technique UsedDatabaseMatchingLimitationAdvantage Result
Fathima et al. [ ]GW-LDAAT&Tk-NNHigh processing timeIllumination invariant and reduce the dimensionality88%
FACES9494.02%
MITINDIA88.12%
Barkan et al., [ ]OCLBP, LDA, and WCCNLFWWCCN_Reduce the dimensionality87.85%
Juefei et al. [ ]ACF and WLBPLFW ComplexityPose conditions89.69%
Simonyan et al. [ ]Fisher + SIFTLFWMahalanobis matrixSingle feature typeRobust87.47%
Sharma et al. [ ]PCA–ANFISORLANFISSensitivity-specificity 96.66%
ICA–ANFISANFISPose conditions71.30%
LDA–ANFISANFIS 68%
Ojala et al. [ ] DCT–PCAORLEuclidian distanceComplexityReduce the dimensionality92.62%
UMIST99.40%
YALE95.50%
Mian et al. [ ] Hotelling transform, SIFT, and ICPFRGCICPProcessing timeFacial expressions99.74%
Cho et al. [ ]PCA–LGBPHSExtended Yale FaceBhattacharyya distanceIllumination conditionComplexity95%
PCA–GABOR Wavelets
Sing et al. [ ]PCA–FLDCMUSVMRobustnessPose, illumination, and expression71.98%
FERET94.73%
AR68.65%
Kamencay et al. [ ]SPCA-KNNESSEXKNNProcessing timeExpression variation96.80%
Sun et al. [ ]CNN–LSTM–ELMOPPORTUNITYLSTM/ELMHigh processing timeAutomatically learn feature representations90.60%
Ding et al. [ ]CNNs and SAELFW_ _ComplexityHigh recognition rate99%
ApproachesDatabases UsedAdvantagesDisadvantagesPerformancesChallenges Handled
TDF, CF1999,
LFW, FERET,
CMU-PIE, AR,
Yale B, PHPID,
YaleB Extended, FRGC2.0, Face94.
]. , ]. , ]. ], various lighting conditions[ ], facial expressions [ ], and low resolution.
]. ]. ]. ]. ]. ].
LFW, FERET, MEPCO, AR, ORL, CK, MMI, JAFFE,
C. Yale B, Yale, MNIST, ORL, UMIST face, HELEN face, FRGC.
, ]. , , , ]. ]. ]. ]. , ], scaling, facial expressions.
, , ]. , , ]. ]. , ]. ]. , ]. , ], poses [ ], conditions, scaling, facial expressions.
AT&T, FACES94,
MITINDIA, LFW, ORL, UMIST, YALE, FRGC, Extended Yale, CMU, FERET, AR, ESSEX.
]. , , ]. ]. ]. , ].

Share and Cite

Kortli, Y.; Jridi, M.; Al Falou, A.; Atri, M. Face Recognition Systems: A Survey. Sensors 2020 , 20 , 342. https://doi.org/10.3390/s20020342

Kortli Y, Jridi M, Al Falou A, Atri M. Face Recognition Systems: A Survey. Sensors . 2020; 20(2):342. https://doi.org/10.3390/s20020342

Kortli, Yassin, Maher Jridi, Ayman Al Falou, and Mohamed Atri. 2020. "Face Recognition Systems: A Survey" Sensors 20, no. 2: 342. https://doi.org/10.3390/s20020342

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Deep learning based single sample face recognition: a survey

  • Published: 05 August 2022
  • Volume 56 , pages 2723–2748, ( 2023 )

Cite this article

recent research on face recognition

  • Fan Liu   ORCID: orcid.org/0000-0001-8746-9845 1 , 2   na1 ,
  • Delong Chen 1 , 2   na1 ,
  • Fei Wang 1 , 2 ,
  • Zewen Li 1 , 2 &
  • Feng Xu 1 , 2  

1417 Accesses

21 Citations

2 Altmetric

Explore all metrics

Face recognition has long been an active research area in the field of artificial intelligence, particularly since the rise of deep learning in recent years. In some practical situations, each identity has only a single sample available for training. Face recognition under this situation is referred to as single sample face recognition and poses significant challenges to the effective training of deep models. Therefore, in recent years, researchers have attempted to unleash more potential of deep learning and improve the model recognition performance in the single sample situation. While several comprehensive surveys have been conducted on traditional single sample face recognition approaches, emerging deep learning based methods are rarely involved in these reviews. Accordingly, we focus on the deep learning-based methods in this paper, classifying them into virtual sample methods and generic learning methods. In the former category, virtual images or virtual features are generated to benefit the training of the deep model. In the latter one, additional multi-sample generic sets are used. There are three types of generic learning methods: combining traditional methods and deep features, improving the loss function, and improving network structure, all of which are covered in our analysis. Moreover, we review face datasets that have been commonly used for evaluating single sample face recognition models and go on to compare the results of different types of models. Additionally, we discuss problems with existing single sample face recognition methods, including identity information preservation in virtual sample methods, domain adaption in generic learning methods. Furthermore, we regard developing unsupervised methods is a promising future direction, and point out that the semantic gap as an important issue that needs to be further considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

recent research on face recognition

Similar content being viewed by others

recent research on face recognition

Single sample face recognition using deep learning: a survey

recent research on face recognition

Multiple-Step Model Training for Face Recognition

recent research on face recognition

ETM-face: effective training sample selection and multi-scale feature learning for face detection

Abdelmaksoud M, Nabil E, Farag I, Hameed HA (2020) A novel neural network method for face recognition with a single sample per person. IEEE Access 8:102212–102221. https://doi.org/10.1109/ACCESS.2020.2999030

Article   Google Scholar  

Abdolali F, Seyyedsalehi S (2010) Face recognition from a single image per person using deep architecture neural network. In: Proceedings of the 3rd International conference on computer and electrical engineering, vol 1, pp 70–73

Abdolali F, Seyyedsalehi SA (2012) Improving face recognition from a single image per person via virtual images produced by a bidirectional network. Procedia Soc Behav Sci 32:108–116

Adamo A, Grossi G, Lanzarotti R (2012) Sparse representation based classification for face recognition by k-limaps algorithm. In: International conference on image and signal processing. Springer, pp 245–252

Blanz V, Vetter T (2003) Face recognition based on fitting a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 25(9):1063–1074

Bodini M, D’Amelio A, Grossi G, Lanzarotti R, Lin J (2018) Single sample face recognition by sparse recovery of deep-learned lda features. In: International conference on advanced concepts for intelligent vision systems. Springer, pp 297–308

Cai S, Zhang L, Zuo W, Feng X (2016) A probabilistic collaborative representation based approach for pattern classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2950–2959

Cao B, Wang N, Li J, Gao X (2019) Data augmentation-based joint learning for heterogeneous face recognition. IEEE Trans Neural Netw Learn Syst 30(6):1731–1743. https://doi.org/10.1109/TNNLS.2018.2872675

Article   MathSciNet   Google Scholar  

Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) Pcanet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032

Article   MathSciNet   MATH   Google Scholar  

Chen T, Kornblith S, Norouzi M, Hinton GE (2020) A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13–18 July 2020, Virtual Event, Proceedings of machine learning research. PMLR, vol 119, pp 1597–1607. http://proceedings.mlr.press/v119/chen20j.html

Cheng Y, Zhao J, Wang Z, Xu Y, Jayashree K, Shen S, Feng J (2017) Know you at one glance: a compact vector representation for low-shot learning. In: Proceedings of the IEEE international conference on computer vision workshops, pp 1924–1932

Choe J, Park S, Kim K, Hyun Park J, Kim D, Shim H (2017) Face generation for low-shot learning using generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision workshops, pp 1940–1948

Cuculo V, D’Amelio A, Grossi G, Lanzarotti R, Lin J (2019) Robust single-sample face recognition by sparsity-driven sub-dictionary learning using deep features. Sensors 19(1):146

Deng W, Hu J, Guo J (2012) Extended src: undersampled face recognition via intraclass variant dictionary. IEEE Trans Pattern Anal Mach Intell 34(9):1864–1870

Deng W, Hu J, Guo J (2013) In defense of sparsity based face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 399–406

Deng W, Hu J, Zhou X, Guo J (2014) Equidistant prototypes embedding for single sample based face recognition with generic learning and incremental learning. Pattern Recogn 47(12):3738–3749

Ding C, Bao T, Karmoshi S, Zhu M (2017) Single sample per person face recognition with kpcanet and a weighted voting scheme. SIViP 11(7):1213–1220

Ding C, Tao D (2017) Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Trans Pattern Anal Mach Intell 40(4):1002–1014

Ding Y, Liu F, Tang Z, Zhang T (2020) Uniform generic representation for single sample face recognition. IEEE Access 8:158281–158292

Ding Z, Guo Y, Zhang L, Fu Y (2018) One-shot face recognition via generative learning. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, pp 1–7

Ding Z, Guo Y, Zhang L, Fu Y (2019) Generative one-shot face recognition. arXiv:1910.04860

Duan Q, Zhang L (2020) Look more into occlusion: realistic face frontalization and recognition with boostgan. IEEE Trans Neural Netw Learn Syst 32(1):214–228

Ericsson L, Gouk H, Loy CC, Hospedales TM (2021) Self-supervised representation learning: introduction, advances and challenges. CoRR arXiv:abs/2110.09327

Fan L, Sun X, Rosin PL (2021) Siamese graph convolution network for face sketch recognition: An application using graph structure for face photo-sketch recognition. In: 2020 25th international conference on pattern recognition (ICPR). IEEE, pp 8008–8014

Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611

Feng Y, Wu F, Shao X, Wang Y, Zhou X (2018) Joint 3d face reconstruction and dense alignment with position map regression network. In: Proceedings of the European conference on computer vision (ECCV), pp 534–551

Galea C, Farrugia RA (2017) Forensic face photo-sketch recognition using a deep learning-based architecture. IEEE Signal Process Lett 24(11):1586–1590

Galea C, Farrugia RA (2017) Matching software-generated sketches to face photographs with a very deep cnn, morphed faces, and transfer learning. IEEE Trans Inf Forensics Secur 13(6):1421–1431

Gao S, Zhang Y, Jia K, Lu J, Zhang Y (2015) Single sample face recognition via learning deep supervised autoencoders. IEEE Trans Inf Forensics Secur 10(10):2108–2118

Gao X, Wang N, Tao D, Li X (2012) Face sketch-photo synthesis and retrieval using sparse representation. IEEE Trans Circuits Syst Video Technol 22(8):1213–1226. https://doi.org/10.1109/TCSVT.2012.2198090

Gao Y, Ma J, Yuille AL (2017) Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Trans Image Process 26(5):2545–2560

Garrido P, Valgaerts L, Rehmsen O, Thormahlen T, Perez P, Theobalt C (2014) Automatic face reenactment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4217–4224

Georghiades AS, Belhumeur PN, Kriegman DJ (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660

Gong D, Li Z, Huang W, Li X, Tao D (2017) Heterogeneous face recognition: a common encoding feature discriminant approach. IEEE Trans Image Process 26(5):2079–2089. https://doi.org/10.1109/TIP.2017.2651380

Grill J, Strub F, Altché F, Tallec C, Richemond PH, Buchatskaya E, Doersch C, Pires BÁ, Guo Z, Azar MG, Piot B, Kavukcuoglu K, Munos R, Valko M (2020) Bootstrap your own latent: a new approach to self-supervised learning. In: H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, H. Lin (eds) Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6–12, 2020, virtual. https://proceedings.neurips.cc/paper/2020/hash/f3ada80d5c4ee70142b17b8192b2958e-Abstract.html

Guo Y, Jiao L, Wang S, Wang S, Liu F (2017) Fuzzy sparse autoencoder framework for single image per person face recognition. IEEE Trans Cybern 48(8):2402–2415

Guo Y, Zhang L (2017) One-shot face recognition by promoting underrepresented classes. arXiv:1707.05574

Guo Y, Zhang L, Hu Y, He X, Gao J (2016) Ms-celeb-1m: a dataset and benchmark for large-scale face recognition. In: European conference on computer vision. Springer, pp 87–102

Han H, Klare BF, Bonnen K, Jain AK (2012) Matching composite sketches to face photos: a component-based approach. IEEE Trans Inf Forensics Secur 8(1):191–204

Hao H, Baireddy S, Reibman AR, Delp EJ (2020) Far-gan for one-shot face reenactment. arXiv:2005.06402

He K, Fan H, Wu Y, Xie S, Girshick RB (2020) Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, pp 9726–9735. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR42600.2020.00975

He R, Li Y, Wu X, Song L, Chai Z, Wei X (2021) Coupled adversarial learning for semi-supervised heterogeneous face recognition. Pattern Recogn 110:107618

Hong S, Im W, Ryu J, Yang HS (2017) Sspp-dan: deep domain adaptation network for face recognition with single sample per person. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 825–829

Huang GB, Mattar M, Berg T, Learned-Miller E (2008) Labeled faces in the wild: a database forstudying face recognition in unconstrained environments

Huang R, Zhang S, Li T, He R (2017) Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: Proceedings of the IEEE international conference on computer vision, pp 2439–2448

Jadhav A, Namboodiri VP, Venkatesh K (2016) Deep attributes for one-shot face recognition. In: European conference on computer vision. Springer, pp. 516–523

Ji HK, Sun QS, Ji ZX, Yuan YH, Zhang GQ (2017) Collaborative probabilistic labels for face recognition from single sample per person. Pattern Recogn 62:125–134

Kadam S, Vaidya V (2020) Review and analysis of zero, one and few shot learning approaches. In: Abraham A, Cherukuri AK, Melin P, Gandhi N (eds) Intelligent systems design and applications. Springer, Cham, pp 100–112

Chapter   Google Scholar  

Kan M, Shan S, Su Y, Xu D, Chen X (2013) Adaptive discriminant learning for face recognition. Pattern Recogn 46(9):2497–2509

Karras T, Aittala M, Laine S, Härkönen E, Hellsten J, Lehtinen J, Aila T (2021) Alias-free generative adversarial networks. CoRR arXiv:abs/2106.12423

Klare B, Jain AK (2013) Heterogeneous face recognition using kernel prototype similarities. IEEE Trans Pattern Anal Mach Intell 35(6):1410–1422. https://doi.org/10.1109/TPAMI.2012.229

Klare B, Li Z, Jain AK (2011) Matching forensic sketches to mug shot photos. IEEE Trans Pattern Anal Mach Intell 33(3):639–646. https://doi.org/10.1109/TPAMI.2010.180

Klum SJ, Han H, Klare BF, Jain AK (2014) The facesketchid system: matching facial composites to mugshots. IEEE Trans Inf Forensics Secur 9(12):2248–2263

Kosarevych I, Petruk M, Kostiv M, Kupyn O, Maksymenko M, Budzan V (2020) Actgan: Flexible and efficient one-shot face reenactment. In: 2020 8th international workshop on biometrics and forensics (IWBF). IEEE, pp 1–6

Kumar N, Garg V (2019) Single sample face recognition in the last decade: a survey. Int J Pattern Recognit Artif Intell 33(13):1956009

Li A, Shan S, Chen X, Gao W (2009) Maximizing intra-individual correlations for face recognition across pose differences. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, IEEE Computer Society, Miami, Florida, USA, pp 605–611. https://doi.org/10.1109/CVPR.2009.5206659

Li JB, Pan JS, Chu SC (2007) Face recognition from a single image per person using common subfaces method. In: International symposium on neural networks. Springer, pp. 905–912

Li L, Peng Y, Qiu G, Sun Z, Liu S (2018) A survey of virtual sample generation technology for face recognition. Artif Intell Rev 50(1):1–20

Li M, Zuo W, Zhang D (2016) Convolutional network for attribute-driven and identity-preserving human face generation. arXiv:1608.06434

Li X, Song A (2011) Face recognition using m-msd and svd with single training image. In: Proceedings of the 30th Chinese control conference. IEEE, pp 3231–3233

Li Z, Liu F, Yang W, Peng S, Zhou J (2021) A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst, pp 1–21. https://doi.org/10.1109/TNNLS.2021.3084827

Liu D, Gao X, Wang N, Li J, Peng C (2020) Coupled attribute learning for heterogeneous face recognition. IEEE Trans Neural Netw Learn Syst 31(11):4699–4712

Liu F, Ding Y, Xu F, Ye Q (2019) Learning low-rank regularized generic representation with block-sparse structure for single sample face recognition. IEEE Access 7:30573–30587

Liu F, Tang J, Song Y, Zhang L, Tang Z (2015) Local structure-based sparse representation for face recognition. ACM Trans Intell Syst Technol (TIST) 7(1):1–20

Liu F, Tang J, Song Y, Zhang L, Tang Z (2016) Local structure based multi-phase collaborative representation for face recognition with single sample per person. Inf Sci, pp 346–347

Liu J, Chen S, Zhou ZH, Tan X (2007) Single image subspace for face recognition. In: International workshop on analysis and modeling of faces and gestures. Springer, pp 205–219

Majumdar A, Ward RK (2008) Single image per person face recognition with images synthesized by non-linear approximation. In: 2008 15th IEEE international conference on image processing. IEEE, pp 2740–2743

Malisiewicz T, Gupta A, Efros AA (2011) Ensemble of exemplar-svms for object detection and beyond. In: 2011 International conference on computer vision. IEEE, pp 89–96

Martinez AM (1998) The ar face database. CVC Technical Report24

Martínez AM (2002) Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Trans Pattern Anal Mach Intell 24(6):748–763

Min R, Xu S, Cui Z (2019) Single-sample face recognition based on feature expansion. IEEE Access 7:45219–45229

Minaee S, Luo P, Lin Z, Bowyer KW (2021) Going deeper into face detection: a survey. arXiv:abs/2103.14983

Mokhayeri F, Granger E (2020) A paired sparse representation model for robust face recognition from a single sample. Pattern Recogn 100:107129

Ouanan H, Ouanan M, Aksasse B (2018) Non-linear dictionary representation of deep features for face recognition from a single sample per person. Procedia Comput Sci 127:114–122

Omahony N, Campbell S, Carvalho A, Krpalkova L, Hernandez GV, Harapanahalli S, Riordan D, Walsh J (2019) One-shot learning for custom identification tasks; a review. Procedia Manuf 38:186–193. https://doi.org/10.1016/j.promfg.2020.01.025

Pang M, Cheung YM, Wang B, Lou J (2019) Synergistic generic learning for face recognition from a contaminated single sample per person. IEEE Trans Inf Forensics Secur 15:195–209

Pang M, Cheung YM, Shi Q, Li M (2020) Iterative dynamic generic learning for face recognition from a contaminated single-sample per person. IEEE transactions on neural networks and learning systems

Pang M, Wang B, Cheung YM, Chen Y, Wen B (2021a) Vd-gan: a unified framework for joint prototype and representation learning from contaminated single sample per person. IEEE Trans Inf Forensics Secur 16:2246–2259

Pang M, Wang B, Ye M, Chen Y, Wen B (2021b) Disentangling prototype and variation for single sample face recognition. In: 2021 IEEE International conference on multimedia and expo (ICME). IEEE, pp 1–6

Parchami M, Bashbaghi S, Granger E (2017a) Cnns with cross-correlation matching for face recognition in video surveillance using a single training sample per person. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6

Parchami M, Bashbaghi S, Granger E (2017b) Video-based face recognition using ensemble of haar-like deep convolutional neural networks. In: 2017 International joint conference on neural networks (IJCNN). IEEE, pp 4625–4632

Phillips PJ, Wechsler H, Huang J, Rauss PJ (1998) The feret database and evaluation procedure for face-recognition algorithms. Image Vis Comput 16(5):295–306

Pumarola A, Agudo A, Martinez AM, Sanfeliu A, Moreno-Noguer F (2018) Ganimation: natomically-aware facial animation from a single image. In: Proceedings of the European conference on computer vision (ECCV), pp 818–833

Reed S, Sohn K, Zhang Y, Lee H (2014) Learning to disentangle factors of variation with manifold interaction. In: International conference on machine learning, pp 1431–1439

Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823

Sharma A, Jacobs DW (2011) Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: The 24th IEEE conference on computer vision and pattern recognition, CVPR 2011. IEEE Computer Society, Colorado Springs, CO, USA, 20–25 June 2011, pp 593–600. https://doi.org/10.1109/CVPR.2011.5995350

Smirnov E, Melnikov A, Novoselov S, Luckyanets E, Lavrentyeva G (2017) Doppelganger mining for face representation learning. In: Proceedings of the IEEE international conference on computer vision workshops, pp 1916–1923

Su Y, Shan S, Chen X, Gao W (2010) Adaptive generic learning for face recognition from a single sample per person. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 2699–2706

Tan X, Chen S, Zhou ZH, Zhang F (2006) Face recognition from a single image per person: a survey. Pattern Recogn 39(9):1725–1745

Article   MATH   Google Scholar  

Tang Y, Salakhutdinov R, Hinton G (2012) Deep lambertian networks. arXiv:1206.6445

Tran L, Yin X, Liu X (2018) Representation learning by rotating your faces. IEEE Trans Pattern Anal Mach Intell 41(12):3007–3021

Tu H, Duoji G, Zhao Q, Wu S (2020) Improved single sample per person face recognition via enriching intra-variation and invariant features. Appl Sci 10(2):601

Tuan Tran A, Hassner T, Masi I, Medioni G (2017) Regressing robust and discriminative 3d morphable models with a very deep neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5163–5172

Vasilescu MAO, Terzopoulos D (2003) Multilinear subspace analysis of image ensembles. In: Proceedings 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. . IEEE, vol 2, pp II–93

Vega PJS, Feitosa RQ, Quirita VHA, Happ PN (2016) Single sample face recognition from video via stacked supervised auto-encoder. In: 2016 29th SIBGRAPI conference on graphics, patterns and images (SIBGRAPI). IEEE, pp. 96–103

Viktorisa OY, Wasito I, Syafiandini AF (2016) Evaluating the performance of deep supervised auto encoder in single sample face recognition problem using kullback-leibler divergence sparsity regularizer. J Theor Appl Inf Technol 87(2):255–258

Google Scholar  

Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. Adv Neural Inf Process Syst 29:3630–3638

Wang J, Plataniotis KN, Lu J, Venetsanopoulos AN (2006) On solving the face recognition problem with one training sample per subject. Pattern Recogn 39(9):1746–1762

Wang L, Li Y, Wang S (2018) Feature learning for one-shot face recognition. In: 2018 25th IEEE international conference on image processing (ICIP). IEEE, pp. 2386–2390

Wang M, Deng W (2018) Deep visual domain adaptation: a survey. Neurocomputing 312:135–153

Wang M, Deng W (2021) Deep face recognition: a survey. Neurocomputing 429:215–244. https://doi.org/10.1016/j.neucom.2020.10.081

Wang N, Gao X, Li J (2018) Random sampling for fast face sketch synthesis. Pattern Recognit 76:215–227. https://doi.org/10.1016/j.patcog.2017.11.008

Wang X, Zhang B, Yang M, Ke K, Zheng W (2019) Robust joint representation with triple local feature for face recognition with single sample per person. Knowl-Based Syst 181:104790

Wen W, Wang X, Shen L, Yang M (2018) Adaptive convolution local and global learning for class-level joint representation of face recognition with single sample per person. In: 2018 24th International conference on pattern recognition (ICPR). IEEE, pp 3537–3542

Wolf L, Hassner T, Taigman Y (2009) The one-shot similarity kernel. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 897–902

Wu X, Huang H, Patel VM, He R, Sun Z (2019) Disentangled variational representation for heterogeneous face recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 9005–9012

Wu Y, Liu H, Fu Y (2017) Low-shot face recognition with hybrid classifiers. In: Proceedings of the IEEE international conference on computer vision workshops, pp 1933–1939

Wu Z, Deng W (2016) One-shot deep neural network for pose and illumination normalization face recognition. In: 2016 IEEE international conference on multimedia and expo (ICME). IEEE, pp. 1–6

Yang M, Van Gool L, Zhang L (2013) Sparse variation dictionary learning for face recognition with a single training sample per person. In: Proceedings of the IEEE international conference on computer vision, pp 689–696

Yang M, Wang X, Zeng G, Shen L (2017) Joint and collaborative representation with local adaptive convolution feature for face recognition with single sample per person. Pattern Recogn 66:117–128

Yang M, Zhang L, Feng X, Zhang D (2011) Fisher discrimination dictionary learning for sparse representation. In: 2011 International conference on computer vision. IEEE, pp 543–550

Yao G, Yuan Y, Shao T, Zhou K (2020) Mesh guided one-shot face reenactment using graph convolutional networks. In: Proceedings of the 28th ACM international conference on multimedia, pp 1773–1781

Yin X, Yu X, Sohn K, Liu X, Chandraker M (2019) Feature transfer learning for face recognition with under-represented data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5704–5713

You F, Cao Y, Zhang C (2017) Deep domain adaptation with a few samples for face identification. In: 2017 4th IAPR Asian conference on pattern recognition (ACPR). IEEE, pp 178–183

Yu S, Han H, Shan S, Dantcheva A, Chen X (2019) Improving face sketch recognition via adversarial sketch-photo transformation. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019). IEEE, pp 1–8

Zakharov E, Shysheya A, Burkov E, Lempitsky V (2019) Few-shot adversarial learning of realistic neural talking head models. In: Proceedings of the ieee international conference on computer vision, pp 9459–9468

Zeng J, Zhao X, Qin C, Lin Z (2017) Single sample per person face recognition based on deep convolutional neural network. In: 2017 3rd IEEE international conference on computer and communications (ICCC). IEEE, pp. 1647–1651

Zhang L, Liu J, Zhang B, Zhang D, Zhu C (2019) Deep cascade model-based face recognition: when deep-layered learning meets small data. IEEE Trans Image Process 29:1016–1029

Zhang W, Wang X, Tang X (2011) Coupled information-theoretic encoding for face photo-sketch recognition. In: The 24th IEEE conference on computer vision and pattern recognition, CVPR 2011. IEEE Computer Society, Colorado Springs, CO, USA, 20–25 June 2011, pp 513–520. https://doi.org/10.1109/CVPR.2011.5995324

Zhang W, Wang X, Tang X (2011) Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR 2011. IEEE, pp 513–520

Zhang Y, Hu C, Lu X (2019) Il-gan: illumination-invariant representation learning for single sample face recognition. J Vis Commun Image Represent 59:501–513

Zhang Y, Peng H (2017) Sample reconstruction with deep autoencoder for one sample per person face recognition. IET Comput Vis 11(6):471–478

Zhang Y, Zhang S, He Y, Li C, Loy CC, Liu Z (2019) One-shot face reenactment. arXiv:1908.03251

Zhou J, Chen J, Liang C, Chen J (2020) One-shot face recognition with feature rectification via adversarial learning. In: International conference on multimedia modeling. Springer, pp 290–302

Zhu Z, Luo P, Wang X, Tang X (2013) Deep learning identity-preserving face space. In: Proceedings of the IEEE International conference on computer vision, pp 113–120

Zhu Z, Luo P, Wang X, Tang X (2014) Multi-view perceptron: a deep model for learning face identity and view representations. Adv Neural Inf Process Syst 27:217–225

Download references

Acknowledgements

This work was partially funded by Natural Science Foundation of Jiangsu Province under Grant No. BK20191298, Research Fund from Science and Technology on Underwater Vehicle Technology Laboratory (2021JCJQ-SYSJJ-LB06905), Water Science and Technology Project of Jiangsu Province under Grant Nos. 2021072, 2021063.

Author information

Fan Liu and Delong Chen have contributed equally to this work.

Authors and Affiliations

College of Computer and Information, Hohai University, Nanjing, China

Fan Liu, Delong Chen, Fei Wang, Zewen Li & Feng Xu

Science and Technology on Underwater Vehicle Technology Laboratory, Harbin Engineering University, Harbin, 150001, China

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Fan Liu .

Ethics declarations

Conflict of interest.

The authors declare that they have no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Liu, F., Chen, D., Wang, F. et al. Deep learning based single sample face recognition: a survey. Artif Intell Rev 56 , 2723–2748 (2023). https://doi.org/10.1007/s10462-022-10240-2

Download citation

Published : 05 August 2022

Issue Date : March 2023

DOI : https://doi.org/10.1007/s10462-022-10240-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Face recognition
  • Deep learning
  • Single Sample Per Person (SSPP) problem
  • Find a journal
  • Publish with us
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 01 August 2024

Autistic adults have insight into their relative face recognition ability

  • Bayparvah Kaur Gehdu 1 ,
  • Clare Press 2 , 3 ,
  • Katie L. H. Gray 4 &
  • Richard Cook 5  

Scientific Reports volume  14 , Article number:  17802 ( 2024 ) Cite this article

454 Accesses

5 Altmetric

Metrics details

  • Human behaviour

The PI20 is a self-report questionnaire that assesses the presence of lifelong face recognition difficulties. The items on this scale ask respondents to assess their face recognition ability relative to the rest of the population, either explicitly or implicitly. Recent reports suggest that the PI20 scores of autistic participants exhibit little or no correlation with their performance on the Cambridge Face Memory Test—a key measure of face recognition ability. These reports are suggestive of a meta-cognitive deficit whereby autistic individuals are unable to infer whether their face recognition is impaired relative to the wider population. In the present study, however, we observed significant correlations between the PI20 scores of 77 autistic adults and their performance on two variants of the Cambridge Face Memory Test. These findings indicate that autistic individuals can infer whether their face recognition ability is impaired. Consistent with previous research, we observed a wide spread of face recognition abilities within our autistic sample. While some individuals approached ceiling levels of performance, others met the prevailing diagnostic criteria for developmental prosopagnosia. This variability showed little or no association with non-verbal intelligence, autism severity, or the presence of co-occurring alexithymia or ADHD.

Similar content being viewed by others

recent research on face recognition

Both identity and non-identity face perception tasks predict developmental prosopagnosia and face recognition ability

recent research on face recognition

Normative data of the Italian Famous Face Test

recent research on face recognition

Individual differences and the multidimensional nature of face perception

Introduction.

Historically, lifelong face recognition difficulties were thought to be extremely rare 1 . Over the last twenty years, however, there has been growing appreciation that ‘congenital’ or ‘developmental’ prosopagnosia is far more common than was once believed 2 , 3 , 4 , 5 . Indeed, around 2% of the general population describe lifelong face recognition problems severe enough to disrupt their daily lives 6 , 7 . The incidence of lifelong face recognition difficulties is particularly high amongst autistic individuals, many of whom experience problems when asked to identify or match faces 8 , 9 , 10 , 11 .

Increasing awareness of these difficulties has fuelled the development of tools for the identification and assessment of face recognition impairments. One well-known measure is the Cambridge Face Memory Test (CFMT) 12 , 13 , a standardized objective test of face recognition ability that was developed to identify cases of developmental prosopagnosia. On each trial (72 in total), participants are asked to identify recently learned target faces from a line-up of three options (a target and two foils). The addition of view-point changes and high-spatial frequency visual noise increases task difficulty in the later stages. The CFMT has good internal reliability 12 , 13 and correlates well with other measures of face identification 14 .

A second measure developed to aid the identification of developmental prosopagnosia is the Twenty Item Prosopagnosia Index (PI20) 15 , 16 , 17 . This self-report questionnaire was designed to provide standardized self-report evidence of face recognition difficulties, to complement diagnostic evidence obtained from objective computer-based assessments such as the CFMT. The PI20 comprises 20 statements describing face recognition experiences drawn from qualitative and quantitative descriptions of individuals with lifelong face recognition difficulties. Respondents (typically adults) rate how well each statement describes their own experiences on a 5-point scale. Scores can range from 20 to 100. A score of 65 or higher is thought to indicate the likely presence of face recognition impairment. The PI20, originally written in English, has been translated into multiple languages (e.g., Italian, Portuguese, Danish, Japanese & Mandarin) and applied in various cultural contexts 18 , 19 , 20 , 21 , 22 . The twenty items comprising the PI20 can be viewed in the supplementary materials (Table S1 ).

The items on the PI20 ask respondents to assess their face recognition ability relative to the rest of the population, either explicitly (e.g., My face recognition ability is worse than most people; I am better than most people at putting a ‘name to a face’; I have to try harder than other people to memorize faces) or implicitly (e.g., When people change their hairstyle or wear hats, I have problems recognizing them; when I was at school, I struggled to recognize my classmates). There has been considerable debate about whether participants have the necessary insight into their relative face recognition ability to provide meaningful responses to such items 23 , 24 , 25 , 26 . However, there is now strong evidence that the PI20 scores of non-autistic participants correlate with their performance on objective measures of face recognition accuracy 15 , 16 , 17 , 27 . While participants may lack fine-grained insight into their face recognition ability (e.g., whether they fall within the 45th or 55th percentile), these findings suggest that respondents have enough insight to provide meaningful responses on the PI20; i.e., they appear to have some idea whether their face recognition is impaired or unimpaired.

This may not be true of autistic individuals, however. Minio-Paluello and colleagues 28 reported that the PI20 scores of autistic adults ( N  = 63) exhibited little or no correlation with their performance on the CFMT—a key objective test of face recognition ability. A similar result was described by Stantić and colleagues 10 . In this study, the authors observed a non-significant correlation of r  = − 0.17 between the PI20 scores of 31 autistic adults and their performance on the CFMT. If found to be robust, these results have important theoretical implications: they raise the possibility that face recognition in autism may be subject to a metacognitive deficit, whereby autistic individuals are unable to infer whether (or not) their face recognition ability is impaired relative to the wider population. There is also an important substantive implication. These results suggests that the PI20 may not be suitable for screening autistic participants for face recognition difficulties. This would be a non-trivial limitation, not least because face recognition difficulties appear to be far more prevalent in the autistic population than in the non-autistic population 8 , 9 , 10 , 11 .

There are several reasons to be cautious when interpreting these findings, however. First, previous research suggests that metacognitive differences in autistic adults tend to be small and subtle, if observed at all 29 . Second, the study described by Stantić et al. 10 was not designed to examine individual differences. Any conclusions about face recognition variability and correlations therewith, are limited by the relatively small size of the study’s autistic sample ( N  = 31). Correlation estimates obtained with small samples are notoriously unstable 30 . Third, both results were obtained using the original version of the CFMT (the CFMT-O) 13 . This version of the CFMT is now easily accessible online; it is hosted by several websites, and various prosopagnosia forums and pop-science resources link to this test. Consequently, many individuals with face recognition difficulties have attempted the CFMT-O on multiple occasions 31 . Where practice benefits arise, participants may achieve higher scores than might be expected based on their PI20 score.

In light of the foregoing observations, we were keen to re-examine the relationship between the PI20 scores and CFMT performance of autistic individuals. To this end, a group of 77 autistic participants completed the PI20 questionnaire and two variants of the CFMT: the original (CFMT-O) 13 and the Australian (CFMT-A) 12 versions. The CFMT-O and CFMT-A share an identical format and differ only in terms of the (White male) facial identities used. Unlike the CFMT-O, however, the CFMT-A is not widely available to the members of the general public.

It has been noted previously that the face recognition abilities of autistic participants vary widely 8 , 9 , 10 , 11 , 28 . At present, however, little is known about the nature and origin of this variability. Some of this variance might be explained by differences in autism severity 32 . However, performance on face processing tasks may also be affected by differences in non-verbal intelligence 33 and the presence of co-occurring conditions, notably alexithymia 34 , 35 and attention-deficit-hyperactivity disorder (ADHD) 36 , 37 . We therefore took this opportunity to explore which of these factors—if any—predicted face recognition performance in our autistic sample.

Participants

Seventy-seven participants with a clinical diagnosis of autism ( M age  = 35.99 years; SD a ge  = 11.60 years) were recruited via www.ukautismresearch.org . All autistic participants had received an autism diagnosis (e.g., Autism Spectrum Disorder, Asperger’s Syndrome) from a clinical professional (General Practitioner, Neurologist, Psychiatrist or Clinical Psychologist) based in the U.K. All participants in the autistic group also reached cut-off (a score of 32) on the Autism Spectrum Quotient (AQ) 38 . The mean AQ score of the participants was 42.45 ( SD  = 4.17). To be eligible, participants also had to be aged between 18 and 60, speak English as a first language and have normal or corrected-to-normal visual acuity. All participants were required to be a current U.K. resident.

Of the 16 individuals who described their sex as male, 13 described their gender identity as male, 2 identified as non-binary and 1 identified as female. Of the 61 individuals who described their sex as female, 48 described their gender identity as female, 9 identified as non-binary, 1 as male and 3 preferred not to say. Seventy-six of the 77 participants identified as White (73: White-British, 1: White Irish, 2: White-Other). One participant did not specify their ethnicity. Sixty-eight of the participants were right-handed, while 9 were left-handed.

Data collection for the study took place between June and August 2023. At the outset, our aim was i) to recruit as many participants as possible during this period, and ii) to stop data collection at the end of August provided a minimum sample size of N  = 62 had been reached. A sample of N  = 62 yields a 90% chance of detecting a correlation of r  = 0.40 between PI20 scores and CFMT performance. Our final sample ( N  = 77) comfortably exceeded this minimum.

Ethical clearance was granted by the Departmental Ethics Committee for Psychological Sciences, Birkbeck, University of London and the experiment was conducted in line with the ethical guidelines laid down in the 6th (2008) Declaration of Helsinki. All participants gave informed consent before taking part.

The principal aim of the study was to elucidate the relationship between participants’ PI20 scores and their performance on the CFMT. To this end, all participants completed the PI20 questionnaire 16 and two versions of the CFMT: the CFMT-O 13 and the CFMT-A 12 . All participants completed the PI20 before attempting the CFMTs. Participants also completed the AQ to confirm their eligibility for the study. In addition to these measures, all participants completed a self-report measure of alexithymia severity: the Twenty-item Toronto Alexithymia Scale (TAS20) 39 , 40 , a self-report measure of ADHD traits: the Adult ADHD Self-Report Scale (ASRS) 41 , and a matrix reasoning task (MRT) to assess their non-verbal intelligence.

The TAS20 comprises 20 statements that relate to one’s ability to describe and identify emotions and interoceptive sensations. Respondents indicate to what extent each statement applies to them on a 5-point scale. Scores can range from 20 to 100, with higher scores indicative of more alexithymic traits. A score of 61 or higher is thought to index clinically significant levels of alexithymia. The TAS20 has good psychometric properties 39 and is widely used to assess the presence of alexithymia in autistic and non-autistic individuals 34 .

The ASRS is a self-report questionnaire that assesses the presence of traits associated with inattention, hyperactivity, and impulsivity. The ASRS consists of two parts: Part A is a 6-item screener that has been shown to effectively discriminate clinical cases of adult ADHD from non-cases 42 . Each response is scored as either 0 or 1, thus screener scores can range from 0 to 6. A score of 4 or above is thought to be associated with clinically significant levels of ADHD traits. Part B consists of 12 follow-up items that can be used to probe symptomology. Part B was not employed here.

The MRT employed consists of forty items selected from The Matrix Reasoning Item Bank 43 . Participants were given 30 s to complete each puzzle by selecting the correct answer from 4 options. Participants responded using keyboard number keys (1–4), were given a 5-s warning before the end of each trial, and received no feedback. Each participant attempted all forty items. Participants had to complete 3 practice trials correctly before beginning the test. We have employed this measure in previous studies of social perception in autism 8 , 35 . Based on a sample of 100 non-autistic adults ( M age  = 34.90; SD a ge  = 10.16), we estimate the test–retest reliability of this measure to be r p  = 0.727 (see Supplementary Material ). All data reported here were collected online. Both versions of the CFMT and the matrix reasoning test were administered via Gorilla Experiment Builder 44 .

Statistical procedures

The correlational analyses described below (all α = 0.05, two-tailed) were conducted using Pearson’s r ( r p ) and Spearman’s rho ( r s ) . The comparison of autistic subgroups was assessed using independent samples t -tests (α = 0.05, two-tailed). For each t -test we also provide the associated Bayes factor (BF), calculated in JASP 45 with default prior width. We interpret BFs of less than 3.0 as anecdotal evidence for the null hypothesis. BFs of greater than 3.0 are treated as substantial evidence for the null hypothesis 46 . The data supporting all the analyses described are available via the Open Science Framework ( https://osf.io/tesk5/ ).

The mean scores obtained for each measure are shown in Table 1 . As expected, we saw strong correlation between performance on the CFMT-O and CFMT-A [ N  = 77, r p  = 0.744, p  < 0.001], underscoring the good psychometric properties of our two dependent measures. We also observed a number of significant correlations between our predictor variables (Table 1 ). Reassuringly, several of these relationships are predicted by the existing literature 34 , 47 , 48 , including the AQ-TAS20 correlation [ N  = 77, r p  = 0.526, p  < 0.001] and the AQ-ASRS correlation [ N  = 77, r p  = 0.322, p  = 0.004]. The mean PI20 score ( M  = 62.43) accords well with the mean PI20 score described by Stantić and colleagues 10 ( M  = 63.30) but is a little higher than that reported by Minio-Paluello and colleagues 28 ( M  = 55.5).

Does the autistic sample exhibit poor face recognition?

The present study had two principal aims: first, to establish whether or not the PI20 scores of autistic adults correlate with their CFMT performance. Second, to explore whether differences in non-verbal intelligence and the presence of co-occurring conditions (alexithymia and ADHD) account for the enormous variability in face recognition ability seen in the autistic population. Thus, the focus of our investigation is the variability in face recognition performance observed within the autistic sample.

At the outset of our analyses, however, we first sought to evaluate the overall performance of the autistic sample on the CFMT-O ( M  = 67.95, SD  = 15.80) and CFMT-A ( M  = 70.67, SD  = 15.22). For this purpose, we employed comparison data reported by Tsantani et al. 17 (Fig.  1 a). These data were obtained from 238 non-autistic individuals (131 females, 104 males, 3 non-binary; M age  = 36.56, SD age  = 11.72), who completed online versions of the CFMT-O ( M  = 73.96, SD  = 13.77) and CFMT-A ( M  = 75.37, SD  = 12.48) under similar conditions. The participants in this sample were recruited via Prolific ( www.prolific.com ). Thirteen of the 238 participants (5.46%) reached the PI20 cut-off score of 65 ( M  = 44.85, SD  = 10.70).

figure 1

( a ) Mean scores on the CFMT-O and CFMT-A for the autistic sample. ( b ) Mean scores on the CFMT-O and CFMT-A for those autistic participants who reached the PI20 cut-off score (high-scorers) and those who did not (low-scorers). The non-autistic comparison data illustrated in both panels is taken from Tsantani et al. 17 . ** p  ≤ 0.01, *** p  ≤ 0.001. Error bars denote ± 1SD.

As expected, the scores of the autistic participants in our sample were significantly below those seen in this comparison sample, both for the CFMT-O [ t (313) = 3.207, d  = 0.420, p  = 0.001, BF 01  = 0.057] and the CFMT-A [ t (110.97) = 2.453, p  = 0.016, d  = 0.356, BF 01  = 0.221]. Note, for this latter comparison it was necessary to correct the degrees of freedom because the variance in our autistic sample was greater than that seen in the non-autistic comparison data [ F (1, 313) = 5.387, p  = 0.021]. The fact that the CFMT scores of our autistic sample tended to be lower than those of the non-autistic comparison sample accords well the existing literature 8 , 9 , 10 , 11 . This finding suggests that, in terms of face recognition ability, our autistic sample is broadly comparable with autistic samples described elsewhere.

Do PI20 scores predict CFMT scores?

Next, we sought to determine whether the PI20 scores of our autistic participants were predictive of their CFMT performance. To begin, we examined the simple correlations between participants’ PI20 and CFMT scores. Contrary to the findings of Minio-Paluello et al. 28 and Stantić et al. 10 , we observed significant correlation between PI20 scores and performance on both the CFMT-O [ N  = 77, r p  = − 0.486, p  < 0.001] and CFMT-A [ N  = 77, r p  = − 0.464, p  < 0.001] (Fig.  2 ). Similar correlations were seen when the raw scores were transformed into ranks for both the CFMT-O [ N  = 77, r s  = − 0.435, p  < 0.001] and CFMT-A [ N  = 77, r s  = − 0.469, p  < 0.001].

figure 2

Scatterplots of the relationship between PI20 scores and performance on the CFMT-O (left) and the CFMT-A (right). The solid lines depict the linear trends. The dashed lines depict the mean performance of the non-autistic sample described by Tsantani et al. 17 .

We also conducted a complementary subgroup analysis based on the established PI20 cut-off of 65. We split our sample of 77 autistic participants into those who met the cut-off ( N  = 42, M age  = 37.40, SD age  = 11.61) and those who did not ( N  = 35, M age  = 34.29, SD age  = 11.51). The autistic participants who met the PI20 cut-off achieved significantly lower scores than those who did not on both the CFMT-O [low-scorers: M  = 76.23, SD  = 13.59; high-scorers: M  = 61.04, SD  = 14.21; t (75) = 4.762, p  < 0.001, d  = 1.090, BF 01  < 0.001] and the CFMT-A [low-scorers: M  = 77.54, SD  = 12.60; high-scorers: M  = 64.95, SD  = 14.97; t (75) = 3.946, p  < 0.001, d  = 0.903, BF 01  = 0.007] (Fig.  1 b). Moreover, the autistic participants who met the PI20 cut-off performed worse on the CFMT-O [ t (278) = 5.576, p  < 0.001, d  = 0.933, BF 01  < 0.001] and CFMT-A [ t (278) = 4.835, p  < 0.001, d  = 0.809, BF 01  < 0.001] than the non-autistic participants tested by Tsantani and colleagues 17 (Fig.  1 b). In contrast, the CFMT-O scores [ t (271) = − 0.914, p  = 0.362, d  = − 0.165, BF 01  = 3.556] and CFMT-A scores [ t (271) = − 0.960, p  = 0.338, d  = − 0.174, BF 01  = 3.419] of the autistic participants who did not meet the PI20 cut-off did not differ significantly from the comparison distributions described by Tsantani et al. 17 .

Do co-occurring alexithymia and ADHD predict face recognition in autism?

There was some correlation between scores on the TAS20—a measure of alexithymia—and performance on the CFMT-A [ N  = 77, r p  = − 0.252, p  = 0.027]. We also observed a significant correlation between TAS20 scores and average performance on the CFMT-O and CFMT-A [ N  = 77, r p  = − 0.229, p  = 0.045]. However, we failed to observe a significant relationship with CFMT-O scores independently [ N  = 77, r p  = − 0.177, p  = 0.125]. We also note that the significant TAS20-CFMT correlations described above do not survive correction for multiple comparisons. No significant correlation was observed between scores on the ASRS—a measure of ADHD traits—and either CFMT-O scores [ N  = 77, r p  = 0.077, p  = 0.504] or CFMT-A scores [ N  = 77, r p  = 0.064, p  = 0.580]. Interestingly, we observed a noteworthy correlation between TAS20 and ASRS scores [ N  = 77, r p  = 0.453, p  < 0.001]; i.e., those autistic participants who reported high levels of alexithymic traits also reported higher levels of ADHD traits (Fig.  3 ).

figure 3

Simple correlations observed between autism severity (inferred from scores on the AQ questionnaire), levels of alexithymia (inferred from TAS20 scores), and the presence of ADHD traits (inferred from the ASRS screener). All correlations are significant at p  < 0.001 ( N  = 77).

Like the PI20, both the TAS20 and the ASRS have established cut-offs, associated with clinically significant levels of alexithymia and ADHD traits, respectively. We therefore examined whether subgroup analyses of TAS20 and ASRS scores would reveal evidence of a predictive relationship with CFMT. Of the 77 autistic participants, 59 met the TAS20 cut-off for clinically significant levels of alexithymia, while 18 did not. Those who met cut-off and those who did not, did not differ in their scores on the CFMT-O [low-scorers: M  = 70.22, SD  = 14.78; high-scorers: M  = 67.26, SD  = 16.15; t (75) = 0.694, p  = 0.490, d  = 0.187, BF 01  = 3.016] or on the CFMT-A [low-scorers: M  = 74.15, SD  = 11.32; high-scorers: M  = 69.61, SD  = 16.60; t (75) = 1.110, p  = 0.271, d  = 0.299, BF 01  = 2.212]. Similarly, 51 autistic participants met the ASRS cut-off for clinically significant ADHD traits, while 26 did not. Once again, there was little sign that CFMT-O scores [low-scorers: M  = 66.61, SD  = 18.64; high-scorers: M  = 68.63, SD  = 14.29; t (75) = 0.527, p  = 0.600, d  = 0.127, BF 01  = 3.586] or CFMT-A scores [low-scorers: M  = 71.53, SD  = 16.29; high-scorers: M  = 70.23, SD  = 14.80; t (75) = 0.351, p  = 0.727, d  = 0.084, BF 01  = 3.831] differed across these subgroups.

Is face recognition in autism affected by non-verbal intelligence or autism severity?

No significant correlation was observed between AQ scores and CFMT-O scores [ N  = 77, r p  = − 0.190, p  = 0.098] or between AQ scores and CFMT-A scores [ N  = 77, r p  = − 0.173, p  = 0.131]. Note, however, meeting the AQ cut-off score was part of the study inclusion criteria; hence, all 77 autistic participants had an AQ score of 33 or higher. Similarly, no significant correlation was observed between MRT scores and CFMT-O scores [ N  = 77, r p  = − 0.090, p  = 0.436] or between MRT scores and CFMT-A scores [ N  = 77, r p  = 0.001, p  = 0.992]. In sum, we find no evidence in our data that non-verbal intelligence or autism severity influence the face recognition abilities of autistic participants.

General discussion

There is now considerable evidence that the PI20 scores of non-autistic participants correlate with their performance on objective measures of face recognition accuracy 15 , 16 , 17 , 27 . These findings suggest that respondents have enough insight into their relative face recognition ability to provide meaningful responses on the PI20. Recently, however, Minio-Paluello et al. 28 reported that the PI20 scores of autistic participants ( N  = 66) exhibited little or no correlation with their performance on the CFMT. A similar finding was described by Stantić et al. 10 , albeit with a smaller sample ( N  = 31). These reports are potentially important because they suggest the possibility that autistic individuals may experience a metacognitive deficit, whereby they are unable to infer whether (or not) their face recognition ability is impaired. Moreover, these results raise the possibility that the PI20 may be unsuitable for screening autistic participants for face recognition difficulties.

Contrary to these reports, however, we find clear evidence of association between the PI20 scores of autistic participants ( N  = 77) and their performance on the CFMT-O and the CFMT-A. This association was evident both in simple correlation analyses, and in subgroup analyses where the autistic sample was split into those who met the established cut-off for developmental prosopagnosia, and those who did not. The mean score of those autistic participants who met cut-off was ~ 15% and ~ 12.5% lower than those that did not, on the CFMT-O and CFMT-A, respectively. Indeed, those autistic participants who did not meet the PI20 cut-off exhibited similar levels of performance to a non-autistic comparison sample described previously 17 . Together, these analyses provide clear evidence that the PI20 scores of autistic participants predict their CFMT performance.

The most likely explanation for the failure of Stantić et al. 10 to detect a correlation between scores on the PI20 and the CFMT is the relatively small size of their autistic group ( N  = 31). As we allude to in the introduction, (1) this study was not designed to examine the individual differences seen within the autistic population, and (2) correlation estimates obtained with small samples are notoriously unstable 30 . Post-hoc power analysis indicates there is a 38% chance of failing to detect a significant correlation of r  = 0.40 with a sample of this size (α = 0.05, two-tailed).

Assuming the authors scored the PI20 correctly, the null correlation described by Minio-Paluello et al. 28 is harder to explain. One relevant factor may be the wide range of general cognitive abilities present in their autistic sample ( N  = 63). As a self-report scale, the PI20 has relatively high verbal demands potentially making it unsuitable for individuals with intellectual disability. Moreover, five of the twenty items are reverse scored. Respondents must therefore read individual items carefully to respond appropriately. If some of the participants tested by Minio-Paluello et al. 28 struggled to interpret scale items, and were unable to respond appropriately, this might also explain why the mean PI20 score was lower than that reported here and elsewhere 10 .

It is now beyond doubt that the face recognition abilities of autistic participants vary enormously 8 , 9 , 10 , 11 , 28 . Once again, we saw evidence of this variability in our sample. On the one hand, 13 of our 77 autistic participants (16.9%) scored 65 or higher on the PI20 and scored less than 60% on both versions of the CFMT. These individuals would meet the diagnostic criteria for developmental prosopagnosia employed by the vast majority of research groups 13 , 49 . On the other hand, 10 of the 77 autistic participants (13.0%) scored 85% or higher on both tests, suggestive of excellent face recognition 12 , 17 .

There was little sign in our data that variability in face recognition ability is attributable to differences in non-verbal intelligence (as measured by MRT score), autism severity (as measured by AQ score), or the presence co-occurring ADHD traits (as measured by ASRS score). There was some hint of a potential relationship between the presence of co-occurring alexithymia and face recognition ability: TAS20 scores were negatively correlated with performance on the CFMT-A and with average CFMT performance. However, TAS20 scores did not exhibit significant correlation with CFMT-O scores independently, and the foregoing correlations do not survive correction for multiple-comparisons.

What should we make of this variability? We favour the view that, like alexithymia and ADHD, developmental prosopagnosia is a neurodevelopmental condition that can occur independently of autism, but that also frequently co-occurs with autism 4 , 8 , 51 , 52 . This view not only accounts for the severe lifelong face recognition problems seen in some autistic individuals, but also explains why many other autistic individuals exhibit excellent face recognition. Moreover, this account accords with the prevailing view that the co-occurrence of neurodevelopmental conditions is the ‘norm’ rather than the exception 34 , 47 , 48 , 53 , 54 , 55 . Given what we know about neurodevelopmental conditions more broadly, it would be hugely surprising if the incidence of developmental prosopagnosia was not elevated in the autistic population.

Recently, some authors have rejected this account citing evidence that autistic samples still exhibit below-average face recognition when those who meet the diagnostic criteria for prosopagnosia are removed 11 . However, this critique overlooks two issues. First, diagnostic assessments for developmental prosopagnosia are imperfect 26 . Many autistic individuals with severe co-occurring prosopagnosia may fail to meet diagnostic thresholds simply because of measurement error. Second, the severity of developmental prosopagnosia is thought to vary 56 . While some autistic individuals may experience severe developmental prosopagnosia, others may experience relatively mild forms. These latter individuals may fail to meet conservative diagnostic criteria for developmental prosopagnosia, but still exhibit below average face recognition.

While it was not the focus of our study, we observed a striking correlation between the presence of alexithymia and ADHD traits in our autistic participants. The fact that those autistic participants who report high levels of alexithymia also tend to report high levels of ADHD traits is potentially significant for understanding socio-cognitive differences in autism. In recent years, there has been increasing suggestion that many social perception difficulties traditionally attributed to autism—such as atypical interpretation of facial expressions 35 , 57 and reduced eye-region fixations 50 , 58 —may actually be products of co-occurring alexithymia. Likewise, there is some suggestion that other socio-cognitive differences attributed to autism—for example, atypical attentional cueing by gaze direction 37 —may be partly attributable to co-occurring ADHD. To date, however, authors have tended to assess the presence of either co-occurring alexithymia or co-occurring ADHD. In future, it may prove valuable to establish the extent to which these constructs exert independent or interactive effects in these domains.

Limitations and future directions

We note that 61 of our 77 participants described their sex as female. Conversely, the majority of the autistic population are thought to identify as male 59 . This is not the first time that a high proportion of female participants has been seen where studies have sought to recruit autistic participants online 60 . Unlike participant age 61 , the sex/gender of observers is not thought to exert a strong influence on face recognition ability 62 . However, we acknowledge the need to replicate the present findings in a sample more representative of the wider autistic community.

It is important that future research ascertain if/how other measures of meta-cognitive performance—such as estimates of meta c and meta d inferred from type-II signal detection tasks 63 , 64 —relate to participants’ responses on the PI20. For example, one might hypothesize that the PI20 scores of those with a higher meta c ought to correspond more closely to objective face recognition performance. It might also be interesting to examine how autistic and non-autistic individuals acquire insight into their relative face recognition abilities (e.g., What kinds of face recognition errors are salient? What have individuals been told about face recognition in autism?).

Contrary to recent reports, we observed significant correlation between PI20 scores and performance on both the CFMT-O and CFMT-A in autistic adults. This finding indicates that autistic individuals are able to infer whether (or not) their face recognition ability is impaired and confirms that the PI20 can be used to screen autistic participants for face recognition difficulties. Consistent with previous research, the face recognition performance within our autistic sample varied considerably. While some individuals approached ceiling levels of recognition accuracy, others met the prevailing diagnostic criteria for developmental prosopagnosia. This variability showed little or no association with non-verbal intelligence, autism severity, or the presence of co-occurring alexithymia or ADHD.

Data availability

The data supporting all the analyses is available here: https://osf.io/tesk5/ .

McConachie, H. R. Developmental prosopagnosia. A single case report. Cortex 12 , 76–82 (1976).

Article   CAS   PubMed   Google Scholar  

Wilmer, J. B. Individual differences in face recognition: A decade of discovery. Curr. Dir. Psychol. Sci. 26 , 225–230 (2017).

Article   Google Scholar  

Behrmann, M. & Avidan, G. Congenital prosopagnosia: Face-blind from birth. Trends Cogn. Sci. 9 , 180–187 (2005).

Article   PubMed   Google Scholar  

Cook, R. & Biotti, F. Developmental prosopagnosia. Curr. Biol. 26 , R312–R313 (2016).

Duchaine, B. & Nakayama, K. Developmental prosopagnosia: A window to content-specific face processing. Curr. Opin. Neurobiol. 16 , 166–173 (2006).

Kennerknecht, I. et al. First report of prevalence of non-syndromic hereditary prosopagnosia (HPA). Am. J. Med. Genet. 140A , 1617–1622 (2006).

Kennerknecht, I., Ho, N. Y. & Wong, V. C. N. Prevalence of heriditary prosopagonsia (HPA) in Hong Kong Chinese population. Am. J. Med. Genet. 146A , 2863–2870 (2008).

Gehdu, B. K., Gray, K. L. & Cook, R. Impaired grouping of ambient facial images in autism. Sci. Rep. 12 , e6665 (2022).

Article   ADS   Google Scholar  

Hedley, D., Brewer, N. & Young, R. Face recognition performance of individuals with Asperger syndrome on the Cambridge Face Memory Test. Autism Res. 4 , 449–455 (2011).

Stantić, M., Ichijo, E., Catmur, C. & Bird, G. Face memory and face perception in autism. Autism , 13623613211027685 (2021).

Kamensek, T., Susilo, T., Iarocci, G. & Oruc, I. Are people with autism prosopagnosic? Autism Res . (2023).

McKone, E. et al. Face ethnicity and measurement reliability affect face recognition performance in developmental prosopagnosia: Evidence from the Cambridge face memory test-Australian. Cogn. Neuropsychol. 28 , 109–146 (2011).

Duchaine, B. & Nakayama, K. The Cambridge Face Memory Test: results for neurologically intact individuals and an investigation of its validity using inverted face stimuli and prosopagnosic participants. Neuropsychologia 44 , 576–585 (2006).

Biotti, F., Gray, K. L. H. & Cook, R. Is developmental prosopagnosia best characterised as an apperceptive or mnemonic condition?. Neuropsychologia 124 , 285–298 (2019).

Gray, K. L. H., Bird, G. & Cook, R. Robust associations between the 20-item prosopagnosia index and the Cambridge Face Memory Test in the general population. Royal Society Open Sci. 4 , 160923 (2017).

Shah, P., Gaule, A., Sowden, S., Bird, G. & Cook, R. The 20-item prosopagnosia index (PI20): A self-report instrument for identifying developmental prosopagnosia. Royal Society Open Sci. 2 , 140343 (2015).

Tsantani, M., Vestner, T. & Cook, R. The Twenty Item Prosopagnosia Index (PI20) provides meaningful evidence of face recognition impairment. Royal Society Open Sci. 8 , e202062 (2021).

Estudillo, A. J. & Wong, H. K. Associations between self-reported and objective face recognition abilities are only evident in above-and below-average recognisers. PeerJ 9 , e10629 (2021).

Article   PubMed   PubMed Central   Google Scholar  

Tagliente, S. et al. Self-reported face recognition abilities moderately predict face-learning skills: Evidence from Italian samples. Heliyon 9 , e14125 (2023).

Nørkær, E. et al. The Danish version of the 20-Item prosopagnosia index (PI20): Translation, validation and a link to face perception. Brain Sciences 13 , e337 (2023).

Oishi, Y., Aruga, K. & Kurita, K. (2024), Relationship between face recognition ability and anxiety tendencies in healthy young individuals: A prosopagnosia index and state-trait anxiety inventory study. Acta Psychol. 245 , e104237 (2024).

Ventura, P., Livingston, L. A. & Shah, P. Adults have moderate-to-good insight into their face recognition ability: Further validation of the 20-item Prosopagnosia Index in a Portuguese sample. Quart. J. Exp. Psychol. 71 , 2677–2679 (2018).

Bobak, A. K., Mileva, V. R. & Hancock, P. J. Facing the facts: Naive participants have only moderate insight into their face recognition and face perception abilities. Quart. J. Exp. Psychol. 72 , 872–881 (2019).

Matsuyoshi, D. & Watanabe, K. People have modest, not good, insight into their face recognition ability: A comparison between self-report questionnaires. Psychol. Res. 85 , 1713–1723 (2021).

Arizpe, J. M. et al. Self-reported face recognition is highly valid, but alone is not highly discriminative of prosopagnosia-level performance on objective assessments. Behav. Res. Methods 51 , 1102–1116 (2019).

Burns, E. J., Gaunt, E., Kidane, B., Hunter, L. & Pulford, J. A new approach to diagnosing and researching developmental prosopagnosia: Excluded cases are impaired too. Behav. Res. Methods. https://doi.org/10.3758/s13428-022-02017-w (2022).

Shah, P., Sowden, S., Gaule, A., Catmur, C. & Bird, G. The 20 item prosopagnosia index (PI20): Relationship with the Glasgow face-matching test. Royal Society Open Sci. 2 , e150305 (2015).

Minio-Paluello, I., Porciello, G., Pascual-Leone, A. & Baron-Cohen, S. Face individual identity recognition: A potential endophenotype in autism. Mol. Autism 11 , 1–16 (2020).

Carpenter, K. L. & Williams, D. M. A meta-analysis and critical review of metacognitive accuracy in autism. Autism 27 , 512–525 (2023).

Schönbrodt, F. D. & Perugini, M. At what sample size do correlations stabilize?. J. Res. Personality 47 , 609–612 (2013).

Murray, E. & Bate, S. Diagnosing developmental prosopagnosia: Repeat assessment using the Cambridge Face Memory Test. Royal Society Open Sci. 7 , e200884 (2020).

Keating, C. T., Fraser, D. S., Sowden, S. & Cook, J. L. Differences between autistic and non-autistic adults in the recognition of anger from facial motion remain after controlling for alexithymia. J. Autism Develop. Disord. 52 , 1855–1871 (2022).

Walker, D. L., Palermo, R., Callis, Z. & Gignac, G. E. The association between intelligence and face processing abilities: A conceptual and meta-analytic review. Intelligence 96 , e101718 (2023).

Bird, G. & Cook, R. Mixed emotions: The contribution of alexithymia to the emotional symptoms of autism. Transl. Psychiatry 3 , e285 (2013).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Gehdu, B. K., Tsantani, M., Press, C., Gray, K. L. & Cook, R. Recognition of facial expressions in autism: Effects of face masks and alexithymia. Quart. J. Exp. Psychol . e17470218231163007 (2023).

Thoma, P., Soria Bauser, D., Edel, M. A., Juckel, G. & Suchan, B. Configural processing of emotional bodies and faces in patients with attention deficit hyperactivity disorder. J. Clin. Exp. Neuropsychol. 42 , 1028–1048 (2020).

Seernani, D. et al. Social and non-social gaze cueing in autism spectrum disorder, attention-deficit/hyperactivity disorder and a comorbid group. Biol. Psychol. 162 , e108096 (2021).

Baron-Cohen, S., Wheelwright, S., Skinner, R., Martin, J. & Clubley, E. The autism-spectrum quotient (AQ): Evidence from asperger syndrome/high-functioning autism, malesand females, scientists and mathematicians. J. Autism Develop. Disorders 31 , 5–17 (2001).

Article   CAS   Google Scholar  

Bagby, R. M., Parker, J. D. & Taylor, G. J. The twenty-item Toronto Alexithymia Scale-I. Item selection and cross-validation of the factor structure. J. Psychosomatic Res. 38 , 23–32 (1994).

Taylor, G. J., Bagby, R. M. & Parker, J. D. The 20-Item Toronto Alexithymia Scale: IV. Reliability and factorial validity in different languages and cultures. J. Psychosomatic Res. 55 , 277–283 (2003).

Kessler, R. C. et al. The World Health Organization Adult ADHD Self-Report Scale (ASRS): A short screening scale for use in the general population. Psychol. Med. 35 , 245–256 (2005).

Kessler, R. C. et al. Validity of the World Health Organization Adult ADHD Self-Report Scale (ASRS) Screener in a representative sample of health plan members. Int. J. Methods Psychiatric Res. 16 , 52–65 (2007).

Chierchia, G. et al. The matrix reasoning item bank (MaRs-IB): Novel, open-access abstract reasoning items for adolescents and adults. Royal Society Open Sci. 6 , 190232 (2019).

Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N. & Evershed, J. K. Gorilla in our midst: An online behavioral experiment builder. Behav. Res. Methods 52 , 388–407 (2020).

JASP-Team. JASP (Version 0.16.3)[Computer software]. Amsterdam, The Netherlands. (2022).

Jeffreys, H. Theory of probability (3rd ed.) . (Oxford University Press, 1961).

Hours, C., Recasens, C. & Baleyte, J. M. ASD and ADHD comorbidity: What are we talking about?. Front. Psychiatry 13 , e154 (2022).

Leitner, Y. The co-occurrence of autism and attention deficit hyperactivity disorder in children–what do we know? Front. Human Neurosci . 8 (2014).

Tsantani, M., Gray, K. L. H. & Cook, R. New evidence of impaired expression recognition in developmental prosopagnosia. Cortex (2022).

Cuve, H. C. et al. Alexithymia explains atypical spatiotemporal dynamics of eye gaze in autism. Cognition 212 (2021).

Gray, K. L. H. & Cook, R. Should developmental prosopagnosia, developmental body agnosia, and developmental object agnosia be considered independent neurodevelopmental conditions?. Cognit. Neuropsychol. 35 , 59–62 (2018).

Kracke, I. Developmental prosopagnosia in Asperger syndrome: Presentation and discussion of an individual case. Develop. Med. Child Neurol. 36 , 873–886 (1994).

Conti-Ramsden, G., Simkin, Z. & Botting, N. The prevalence of autistic spectrum disorders in adolescents with a history of specific language impairment (SLI). J. Child Psychol. Psychiatry 47 , 621–628 (2006).

Dziuk, M. A. et al. Dyspraxia in autism: Association with motor, social, and communicative deficits. Develop. Med. Child Neurol. 49 , 734–739 (2007).

Gilger, J. W. & Kaplan, B. J. Atypical brain development: A conceptual framework for understanding developmental learning disabilities. Develop. Neuropsychol. 20 , 465–481 (2001).

DeGutis, J. et al. What is the prevalence of developmental prosopagnosia? An empirical assessment of different diagnostic cutoffs. Cortex 161 , 51–64 (2023).

Cook, R., Brewer, R., Shah, P. & Bird, G. Alexithymia, not autism, predicts poor recognition of emotional facial expressions. Psychol. Sci. 24 , 723–732 (2013).

Bird, G., Press, C. & Richardson, D. C. The role of alexithymia in reduced eye-fixation in autism spectrum conditions. J. Autism Develop. Disorders 41 , 1556–1564 (2011).

Ferri, S. L., Abel, T. & Brodkin, E. S. Sex differences in autism spectrum disorder: A review. Curr. Psychiatry Rep. 20 , 1–17 (2018).

Rødgaard, E. M., Jensen, K., Miskowiak, K. W. & Mottron, L. Representativeness of autistic samples in studies recruiting through social media. Autism Res. 15 , 1447–1456 (2022).

Germine, L., Duchaine, B. & Nakayama, K. Where cognitive development and aging meet: Face learning ability peaks after age 30. Cognition 118 , 201–210 (2011).

Gray, K. L. H., Biotti, F. & Cook, R. Evaluating object recognition ability in developmental prosopagnosia using the Cambridge Car Memory Test. Cognitive Neuropsychol. 36 , 89–96 (2019).

Maniscalco, B. & Lau, H. A signal detection theoretic approach for estimating metacognitive sensitivity from confidence ratings. Consciousness Cognition 21 , 422–430 (2012).

Fleming, S. M. & Lau, H. C. How to measure metacognition. Front. Human Neurosci. 8 , e443 (2014).

Download references

Author information

Authors and affiliations.

Department of Psychological Sciences, Birkbeck, University of London, London, UK

Bayparvah Kaur Gehdu

Department of Experimental Psychology, University College London, London, UK

Clare Press

Wellcome Centre for Human Neuroimaging, University College London, London, UK

School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK

Katie L. H. Gray

School of Psychology, University of Leeds, Leeds, LS2 9JT, UK

Richard Cook

You can also search for this author in PubMed   Google Scholar

Contributions

B.K.G., C.P., K.L.H.G., and R.C. designed the study. B.K.G. collected the data. B.K.G., K.L.H.G. and R.C. analysed the data. B.K.G., C.P., K.L.H.G., and R.C. wrote the manuscript.

Corresponding author

Correspondence to Richard Cook .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Gehdu, B.K., Press, C., Gray, K.L.H. et al. Autistic adults have insight into their relative face recognition ability. Sci Rep 14 , 17802 (2024). https://doi.org/10.1038/s41598-024-67649-8

Download citation

Received : 20 December 2023

Accepted : 15 July 2024

Published : 01 August 2024

DOI : https://doi.org/10.1038/s41598-024-67649-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Face recognition
  • Twenty-item prosopagnosia index
  • Developmental prosopagnosia

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

recent research on face recognition

Exploring privacy-personalization paradox: : Facial recognition systems at business events

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options, recommendations, the privacy-personalization paradox in mhealth services acceptance of different age groups.

We examine the privacy-personalization paradox in mHealth service acceptance.Trust can mediate the effects of perceived personalization and privacy concerns on acceptance intention.For younger potential users, acceptance intention is largely driven by ...

Exploring biometric technology adoption in a developing country context using the modified UTAUT

Biometric technology BT is a component of information security and person identification. Individual acceptance and adoption of BT is fundamental to successful implementation of BT by organisations. There has been a fairly moderate but improving pace of ...

Investigating the influence of the most commonly used external variables of TAM on students' Perceived Ease of Use (PEOU) and Perceived Usefulness (PU) of e-portfolios

Engagement with e-portfolios has been shown to improve students' learning. However, what influences students to accept e-portfolios is a question that needs careful study. The purpose of this study is to investigate the influence of Self-Efficacy, ...

Information

Published in.

Elsevier Science Publishers B. V.

Netherlands

Publication History

Author tags.

  • Facial recognition
  • Biometric technology
  • Business meetings
  • Touchless technology
  • Technology adoption
  • Privacy-personalization paradox
  • Research-article

Contributors

Other metrics, bibliometrics, article metrics.

  • 0 Total Citations
  • 0 Total Downloads
  • Downloads (Last 12 months) 0
  • Downloads (Last 6 weeks) 0

View options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

  • Today's news
  • Reviews and deals
  • Climate change
  • 2024 election
  • Fall allergies
  • Health news
  • Mental health
  • Sexual health
  • Family health
  • So mini ways
  • Unapologetically
  • Buying guides

Entertainment

  • How to Watch
  • My Portfolio
  • Latest News
  • Stock Market
  • Biden Economy
  • Stocks: Most Actives
  • Stocks: Gainers
  • Stocks: Losers
  • Trending Tickers
  • World Indices
  • US Treasury Bonds
  • Top Mutual Funds
  • Highest Open Interest
  • Highest Implied Volatility
  • Stock Comparison
  • Advanced Charts
  • Currency Converter
  • Basic Materials
  • Communication Services
  • Consumer Cyclical
  • Consumer Defensive
  • Financial Services
  • Industrials
  • Real Estate
  • Mutual Funds
  • Credit Cards
  • Balance Transfer Cards
  • Cash-back Cards
  • Rewards Cards
  • Travel Cards
  • Credit Card Offers
  • Best Free Checking
  • Student Loans
  • Personal Loans
  • Car Insurance
  • Mortgage Refinancing
  • Mortgage Calculator
  • Morning Brief
  • Market Domination
  • Market Domination Overtime
  • Asking for a Trend
  • Opening Bid
  • Stocks in Translation
  • Lead This Way
  • Good Buy or Goodbye?
  • Financial Freestyle
  • Capitol Gains
  • Fantasy football
  • Pro Pick 'Em
  • College Pick 'Em
  • Fantasy baseball
  • Fantasy hockey
  • Fantasy basketball
  • Download the app
  • Daily fantasy
  • Scores and schedules
  • GameChannel
  • World Baseball Classic
  • Premier League
  • CONCACAF League
  • Champions League
  • Motorsports
  • Horse racing
  • Newsletters

New on Yahoo

  • Privacy Dashboard

Yahoo Finance

Face recognition device market to reach $16.5 billion, globally, by 2032 at 15.7% cagr: allied market research.

The global face recognition device market is experiencing growth driven by several factors, including the growing adoption in retail and e-commerce, expansion of smart city initiatives, and rising integration with IoT devices and other emerging technologies within the industry.

Wilmington, Delaware , Aug. 07, 2024 (GLOBE NEWSWIRE) -- Allied Market Research published a report, titled, " Face Recognition Device Market by Type (Standalone Devices and Integrated Devices), and Application (Surveillance, Access Control, Healthcare, Banking and Finance, Retail and E-commerce and Others): Global Opportunity Analysis and Industry Forecast, 2024-2032" . According to the report, the face recognition device market was valued at $4.5 billion in 2023, and is estimated to reach $16.5 billion by 2032, growing at a CAGR of 15.7% from 2024 to 2032.  

Download Research Report Sample & TOC:   https://www.alliedmarketresearch.com/request-sample/A68885

(We are providing report as per your research requirement, including the Latest Industry Insight's Evolution, Potential and COVID-19 Impact Analysis)

105 – Tables

57 – Charts

310 – Pages

Prime determinants of growth  

The global face recognition device market is experiencing growth due to several factors, including the increasing adoption in retail and e-commerce and the expansion of smart city initiatives. However, accuracy issues somewhat hinder market growth. Moreover, the integration with IoT and other emerging technologies offers lucrative opportunities for the expansion of the global face recognition device market.

Report coverage & details: 

Forecast Period

2024–2032

Base Year

2023

Market Size in 2023

$4.5 billion

Market Size in 2032

$16.5 billion

CAGR

15.7%

Segments Covered

Device Type, Application, and Region.

Drivers

Opportunities

Integration with IoT and Other Emerging Technologies

Restraint

Accuracy Issues

Segment Highlights:     

By device type, the global face recognition device market is bifurcated into standalone devices and integrated devices. Integrated devices lead the global face recognition device market. This dominance is driven by their seamless incorporation into a wide range of consumer electronics and security systems, benefiting from ongoing technological advancements in AI and machine learning.

By application, the global face recognition device market is divided into security and surveillance, access control, healthcare, banking and finance, retail and E-commerce, and others. The security and surveillance lead the global face recognition device market. This dominance is attributed to the increasing need for enhanced security measures across public and private sectors, including government buildings, airports, and critical infrastructure.

Get Customized Reports with your Requirements:   https://www.alliedmarketresearch.com/request-for-customization/A68885  

Region/Country Outlook: 

The Asia-Pacific region leads the face recognition device market, with China at the forefront. This leadership is driven by substantial government investments in surveillance infrastructure, widespread adoption of facial recognition technology in public security and law enforcement, and a rapidly growing consumer electronics market.

Leading Market Players:   

NEC Corporation

IDEMIA Group

Zhejiang Dahua Technology Co., Ltd.

Hangzhou Hikvision Digital Technology Co., Ltd.

Panasonic Corporation

Honeywell International Inc.

Fujitsu Limited.

Anviz Global Inc.

Sense Time Group Limited

VIVOTEK Inc.

The report provides a detailed analysis of these key players in the global face recognition device market. These players have adopted different strategies such as new product launches, collaborations, and others to increase their market share and maintain dominant shares in different regions. The report is valuable in highlighting business performance, operating segments, product portfolio, and strategic moves of market players to showcase the competitive scenario.

Key Industry Developments:   

March 2024: Hikvision, a leading provider of security products, launched a new range of face recognition cameras designed for enhanced security in urban surveillance and critical infrastructure protection, boasting higher resolution and faster processing speeds.

November 2022: NEC Corporation developed a gateless access control system using biometric recognition that combines NEC's face recognition technology with person re-identification technology, which matches people even if they are facing away or their bodies are occluded, to provide fast and reliable entry control that is free from gates.

Inquiry before Buying:   https://www.alliedmarketresearch.com/purchase-enquiry/A68885  

Key Benefits For Stakeholders:

This report provides a quantitative analysis of the face recognition devices market segments, current trends, estimations, and dynamics of the market analysis to identify the prevailing market opportunities.

The market research is offered along with information related to key drivers, restraints, and opportunities.

Porter's five forces analysis highlights the potency of buyers and suppliers to enable stakeholders make profit-oriented business decisions and strengthen their supplier-buyer network.

In-depth analysis of the face recognition devices market segmentation assists to determine the prevailing market opportunities.

Major countries in each region are mapped according to their revenue contribution to the global face recognition devices market statistics.

Market player positioning facilitates benchmarking and provides a clear understanding of the present position of the market players.

The report includes the analysis of the regional as well as global f face recognition devices market trends, key players, market segments, application areas, and market growth strategies.

Procure Complete Report (250 Pages PDF with Insights, Charts, Tables, and Figures) @   https://www.alliedmarketresearch.com/checkout-final /face-recognition-device-market

Face Recognition Device Market Key Segments:

Standalone Devices

Integrated Devices

By Application

Surveillance

Access Control

Banking and Finance

Retail and E-commerce

North America   (U.S., Canada, Mexico)

Europe   (UK, Germany, France, Italy, Rest of Europe)

Asia- Pacific   (China, Japan, India, South Korea, Rest of Asia-Pacific)

Latin America   (Brazil, Argentina, Rest of Latin America)

Middle East and Africa   (UAE, Saudi Arabia, Rest of Middle East And Africa)

Access AVENUE - A Subscription-Based Library (Premium On-Demand, Subscription-Based Pricing Model) @ https://www.alliedmarketresearch.com/library-access

Avenue is a user-based library of global market report database, provides comprehensive reports pertaining to the world's largest emerging markets. It further offers e-access to all the available industry reports just in a jiffy. By offering core business insights on the varied industries, economies, and end users worldwide, Avenue ensures that the registered members get an easy as well as single gateway to their all-inclusive requirements.

Avenue Library Subscription | Request For 14 Days Free Trial of Before Buying:   https://www.alliedmarketresearch.com/avenue/trial/starter  

Trending Reports in Semiconductor and Electronics Industry:

Industrial High Voltage Motor Market size was valued at $1.8 billion in 2022, and is projected to reach $2.6 billion by 2032, growing at a CAGR of 3.9% from 2023 to 2032.

High Voltage Capacitor Market was valued at $11.8 billion in 2020, and is projected to reach $30.3 billion by 2030, growing at a CAGR of 9.9% from 2021 to 2030.

Hearables Market size was valued at $21.20 billion in 2018, and is projected to reach $93.90 billion by 2026, growing at a CAGR of 17.2% from 2019 to 2026

Professional Portable Audio System Market was valued at $2.3 billion in 2021, and is projected to reach $5.1 billion by 2031, growing at a CAGR of 8.5% from 2022 to 2031

Allied Market Research (AMR) is a full-service market research and business-consulting wing of Allied Analytics LLP based in Wilmington, Delaware. Allied Market Research provides global enterprises as well as medium and small businesses with unmatched quality of "Market Research Reports Insights" and "Business Intelligence Solutions." AMR has a targeted view to provide business insights and consulting to assist its clients to make strategic business decisions and achieve sustainable growth in their respective market domain.

We are in professional corporate relations with various companies, and this helps us in digging out market data that helps us generate accurate research data tables and confirms utmost accuracy in our market forecasting. Allied Market Research CEO Pawan Kumar is instrumental in inspiring and encouraging everyone associated with the company to maintain high quality of data and help clients in every way possible to achieve success. Each and every data presented in the reports published by us is extracted through primary interviews with top officials from leading companies of domain concerned. Our secondary data procurement methodology includes deep online and offline research and discussion with knowledgeable professionals and analysts in the industry.

David Correa

1209 Orange Street,  Corporation Trust Center,  Wilmington, New Castle,  Delaware 19801 USA. Int'l: +1-503-894-6022 Toll Free: +1-800-792-5285 UK: +44-845-528-1300 India (Pune): +91-20-66346060 Fax: +1-800-792-5285  [email protected]

Ave Maria School of Law

A.I. And Its Impact on Facial Recognition Software

By Paolo Vilbon, The Gavel, Contributor J.D. Candidate, Class of 2024

Artificial intelligence is the future and there is no denying that. But, with great advancements also comes the potential dangers associated with them. One of law enforcements biggest technological advancements in the last two decades has been the use of facial recognition technology. This coupled with modern artificial intelligence would lead some to think that this system would completely revolutionize law enforcement investigations and their standard operating procedures. Unfortunately, this is not the case. According to researchers, facial recognition technologies falsely identified Black and Asian faces 10 to 100 times more often than they did White faces. The technologies also falsely identified women more than they did men—making Black women particularly vulnerable to algorithmic bias.1  

These algorithms currently help national agencies identify potential flight risks and protect borders.2 National agencies have an advantage over local law enforcement agencies because they possess the resources to cross check any information they receive, but local agencies do not have that kind of bandwidth. Further, it is no secret that efforts to recruit law enforcement officers have been on a downturn in recent years.3 This will lead police departments to rely more heavily on these technologies to fight crime. As the use of these systems increases, so will the errors associated with them. Therefore, if these technologies are not accurate or contain identifiable biases, they may do more harm than good.  

One of the issues identified with artificial intelligence and facial detection is that AI face recognition tools “rely on machine learning algorithms that are trained with labeled data.”4 Further, “[i]t has recently been shown that algorithms trained with biased data have resulted in algorithmic discrimination.”5 The potential dangers associated with erroneous identification range from “missed flights, lengthy interrogations, watch list placements, tense police encounters, false arrests, or worse.”6 All of which ignore the financial impact that a false identification will have on the individual. Society must hold companies who put face recognition tools into the marketplace accountable in the hopes that new development of technologies will be much more accurate. This would ensure that future algorithms will prevent harm to the individuals that these technologies are biased against.

Artificial intelligence is far too embedded into daily life to slow its progress but claims that the data set used for its baselines is biased should not be ignored. These biases should be brought to the forefront so that the necessary changes can be made now before artificial intelligence needlessly overburdens the criminal justice system. A yearlong research investigation across 100 police departments revealed that African American individuals are more likely to be stopped by law enforcement and be subjected to face recognition searches than individuals of other ethnicities.7 This happens because, without a dataset that has labels for various skin characteristics such as color, thickness, and the amount of hair, one cannot measure the accuracy of such automated detection systems. Although it may sound ridiculous, we are at a turning point when it comes to this technology. If this technology is continuously used with the current biases it has, it will be useful, but will also lead to mass incarceration of the wrong suspects. This will then negatively harm the government and impacted individuals economically while also carrying a negative social impact. It is imperative that we realize that these biases exist so they can be corrected now.  

References:

1 The Regulatory Review. “Saturday Seminar: Facing Bias in Facial Recognition Technology.”

3 U.S. Experiencing Police Hiring Crisis

4 Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.

Recent Posts

Does artificially manufactured art satisfy the originality element in copyright law, reshaping and enhancing the e-discovery process, a constitutional right, artificial intelligence and the legal world – will more class actions result, revolutionizing traditional methods of legal research.

Comments are closed.

recent research on face recognition

  • ACADEMIC CALENDAR
  • AVE CENTRAL
  • CAMPUS ALERTS
  • COMMENCEMENT
  • SCHOLARSHIPS
  • TITLE IX | SAFETY & SECURITY
  • COMMUNITY HEALTH & SAFETY POLICIES & PROTOCOLS
  • ABA REQUIRED REPORTING
  • IT ASSISTANCE
  • GOOD COUNSEL

Fully accredited by the American Bar Association, Council of the Section of Legal Education and Admissions to the Bar of the American Bar Association, 321 N. Clark Street, Chicago, IL  60654, (312) 988-6738.  Licensed by the Commission for Independent Education, Florida Department of Education, license # 4007.  Additional information regarding this institution may be obtained by contacting the Commission at 325 West Gaines Street, Suite 1414, Tallahassee, FL, 32399-0400.  Toll-free telephone number (888) 224-6684.

1025 Commons Cir, Naples, FL 34119

recent research on face recognition

© 2024 Ave Maria School of Law. All Rights Reserved

(239) 687-5300

  • Summer Start Program
  • Request Information | Visit
  • Application Requirements
  • Application Review Standards
  • Application Status
  • International Applicants
  • Transfer & Visiting Students
  • Waitlist Form
  • Cardinal Newman Scholarship
  • Michigan and Toledo Area Scholarship
  • Veteran Benefits
  • Greater Orlando Area Scholarship
  • Cost of Attendance
  • Types of Aid
  • Entrance and Exit Counseling
  • Loan Repayment and Loan Forgiveness
  • Financial Aid Code of Conduct
  • Satisfactory Academic Progress For Federal Student Aid Recipients
  • Consumer Information Statement
  • Tuition Deposit
  • Contact Financial Aid
  • Ave Law Virtual Experience
  • Admissions Ambassadors
  • American Bar Association Required Disclosures
  • Campus Housing
  • Maps & Directions
  • Hotel Options
  • Academic Calendar
  • Law Degree Requirements
  • Learning Outcomes
  • Disability Accommodations
  • Estate Planning and General Practice Clinic
  • The Veterans and Servicemembers Law Clinic
  • Intellectual Property Law Clinic
  • Legal Analysis, Writing, and Research
  • Labor Law Practicum
  • Program for Academic Success
  • Bar Preparation Strategies
  • Business Law Institute
  • Study Abroad in Rome
  • Chief Executive Officer and Dean John Czarnetzky
  • Adjunct Faculty
  • Professional Development Team
  • Contact the Office of Professional Development
  • OPD Audio Presentations
  • Career Development Programming
  • Mock Interviews
  • Individual Career Counseling
  • Pro Bono Recognition Program
  • Internships & Externships
  • Federal Work Study
  • Job Postings
  • On Campus Recruitment Program
  • Get Involved
  • Job Counseling
  • Update your contact info
  • Reciprocity
  • Campus Life
  • Spiritual Life
  • Cancro Family Wellness Center
  • Student Life
  • Diversity & Inclusion
  • Student Organizations
  • Commencement
  • Title IX | Safety & Security
  • AMSL Emergency Management
  • Law Library
  • Legal Research & Library Catalog
  • Mission Statement
  • Senior Administrators
  • Administrative Directory
  • Board of Governors
  • Ave Maria Law & The Holy See
  • Honorary Doctorates
  • IT Assistance
  • Press Releases
  • Ave Law Blogs
  • Good Councel
  • Speakers & Conferences
  • Office of Development
  • Wall of Honor
  • SPIRITUAL LIFE
  • Venetian Carnival
  • NEWS & MEDIA

IMAGES

  1. (PDF) Face detection and Recognition: A review

    recent research on face recognition

  2. Face Recognition Technology: A Comprehensive Guide

    recent research on face recognition

  3. Facial biometric recognition

    recent research on face recognition

  4. 250 research papers and projects in Face Recognition

    recent research on face recognition

  5. (PDF) FaceTime—Deep learning based face recognition attendance system

    recent research on face recognition

  6. Research PhD Projects in Face Recognition (PhD Research Proposal)

    recent research on face recognition

COMMENTS

  1. Past, Present, and Future of Face Recognition: A Review

    Face recognition is one of the most active research fields of computer vision and pattern recognition, with many practical and commercial applications including identification, access control, forensics, and human-computer interactions. However, identifying a face in a crowd raises serious questions about individual freedoms and poses ethical issues. Significant methods, algorithms, approaches ...

  2. Face recognition: Past, present and future (a review)

    A novel taxonomy of image and video-based methods, which also contains recent methods such as sparsity and deep learning based methods. An up-to-date review of the image and video-based data sets used for face recognition. Review of the recent deep-learning based methods, which have shown remarkable results on large scale and unconstrained ...

  3. Face Recognition by Humans and Machines: Three Fundamental Advances

    Abstract. Deep learning models currently achieve human levels of performance on real-world face recognition tasks. We review scientific progress in understanding human face processing using computational approaches based on deep learning. This review is organized around three fundamental advances.

  4. A review on face recognition systems: recent approaches and ...

    Face recognition is an efficient technique and one of the most preferred biometric modalities for the identification and verification of individuals as compared to voice, fingerprint, iris, retina eye scan, gait, ear and hand geometry. This has over the years necessitated researchers in both the academia and industry to come up with several face recognition techniques making it one of the most ...

  5. An optimized solution for face recognition

    MIT neuroscientists have found that when artificial intelligence is tasked with visually identifying objects and faces, it assigns specific components of its network to face recognition - just like the human brain.

  6. Classical and modern face recognition approaches: a complete ...

    In this review, we have highlighted major applications, challenges and trends of face recognition systems in social and scientific domains. The prime objective of this research is to sum-up recent face recognition techniques and develop a broad understanding of how these techniques behave on different datasets.

  7. Face Recognition by Humans and Machines: Three Fundamental Advances

    Deep learning models currently achieve human levels of performance on real-world face recognition tasks. We review scientific progress in understanding human face processing using computational approaches based on deep learning. This review is organized around three fundamental advances. First, deep networks trained for face identification generate a representation that retains structured ...

  8. A review on face recognition systems: recent approaches and challenges

    Abstract Face recognition is an efficient technique and one of the most preferred biometric modalities for the identification and verification of individuals as compared to voice, fingerprint, iris, retina eye scan, gait, ear and hand geometry. This has over the years necessitated researchers in both the academia and industry to come up with several face recognition techniques making it one of ...

  9. (PDF) A review on face recognition systems: recent approaches and

    Abstract and Figures. Face recognition is an efficient technique and one of the most preferred biometric modalities for the identification and verification of individuals as compared to voice ...

  10. [2212.13038] A Survey of Face Recognition

    Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks. Dozens of papers in the field of FR are published every year. Some of them were applied in the industrial community and played an important role in human life such as device unlock, mobile payment, and so on. This paper provides an introduction to face recognition, including its history ...

  11. Face Recognition: Recent Advancements and Research Challenges

    A Review of Face Recognition Technology: In the previous few decades, face recognition has become a popular field in computer-based application development This is due to the fact that it is employed in so many different sectors. Face identification via database photographs, real data, captured images, and sensor images is also a difficult task due to the huge variety of faces. The fields of ...

  12. A face recognition algorithm based on the combine of image ...

    The recognition rate and speed of face recognition systems have always been the two key technical factors that researchers focus on.

  13. Face Recognition

    Description with markdown (optional): **Facial Recognition** is the task of making a positive identification of a face in a photo or video image against a pre-existing database of faces. It begins with detection - distinguishing human faces from other objects in the image - and then works on identification of those detected faces.

  14. Individual Differences in Face Recognition: A Decade of Discovery

    The aim of this focused review is to recount a string of key discoveries about individual differences in face recognition made during the last decade. Fig. 1. The recent acceleration of research on individual differences in face recognition. Each line represents averaged results for similar searches (producing similar results) in Google Scholar ...

  15. Deep Learning for Face Recognition: a Critical Analysis

    ABSTRACT Face recognition is a rapidly developing and widely applied aspect of biometric technologies. Its applications are broad, ranging from law enforcement to consumer applications, and industry efficiency and monitoring solutions. The recent advent of affordable, powerful GPUs and the creation of huge face databases has drawn research focus primarily on the development of increasingly ...

  16. A Review of Face Recognition Technology

    Metrics. Abstract: Face recognition technology is a biometric technology, which is based on the identification of facial features of a person. People collect the face images, and the recognition equipment automatically processes the images. The paper introduces the related researches of face recognition from different perspectives.

  17. Human face recognition based on convolutional neural network and

    To deal with the issue of human face recognition on small original dataset, a new approach combining convolutional neural network (CNN) with augmented dataset is developed in this paper. The origin...

  18. Face perception: A brief journey through recent ...

    These varied accomplishments across many distinct subdomains of face-recognition research are extensive and conveying the current state of knowledge and looking ahead to future challenges is difficult when there is this much material to draw upon.

  19. Face Recognition Systems: A Survey

    This paper highlights the recent research on the 2D or 3D face recognition system, focusing mainly on approaches based on local, holistic (subspace), and hybrid features.

  20. Sensors

    This paper highlights the recent research on the 2D or 3D face recognition system, focusing mainly on approaches based on local, holistic (subspace), and hybrid features.

  21. Deep learning based single sample face recognition: a survey

    Face recognition has long been an active research area in the field of artificial intelligence, particularly since the rise of deep learning in recent years. In some practical situations, each identity has only a single sample available for training. Face recognition under this situation is referred to as single sample face recognition and poses significant challenges to the effective training ...

  22. Face Recognition: A Literature Review

    Abstract and Figures The task of face recognition has been actively researched in recent years. This paper provides an up-to-date review of major human face recognition research.

  23. Recent development in face recognition

    This paper investigates various feature-based automatic face recognition approaches in detail. High degree of freedom in head movement and human emotion leads a face recognition system to face critical challenges in terms of pose, illumination and expression. Human face also undergoes irreversible changes due to aging.

  24. Autistic adults have insight into their relative face recognition

    Recent reports suggest that the PI20 scores of autistic participants exhibit little or no correlation with their performance on the Cambridge Face Memory Test—a key measure of face recognition ...

  25. Shared Neural Dynamics of Facial Expression Processing

    In this study, we investigated the shared neural dynamics of emotional face processing using an explicit facial emotion recognition task, where participants made two-alternative forced choice (2AFC) decisions on the displayed emotion.

  26. A gradual self distillation network with adaptive channel attention for

    In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...

  27. Exploring privacy-personalization paradox: : Facial recognition systems

    C. Morosan, Hotel facial recognition systems: Insight into guests' system perceptions, congruity with self-image, and anticipated emotions, Journal of Electronic Commerce Research 21 (1) (2020) 21-38.

  28. Facial Emotion Recognition Through Quantum Machine learning

    This Facial emotion recognition plays a pivotal role in human-computer interaction, with far-reaching implications spanning psychology, healthcare, and human computer interaction. Conventional approaches, primarily utilizing classical machine learning methods, frequently face challenges in effectively interpreting intricate facial expressions under diverse environmental circumstances. However ...

  29. Face Recognition Device Market to Reach $16.5 Billion, Globally, by

    The global face recognition device market is experiencing growth driven by several factors, including the growing adoption in retail and e-commerce, expansion of smart city initiatives, and rising ...

  30. A.I. And Its Impact on Facial Recognition Software

    A yearlong research investigation across 100 police departments revealed that African American individuals are more likely to be stopped by law enforcement and be subjected to face recognition searches than individuals of other ethnicities.7 This happens because, without a dataset that has labels for various skin characteristics such as color ...