Just how accurate are facial recognition algorithms—which may have been trained and tested on fewer than 15,000 photos—when put to the test on a larger scale? Researchers in UW CSE’s Graphics and Imaging Lab (GRAIL) aimed to find out by launching the MegaFace Challenge, a new competition in which teams from all over the world were invited to put their algorithms through their paces using the MegaFace dataset of one million images.
The results showed that, when it comes to the size of datasets used for training and testing these algorithms, bigger tends to be better.
From the UW News release:
“‘We need to test facial recognition on a planetary scale to enable practical applications—testing on a larger scale lets you discover the flaws and successes of recognition algorithms,’ said Ira Kemelmacher-Shlizerman, a UW assistant professor of computer science and the project’s principal investigator. ‘We can’t just test it on a very small scale and say it works perfectly.’
“The UW team first developed a dataset with one million Flickr images from around the world that are publicly available under a Creative Commons license, representing 690,572 unique individuals. Then they challenged facial recognition teams to download the database and see how their algorithms performed when they had to distinguish between a million possible matches.
“Google’s FaceNet showed the strongest performance on one test, dropping from near-perfect accuracy when confronted with a smaller number of images to 75 percent on the million person test. A team from Russia’s N-Tech.Lab came out on top on another test set, dropping to 73 percent.”
Testing against the larger dataset revealed differences in performance across facial recognition algorithms that were masked when tested against a much smaller dataset, with accuracy rates of some other algorithms that had previously performed well on a small scale—some surpassing 95%—falling to as low as 33% when using the larger dataset. Algorithms that were trained on larger datasets to begin with tended to outperform those that were trained on smaller datasets when tested on the MegaFace collection.
The team, which also includes UW CSE professor Steve Seitz, Master’s student Aaron Nech, undergraduate student Evan Brossard, and former student Daniel Miller, will present its results at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016) in Las Vegas next week.
In addition to the challenge—which is ongoing—researchers are building a large-scale training dataset that will incorporate multiple photos of half a million identities. This new dataset will benefit researchers who previously had to rely on smaller sample sizes to train their algorithms. The project could help enable new applications for facial recognition technology, such as face-based security features for mobile devices and the ability for enforcement to quickly and accurately identify individuals captured on video surveillance footage for public safety purposes.
Read the full news release here and the research paper here. Visit the MegaFace website here, and check out articles in TechCrunch, The Atlantic, IEEE Spectrum, and Silicon Republic.