When it comes to artificial intelligence, the debate can heat up rather quickly, usually between a faction crouching in a position of self-defense, arguing that machines will not reach human capabilities any time soon, and a faction that argues that the age of Instead, AI is almost here, if it hasn’t arrived yet.
This post is not meant to be an introduction to the above arguments (I could write a more detailed post later), but to set out some considerations on how misleading a crude comparison between the results of the two can be if the full context is not taken into account.
Speaking of Deep Neural Networks (DNN), today they are considered state-of-the-art in many areas of artificial intelligence, especially computer vision, so we might as well consider them an important benchmark for this debate. So how do they relate to human vision? Are they on a par with our own capabilities? It turns out that the answer is not exactly simple.
An interesting article by Christian Szegedy & coll.(1) showed that DNNs have counterintuitive properties, that is, they seem to be very good at generalization, even better than humans, but can be easily fooled with adversarial negative examples. The authors hypothesized that one possible explanation was the extremely low probability that such adversarial sets would be observed in a test set, but (like rational numbers) dense enough to be found in virtually all test cases. .


Many years have passed since the first pioneering works on adversarial classification(23), and today many adversarial examples are generated with Evolutionary Algorithms (EAs) that evolve as a population of images. With these types of algorithms, it is interesting to note that it is possible to trick state-of-the-art neural networks to “recognize” with almost 100% certainty that images evolved to be totally unrecognizable to humans as natural objects.(4).

Using evolutionary algorithms to produce images that match DNN classes can produce a wide variety of different images, and looking at these, the authors interestingly note that:
“For many of the images produced, one can begin to identify why the DNN believes the image is of that class once the class tag is assigned to it. This is because evolution only needs to produce features that are unique or discriminatory for a class, rather than producing an image that contains all the typical features of a class.”
These examples demonstrate how AI recognition can be intentionally tricked into not recognizing some images that are obvious to us (false negatives), and also making it recognize with high confidence something that is obviously not there to us. There is a lot of literature on this topic.(5–7)which can be quite important also from a cybersecurity perspective(8).
However, we must stress that human recognition has its own shortcomings as well: there are plenty of optical illusions to prove it, including the famous white and gold versus blue and black dress, which generated much debate.


There are cases where artificial recognition can consistently outperform humans.(9,10)as fine-grained intraclass recognition (for example, dog breeds, snakes, etc.). It also seems that humans may be even more susceptible than AI when there is insufficient training data, that is, the human himself did not have enough exposure to that kind of class.
Human perception is a tricky beast, it seems extremely good to us, because it can be quite robust and adaptive, but as we have just seen, it depends a lot on prior knowledge since we also need training (sometimes lifelong training) to be able to perform. with some degree of success. Sure enough, we also have some innate categories that we are very adept at recognizing from birth (for example, human faces of our own race), but guess what? We’re also susceptible to being fooled there too, if we just change the lighting.(11,12).

Furthermore, we depend on aspects of reality that are not objective at all, such as colors. Everyone knows that colors depend on the wavelengths of light reflected from objects, but we often forget that what really makes colors what they are to us is our brain interpretation. In short, colors do not exist in nature, they are just a small portion of light that our brain encodes into specific sensations. We don’t see infrared, ultraviolet, or gamma rays as color, which are definitely there, and we also see colors that don’t really “exist” in the spectrum, like brown.
Our perception is strongly linked not only to our neurophysiology but also to our cultural context. There is a now famous Namibian tribe, called the Himba, who have dozens of terms to define green, while no words for blue, and apparently their members don’t seem to be able to tell blue from green at all, while they are still much better than we are at detecting very slight differences in greens(13,14). Furthermore, very recent studies have shown that humans may be just as prone to being fooled by some sort of conflicting imagery as machines.(9,15,16).
The variations in deficiencies between human and artificial image recognition suggest that the process is very different. Human reconnaissance is no better or worse than machine reconnaissance, or at least it’s a very poorly posed problem, as we constantly neglect to take into account the knowledge and training we need to perform any reconnaissance.
References
-
(1)
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, IJ Goodfellow, R. Fergus, Intriguing Properties of Neural Networks, in: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
-
(2)
N. Dalvi, P. Domingos, Mausam, S. Sanghai, D. Verma, Adversarial Classification, in: Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD ’04, ACM Press, 2004. doi : 10.1145 /1014052.1014066. -
(3)
D. Lowd, C. Meek, Adversarial learning, in: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining – KDD ’05, ACM Press, 2005. doi:10.1145/1081870.1081950. -
(4)
A. Nguyen, J. Yosinski, J. Clune, Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images, ArXiv E-Prints. (2014) arXiv:1412.1897.
-
(5)
B. Biggio, F. Roli, Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning, (2017).
-
(7)
A. Krizhevsky, I. Sutskever, GE Hinton, ImageNet Classification with Deep Convolutional Neural Networks, Commun. MCA. (2017) 84–90. doi:10.1145/3065386. -
(eleven)
C. Hong Liu, CA Collin, AM Burton, A. Chaudhuri, Direction of lighting affects recognition of non-textured faces in photographic positives and negatives, Vision Research. (1999) 4003–4009. doi:10.1016/s0042-6989(99)00109-1. -
(12)
A. Missinato, Face recognition with photographic negatives: the role of spatial frequencies and facial specificity, University of Aberdeen, 1999.
-
(fifteen)
Gamaeldin F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alex Kurakin, Ian Goofellow, Jascha Sohl-Dickstein, Adversarial Examples Fooling Both Computer Vision and Humans with Limited Time, (2018).
-
(sixteen)
E. Watanabe, A. Kitaoka, K. Sakamoto, M. Yasugi, K. Tanaka, Illusory motion reproduced by prediction-trained deep neural networks, front. psychol. (2018). doi:10.3389/fpsyg.2018.00345.
Andrea has been working in IT for almost 20 years, covering just about everything from development to business analysis to project management.
Today we can say that he is a carefree gnome, passionate about Neurosciences, Artificial Intelligence and photography.