24 Sep 2019 |
Research article |
Intelligent and Autonomous Systems
On the Brittleness of Deep Learning Models
Jérôme Rony is a master’s student at ÉTS as part of the agreement between ÉTS and École nationale supérieure des arts et métiers.





Chen, S.-T. et al. 2019.
Deep learning models are widely used to solve computer vision problems, from facial recognition to object classification and scene understanding. However, research on adversarial examples has shown that these models are brittle: small, often imperceptible changes to an image can induce misclassifications, which has security implications for a wide range of image-processing systems. We propose an efficient algorithm to generate small perturbations that cause misclassifications for deep learning models. Our method achieves state-of-the-art performance while requiring much less computation than previous algorithms (~100x times speedup) which opens the possibility of using it at larger scales and, in particular, to design defense mechanisms. Keywords: Machine Learning, Security, Deep Learning, Computer Vision, Adversarial Attacks
Adversarial Attacks
Deep Neural Networks (DNNs) have achieved state-of-the-art performance for a wide range of computer vision applications. However, these DNNs are susceptible to active adversaries. Most notably, they are susceptible to adversarial attacks, in which small changes, often imperceptible to a human observer cause a misclassification (i.e. an error) [1, 2]. This has become a big concern as more and more DNNs are deployed in the real world and, soon, in safety-critical environments such as autonomous cars, drones, the medical field, etc.
In a typical image classification scenario, we use a model which is a function that maps an image (a vector of values between 0 and 1) to a set of scores. Each one of these scores represents the likelihood that the object in the image belongs to a certain class (e.g. cat, dog, car, plane, person, table, etc.). The maximum scoring class is the one predicted as being present in the image. In the image on the left of the Figure, a fairly good model (Inception V3 trained on ImageNet [3]) can predict with high confidence that the image contains a dog, and even determine the breed: Curly-Coated Retriever. We expect the model to be robust to subtle changes in the image; that is, if we slightly modify the values of some pixels, the prediction will not significantly change. However, if we add a perturbation (centre) to the original image, we obtain a new image (right) classified, with high confidence, as a Microwave. The security implications of this behaviour become obvious as it means that the model is not robust to carefully crafted perturbations that would not affect the decision of a human observer. See https://adversarial-ml-tutorial.org/ for more technical details.

Left: Original Image: Classified as Curly-Coated Retriever
Centre: Perturbation (amplified ~40 times to be visible)
Right: Perturbed Image: Classified as Microwave
Perturbation Generation
Being able to generate these perturbations in an efficient way to force an error from a model is important for two main reasons. First, most of the current work in deep learning is evaluated on the average case: a set of images is held out during training (i.e. development) of the model to evaluate the final performance. However, this set might not evaluate the model for the worst-case scenario—a problem for safety-critical applications. Second, having an efficient way to generate these perturbations means that we can use them for the training phase to obtain a more robust model.
An algorithm to generate such perturbations is called an adversarial attack. In this work, the objective of this adversarial attack is to produce a minimal perturbation that causes a misclassification for a given image and model. The size of the perturbation is measured in terms of Euclidean norm of the pixel values. As a reference, the size of the perturbation added to the image in the figure is 0.7 and is imperceptible to a human observer. The performance of the adversarial attacks is measured in terms of average size of the perturbation and total run-time on a given hardware configuration for a set of 1,000 images. For both measures, lower is better. The following table shows that our algorithm (called DDN) outperforms the previous state-of-the-art approach by a large margin both in terms of norms and run-times.
Conclusion
In conclusion, we created an adversarial attack that can generate small perturbations to cause misclassifications in DNNs in a much more efficient way than the previous state-of-the-art approach. This algorithm is especially useful to evaluate the robustness of current and future machine learning models that are increasingly used in real-world applications. The performance of this algorithm also makes it possible to use it to design more effective defense mechanisms.
Additional Information
For more information on this research, please refer to the following conference article:
Rony, Jérome; Hafemann, Luiz G.; Oliveira, Luiz S.; Ben Ayed, I.; Sabourin, Robert; Granger, Éric. 2019. “Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses”, presented at the IEEE Conference on Computer Vision and Pattern Recognition, June 16-20. Long Beach. pp. 4322-4330.

Jérôme Rony
Jérôme Rony is a master’s student at the ÉTS LIVIA laboratory. His research focuses on weakly supervised deep machine learning models for image classification and object localization applied to medical imaging.
Program : Automated Manufacturing Engineering Healthcare Technology
Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory

Luiz Gustavo Hafemann
Luiz Gustavo Hafemann earned his PhD from ÉTS in 2019, where he worked with deep learning models for handwritten signature verification. He is currently a researcher at Sportlogiq, applying computer vision models for sports analytics.
Program : Automated Manufacturing Engineering
Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory

Ismail Ben Ayed
Ismail Ben Ayed is a professor in the Systems Engineering Department at ÉTS. His research is at the crossroads of optimization, machine learning and medical image analysis.
Program : Automated Manufacturing Engineering
Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory

Éric Granger
Eric Granger is a professor in the Systems Engineering Department at ÉTS. His research focuses on machine learning, pattern recognition, computer vision, information fusion, and adaptive and intelligent systems.
Program : Automated Manufacturing Engineering
Research chair : Research Chair in Artificial Intelligence and Digital Health for Health Behaviour Change
Research laboratories : LiNCS - Cognitive and Semantic Interpretation Engineering Laboratory LIVIA – Imaging, Vision and Artificial Intelligence Laboratory
Research laboratories :

