MIT-IBM researchers have found out that there is a trade-off between accuracy and robustness, and that a single minded pursuit of accuracy can get developers and users into trouble.

Junko Yoshida explains the tradeoff in this article from EETimes:

As a neural network is taught more images, it memorizes what it needs to classify. “But we don’t necessarily expect it to be robust,” said (researcher Pin-Yu) Chen. “The higher the accuracy is, the more fragile it could get.”

For autonomous vehicles, in which safety is paramount, verifying classification robustness is critical. Techniques available today have been generally limited to certifying small-scale and simple neural-network models. In contrast, the joint IBM-MIT team found a way to certify robustness on the widely popular general CNNs.

The team’s proposed framework can “handle various architectures including convolutional layers, max-pooling layers, batch normalization layer, residual blocks, as well as general activation functions,” according to Chen. By allowing perturbation in each pixel with confined magnitude, said Chen, “We have created verification tools optimized for CNNs.” The team’s goal is “to assure you that adversarial attacks can’t alter AI’s prediction.”

Chen also pointed out that adversarial examples can come from anywhere. They exist in the physical world, digital space, and a variety of domains ranging from images and video to speech and data analysis. The newly developed certification framework can be applied in a variety of situations. In essence, it is designed to provide “attack-independent and model-agnostic” metrics, he explained.