Interpretability with GradCAM

Deep learning interpretability model with GradCAM

What is GradCAM ?

GradCAM is an acronym for Gradient-based Class Activation Model. It’s a rather ingenious way to use the guiding principles of a convolutional neural network (CNN) to better understand what features it managed to learn.

Why it’s awesome

GradCAM helps ensure that the model is robust to adverse degradations. This is especially important for safety reasons, such as in the context of self driving car, where misreading a sign might lead to life threatening accidents. It’s also paramount to making sure the data was unbiased. For instance, it could classify correctly solely based on whether the sun is out or not. For instance, if all the training pictures for one class were taken outside on the same day, then the sun might help discriminate them from another class, which pictures were taken on a cloudy day.

It can also help Deep Learning practitioners and their models to be taken more seriously by other industry’s experts. Imagine a model whose task it is to diagnose lung cancer from X-ray radiographs. It takes years of specialised training to a doctor to be able to make a relevant interpretation, but only a few hours to a deep neural network to outperform them. Anybody would get rather skeptical in this context, especially considering the cost of education and medical care in some countries. Hopefully, interpretability techniques can help open the AI black box and unearth why is the model predicting what it is, based on which part of the input image.

Llama through deep learning interpretability with GradCAM
This fictional creature is a Llamasticot, half llama, half asticot (french for a type of worm). As you can see, the model managed to generalise enough not to be too troubled by it.

A way for expert to criticise the model

When the model simply outputs the probability of there being lung cancer, it’s hard to argue what it might be doing wrong. However, when it highlights what caused the said prediction, then it’s far easier to get an idea of the criterions it learnt to take into account. Then, an expert might be able to better interprete these than the DL practitioner. He might provide good insight as to why the model might be biased and unable to generalise.

A cheap yet effective way to do it

Creating the kind of heatmap displayed above is something one could achieve through painful annotation and then training a model such as U-Net. I once helped a grad student in biology to train such a network to infer the surface of a cell under a special type of microscope for her PhD thesis.

Deep learning applied to biology

It was a tedious process because she essentially had to create the whole dataset by hand. Then she had to train a custom version of U-Net from scratch.

In contrast, GradCAM offers a cheap way to get arbitrary level feature activation heatmaps. Indeed, it just leverages an already trained  CNN. It simply compute a gradient based on its weights, between an arbitrarly deep layer and the final class prediction. This gives us granularity as to which type of feature we want to pay attention to.

For the above llama example, I didn’t have to retrain a llama image class segmentation. No need for thousand of painstakingly annotated llama images. Instead I simply leveraged a (now widely outdated) network called VGG19, which was pretrained on ImageNet. ImageNet is a collection of millions of images accounting for a thousand different classes.

Custom Keras implementation

In 2019, I wrote a Keras implementation as well as an article on my blog, where I went in the details of how to compute the gradient necessary to aggregate the filter activations and finally display the above heatmap. The source code with lots and lots of comments is still available on my GitHub repo, make sure to check it out !