Neural Network Follies (2003)

dane-pgp · on May 4, 2022

The "last modified" date is 2003, but the story supposedly dates back to 1998. Either way, its criticism of neural networks has not aged well:

> Any automatically trained net with more than a few dozen neurons is virtually impossible to analyze and understand.

While it is still an open area of research to try to get neural networks to explain themselves, the solution to this specific case is quite simple (and has been used in practice, I think):

You flip random bits/pixels of the input image and get the neural network to tell you whether the image looks less or more like a tank. Continue this process in a self-supervised loop and eventually you reach what the neural network thinks is a "maximally tank depicting" image. In this case what you would see is a picture of grey skies and no tanks, which would tell you everything you needed to know.

Anyway, it's a great anecdote, even if it is apocryphal, and I've used it in conversation before because it is a really accessible way to get people to think about the problems of AI training.

albertzeyer · on May 4, 2022

You don't need to randomly flip pixels in the input image. You can simply backpropagate to it w.r.t. the loss function and modify it based on the gradient to minimize the loss.

pietro72ohboy · on May 4, 2022

Both approaches are correct. The parent is talking about Ablation based importance attribution whereas you’re using gradients to assign importance.

Your approach will compute pixel importance in a single step whereas ablation based approaches generally need many passes.

danuker · on May 4, 2022

> Your approach will compute pixel importance in a single step

How do you address the vanishing gradient problem?