A Beginner’s Guide to Out-of-Distribution Detection

Let’s say you had a classifier that could distinguish pictures of dogs and cats. Sounds simple enough right? It’s the kind of introductory computer vision project you might do right after training your first MLP to identify digits in MNIST. With the right dataset and the right model (I’ll leave the details up to you for now), you could probably get a decent separation. High confidence scores and a high ROC, the whole nine yards. Here’s the basic question at the heart of the matter:

What do you think would happen if you fed your classifier something other than a dog or a cat?

Perhaps unsurprisingly, most models don’t handle this well. The typical behavior is to mistake that new thing as either a dog or a cat with high confidence [1]. This kind of mistake can have expensive consequences. Recently, a Tesla Model Y using it’s “SmartSummon” feature, crashed into a multi-million dollar private jet. While the exact cause of the crash is unknown, I think we can be reasonably sure what happened here. Either the cameras on the car simply didn’t see the huge jet on in broad daylight, or the self-driving model was not trained to identify private jets and misidentified it as something else and plowed into it.

An example of an OOD error. Training shows the seperation of dogs and cats. Testing shows a correct prediction for both dogs and cats, and an erroneous prediction of a hairdryer as a cat with high confidence.

An unscientific illustrative example of an OOD error.

So lets say you don’t want your Tesla to crash into your private jet. How can we prevent something like this? This is the central question of OOD research: How do we train models that can recognize when they are being fed an unknown class?

Some Basic Terms

To immerse yourself in a field of research, it’s important to speak the language of the locals. To help you out, here’s a handy guide:

  • In-distribution: The set of classes known at training time, on which the model is trained to distinguish, identify, or separate. Alternatively, an instance of one of these classes.

  • Out-of-distribution (OOD): Everything else in the world that is not in-distribution, either classes or instances of those classes. These come in two flavors (a la Donald Rumsfeld):

    • Known Unknowns: (See outlier exposure below) Out-of-distribution examples or classes given at training time

    • Unknown Unknowns: Out-of-distribution examples or classes seeing only at evaluation.

  • Outlier Exposure: The process adapting a model to include out-of-distribution examples at training time with the goal of making the model robust to OOD errors.

  • Fine-grained OOD/Coarse-Grained OOD: For a detailed breakdown, see “Introducing TernaryMixOE”

Sources:

[1] D. Hendrycks, M. Mazeika, and T. Dietterich, “Deep anomaly detection with outlier exposure,” ICLR 2019, Jan 2019, arXiv:1812.04606 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1812.04606

Next
Next

Introducing TernaryMixOE