BayesOD
BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors
Version1 - https://arxiv.org/abs/1903.03838v1
Version2 - https://arxiv.org/abs/1903.03838v2
KeyPoints
Uncertainty measures such as covariance are outputs for boundign box regression and category classification.
MC dropout for Epistemic Uncertainty. Variance Regression for Aleatoric Uncertainty.
Bayesian Inference to get posterior of bouding box using default anchor boxes as prior and bounding box regressor output as likelihood.
NMS is replaced by Bayesian Inference to get final output.
Uncertainty
Types:
Epistemic: Epistemic uncertainty captures our ignorance about the models most suitable to explain our data. It is the uncertainty in the model’s parameters, usually as a result of the confusion about which model generated the training data, and can be explained away given enough representative training data points. Basically it means, when we train out network it is possible that with somehwat different parameters we can get same output or error. Hence thier is set of models which can explain the data equally well. Hence the uncertainty that which model is the best of our task is Epistemic Uncertainty
Aleatoric: Uncertainty about the observation y caused by a noisy data set {x,y}. It results from the stochastic nature of the observed data, and persist in network output despite expanded training on additional data. Basically it is the inherent noise which is always there when performing an experiment and collecting data.
Methods to measure uncertainty in neural networks:
Monte Carlo Dropout: Parameters are stochastically sampled through Monte-Carlo (MC) Dropout. The output detections of multiple stochastic runs are then clustered, and the sufficient statistics of the state distributions for every object instance are directly estimated from the cluster members. The main advantage of this formulation lies in treating the underlying structure of the deep object detector as a black box, allowing it to be applied to various architectures with little effort
Covariance Estimation: Another way to estimate the uncertainty in object detection results is to directly provide estimates for the covariance matrix of the bounding box state B. These sampling free methods are usually faster than black box methods, since a single run of the deep object detector can estimate uncertainty.
(Only for Object Detection)Redundancy in the output of the deep object detector before NMS: This method is only particulary for object detection using DL. They exploit redundancy in the output of the deep object detector before NMS to form spatially affiliated clusters of detection outputs, from which sufficient statistics for both object state distributions can be estimated.
Formulation of BayesOD
BayesOD used RetinaNet as its object detection network. BayesOD made significant changes in the parts of formulation. But network architecture remains same.
In the Formulation below following naming conventino is used:
Note that throughout this section, outputs from the neural network are denoted with a ˆ. operator, and per-anchor variables are indexed with i. Variables not indexed with an i represent accumulation over several anchors.
Predicting uncertainty
Epistemic Uncertainty
To calculate this marginal distribution, method in next point is used.
Aleatoric Uncertainty
The aleatoric covariance matrix can then be constructed from the output regressed variances as:
Note: No aleatoric uncertainty were estimated for classification task. For reason give a read to: https://arxiv.org/abs/1809.05590
Incorporating State Prior Distribution
Probablistically combine the per-anchor prior information to get the final estimate of state.
Statistics for posterior are calculated using multivariate Gaussian conjugate update as:
Similar techinque is applied with categorical distribution where Dirichlet function is used prior. Because after that posterior will also be Dirichlet.
Note: In above the prior is set as µ0 to the initial anchor position, and Σ0 to a matrix with large diagonal entries. This is actually not very informative prior for anchors and standard priors as have used in Retinanet or Faster RCNN. But if you find a way to ger informative prior for anchors, use them.
Bayesian Inference as a Replacement to NMS
First clusters are formed over which Bayesian inference is performed rather than NMS to get the final output. Per-anchor outputs from the neural network are clustered using spatial affinity.
where X is the set of inputs [xi | i = 1 . . .M] of the M cluster members. Also:
A major result from this subsection is that the two states of any object can be updated easily given an additional measurement from a different component of the robotic system using bayes theorem.
Experiments and Results
Considered two classed of pedestrian and car only
Metrics
Average Precision(AP): a standard metric used to evaluate the performance of object detectors. Throughout this section, AP is evaluated separately for the two categories at an IOU of 0.5.
Minimum Uncertainty Error (MUE): used to determine the ability of an uncertainty measure to discriminate true positives from false positives, where a detection is determined to be a true positive if it has an IOU ≥ 0.5 with a same category ground-truth bounding box. False positives in this case could include poorly localized detections, or false detections result- ing from unknown unknowns. Hence:
where δ is the uncertainty measure threshold. MUE is the best uncertainty error achievable by a detector at the best possible value of the threshold δ. The lowest MUE achievable by a detector is 0%.
Gaussian MUE (GMUE) uses the entropy of the Gaussian distribution describing the state B as its uncertainty measure to be used to discriminate true positives from false positives. Similarly, Categorical MUE (CMUE) uses the entropy of the Categorical distribution describing the state S as its uncertainty measure.
Results
BayesOD is seen to outperform all three methods on all performance metrics
Analysis as compared to other methods:
For a meaningful uncertainty measure, the entropy, and hence the uncertainty in both states of a true positive should be lower than those of a false positive.
For the Categorical entropy, all methods are shown to follow this intuitive trend to a certain extent.
For the Gaussian entropy however, two of the three methods in the state of the art: Redundancy and Black Box result in exactly the opposite behaviour, where the mean of the Gaussian entropy of true positives is higher than that of the false positives.
To hypothesise on why such behaviour occurs, one should observe the mechanism employed by these two methods to estimate the final covariance matrix of the state B. Both of these methods use the clustered output ofM stochastic runs to estimate a sample covariance matrix, with the only difference being that Black Box clusters the output of NMS, whereas Redundancy clusters the per-anchor output before NMS. Both of these methods lack adequate cluster merging, and explicit variance estimation, which reduces the discriminative power of their estimated uncertainty measure for the bounding box state B.
The first support for this hypothesis is that Sampling Free, a method that explicitly uses the per-anchor regressed covariance matrix, provides a 10.76% and 2.37% decrease in GMUE of the car and pedestrian categories over the second runner up from Black Box and Redundancy.
Ablation Study
Pushing the variance of negative anchors to increase during training provides a slightly more discriminative uncertainty in the bounding box state B.
Explicit aleatoric covariance matrix estimation provides a slightly more discriminative uncertainty estimate of the bounding box state B.
The gains in performance on CMUE can be explained through the per-anchor marginalization over neural net- work parameters.
Greedy Non-Maximum Suppression is detrimental to the discriminative power of the uncertainty in the bounding box state B.
Resources
Last updated