Anomaly Detection with Nonparametric Sequential Probability Ratio Tests (SPRT)
CALCE Team: V. Sotiris, S. Cheng, and M. Pecht
Objective:
Develop a general approach for anomaly detection in complex multivariate systems
Abstract:
The motivation for pursuing a nonparametric approach to detection comes from insufficient detection accuracy of traditional linear pattern recognition algorithms. These traditional approaches assume that the underlying distribution of the data is Gaussian or at least mostly Gaussian. The similarity measures that these approaches compute are generally not useful for data that are highly non – Gaussian and covariate dependant. A general detection approach that does not suffer from parametric constraints, whether Gaussian or other imposed distributions (Lognormal, Gamma, etc.) is anticipated to overcome these problems. Additionally, in order to efficiently analyze high dimensional data a feature extraction approach based on centroid clustering is used to pre – process the data. In the clustering approach data similarity is computed based on Euclidian (Mahalanobis, distances of observations to representative centroids of the training populations. A distribution of the training distances is achieved through Monte Carlo sampling of the training data. With the use of a nonparametric sequential probability test (NPSPRT), new test distances from new observations are compared to the derived training distribution to infer their classification and in turn the system health.