Uncertainty Quantification

Quantifying Uncertainty to Keep Astronauts and Warfighters Safe

Both NASA and DOT&E increasingly rely on computer models to supplement data collection, and utilize statistical distributions to quantify the uncertainty in models, so that decision-makers are equipped with the most accurate information about system performance and model fitness. This article provides a high-level overview of uncertainty quantification (UQ) through an example assessment for the reliability of a new space-suit system. The goal is to reach a more general audience in Significance Magazine, and convey the importance and relevance of statistics to the defense and aerospace communities....

Uncertainty Quantification for Ground Vehicle Vulnerability Simulation

A vulnerability assessment of a combat vehicle uses modeling and simulation (M&S) to predict the vehicle’s vulnerability to a given enemy attack. The system-level output of the M&S is the probability that the vehicle’s mobility is degraded as a result of the attack. The M&S models this system-level phenomenon by decoupling the attack scenario into a hierarchy of sub-systems. Each sub-system addresses a specific scientific problem, such as the fracture dynamics of an exploded munition, or the ballistic resistance provided by the vehicle’s armor....

Topological Modeling of Human-Machine Teams

A Human-Machine Team (HMT) is a group ofagents consisting of at least one human and at least one machine, all functioning collaboratively towards one or more common objectives. As industry and defense find more helpful, creative, and difficult applications of AI-driven technology, the need to effectively and accurately model, simulate, test, and evaluate HMTs will continue to grow and become even more essential. Going along with that growing need, new methods are required to evaluate whether a human-machine team is performing effectively as a team in testing and evaluation scenarios....

Introduction to Bayesian Analysis

As operational testing becomes increasingly integrated and research questions become more difficult to answer, IDA’s Test Science team has found Bayesian models to be powerful data analysis methods. Analysts and decision-makers should understand the differences between this approach and the conventional way of analyzing data. It is also important to recognize when an analysis could benefit from the inclusion of prior information—what we already know about a system’s performance—and to understand the proper way to incorporate that information....

Warhead Arena Analysis Advancements

Fragmentation analysis is a critical piece of the live fire test and evaluation (LFT&E) of the lethality and vulnerability aspects of warheads. But the traditional methods for data collection are expensive and laborious. New optical tracking technology is promising to increase the fidelity of fragmentation data, and decrease the time and costs associated with data collection. However, the new data will be complex, three-dimensional “fragmentation clouds,” possibly with a time component as well, and there will be a larger number of individual data points....

Circular Prediction Regions for Miss Distance Models under Heteroskedasticity

Circular prediction regions are used in ballistic testing to express the uncertainty in shot accuracy. We compare two modeling approaches for estimating circular prediction regions for the miss distance of a ballistic projectile. The miss distance response variable is bivariate normal and has a mean and variance that can change with one or more experimental factors. The first approach fits a heteroskedastic linear model using restricted maximum likelihood, and uses the Kenward-Roger statistic to estimate circular prediction regions....

Designing Experiments for Model Validation- The Foundations for Uncertainty Quantification

Advances in computational power have allowed both greater fidelity and more extensive use of such models. Numerous complex military systems have a corresponding model that simulates its performance in the field. In response, the DoD needs defensible practices for validating these models. Design of Experiments and statistical analysis techniques are the foundational building blocks for validating the use of computer models and quantifying uncertainty in that validation. Recent developments in uncertainty quantification have the potential to benefit the DoD in using modeling and simulation to inform operational evaluations....

Handbook on Statistical Design & Analysis Techniques for Modeling & Simulation Validation

This handbook focuses on methods for data-driven validation to supplement the vast existing literature for Verification, Validation, and Accreditation (VV&A) and the emerging references on uncertainty quantification (UQ). The goal of this handbook is to aid the test and evaluation (T&E) community in developing test strategies that support model validation (both external validation and parametric analysis) and statistical UQ. Suggested Citation Wojton, Heather, Kelly M Avery, Laura J Freeman, Samuel H Parry, Gregory S Whittier, Thomas H Johnson, and Andrew C Flack....

Statistics Boot Camp

In the test community, we frequently use statistics to extract meaning from data. These inferences may be drawn with respect to topics ranging from system performance to human factors. In this mini-tutorial, we will begin by discussing the use of descriptive and inferential statistics. We will continue by discussing commonly used parametric and nonparametric statistics within the defense community, ranging from comparisons of distributions to comparisons of means. We will conclude with a brief discussion of how to present your statistical findings graphically for maximum impact....

The Purpose of Mixed-Effects Models in Test and Evaluation

Mixed-effects models are the standard technique for analyzing data with grouping structure. In defense testing, these models are useful because they allow us to account for correlations between observations, a feature common in many operational tests. In this article, we describe the advantages of modeling data from a mixed-effects perspective and discuss an R package—ciTools—that equips the user with easy methods for presenting results from this type of model. Suggested Citation Haman, John, Matthew Avery, and Heather Wojton....

Improved Surface Gunnery Analysis with Continuous Data

Recasting gunfire data from binomial (hit/miss) to continuous (time-to-kill) allows us to draw statistical conclusions with tactical implications from free-play,live-fire surface gunnery events. Our analysis provided the Navy with suggestions forimprovements to its tactics and the employment of its weapons. A censored analysisenabled us to do so, where other methods fell short. Suggested Citation Ashwell, Benjamin A, V Bram Lillard, and George M Khoury. Improved Surface Gunnery Analysis with Continuous Data....

Scientific Test and Analysis Techniques

Abstract This document contains the technical content for the Scientific Test and Analysis Techniques (STAT) in Test and Evaluation (T&E) continuous learning module. The module provides a basic understanding of STAT in T&E. Topics coverec include design of experiments, observational studies, survey design and analysis, and statistical analysis. It is designed as a four hour online course, suitable for inclusion in the DAU T&E certification curriculum. Slides

Scientific Test and Analysis Techniques- Continuous Learning Module

This document contains the technical content for the Scientific Test and Analysis Techniques (STAT) in Test and Evaluation (T&E) continuous learning module. The module provides a basic understanding of STAT in T&E. Topics covered include design of experiments, observational studies, survey design and analysis, and statistical analysis. It is designed as a four hour online course, suitable for inclusion in the DAU T&E certification curriculum. Suggested Citation Pinelis, Yevgeniya, Laura J Freeman, Heather M Wojton, Denise J Edwards, Stephanie T Lane, and James R Simpson....

Prediction Uncertainty for Autocorrelated Lognormal Data with Random Effects

Accurately presenting model estimates with appropriate uncertainties is critical to the credibility and defensibility of anypiece of statistical analysis. When dealing with complex data that require hierarchical covariance structures, many of the standardapproaches for visualizing uncertainty are insufficient. One such case is data fit with log-linear autoregressive mixed effectsmodels. Data requiring such an approach have three exceptional characteristics.1. The data are sampled in “groups” that exhibit variation unexplained by other model factors....

Thinking About Data for Operational Test and Evaluation

While the human brain is powerful tool for quickly recognizing patterns in data, it will frequently make errors in interpreting random data. Luckily, these mistakes occur in systematic and predictable ways. Statistical models provide an analytical framework that helps us avoid these error-prone heuristics and draw accurate conclusions from random data. This non-technical presentation highlights some tricks of the trade learned by studying data and the way the human brain processes....

A First Step into the Bootstrap World

Bootstrapping is a powerful nonparametric tool for conducting statistical inference with many applications to data from operational testing. Bootstrapping is most useful when the population sampled from is unknown or complex or the sampling distribution of the desired statistic is difficult to derive. Careful use of bootstrapping can help address many challenges in analyzing operational test data. Suggested Citation Avery, Matthew R. A First Step into the Bootstrap World. IDA Document NS D-5816....

Bayesian Analysis in R/STAN

In an era of reduced budgets and limited testing, verifying that requirements have been met in a single test period can be challenging, particularly using traditional analysis methods that ignore all available information. The Bayesian paradigm is tailor made for these situations, allowing for the combination of multiple sources of data and resulting in more robust inference and uncertainty quantification. Consequently, Bayesian analyses are becoming increasingly popular in T&E. This tutorial briefly introduces the basic concepts of Bayesian Statistics, with implementation details illustrated in R through two case studies: reliability for the Core Mission functional area of the Littoral Combat Ship (LCS) and performance curves for a chemical detector in the Bio-chemical Detection System (BDS) with different agents and matrices....

Censored Data Analysis Methods for Performance Data- A Tutorial

Binomial metrics like probability-to-detect or probability-to-hit typically do not provide the maximum information from testing. Using continuous metrics such as time to detect provide more information, but do not account for non-detects. Censored data analysis allows us to account for both pieces of information simultaneously. Suggested Citation Lillard, V Bram. Censored Data Analysis Methods for Performance Data: A Tutorial. IDA Document NS D-5811. Alexandria, VA: Institute for Defense Analyses, 2016....

Estimating System Reliability from Heterogeneous Data

This briefing provides an example of some of the nuanced issues in reliability estimation in operational testing. The statistical models are motivated by an example of the Paladin Integrated Management (PIM). We demonstrate how to use a Bayesian approach to reliability estimation that uses data from all phases of testing. Suggested Citation Browning, Caleb, Laura Freeman, Alyson Wilson, Kassandra Fronczyk, and Rebecca Dickinson. “Estimating System Reliability from Heterogeneous Data.” Presented at the Conference on Applied Statistics in Defense, George Mason University, October 2015....

Censored Data Analysis- A Statistical Tool for Efficient and Information-Rich Testing

Binomial metrics like probability-to-detect or probability-to-hit typically provide operationally meaningful and easy to interpret test outcomes. However, they are information-poor metrics and extremely expensive to test. The standard power calculations to size a test employ hypothesis tests, which typically result in many tens to hundreds of runs. In addition to being expensive, the test is most likely inadequate for characterizing performance over a variety of conditions due to the inherently large statistical uncertainties associated with binomial metrics....

Hybrid Designs- Space Filling and Optimal Experimental Designs for Use in Studying Computer Simulation Models

This tutorial provides an overview of experimental design for modeling and simulation. Pros and cons of each design methodology are discussed. Suggested Citation Silvestrini, Rachel Johnson. “Hybrid Designs: Space Filling and Optimal Experimental Designs for Use in Studying Computer Simulation Models.” Monterey, California, May 2011. Slides: