2016 | Test Science Research Document Library

A First Step into the Bootstrap World

Bootstrapping is a powerful nonparametric tool for conducting statistical inference with many applications to data from operational testing. Bootstrapping is most useful when the population sampled from is unknown or complex or the sampling distribution of the desired statistic is difficult to derive. Careful use of bootstrapping can help address many challenges in analyzing operational test data. Suggested Citation Avery, Matthew R. A First Step into the Bootstrap World. IDA Document NS D-5816....

Bayesian Analysis in R/STAN

In an era of reduced budgets and limited testing, verifying that requirements have been met in a single test period can be challenging, particularly using traditional analysis methods that ignore all available information. The Bayesian paradigm is tailor made for these situations, allowing for the combination of multiple sources of data and resulting in more robust inference and uncertainty quantification. Consequently, Bayesian analyses are becoming increasingly popular in T&E. This tutorial briefly introduces the basic concepts of Bayesian Statistics, with implementation details illustrated in R through two case studies: reliability for the Core Mission functional area of the Littoral Combat Ship (LCS) and performance curves for a chemical detector in the Bio-chemical Detection System (BDS) with different agents and matrices....

Bayesian Reliability- Combining Information

One of the most powerful features of Bayesian analyses is the ability to combine multiple sources of information in a principled way to perform inference. This feature can be particularly valuable in assessing the reliability of systems where testing is limited. At their most basic, Bayesian methods for reliability develop informative prior distributions using expert judgment or similar systems. Appropriate models allow the incorporation of many other sources of information, including historical data, information from similar systems, and computer models....

Censored Data Analysis Methods for Performance Data- A Tutorial

Binomial metrics like probability-to-detect or probability-to-hit typically do not provide the maximum information from testing. Using continuous metrics such as time to detect provide more information, but do not account for non-detects. Censored data analysis allows us to account for both pieces of information simultaneously. Suggested Citation Lillard, V Bram. Censored Data Analysis Methods for Performance Data: A Tutorial. IDA Document NS D-5811. Alexandria, VA: Institute for Defense Analyses, 2016....

DOT&E Reliability Course

This reliability course provides information to assist DOT&E action officers in their review and assessment of system reliability. Course briefings cover reliability planning and analysis activities that span the acquisition life cycle. Each briefing discusses review criteria relevant to DOT&E action officers based on DoD policies and lessons learned from previous oversight efforts. Suggested Citation Avery, Matthew, Jonathan Bell, Rebecca Medlin, and Freeman Laura. DOT&E Reliability Course. IDA Document NS D-5836....

Introduction to Survey Design

An important goal of test and evaluation is to understand not only how a system performs in its intended environment, but also users’ experiences operating the system. This briefing aimed to provide the audience with a set of tools – most notably, surveys – that are appropriate for measuring the user experience. DOT&E guidance regarding these tools is highlighted where appropriate. The briefing was broken into three major sections: conceptualizing surveys, writing survey items, and formatting surveys....

Regularization for Continuously Observed Ordinal Response Variables with Piecewise-Constant Functional Predictors

This paper investigates regularization for continuously observed covariates that resemble step functions. The motivating examples come from operational test data from a recent United States Department of Defense (DoD) test of the Shadow Unmanned Air Vehicle system. The response variable, quality of video provided by the Shadow to friendly ground units, was measured on an ordinal scale continuously over time. Functional covariates, altitude and distance, can be well approximated by step functions....

Rigorous Test and Evaluation for Defense, Aerospace, and National Security

In April 2016, NASA, DOT&E, and IDA collaborated on a workshopdesigned to strengthen the community around statistical approaches to test andevaluation in defense and aerospace. The workshop brought practitioners, analysts,technical leadership, and statistical academics together for a three day exchange ofinformation with opportunities to attend world renowned short courses, share commodchallenges, and learn new skill sets from a variety of tutorials. A highlight of theworkshop was the Tuesday afternoon technical leadership panel chaired by Dr....

Science of Test Workshop Proceedings, April 11-13, 2016

To mark IDA’s 60th anniversary, we are conducting a series of workshops and symposia that bring together IDA sponsors, researchers, experts inside and outside government, and other stakeholders to discuss issues of the day. These events focus on future national security challenges, reflecting on how past lessons and accomplishments help prepare us to deal with complex issues and environments we face going forward. This publication represents the proceedings of the Science of Test Workshop....

Tutorial on Sensitivity Testing in Live Fire Test and Evaluation

A sensitivity experiment is a special type of experimental design that is used when the response variable is binary and the covariate is continuous. Armor protection and projectile lethality tests often use sensitivity experiments to characterize a projectile’s probability of penetrating the armor. In this mini-tutorial we illustrate the challenge of modeling a binary response with a limited sample size, and show how sensitivity experiments can mitigate this problem. We review eight different single covariate sensitivity experiments and present a comparison of these designs using simulation....