A Groundswell for Test and Evaluation

The fundamental purpose of test and evaluation (T&E) in the Department of Defense (DOD) is to provide knowledge to answer critical questions that help decision makers manage the risk involved in developing, producing, operating, and sustaining systems and capabilities. At its core, T&E takes data and translates it into information for decision makers. Subject matter expertise of the platform and operational mission have always been critical components of developing defensible test and evaluation strategies....

2018 · Laura Freeman

Analysis of Split-Plot Reliability Experiments with Subsampling

Reliability experiments are important for determining which factors drive product reliability. The data collected in these experiments can be challenging to analyze. Often, the reliability or lifetime data collected follow distinctly nonnormal distributions and include censored observations. Additional challenges in the analysis arise when the experiment is executed with restrictions on randomization. The focus of this paper is on the proper analysis of reliability data collected from a nonrandomized reliability experiments....

2018 · Rebecca Medlin, Laura Freeman, Jennifer Kensler, Geoffrey Vining

Comparing M&S Output to Live Test Data- A Missile System Case Study

In the operational testing of DoD weapons systems, modeling and simulation (M&S) is often used to supplement live test data in order to support a more complete and rigorous evaluation. Before the output of the M&S is included in reports to decision makers, it must first be thoroughly verified and validated to show that it adequately represents the real world for the purposes of the intended use. Part of the validation process should include a statistical comparison of live data to M&S output....

2018 · Kelly Avery

Improved Surface Gunnery Analysis with Continuous Data

Recasting gunfire data from binomial (hit/miss) to continuous (time-to-kill) allows us to draw statistical conclusions with tactical implications from free-play,live-fire surface gunnery events. Our analysis provided the Navy with suggestions forimprovements to its tactics and the employment of its weapons. A censored analysisenabled us to do so, where other methods fell short. Suggested Citation Ashwell, Benjamin A, V Bram Lillard, and George M Khoury. Improved Surface Gunnery Analysis with Continuous Data....

2018 · Benjamin Ashwell, V. Bram Lillard

Informing the Warfighter—Why Statistical Methods Matter in Defense Testing

Needs one Suggested Citation Freeman, Laura J., and Catherine Warner. “Informing the Warfighter—Why Statistical Methods Matter in Defense Testing.” CHANCE 31, no. 2 (April 3, 2018): 4–11. https://doi.org/10.1080/09332480.2018.1467627. Paper:

2018 · Laura Freeman, Catherine Warner

Introduction to Observational Studies

A presentation on the theory and practice of observational studies. Specific average treatment effect methods include matching, difference-in-difference estimators, and instrumental variables. Suggested Citation Thomas, Dean, and Yevgeniya K Pinelis. Introduction to Observational Studies. IDA Document NS D-9020. Alexandria, VA: Institute for Defense Analyses, 2018. Slides:

2018 · Yevgeniya Pinelis

JEDIS Briefing and Tutorial

Are you sick of having to manually iterate your way through sizing your design of experiments? Come learn about JEDIS, the new IDA-developed JMP Add-In for automating design of experiments power calculations. JEDIS builds multiple test designs in JMP over user-specified ranges of sample sizes, Signal-to-Noise Ratios (SNR), and alpha (1 -confidence) levels. It then automatically calculates the statistical power to detect an effect due to each factor and any specified interactions for each design....

2018 · Jason Sheldon

Parametric Reliability Models Tutorial

This tutorial demonstrates how to plot reliability functions parametrically in R using the output from any reliability modeling software. It provides code and sample plots of reliability and failure rate functions with confidence intervals for three different skewed probability distributions the exponential, the two-parameter Weibull, and the lognormal. These three distributions are the most common parametric models for reliability or survival analysis. This paper also provides mathematical background for the models and recommendations for when to use them....

2018 · William Whitledge

Power Approximations for Reliability Test Designs

Reliability tests determine which factors drive system reliability. Often, the reliability or failure time data collected in these tests tend to follow distinctly non- normal distributions and include censored observations. The experimental design should accommodate the skewed nature of the response and allow for censored observations, which occur when systems under test do not fail within the allotted test time. To account for these design and analysis considerations, Monte Carlo simulations are frequently used to evaluate experimental design properties....

2018 · Rebecca Medlin, Laura Freeman, Thomas Johnson

Reliability Best Practices and Lessons Learned in the Department of Defense

Despite the importance of acquiring reliable systems to support thewarfighter, many military programs fail to meet reliability requirements, which affectsthe overall suitability and cost of the system. To determine ways to improve reliabilityoutcomes in the future, research staff from the Institute for Defense analysesOperational Evaluation Division compiled case studies identifying reliability lessonslearned and best practices for several DOT&E oversight programs. The case studiesprovide program specific information on strategies that worked well or did not workwell to produce reliable systems....

2018 · Jon Bell, Jane Pinelis, Laura Freeman

Scientific Test and Analysis Techniques

Abstract This document contains the technical content for the Scientific Test and Analysis Techniques (STAT) in Test and Evaluation (T&E) continuous learning module. The module provides a basic understanding of STAT in T&E. Topics coverec include design of experiments, observational studies, survey design and analysis, and statistical analysis. It is designed as a four hour online course, suitable for inclusion in the DAU T&E certification curriculum. Slides

2018 · Laura Freeman, Denise Edwards, Stephanie Lane, James Simpson, Heather Wojton

Scientific Test and Analysis Techniques- Continuous Learning Module

This document contains the technical content for the Scientific Test and Analysis Techniques (STAT) in Test and Evaluation (T&E) continuous learning module. The module provides a basic understanding of STAT in T&E. Topics covered include design of experiments, observational studies, survey design and analysis, and statistical analysis. It is designed as a four hour online course, suitable for inclusion in the DAU T&E certification curriculum. Suggested Citation Pinelis, Yevgeniya, Laura J Freeman, Heather M Wojton, Denise J Edwards, Stephanie T Lane, and James R Simpson....

2018 · Laura Freeman, Denise Edwards, Stephanie Lane, James Simpson, Heather Wojton

Testing Defense Systems

The complex, multifunctional nature of defense systems, along with the wide variety of system types, demands a structured but flexible analytical process for testing systems. This chapter summarizes commonly used techniques in defense system testing and specific challenges imposed by the nature of defense system testing. It highlights the core statistical methodologies that have proven useful in testing defense systems. Case studies illustrate the value of using statistical techniques in the design of tests and analysis of the resulting data....

2018 · Justace Clutter, Thomas Johnson, Matthew Avery, V. Bram Lillard, Laura Freeman

Vetting Custom Scales - Understanding Reliability, Validity, and Dimensionality

For situations in which an empirically vetted scale does not exist or is not suitable, a custom scale may be created. This document presents a comprehensive process for establishing the defensible use of a custom scale. At the highest level, this process encompasses (1) establishing validity of the scale, (2) establishing reliability of the scale, and (3) assessing dimensionality, whether intended or unintended, of the scale. First, the concept of validity is described, including how validity may be established using operators and subject matter experts....

2018 · Stephanie Lane