A Practitioner’s Framework for Federated Model Validation Resource Allocation

Recent advances in computation and statistics led to an increasing use of federated models for end-to-end system test and evaluation. A federated model is a collection of interconnected models where the outputs of a model act as inputs to subsequent models. However, the process of verifying and validating federated models is poorly understood, especially when testers have limited resources, knowledge-based uncertainties, and concerns over operational realism. Testers often struggle with determining how to best allocate limited test resources for model validation....

2024 · Dhruv Patel, Jo Anna Capp, John Haman

A Preview of Functional Data Analysis for Modeling and Simulation Validation

Modeling and simulation (M&S) validation for operational testing often involves comparing live data with simulation outputs. Statistical methods known as functional data analysis (FDA) provides techniques for analyzing large data sets (“large” meaning that a single trial has a lot of information associated with it), such as radar tracks. We preview how FDA methods could assist M&S validation by providing statistical tools handling these large data sets. This may facilitate analyses that make use of more of the data available and thus allows for better detection of differences between M&S predictions and live test results....

2024 · Curtis Miller

Determining the Necessary Number of Runs in Computer Simulations with Binary Outcomes

How many success-or-failure observations should we collect from a computer simulation? Often, researchers use space-filling design of experiments when planning modeling and simulation (M&S) studies. We are not satisfied with existing guidance on justifying the number of runs when developing these designs, either because the guidance is insufficiently justified, does not provide an unambiguous answer, or is not based on optimizing a statistical measure of merit. Analysts should use confidence interval margin of error as the statistical measure of merit for M&S studies intended to characterize overall M&S behavioral trends....

2024 · Curtis Miller, Kelly Duffy

CDV Method for Validating AJEM using FUSL Test Data

M&S validation is critical for ensuring credible weapon system evaluations. System-level evaluations of Armored Fighting Vehicles (AFV) rely on the Advanced Joint Effectiveness Model (AJEM) and Full-Up System Level (FUSL) testing to assess AFV vulnerability. This report reviews and improves upon one of the primary methods that analysts use to validate AJEM, called the Component Damage Vector (CDV) Method. The CDV Method compares vehicle components that were damaged in FUSL testing to simulated representations of that damage from AJEM....

2023 · Thomas Johnson, Lindsey Butler, David Grimm, John Haman, Kerry Walzl

Implementing Fast Flexible Space-Filling Designs in R

Modeling and simulation (M&S) can be a useful tool when testers and evaluators need to augment the data collected during a test event. When planning M&S, testers use experimental design techniques to determine how much and which types of data to collect, and they can use space-filling designs to spread out test points across the operational space. Fast flexible space-filling designs (FFSFDs) are a type of space-filling design useful for M&S because they work well in design spaces with disallowed combinations and permit the inclusion of categorical factors....

2023 · Christopher Dimapasok

Statistical Methods Development Work for M&S Validation

We discuss four areas in which statistically rigorous methods contribute to modeling and simulation validation studies. These areas are statistical risk analysis, space-filling experimental designs, metamodel construction, and statistical validation. Taken together, these areas implement DOT&E guidance on model validation. In each area, IDA has contributed either research methods, user-friendly tools, or both. We point to our tools on testscience.org, and survey the research methods that we’ve contributed to the M&S validation literature...

2023 · Curtis Miller

Statistical Methods for M&S V&V- An Intro for Non-Statisticians

This is a briefing intended to motivate and explain the basic concepts of applying statistics to verification and validation. The briefing will be presented at the Navy M&S VV&A WG (Sub-WG on Validation Statistical Method Selection). Suggested Citation Pagan-Rivera, Keyla, John T Haman, Kelly M Avery, and Curtis G Miller. Statistical Methods for M&S V&V: An Intro for Non- Statisticians. IDA Product ID-3000770. Alexandria, VA: Institute for Defense Analyses, 2024....

2023 · John Haman, Kelly Avery, Curtis Miller

Metamodeling Techniques for Verification and Validation of Modeling and Simulation Data

Modeling and simulation (M&S) outputs help the Director, Operational Test and Evaluation (DOT&E) assess the effectiveness, survivability, lethality, and suitability of systems. To use M&S outputs, DOT&E needs models and simulators to be sufficiently verified and validated. The purpose of this paper is to improve the state of verification and validation by recommending and demonstrating a set of statistical techniques—metamodels, also called statistical emulators—to the M&S community. The paper expands on DOT&E’s existing guidance about metamodel usage by creating methodological recommendations the M&S community could apply to its activities....

2022 · John Haman, Curtis Miller

What Statisticians Should Do to Improve M&S Validation Studies

It is often said that many research findings – from social sciences, medicine, economics, and other disciplines – are false. This fact is trumpeted in the media and by many statisticians. There are several reasons that false research is published, but to what extent should we be worried about them in defense testing and modeling and simulation? In this talk I will present several recommendations for actions that statisticians and data scientists can take to improve the quality of our validations and evaluations....

2022 · John Haman

Space-Filling Designs for Modeling & Simulation

This document presents arguments and methods for using space-filling designs (SFDs) to plan modeling and simulation (M&S) data collection. Suggested Citation Avery, Kelly, John T Haman, Thomas Johnson, Curtis Miller, Dhruv Patel, and Han Yi. Test Design Challenges in Defense Testing. IDA Product ID 3002855. Alexandria, VA: Institute for Defense Analyses, 2024. Slides: Paper:

2021 · Han Yi, Curtis Miller, Kelly Avery

A Validation Case Study- The Environment Centric Weapons Analysis Facility (ECWAF)

Reliable modeling and simulation (M&S) allows the undersea warfare community to understand torpedo performance in scenarios that could never be created in live testing, and do so for a fraction of the cost of an in-water test. The Navy hopes to use the Environment Centric Weapons Analysis Facility (ECWAF), a hardware-in-the-loop simulation, to predict torpedo effectiveness and supplement live operational testing. In order to trust the model’s results, the T&E community has applied rigorous statistical design of experiments techniques to both live and simulation testing....

2020 · Elliot Bartis, Steven Rabinowitz

Designing Experiments for Model Validation- The Foundations for Uncertainty Quantification

Advances in computational power have allowed both greater fidelity and more extensive use of such models. Numerous complex military systems have a corresponding model that simulates its performance in the field. In response, the DoD needs defensible practices for validating these models. Design of Experiments and statistical analysis techniques are the foundational building blocks for validating the use of computer models and quantifying uncertainty in that validation. Recent developments in uncertainty quantification have the potential to benefit the DoD in using modeling and simulation to inform operational evaluations....

2019 · Heather Wojton, Kelly Avery, Laura Freeman, Thomas Johnson

Handbook on Statistical Design & Analysis Techniques for Modeling & Simulation Validation

This handbook focuses on methods for data-driven validation to supplement the vast existing literature for Verification, Validation, and Accreditation (VV&A) and the emerging references on uncertainty quantification (UQ). The goal of this handbook is to aid the test and evaluation (T&E) community in developing test strategies that support model validation (both external validation and parametric analysis) and statistical UQ. Suggested Citation Wojton, Heather, Kelly M Avery, Laura J Freeman, Samuel H Parry, Gregory S Whittier, Thomas H Johnson, and Andrew C Flack....

2019 · Heather Wojton, Kelly Avery, Laura Freeman, Samuel Parry, Gregory Whittier, Thomas Johnson, Andrew Flack

M&S Validation for the Joint Air-to-Ground Missile

An operational test is resource-limited and must therefore rely on both live test data and modeling and simulation (M&S) data to inform a full evaluation. For the Joint Air-to-Ground Missile (JAGM) system, we needed to create a test design that accomplished dual goals, characterizing missile performance across the operational space and supporting rigorous validation of the M&S. Our key question is which statistical techniques should be used to compare the M&S to the live data?...

2019 · Brent Crabtree, Andrew Cseko, Thomas Johnson, Joel Williamson, Kelly Avery

Comparing M&S Output to Live Test Data- A Missile System Case Study

In the operational testing of DoD weapons systems, modeling and simulation (M&S) is often used to supplement live test data in order to support a more complete and rigorous evaluation. Before the output of the M&S is included in reports to decision makers, it must first be thoroughly verified and validated to show that it adequately represents the real world for the purposes of the intended use. Part of the validation process should include a statistical comparison of live data to M&S output....

2018 · Kelly Avery

Comparing Live Missile Fire and Simulation

Modeling and Simulation is frequently used in Test and Evaluation (T&E) of air-to-air weapon systems to evaluate the effectiveness of a weapons. The AirIntercept Missile-9X (AIM-9X) program uses modeling and simulationextensively to evaluate missile miss distances. Since flight testing isexpensive, the test program uses relatively few flight tests and supplementsthose data with large numbers of miss distances from simulated tests acrossthe weapons operational space. However, before modeling and simulation canbe used to predict performance it must first be validated....

2017 · Rebecca Medlin, Pamela Rambow, Douglas Peek

Best Practices for Statistically Validating Modeling and Simulation (M&S) Tools Used in Operational Testing

In many situations, collecting sufficient data to evaluate system performance against operationally realistic threats is not possible due to cost and resource restrictions, safety concerns, or lack of adequate or representative threats. Modeling and simulation tools that have been verified, validated, and accredited can be used to supplement live testing in order to facilitate a more complete evaluation of performance. Two key questions that frequently arise when planning an operational test are (1) which (and how many) points within the operational space should be chosen in the simulation space and the live space for optimal ability to verify and validate the M&S, and (2) once that data is collected, what is the best way to compare the live trials to the simulated trials for the purpose of validating the M&S?...

2015 · Kelly Avery, Laura Freeman, Rebecca Medlin

Validating the PRA Testbed Using a Statistically Rigorous Approach

For many systems, testing is expensive and only a few live test events are conducted. When this occurs, testers frequently use a model to extend the test results. However, testers must validate the model to show that it is an accurate representation of the real world from the perspective of the intended uses of the model. This raises a problem when only a small number of live test events are conducted, only limited data are available to validate the model, and some testers struggle with model validation....

2015 · Rebecca Medlin, Dean Thomas

Hybrid Designs- Space Filling and Optimal Experimental Designs for Use in Studying Computer Simulation Models

This tutorial provides an overview of experimental design for modeling and simulation. Pros and cons of each design methodology are discussed. Suggested Citation Silvestrini, Rachel Johnson. “Hybrid Designs: Space Filling and Optimal Experimental Designs for Use in Studying Computer Simulation Models.” Monterey, California, May 2011. Slides:

2011 · Rachel Johnson Silvestrini