How many success-or-failure observations should we collect from a computer simulation? Often, researchers use space-filling design of experiments when planning modeling and simulation (M&S) studies. We are not satisfied with existing guidance on justifying the number of runs when developing these designs, either because the guidance is insufficiently justified, does not provide an unambiguous answer, or is not based on optimizing a statistical measure of merit. Analysts should use confidence interval margin of error as the statistical measure of merit for M&S studies intended to characterize overall M&S behavioral trends. Unfortunately, the margin of error for studies involving factors and success-or-failure (or binary) outcomes requires knowing model parameters when using logistic regression. We explore how an upper bound on the margin of error, needing less information about the statistical model we need to estimate, can assist in sample size planning. While the upper bound needs further theoretical refinement, simulation studies suggest the upper bound may provide a means of justifying M&S study sample sizes with a statistical measure of merit.

Suggested Citation

Duffy, Kelly, Curtis G Miller, and Rebecca Medlin. Sample Size Determination for Computer Simulations with Binary Outcomes. IDA Product 3002814. Alexandria, VA: Institute for Defense Analyses, 2024.

Slides: