# Monte Carlo Error Simulation

## Contents |

Section 2 outlines some notation, defines MCE, and presents a simple example illustrating that MCE generally may be more substantial than traditionally thought.

Suppose that interest lies in the association between a binary exposure X and a binary outcome Y, and assume that the two are related via the logistic regression modellogitP(Y=1∣X)=β0+βXX.(2)We conducted a

## Monte Carlo Standard Error

Although not shown, the central 95% mass of the Monte Carlo sampling distribution is between −3.3% and 5.1%. Practically, this result suggests that ensuring that the central 95% mass of the Monte Carlo sampling distribution for percent bias is within one unit of the overall underlying value of 0.9%

While the naive Monte Carlo works for simple examples, this is not the case in most problems. The most common choice was R = 1000 (74 articles); only 5 articles used a value of R > 10,000.Table 2Number of replications associated with simulation studies reported in regular articles

The ordinary 'dividing by two' strategy does not work for multi-dimensions as the number of sub-volumes grows far too quickly to keep track. The proposed BGP plot also provides a simple approach for determining the number of simulated data sets or replications needed to achieve a desired level of accuracy, and would be particularly

Monte Carlo integration, on the other hand, employs a non-deterministic approach: each realization provides a different outcome. Hierarchical Spatio-Temporal Mapping of Disease Rates.

## Monte Carlo Error Analysis

Although we provide more details later, here we note that of 223 regular articles that reported a simulation study, only 8 provided either a formal justification for the number of replications. Third, viewed as statistical or mathematical experiments, it could be argued that to aid in the interpretation of results, simulation studies always should be accompanied by some assessment of

Consequently, for a reader to fully understand and place into context results obtained via a simulation study, the results should be accompanied by some measure of associated uncertainty.To gauge the extent

Here we present a series of simple and practical methods for estimating Monte Carlo error as well as determining the number of replications required to achieve a desired level of accuracy. Clearly stratified sampling algorithm concentrates the points in the regions where the variation of the function is largest.

A more detailed description of the data was provided by Waller et al. (1997).Let A1 be a binary indicator of whether or not an individual's age is between 65 and 74. Given the estimation of I from QN, the error bars of QN can be estimated by the sample variance using the unbiased estimate of the variance. Here we call this between-simulation variability Monte Carlo error (MCE).

## Based on these plots, Table 4 also provides the projected number of replications, R+, required to reduce the percent bias MCE to 0.05 or 0.005 for each of the four 2.5th

Each article was downloaded electronically, and a search was performed for any of the following terms: "bootstrap," "dataset," "Monte Carlo," "repetition," "replication," "sample," and "simulation." QUANTIFICATION OF MONTE CARLO ERRORFor the example given in Section 2.2, Figure 1 illustrates a simple and effective diagnostic tool for monitoring the simulation as R increases. To obtain these, we sampled R = 1000 data sets with replacement from the case-control data and evaluated the MLEs using each data set.

Consider the following example where one would like to numerically integrate a gaussian function, centered at 0, with σ = 1, from −1000 to 1000.

An estimate of the MCE is then the standard deviation across the bootstrap statistics MCE^boot(φ^R,B)=1B∑b=1B(φ^R(Xb∗)−φ^R(X∗)¯)2,(9) whereφ^R(X∗)¯=1B∑b=1Bφ^R(Xb∗).Efron (1992) originally proposed the jackknife specifically to avoid a second level of replication, noting that

Press, G.R. The VEGAS algorithm approximates the exact distribution by making a number of passes over the integration region which creates the histogram of the function f. At R = 10,000, the minimum and maximum across the M simulations are −2.3% and 4.7%, with MCE decreasing to 0.7%. The direction is chosen by examining all d possible bisections and selecting the one which will minimize the combined variance of the two sub-regions.

These individual values and their error estimates are then combined upwards to give an overall result and an estimate of its error. For example, although the bootstrap-based estimator is applicable in a broad range of settings, the required second level of replication (denoted here by B) may quickly become computationally burdensome; thus guidance

