Simulation Programming in R
Introduction to Simulation
Simulation involves creating a model of a real-world process or system and then running experiments with this model to understand its behavior under various conditions. In R, simulations are commonly used for statistical analysis, probability estimation, and system modeling.
Generating Random Data
Generating random data is fundamental to simulation. R provides a variety of functions to generate random numbers from different distributions.
Random Numbers from Uniform and Normal Distributions
Uniform Distribution:
# Generate random numbers from a uniform distribution uniform_random <- runif(n = 10, min = 0, max = 1) print(uniform_random) # [1] 0.6981704 0.4550072 0.4378290 0.5438281 0.2452994 0.3897755 0.2596513 0.5323843 0.1111341 0.4547830
Normal Distribution:
# Generate random numbers from a normal distribution normal_random <- rnorm(n = 10, mean = 0, sd = 1) print(normal_random) # [1] -1.2611759 -0.5261253 0.6746311 1.3793731 0.0356756 -0.2564200 -0.8215043 1.0598914 -0.2377863 -0.6647714
Random Sampling
You can also generate random samples from a specified dataset.
# Generate a random sample data <- c("A", "B", "C", "D", "E") sampled_data <- sample(data, size = 3, replace = TRUE) print(sampled_data) # [1] "C" "E" "C"
Simulating Queuing Systems
Queuing theory is used to model and analyze systems with queues, such as customer service lines.
Example: Simulating an M/M/1 Queue
# M/M/1 queue simulation lambda <- 2 # arrival rate mu <- 3 # service rate n_customers <- 100 arrival_times <- cumsum(rexp(n_customers, rate = lambda)) service_times <- rexp(n_customers, rate = mu) completion_times <- pmax(arrival_times, lag(arrival_times, default = 0)) + service_times waiting_times <- completion_times - arrival_times print(mean(waiting_times)) # Example output: 0.764
Reproducibility and Random Seed
For simulations to be reproducible, you need to set a random seed.
# Set a random seed for reproducibility set.seed(123) random_numbers <- rnorm(5) print(random_numbers) # [1] 1.707251 -0.501796 0.336766 -0.070542 0.682848
Visualization of Simulation Results
Visualizing results helps to understand and interpret the outcomes of simulations.
Example: Plotting Results
# Simulate a simple process data <- rnorm(1000) hist(data, breaks = 30, main = "Histogram of Simulated Data", xlab = "Value", ylab = "Frequency")
Advanced Topics
Sensitivity Analysis
Sensitivity analysis examines how different values of an input variable impact the output of a model.
Example: Sensitivity Analysis in a Simulation
# Define a function to simulate simulate <- function(param) { result <- rnorm(100, mean = param) return(mean(result)) } # Perform sensitivity analysis params <- seq(0, 10, by = 1) results <- sapply(params, simulate) plot(params, results, type = "b", main = "Sensitivity Analysis", xlab = "Parameter", ylab = "Result")
Advanced Statistical Simulations
For more complex simulations involving statistical models, you may use packages like simulate, MASS, and spatstat for specialized simulations.
Summary
- Random Data Generation: Use functions like runif(), rnorm(), and sample() for generating random data.
- Monte Carlo Simulations: Estimate values and model complex systems using random sampling.
- Complex Systems: Simulate Markov chains and queuing systems to model real-world processes.
- Reproducibility: Ensure your simulations are reproducible by setting a random seed with set.seed().
- Visualization: Use plots and histograms to visualize and interpret simulation results.
- Advanced Topics: Explore sensitivity analysis and advanced statistical simulations for deeper insights.