Question 1(iii): Plot the Probability Density Function (PDF) of the sampling distribution of the sample mean, under the Central Limit Theorem (CLT), corresponding to the samples generated in part (i). The graph of the PDF should be superimposed on the histogram produced in previous question
To plot the Probability Density Function (PDF) of the sampling distribution of the sample mean, which follows a normal distribution under the Central Limit Theorem (CLT), and superimpose it on the histogram produced in the previous question, you can use the following code:
# Load the previously generated samples
load("exp_samples.RData")
# Calculate the means of each group of samples
sample_means <- tapply(samples, rep(1:num_samples, each = sample_size), mean)
# Plot the histogram on the probability density scale
hist(sample_means,
main = "Histogram and PDF of Sample Means (Exp(3) Distribution)",
xlab = "Sample Means",
ylab = "Probability Density",
prob = TRUE, # Set prob = TRUE for probability density scale
col = "skyblue", # Color of the bars
border = "black", # Color of the bar borders
breaks = 30) # Number of histogram bins
# Add the PDF curve (normal distribution) using CLT properties
mu <- 1 / 3 # Mean of the underlying distribution
sigma <- 1 / (3 * sqrt(10)) # Standard deviation under CLT
x <- seq(min(sample_means), max(sample_means), length = 100)
pdf_curve <- dnorm(x, mean = mu, sd = sigma)
lines(x, pdf_curve, col = "red", lwd = 2)
# Add a legend
legend("topright",
legend = c("Empirical", "CLT PDF"),
col = c("skyblue", "red"),
lwd = c(1, 2))
Explanation of the code and solution:
- Load the Previously Generated Samples: We load the previously generated samples from the “exp_samples.RData” file, as we did in the previous question.
- Calculate Sample Means: We calculate the means of each group of samples, just as we did before.
- Plot the Histogram on Probability Density Scale: We use the
hist
function to plot the histogram of the sample means on the probability density scale, similar to the previous question. Theprob = TRUE
option ensures that the heights of the histogram bars represent probabilities. - Add the CLT PDF Curve:
- We calculate the theoretical mean (
mu
) and standard deviation (sigma
) of the sampling distribution of the sample mean under the CLT. For an Exponential distribution with rate parameter 3 and a sample size of 10, the mean of the sampling distribution is 1/3, and the standard deviation is 1/(3*sqrt(10)). - We create a sequence of x-values (
x
) that spans the range of sample means. - We calculate the PDF of the sampling distribution of the sample mean using
dnorm
with the mean and standard deviation parameters. - We use the
lines
function to superimpose the PDF curve on the histogram. Thecol
parameter sets the color of the curve, andlwd
sets the line width.
- We calculate the theoretical mean (
- Add a Legend: We add a legend to the plot to label the empirical histogram and the CLT PDF curve.
In summary, this code generates a histogram of the sample means on the probability density scale and superimposes the PDF curve of the sampling distribution of the sample mean based on the Central Limit Theorem (CLT). This allows you to visually compare the empirical distribution of sample means with the theoretical normal distribution predicted by the CLT.
Check out the answer to previous question here
Stay Tuned for Rest of the answers from IFOA Actuary CS1B Exam.