Actuary R Programming

Actuary CS1 B R Programming Solutions Part 3

Question 1(iii): Plot the Probability Density Function (PDF) of the sampling distribution of the sample mean, under the Central Limit Theorem (CLT), corresponding to the samples generated in part (i). The graph of the PDF should be superimposed on the histogram produced in previous question

To plot the Probability Density Function (PDF) of the sampling distribution of the sample mean, which follows a normal distribution under the Central Limit Theorem (CLT), and superimpose it on the histogram produced in the previous question, you can use the following code:

# Load the previously generated samples
load("exp_samples.RData")

# Calculate the means of each group of samples
sample_means <- tapply(samples, rep(1:num_samples, each = sample_size), mean)

# Plot the histogram on the probability density scale
hist(sample_means, 
     main = "Histogram and PDF of Sample Means (Exp(3) Distribution)",
     xlab = "Sample Means",
     ylab = "Probability Density",
     prob = TRUE,  # Set prob = TRUE for probability density scale
     col = "skyblue",  # Color of the bars
     border = "black", # Color of the bar borders
     breaks = 30)  # Number of histogram bins

# Add the PDF curve (normal distribution) using CLT properties
mu <- 1 / 3  # Mean of the underlying distribution
sigma <- 1 / (3 * sqrt(10))  # Standard deviation under CLT
x <- seq(min(sample_means), max(sample_means), length = 100)
pdf_curve <- dnorm(x, mean = mu, sd = sigma)
lines(x, pdf_curve, col = "red", lwd = 2)

# Add a legend
legend("topright", 
       legend = c("Empirical", "CLT PDF"), 
       col = c("skyblue", "red"), 
       lwd = c(1, 2))

Explanation of the code and solution:

  1. Load the Previously Generated Samples: We load the previously generated samples from the “exp_samples.RData” file, as we did in the previous question.
  2. Calculate Sample Means: We calculate the means of each group of samples, just as we did before.
  3. Plot the Histogram on Probability Density Scale: We use the hist function to plot the histogram of the sample means on the probability density scale, similar to the previous question. The prob = TRUE option ensures that the heights of the histogram bars represent probabilities.
  4. Add the CLT PDF Curve:
    • We calculate the theoretical mean (mu) and standard deviation (sigma) of the sampling distribution of the sample mean under the CLT. For an Exponential distribution with rate parameter 3 and a sample size of 10, the mean of the sampling distribution is 1/3, and the standard deviation is 1/(3*sqrt(10)).
    • We create a sequence of x-values (x) that spans the range of sample means.
    • We calculate the PDF of the sampling distribution of the sample mean using dnorm with the mean and standard deviation parameters.
    • We use the lines function to superimpose the PDF curve on the histogram. The col parameter sets the color of the curve, and lwd sets the line width.
  5. Add a Legend: We add a legend to the plot to label the empirical histogram and the CLT PDF curve.

In summary, this code generates a histogram of the sample means on the probability density scale and superimposes the PDF curve of the sampling distribution of the sample mean based on the Central Limit Theorem (CLT). This allows you to visually compare the empirical distribution of sample means with the theoretical normal distribution predicted by the CLT.

Check out the answer to previous question here

Stay Tuned for Rest of the answers from IFOA Actuary CS1B Exam.

Leave a Reply

%d bloggers like this: