Jason Chen YCHEN148@e.ntu.edu.sg
Copyright Notice
This document is a compilation of original notes by Jason Chen (YCHEN148@e.ntu.edu.sg). Unauthorized reproduction, downloading, or use for commercial purposes is strictly prohibited and may result in legal action. If you have any questions or find any content-related errors, please contact the author at the provided email address. The author will promptly address any necessary corrections.
This is a discrete uniform distribution where each outcome
1.1 Probability Mass Function (PMF)
The probability mass function (PMF) for the uniform distribution is given by:
Explanation:
The PMF states that the probability of any specific outcome
If
Proof:
The uniform distribution assigns equal probability to each of the
1.2 Moment Generating Function (MGF)
The moment generating function (MGF) for the uniform distribution is given by:
Explanation:
The MGF is used to find the moments of the distribution. The first moment (mean) and the second central moment (variance) can be derived from it.
Proof:
The MGF is defined as:
1.3 Mean (Expected Value)
The mean (or expected value)
Explanation:
The mean of a uniform distribution is simply the average of all possible outcomes.
Proof:
The expected value is calculated as:
1.4 Variance
The variance
Explanation:
Variance measures the spread of the outcomes around the mean. It is the average of the squared differences from the mean.
Proof:
Variance is calculated as:
Where:
1.5 Parameter
This indicates that the parameters
1.6 Example
Throw a fair die once: A classic example of a uniform distribution is rolling a fair six-sided die, where each face (1 through 6) has an equal probability of
2.1 Probability Mass Function (PMF)
The probability mass function (PMF) for the Bernoulli distribution is given by:
Explanation:
When
When
Proof:
The PMF for a Bernoulli random variable can be written as:
This expression works for both cases:
If
If
The sum of the probabilities for all possible outcomes must equal 1:
2.2 Moment Generating Function (MGF)
The moment generating function (MGF) for the Bernoulli distribution is given by:
Explanation:
The MGF helps us find the moments of the distribution, like the mean and variance.
Proof:
The MGF is defined as:
2.3 Mean (Expected Value)
The mean (or expected value)
Explanation:
The mean represents the probability of success in a Bernoulli trial.
Proof:
The expected value is calculated as:
2.4 Variance
The variance
Explanation:
The variance measures the spread of the outcomes around the mean. It reaches its maximum when
Proof:
Variance is calculated using the formula:
Since
Thus, the variance becomes:
2.5 Parameter
2.6 Example
Toss a coin once,
If
An indicator variable
The binomial distribution describes the number of successes in
3.1 Probability Mass Function (PMF)
The probability mass function (pmf) for the binomial distribution is given by:
Explanation:
Proof:
The pmf can be derived as the product of the probability of a specific sequence of
3.2 Moment Generating Function (MGF)
The moment generating function (MGF) for the binomial distribution is:
Explanation:
The MGF is a powerful tool for finding the moments of the distribution, such as the mean and variance.
Proof:
The MGF for a binomial distribution can be derived by recognizing that the binomial distribution is the sum of
3.3 Mean (Expected Value)
The mean (or expected value)
Explanation:
The mean represents the expected number of successes in
Proof:
The mean can be derived from the sum of the expected values of
3.4 Variance
The variance
Explanation:
The variance measures the spread of the number of successes around the mean.
Proof: The variance of a sum of independent Bernoulli trials is the sum of the variances of the individual trials:
Where
3.5 Parameter
Parameter
3.6 Example
Number of heads when tossing a coin
Binomial Theorem: The formula
Application in Finance: The binomial distribution is commonly used in security price modeling, assuming that the price can either rise or fall by a fixed amount during small intervals of time.
Let's break down the problem presented in the slide, which deals with a one-period binomial model for stock pricing. We'll work through the problem step by step to ensure we have the correct solution.
Problem Statement
We are dealing with a stock market model at time
Purchase and (short-) selling of stocks:
Current price:
Future price
Fixed deposit or loan:
Interest rate:
Continuous return over time frame
This represents the capital development of a monetary unit.
Step-by-Step Solution
Analyzing the Trading Strategy
The future value
If the stock price moves up (with probability
If the stock price moves down (with probability
Risk-Neutral Valuation
In a one-period binomial model, the price of a derivative or a contingent claim can be computed using risk-neutral valuation. The risk-neutral probability
The expected value of the future price
The present value
Given
Simplifying this expression gives the initial capital required.
Given that the number of upward movements in the stock price follows a
With probability
With probability
Certainly! Let's go through the content in the previous two images step by step, providing detailed explanations and proofs for each component, similar to how we did with the Poisson distribution.
Understanding the Transition from Binomial to Poisson Distribution
Step 1: Start with the Binomial PMF
The binomial distribution for a random variable
Step 2: Transition to Poisson
Consider the case where
We start by rewriting the binomial pmf using this condition:
The binomial coefficient
As
The probability of success term
The term
Thus, the binomial pmf transitions to the Poisson pmf as:
This is the Poisson distribution with parameter
The formula shown in the second image is a well-known limit in calculus:
Explanation:
Intuition: This limit expresses the idea that as
Connection to Exponentials: This result shows how exponential growth can be approximated by a sequence of compounding steps. The expression
Applications: This limit is often used in proofs related to the convergence of binomial distributions to Poisson distributions, the derivation of compound interest formulas, and the calculation of certain types of sums and integrals.
4.1 Probability Mass Function (PMF):
4.2 Moment Generating Function (MGF):
The MGF is defined as:
Simplifying using the series expansion of the exponential function:
4.3 Mean (Expected Value):
The expected value (mean)
By simplifying this using the properties of sums and derivatives:
4.4 Variance:
We can rewrite
The first term
The second term
Thus:
4.5 Parameter:
4.6 Example:
Number of phone calls coming into an exchange during a unit of time:
If
Summation of Independent Poisson Random Variables
If
This means the sum of independent Poisson-distributed variables is also Poisson-distributed, with the rate parameter being the sum of individual rate parameters.
Proof of the Summation of Independent Poisson Random Variables
Statement: If
Definition of Poisson Distribution:
A random variable
2. Moment Generating Function (MGF):
The moment generating function (MGF) of a Poisson random variable
3. MGF of the Sum:
Let
Substituting the MGF of each
Simplifying the expression:
The negative binomial distribution describes the number of trials
5.1 Probability Mass Function (PMF)
The PMF for the negative binomial distribution is given by:
Derivation of the PMF:
Number of Failures: Before the
Success-Failure Pattern: The probability of having
The binomial coefficient
5.2 Moment Generating Function (MGF)
The MGF
Derivation of the MGF:
The MGF is defined as:
By summing over all possible values of
This series is simplified using the sum of a geometric series, leading to the compact form:
5.3 Mean
The mean
Derivation of the Mean:
From the definition of expectation:
5.4 Variance
The variance
Derivation of the Variance: The variance is given by:
The second moment
5.5 Parameter Range:
5.6 Example:
Lottery: If a person must purchase tickets until they achieve the
6.1 Probability Mass Function (PMF)
The PMF of the hypergeometric distribution is given by:
Derivation of the PMF:
Combination Calculation: The number of ways to draw
Total Combinations: The total number of ways to draw
PMF Expression: The probability of drawing exactly
6.2 Moment Generating Function (MGF)
The hypergeometric distribution does not have an explicit closed-form moment generating function (MGF). This is because the lack of replacement complicates the analysis of the moments in a way that precludes a simple closed-form expression.
6.3 Mean
The mean
Derivation of the Mean:
The mean is derived from the linearity of expectation. Since the hypergeometric distribution models the number of successes (black balls) drawn, the expected value is proportional to the fraction of the total balls that are black.
6.4 Variance
The variance
Derivation of the Variance:
Variance accounts for both the proportion of successes and the finite size of the population.
The formula takes into consideration the fact that the draws are without replacement, which introduces negative dependence between the draws (i.e., drawing one black ball decreases the probability of drawing another black ball).
The variance formula is derived from the second moment and adjusted for the non-independence of the draws:
6.5 Parameters
6.6 Example:
Sampling Industrial Products: If you want to determine the number of defective items (black balls) in a sample drawn from a batch without replacement, the hypergeometric distribution models the probability of finding a specific number of defective items.
Relationship to Binomial Distribution
The hypergeometric distribution can be related to the binomial distribution as the population size becomes large:
Where:
This approximation holds under the condition that
1.1 Probability Density Function (PDF)
The PDF of the uniform distribution
Derivation of the PDF:
Uniformity Condition: Since the distribution is uniform, every point within the interval
Total Probability: The total area under the PDF curve must be 1,
Solving for
Thus, within the interval
1.2 Cumulative Distribution Function (CDF)
The CDF
Derivation of the CDF:
Definition: The CDF is the probability that a random variable
For
For
For
1.3 Moment Generating Function (MGF)
The MGF
Derivation of the MGF:
Definition: The MGF is defined as:
Applying the PDF:
Integration:
Simplifying:
1.4 Mean
The mean
Derivation of the Mean:
Definition: The mean is the expected value of
Applying the PDF:
Integration:
Simplifying:
1.5 Variance
The variance
Derivation of the Variance:
Definition: The variance is the second central moment:
First, calculate
Simplifying
Subtracting the square of the mean:
Final Simplification:
The exponential distribution is commonly used to model the time between events in a Poisson process, such as the time between arrivals in a queue or the time until a component fails.
2.1 Probability Density Function (PDF)
The PDF of the exponential distribution
Derivation of the PDF:
Memorylessness Property: The exponential distribution is characterized by the memoryless property, meaning that the probability of an event occurring in the future is independent of the past.
Poisson Process: If events occur according to a Poisson process with rate
Normalization: The PDF must integrate to 1 over the range
2.2 Cumulative Distribution Function (CDF)
The CDF
Derivation of the CDF:
Definition: The CDF is the probability that the random variable
For
For
2.3 Moment Generating Function (MGF)
The MGF
Derivation of the MGF:
Definition: The MGF is defined as:
Applying the PDF:
Integration:
2.4 Mean
The mean
Derivation of the Mean:
Definition: The mean is the expected value of
Applying the PDF:
Integration by Parts: Use integration by parts where
2.5 Variance
The variance
Derivation of the Variance:
Variance Formula: The variance is given by:
Calculate
Integration by Parts: Apply integration by parts twice, or use the fact that:
Variance:
2.6 Memorylessness Property
Mathematically:
Explanation:
This property implies that the exponential distribution "forgets" how much time has already passed. For example, if a light bulb's lifetime follows an exponential distribution, the probability that it lasts another hour is the same regardless of how long it has already been on.
2.7 Relationship between Exponential and Poisson Distributions
The exponential distribution is closely related to the Poisson distribution.
Setup:
Let
Define
Let
Key Result:
This result indicates that the number of events occurring in a given interval
Reverse Relationship:
If the number of events
Interpretation:
The Poisson distribution models the number of events in a fixed time interval, while the exponential distribution models the time between those events.
2.8 Interpreting
The parameter
Exponential Distribution
Mean Time Between Events: The mean or expected time between events is given by
Poisson Distribution
In the Poisson distribution,
Mean Number of Events: For a time interval of length
Intuitive Example:
If
The time between events, on average, is
3.1 Probability Density Function (PDF)
The PDF of the gamma distribution
Gamma Function
The gamma function
For a positive integer
For
Derivation of the PDF:
Generalization of Exponential Distribution: The gamma distribution generalizes the exponential distribution, which is a special case when
Shape and Scale: The parameter
3.2 Cumulative Distribution Function (CDF)
The CDF of the gamma distribution does not have a simple closed form like the exponential distribution. However, it can be expressed in terms of the incomplete gamma function:
Where
3.3 Moment Generating Function (MGF)
The MGF
Derivation of the MGF:
Definition: The MGF is defined as:
Substitute the PDF:
Recognize the Gamma Function Form:
3.4 Mean
The mean
Derivation of the Mean:
Definition: The mean is the expected value of
Using the PDF:
The integral now represents
3.5 Variance
The variance
Derivation of the Variance:
Variance Formula: The variance is calculated as:
Calculate
Variance:
3.6 Parameters:
3.7 Example:
Used to model the default rate of credit portfolios in risk management. The gamma distribution is highly flexible and can model a variety of shapes by adjusting the parameters
3.8 Property
We are dealing with the sum of multiple independent and identically distributed (i.i.d.) normal random variables. We want to confirm that the sum of squares of these normal variables follows a Gamma distribution. Specifically, if we have
Basic Properties of Normal Distribution
Let
For a normal random variable with a non-zero mean,
Step 3: Sum of Squared Normals as Gamma Distribution
We know that the sum of independent chi-square random variables each with 1 degree of freedom follows a chi-square distribution with degrees of freedom equal to the number of variables:
If each
The chi-square distribution with
Hence, if
Step 4: Moment Generating Function (mgf) Approach
The moment generating function (mgf) of a chi-square distribution with
This matches the mgf of a Gamma distribution
Step 5: Summing Up
When we sum up the squared i.i.d. normal variables
This shows that the sum of squares of multiple i.i.d. normal variables indeed follows a Gamma distribution.
3.9 Property Chi-Square Distribution of Sample Variance
To prove that
Step 1: Start with the Definition of Sample Variance
Given
Step 2: Express the Sum of Squared Deviations
We can express the sum of squared deviations from the mean as:
This equation holds because the first term on the right-hand side represents the total variance, while the second term adjusts for the difference between the sample mean and the population mean.
Step 3: Use the Distribution of the Sample Mean
Since
We know that
Step 4: Simplify the Expression
Using the fact that
This simplifies to:
The normal distribution is one of the most important distributions in probability and statistics, often used to model real-world phenomena due to the Central Limit Theorem.
4.1 Probability Density Function (PDF)
The PDF of the normal distribution
Derivation of the PDF:
Normalization: The normal distribution's PDF must integrate to 1 over the entire real line:
Bell Shape: The function
4.2 Cumulative Distribution Function (CDF)
The CDF
Properties of the CDF:
Standard Normal CDF: For
Transformation: For
4.3 Moment Generating Function (MGF)
The MGF
Derivation of the MGF:
Definition: The MGF is defined as:
Substituting the PDF:
Completing the Square: Combine the exponentials and complete the square:
4.4 Mean
The mean
Derivation of the Mean:
Symmetry: The normal distribution is symmetric around
Integration: Direct integration of
4.5 Variance
The variance
Derivation of the Variance:
Variance Formula: The variance is calculated as:
Calculate
Variance: Subtracting the square of the mean
4.6 Important Notes
Bell-Shaped PDF: The normal distribution is bell-shaped and symmetric around
Linear Transformation: If
Central Limit Theorem (CLT): The normal distribution plays a central role in probability and statistics due to the CLT, which states that the sum (or average) of a large number of independent, identically distributed random variables tends to follow a normal distribution, regardless of the original distribution.
Standard Normal Distribution: The standard normal distribution
The chi-square distribution is widely used in statistics, particularly in hypothesis testing and confidence interval estimation for variance. It is a special case of the gamma distribution and arises as the distribution of the sum of squared standard normal variables.
5.1 Probability Density Function (PDF)
The PDF of the chi-square distribution with
Derivation of the PDF:
Special Case of Gamma Distribution: The chi-square distribution is a special case of the gamma distribution where
Substituting
5.2 Cumulative Distribution Function (CDF)
The CDF of the chi-square distribution is not expressed in a simple closed form. However, it can be represented using the lower incomplete gamma function
Where
5.3 Moment Generating Function (MGF)
The MGF
Derivation of the MGF:
Definition: The MGF is defined as:
Substitute the PDF:
Combining the Exponentials:
Recognizing the integral as the gamma function gives:
5.4 Mean
The mean
Derivation of the Mean:
Sum of Squared Normals: The chi-square distribution with
5.5 Variance
The variance
Derivation of the Variance:
Variance of Sum of Independent Variables: The variance of the sum of independent random variables is the sum of their variances. Since each
5.6 Important Notes
Special Case of Gamma Distribution: The chi-square distribution is a special case of the gamma distribution with parameters
Sum of Independent Chi-square Variables: If
Relationship with Standard Normal: If
5.7 Detailed Proof of the Two Properties
1. Sum of Independent Chi-Square Variables
Statement: If
Proof:
Moment Generating Function (MGF) of a Chi-Square Distribution:
The MGF of a chi-square random variable
MGF of the Sum of Independent Variables:
Since the
Substituting the MGFs of the chi-square distributions:
Simplifying the Expression: Combine the exponents:
Conclusion: The MGF of
2. Relationship with Standard Normal
Statement: If
Proof:
PDF of Standard Normal Distribution: The probability density function (PDF) of the standard normal distribution
Transforming to a Chi-Square Distribution: Consider the transformation
Change of Variables: Let
The PDF of
Simplifying:
Identifying the Chi-Square Distribution: The PDF derived above matches the PDF of the chi-square distribution with 1 degree of freedom
The F-distribution arises frequently in the context of analysis of variance (ANOVA) and is used to compare variances. It is the distribution of the ratio of two scaled chi-square distributions.
6.1 Probability Density Function (PDF)
The PDF of the F-distribution with
Derivation of the PDF:
Ratio of Chi-square Distributions: The F-distribution is defined as the ratio of two independent chi-square distributed variables, scaled by their respective degrees of freedom. Specifically, if
Using the Relationship with the Beta Distribution: The F-distribution can be derived from the Beta distribution since if
6.2 Cumulative Distribution Function (CDF)
The CDF of the F-distribution does not have a simple closed-form expression but can be computed using the incomplete beta function
6.3 Moment Generating Function (MGF)
The MGF of the F-distribution does not have a closed-form expression, largely due to the complex nature of the distribution. Instead, its characteristic function or the first few moments (mean, variance) are often used to understand its properties.
6.4 Mean
The mean
Derivation of the Mean:
Expectation Formula: The mean of the F-distribution is derived using properties of the chi-square and beta distributions, but it requires that the second degree of freedom
6.5 Variance
The variance
Derivation of the Variance:
Variance Formula: The variance of the F-distribution involves higher moments and can be quite complex. The existence of the variance requires that
6.6 Important Notes
Relationship with Chi-square Distribution:
If
Relationship with t-Distribution:
If
Inverse F-Distribution:
If
6.7 Detailed Proof of the Three Notes
Note 1: Relationship between Chi-Square Distributions and the F-Distribution
Statement: Let
Proof:
Chi-Square Distribution:
Define the Ratio:
The random variable
Simplify this to
Distribution of the Ratio:
The distribution of
The result is that
Note 2: Relationship between the t-Distribution and the F-Distribution
Statement: Let
Proof:
t-Distribution:
Square the t-Variable:
Square
Recognize the Distribution:
Therefore,
Note 3: Reciprocal of F-distribution
Statement: If
Proof:
Consider
Reciprocal of
Take the reciprocal
Recognize the Distribution:
The new random variable
Let's break down the problem step by step to thoroughly understand the concepts and the reasoning behind the degrees of freedom
7.1 Sample Mean and Sample Variance
Given a random sample
Sample Mean:
Sample Variance:
7.2 Distribution of the Sample Mean
The sample mean
This result comes from the fact that the sum of independent normal random variables is also normally distributed. The mean of the sample mean is
7.3 Independence of
The sample mean
7.4 Distribution of the Sample Variance
The sample variance
This result indicates that the scaled sample variance follows a chi-square distribution with
7.5 Distribution of
The term
7.6 t-Distribution and Its Relation
The ratio
This result is due to the relationship between the normal distribution, chi-square distribution, and the t-distribution.
7.7 Why the Degrees of Freedom is
The degrees of freedom in the sample variance calculation are reduced by 1 because the sample mean
In other words, there is a constraint on the data, as the sum of the deviations from the mean must be zero:
7.8 Are
No, the terms
Definition: If
1. Understanding the Setup:
The chi-squared distribution
2. Definition of the t-Distribution:
The t-distribution with
To prove this, we need to show that this ratio has the properties of a t-distribution.
3. Deriving the Ratio:
Given
4. Transforming the Chi-Squared Variable:
Let's express the denominator
Since
5. Independence and Distribution:
Given that
6. Connection to the t-Distribution:
By the definition of the t-distribution,
Symmetry around 0.
Heavier tails than the normal distribution.
The degrees of freedom parameter
Proving that the statistic
Step 1: Understanding the Setup
We want to show that the statistic
Step 2: Decomposing the Statistic
First, let's express
Step 3: Standardizing the Sample Mean
The sample mean
If we standardize
This means the numerator of
Step 4: Chi-Square Distribution of the Sample Variance
The sample variance
Step 5: Independence of
One key property of normal distributions is that the sample mean
Step 6: Deriving the t-Distribution
The expression we have for
Thus, the statistic
Question 1: Toss a fair coin 2 or 3 times. Can you accurately predict the average appearance of heads?
When you toss a fair coin only a few times (like 2 or 3 times), the outcome is highly variable. For example:
You might get heads twice, giving you an average of
Or you might get heads once and tails once, giving you an average of
Since there are so few trials, your estimate of the average appearance of heads is not likely to be accurate.
Question 2: Toss a fair coin many times. What will you predict the average appearance of heads?
As you toss the coin more and more times, the Law of Large Numbers tells us that the average number of heads will converge to the true probability of getting a head in a single toss, which is 0.5 for a fair coin.
If you toss the coin 1000 times, and it's fair, the average number of heads should be close to 0.5.
If you toss it 10,000 times, this average will be even closer to 0.5.
2.1 Almost Sure Convergence
Definition: A sequence of random variables
This means that as
Example: Consider our coin toss example. Let
2. Convergence in Probability
Definition: A sequence of random variables
This means that as
Example: Again, using the coin toss, the proportion of heads
Relationship Between Convergence Concepts
Almost sure convergence implies convergence in probability. If a sequence of random variables converges almost surely to a limit, then it also converges in probability to that limit.
Convergence in probability does not imply almost sure convergence. A sequence might converge in probability without almost surely converging, as almost sure convergence is a stronger condition.
几乎必然收敛 的一个例子是抛硬币的实验:
假设你有一个硬币,每次抛硬币得到正面(1)或反面(0)的概率都是
然而,如果我们考虑一个序列
几乎必然收敛是指序列在几乎所有样本路径上都趋向于一个固定值。
依概率收敛是指序列在概率上越来越接近一个固定值,但并不要求在所有样本路径上都收敛到同一个值。
几乎必然收敛并不是要求“所有样本”都遵循规律,而是要求“几乎所有”样本路径上遵循规律,而依概率收敛允许更多的样本路径不遵循这个规律,但这些路径的概率会随着样本量的增加而越来越小。
Understanding
In statistical contexts,
Convergence in distribution (also known as weak convergence) is concerned with the convergence of the cumulative distribution functions (CDFs) of a sequence of random variables.
Definition: A sequence of random variables
This means that as
The following properties explain the relationships between almost sure convergence, convergence in probability, and convergence in distribution:
Almost sure convergence implies convergence in probability, which in turn implies convergence in distribution:
If
If
Reasoning: Almost sure convergence is the strongest form, ensuring that the sequence converges almost everywhere. This naturally implies convergence in probability, which in turn implies convergence in distribution, as distributional convergence is a weaker condition.
Convergence in probability implies convergence in distribution:
If
Reasoning: Since convergence in probability ensures that for any small positive
Convergence to a constant:
If
Reasoning: Convergence in distribution to a constant implies that the random variables
Convergence preserved by continuous transformations:
If
Reasoning: A continuous transformation of a convergent sequence of random variables preserves the convergence in distribution. The continuous mapping theorem formalizes this concept.
Slutsky's Theorem is a powerful result that relates products, sums, and ratios of converging sequences of random variables.
Theorem: If
(a)
(b)
(c)
Reasoning: Slutsky's Theorem combines converging sequences in different manners, demonstrating that convergence in distribution is preserved under certain algebraic operations, provided one of the sequences converges in probability.
The Delta Method is used to approximate the distribution of a function of a random variable that is asymptotically normal.
Theorem: Suppose
Reasoning: The Delta Method leverages the fact that if
The Law of Large Numbers states that as the number of independent, identically distributed (i.i.d.) random variables
Proof Outline:
We know
By the definition of the sample mean,
By the Central Limit Theorem (CLT) and Chebyshev's inequality, we can show that the probability that
Monte Carlo integration is a method used to estimate the value of an integral using random sampling.
Goal:
To calculate the integral
Steps:
Generate Sample Points: Generate
Compute the Sample Mean: Calculate the sample mean of the function values at these points:
Apply the Law of Large Numbers: According to LLN, as
The Central Limit Theorem (CLT) is a fundamental theorem in probability theory. It states that, given a sufficiently large number of independent and identically distributed (i.i.d.) random variables with a finite mean and variance, the distribution of the sum (or average) of these variables approaches a normal distribution as the number of variables increases.
Let
According to the CLT:
This means that as
Given that
Assumptions
Distribution of the Sample Mean
Since each
Mean of
Variance of
Distribution of
Let's consider the binomial distribution as an example of applying the CLT.
Suppose
The mean and variance of each
According to the CLT, when
This means that the binomial distribution
Context:
Suppose you're interested in knowing the average income
If you could ask every family in Singapore for their income, you would get the true average income
Sampling:
Instead of asking every family, you take a random sample of 1000 families and calculate the average income of these 1000 families. Let's denote this sample average by
Sampling Error:
The difference between the sample average
This error arises because the sample might not perfectly represent the entire population.
Assessment of Sampling Error:
The sampling error can be assessed by understanding its distribution. Under certain conditions (like the Central Limit Theorem), the sampling distribution of
Therefore, the sampling error can be assessed using confidence intervals or hypothesis testing to determine how close
Context
Consider an experimental error
These errors could arise from various sources, such as measurement inaccuracies, variations in raw materials, or differences in experimental conditions.
Why Normal Distribution?
According to the Central Limit Theorem (CLT), when these component errors are independent and identically distributed, the sum (or a linear combination) of these errors will tend to follow a normal distribution as the number of components becomes large.
Implication
Because of this property, the overall experimental error
This assumption of normality simplifies statistical analysis and is foundational for many statistical methods, such as regression analysis and hypothesis testing.