Population and Sample:
(a) Finite Population: A population is said to be finite, if it consists of finite or fixed number of elements (i.e., items, objects, measurements or observations). For example, all the university students in Pakistan, the heights of all the students enrolled in Karachi University, etc.
(b) Infinite Population: A population is said to be infinite, if there is no limit to the number elements it can contain. For example, the role of two dice, all the heights between 2 and 3 meters, etc.
(a) Sampling with Replacement: If the sample is taken with replacement from a population finite or infinite, the element drawn is returned to the population before drawing the next element.
(b) Sampling without Replacement: If the sample is taken without replacement from a finite population, the element selected is not returned to the population.
Probability Samples and
NonProbability Samples:
(a) Simple Random Sampling: refers to a method of selecting a sample of a given size from a given population in such a way that all possible samples of this size which could be formed from this population have equal probabilities of selection. It is a method in which a sample of n is selected from the population of N units such that each one of the ^{N}C_{n} distinct samples has an equal chance of being drawn. This method sometimes also refers to ‘lottery method’.
(b) Stratified Random Sampling: consists of the following two steps:
(i) The material or area to be sampled is divided into groups or classes called ‘strata’. Items within each stratum are homogenous.
(ii) From each stratum, a simple random sample is taken and the overall sample is obtained by combining the samples for all strata.
(c) Systematic Sampling: is another form of sample design in which the samples are equally spaced throughout the area or population to be sampled. For e.g., in housetohouse sampling every 10^{th} or 20^{th} house may be taken. More specifically a systematic sample is obtained by taking every k^{th} unit in the population after the units in population have been numbered or arranged in some way.
(d) Cluster Sampling: One of the main difficulties in large scale surveys is the extensive area that may have to be covered in getting a random or stratified random sample. It may be very expensive and lengthy task to cover the whole population in order to obtain a representative sample. It is not possible to take a simple random or systematic sample of persons from the entire country or from within strata, since there is no such list in which all the individuals are numbered from 1 to N. Even if such a list existed, it would be too expensive to base the enquiry on a simple random sample of persons. Under these circumstances, it is economical to select groups called ‘clusters’ of elements from the population. This is called ‘cluster sampling’. The difference between a cluster and a stratum is that a stratum is expected to be homogenous and a cluster must be heterogeneous as possible. Clusters are also known as the primary sampling units. Cluster sampling may be consisted of:
(i) Singlestage Cluster Sampling,
(ii) Subsampling or Twostage Sampling, and
(iii) Multistage Sampling.
(a) Judgement or Purposive Sampling: There are many situations where investigators use judgement samples to gain needed information. For example, it may be convenient to select a random sample from a cartload of melons. The melons selected may be very large or very small. The observer may use his own judgement. This method is very useful when the sample to be drawn is small.
(b) Quota Sampling: is widely used in opinions, market surveys, etc. In such surveys, the interviewers are simply given quotas to be filled in from different strata, with practically no restrictions on how they are to be filled in.
Parameters and Statistic:
Sampling and NonSampling
Errors:
(a) Sampling Errors:
E = t – θ
(b) NonSampling Errors:
Bias:
B = m – μ
Where μ is the true population value and m is the mean of the sample statistics of an infinity of samples.
Precision and Accuracy:
Sampling Distribution:
Sampling Distribution of Mean:
From a finite population of N units with
mean μ and SD σ, draw all possible random samples of size n.
Find the mean
of every sample.
Statistic
is now a random variable. Form a
probability distribution of
, known as ‘sampling distribution of mean’.
The sampling distribution of mean is one of the most fundamental concepts of statistical inference and it has the following properties:
Where is Finite Population Correction (f.p.c.)
is sampling fraction
with replacement finite
The f.p.c. approaches one in each of the following cases:
(i) when the population is infinite,
(ii) when sampling fraction is less than 0.05, and
(iii) when the sampling is with replacement.
Whenever, the sampling is with replacement, the population is considered infinite. For e.g., a box contains 5 balls, when a sample is drawn with replacement, the sample size can be extended from n = 1 to n = 100 or whatever size is desired. Hence, the population is considered to be infinite.
Mean and Standard Deviation of
Sampling Distribution:
Like other distribution, the sampling distribution of has a mean and standard deviation:
 Mean of sampling distribution
The standard deviation of sampling distribution of is known as ‘standard error’ ( ). The standard error of mean is always less than the SD of population, i.e., σ. It depends on the size of the sample drawn. If the sample size increases, the standard error of mean decreases and consequently the value of sample mean will be closer to the value of population mean.
 SD of sampling distribution
or alternatively
 SD of sampling distribution
No. of Possible Samples:
The number of possible samples can be calculated as below:
(i) When sampling is done without replacement, all possible samples = ^{N}C_{n}
(ii) When sampling is done with replacement, all possible samples = N^{n}
Example:
A population consists of following data:
1, 2, 3, 4
Suppose that a sample of size 2 is drawn
‘with replacement’. You are
required to calculate the following:
(a) Population mean,
(b) Population standard deviation,
(c) Mean of each sample,
(d) Sampling distribution table of sample mean with replacement, and
(e) Mean and standard deviation of sampling distribution.
Solution:
N = 4
n = 2
No. of samples (when sampling is with replacement) = N^{n} = 4^{2} = 16
(a) Population Mean (μ):
(b) Population Standard Deviation
(σ):
(c) Mean (
) of Each Sample:
Samples (with replacement):
(1,1) 
(2,1) 
(3,1) 
(4,1) 
(1,2) 
(2,2) 
(3,2) 
(4,2) 
(1,3) 
(2,3) 
(3,3) 
(4,3) 
(1,4) 
(2,4) 
(3,4) 
(4,4) 
Mean (
):
1.0 
1.5 
2.0 
2.5 
1.5 
2.0 
2.5 
3.0 
2.0 
2.5 
3.0 
3.5 
2.5 
3.0 
3.5 
4.0 
(d) Sampling Distribution:
Sampling
Distribution of Sample Mean (
) with Replacement
Frequency
Distribution of

Probability
Distribution of



Tally
Marks 
f 
=


1.0 
 
1 
1.0 
0.0625 
1.5 
 
2 
1.5 
0.125 
2.0 
 
3 
2.0 
0.1875 
2.5 
 
4 
2.5 
0.25 
3.0 
 
3 
3.0 
0.1875 
3.5 
 
2 
3.5 
0.125 
4.0 
 
1 
4.0 
0.0625 
Total 

16 

1 
(e) Mean and standard deviation of
sampling distribution:



^{ } 



1.0 
0.0625 
0.0625 
–1.5 
2.25 
0.1406 
0.0625 
1.5 
0.125 
0.1875 
–1.0 
1 
0.125 
0.2812 
2.0 
0.1875 
0.375 
–0.5 
0.25 
0.0469 
0.75 
2.5 
0.25 
0.625 
0 
0 
0 
1.5625 
3.0 
0.1875 
0.5625 
0.5 
0.25 
0.0469 
1.6875 
3.5 
0.125 
0.4375 
1.0 
1 
0.125 
1.5312 
4.0 
0.0625 
0.25 
1.5 
2.25 
0.1406 
1 
Total 
1 
2.5 


0.625 
6.8749 
Example:
Take the data of previous example and assume sampling ‘without replacement’, and compute:
(a) Population mean,
(b) Population standard deviation,
(c) Mean of each sample,
(d) Sampling distribution table of sample mean w/o replacement, and
(e) Mean and standard deviation of sampling distribution.
Solution:
(a) and (b) Population mean and
SD:
As calculated above.
(c) Mean of each sample:
No. of possible samples = ^{N}C_{n} = ^{4}C_{2} = 6 samples
Samples (without replacement):
(1,2) 
(1,3) 
(1,4) 
(2,3) 
(2,4) 
(3,4) 
Mean:
1.5 
2 
2.5 
2.5 
3 
3.5 
(d) Sampling Distribution:
Sampling
Distribution of Sample Mean (
) without replacement

f( ) 






1.5 
1/6 
0.25 
–1 
1 
0.17 
2.25 
0.375 
2 
1/6 
0.33 
–0.5 
0.25 
0.04 
4 
0.666 
2.5 
2/6 
0.84 
0 
0 
0 
6.25 
2.082 
3 
1/6 
0.5 
0.5 
0.25 
0.04 
9 
1.5 
3.5 
1/6 
0.58 
1 
1 
0.17 
12.25 
2.042 
Total 
1 
2.5 


0.42 

6.665 
(e) Mean and SD of Sampling
Distribution:
Sampling Distribution of the
Differences of Means:
Provided that and = 0.05
The distribution of is normal if:
(i) the samples are drawn from Normal (or Symmetrical) populations, or
(ii) n_{1} and n_{2} both are at least 30.
The distribution of ‘z’ will be standard normal:
Example:
Population I = {1, 2, 3, 4}
Population II = {3,4,5}
Samples drawn from each population with replacement:
n_{1} = 2
n_{2} = 2
Compute means of each samples, possible differences between and , sampling distribution of , and mean and SD of sampling distribution of .
Solution:
No. of possible samples from Population I = N^{n} = 4^{2} = 16 samples
Samples I:
1,1 
1,2 
1,3 
1,4 
2,1 
2,2 
2,3 
2,4 
3,1 
3,2 
3,3 
3,4 
4,1 
4,2 
4,3 
4,4 
1.0 
1.5 
2.0 
2.5 
1.5 
2.0 
2.5 
3.0 
2.0 
2.5 
3.0 
3.5 
2.5 
3.0 
3.5 
4.0 
No. of possible samples from Population II = N^{n} = 3^{2} = 9 samples
Samples II:
3,3 
3,4 
3,5 
4,3 
4,4 
4,5 
5,3 
5,4 
5,5 
:
3.0 
3.5 
4.0 
3.5 
4.0 
4.5 
4.0 
4.5 
5.0 
Differences
of Independent Sample Means

1 
1.5 
2 
2.5 
1.5 
2 
2.5 
3 
2 
2.5 
3 
3.5 
2.5 
3 
3.5 
4 
3 
2 
1.5 
1 
0.5 
1.5 
1 
0.5 
0 
1 
0.5 
0 
0.5 
0.5 
0 
0.5 
1 
3.5 
2.5 
2 
1.5 
1 
2 
1.5 
1 
0.5 
1.5 
1 
0.5 
0 
1 
0.5 
0 
0.5 
4 
3 
2.5 
2 
1.5 
2.5 
2 
1.5 
1 
2 
1.5 
1 
0.5 
1.5 
1 
0.5 
0 
3.5 
2.5 
2 
1.5 
1 
2 
1.5 
1 
0.5 
1.5 
1 
0.5 
0 
1 
0.5 
0 
0.5 
4 
3 
2.5 
2 
1.5 
2.5 
2 
1.5 
1 
2 
1.5 
1 
0.5 
1.5 
1 
0.5 
0 
4.5 
3.5 
3 
2.5 
2 
3 
2.5 
2 
1.5 
2.5 
2 
1.5 
1 
2 
1.5 
1 
0.5 
4 
3 
2.5 
2 
1.5 
2.5 
2 
1.5 
1 
2 
1.5 
1 
0.5 
1.5 
1 
0.5 
0 
4.5 
3.5 
3 
2.5 
2 
3 
2.5 
2 
1.5 
2.5 
2 
1.5 
1 
2 
1.5 
1 
0.5 
5 
4 
3.5 
3 
2.5 
3.5 
3 
2.5 
2 
3 
2.5 
2 
1.5 
2.5 
2 
1.5 
1 
Sampling
Distribution of
with Replacement

Tally
Marks 
f 



–4 
 
1 
0.00694 
–0.02776 
0.11104 
–3.5 
 
4 
0.02778 
–0.09723 
0.340305 
–3 

10 
0.06945 
–0.20835 
0.62505 
–2.5 
 
18 
0.125 
–0.3125 
0.78125 
–2 

25 
0.17361 
–0.34722 
0.69444 
–1.5 
 
28 
0.19444 
–0.29166 
0.43749 
–1 

25 
0.17361 
–0.17361 
0.17361 
–0.5 
 
18 
0.125 
–0.0625 
0.03125 
0 

10 
0.06945 
0 
0 
0.5 
 
4 
0.02778 
0.01389 
0.006945 
1 
 
1 
0.00694 
0.00694 
0.00694 
Total 

144 
1 
–1.5 
3.20832 
Shape of the Sampling Distribution
of
:
The Central Limit Theorem describes the shape of the sampling distribution of mean. The theorem states that the sampling distribution of mean is normal distribution either if the population is normal or if the sample size is more than 30.
Central limit theorem also specifies the relationship between μ and and the relationship between σ and .
If the sampling distribution of mean is normal, we would expect 68.27%, 95.45% and 99.73% of the sample means to lie within the intervals , and respectively.