History
of Statistics:
Famous
Statisticians and Their Contributions:
Statisticians 
Contributions 
John Graunt (1661) 
Vital Statistics 
James G. Cardano (15011536) 
Theory of Probability 
Jacob Bernoulli (16541705) 
Theory of Probability 
Thomas Bayes (1763) 
Theory of Probability 
De Moivre (1733) 
Normal Curve Equation 
Adolf Quetlet (17961874) 
Applied Statistical Tools in Education and Sociology 
Francis Galton 
Applied Statistical Tools in Heredity, Eugenics and Psychology 
Karl Pearson 
ChiSquare Distribution 
William S. Gosset (18761937) 
Probable Error of Mean 
R.A. Fisher (18901962) 
Developed Small Sample Theory 
J.Neyman (18941983) and E.S. Pearson (18951981) 
Theory of Hypothesis Testing 
A. Wald (19021950) 
Statistical Decision Theory 
Descriptive
and Inferential Statistics:
Characteristics
of Statistics:
Functions
or Uses of Statistics:
Limitations
of Statistics:
There are two sources
of collecting data:
(a) Primary Sources: The data published or used by an organisation which originally collects them are called ‘primary data’. The data in the Population Census reports are primary because they are collected, compiled and published by the Population Census Commission.
(b)
Secondary Sources:
The data published or used by an organisation other than the one which
originally collected them are known as ‘secondary data’.
For example, the data in Economic Survey of Pakistan.
Methods
of Collection of Primary Data:
(a) Direct Personal Observation, i.e., through individual interviews.
(b) Indirect Oral Investigation, i.e., on evidence of persons or parties suppose to know the facts directly or indirectly.
(c) Registration is the most popular method of collecting data.
(d) Estimates Through Local Correspondents is not a formal collection of data. This method is generally used in crop or land estimates.
(e) Investigation Through Enumerators to get the forms of inquiry filled in from the informants.
(f)
Mailed Questionnaire Method.
Methods
of Collecting Secondary Data:
(a) Official Sources, i.e., publications of Federal Bureau of Statistics; Ministries of Finance, Trade and Industry, Telecommunication, Education, etc.
(b) SemiOfficial Sources, i.e., publications of State Bank of Pakistan, SECP, District Councils, Municipal Committees, etc.
(c) Private Sources, i.e., publication of trade associations, Chamber of Commerce and Industry, etc.
(d)
Technical and Trade Journals.
(e)
Research Organisations,
i.e., universities, Institute of Education and Research, Institute of
Development Economics, etc.
Continuous
or Discrete Variables:
Quantitative
and Qualitative Data:
Errors
of Measurement:
The difference
between the measured value and true value is called the error of measurement. These errors are of two types:
(a) Compensating Errors: are the errors which tend to balance or cancel out in the long run are called ‘compensating errors’ or ‘chance errors’ or ‘random errors’.
(b)
Biased Errors:
are the errors which tend to occur in the same direction and have cumulative in
effect, are called ‘biased errors’ or ‘cumulative errors’.
Such errors are arised from faulty instruments or personal intentions.
Classification
of Data:
The process of
arranging data into classes or categories according to some common
characteristics present in the data is called ‘classification’.
Data can be
classified by many characteristics, but there are four main bases of data
classifications, there are:
(a) Qualitative, i.e., sex, religion, marital status, race, etc.
(b) Quantitative, i.e., height, weight, income, etc.
(c) Geographical, i.e., continents, states, cities, etc.
(d)
Chronological,
i.e., arrangement of data by their time occurrence, e.g., date of birth, date of
joining, etc.
Types
of Data Classifications:
Data can be
classified by one, two or more characteristics at a time:
(a)
Quantitative:
(i) Oneway: when data are classified by one characteristic.
(ii) Twoway: when data are classified by two characteristics.
(iii) Threeway: when data are classified by three characteristics.
(iv)
Manyway:
when data are classified by many characteristics.
(b)
Qualitative:
(i) Twofold or dichotomy: we may divide a characteristic into two subclasses one possessing the characteristic and the other not possessing it. For example, the literacy and illiteracy of a country.
(ii) Threefold or trichotomy: when data are classified into three sub classes.
(iii)
Manifold:
when data are classified into many subdivisions.
Frequency
Distribution:
(a)
Frequency Distribution of Discrete
Data: There are no class boundaries because
discrete data are not in fractions. If
class interval size is one we usually take single values.
No. of children in a family 
Number
of families 
0 
7 
1 
3 
2 
25 
3 
16 
4 
9 
5 
4 
6 
1 
Total 
65 
(b)
Frequency
Distribution of Continuous Data: Class
boundaries are formed for continuous data because the continuous data are in
fractions:
Heights
of students in a class (inches) 
Number
of students 
55.558.0 
1 
58.060.5 
6 
60.563.0 
17 
63.065.5 
18 
65.568.0 
18 
68.070.5 
4 
70.573.0 
1 
Total (S f) 
65 
(c)
Cumulative
Frequency Distribution: is the table showing
cumulative frequencies:
Heights
(inches) 
No.
of students Less than Cumulative Frequency 
Heights
(inches) 
No.
of students Greater
than Cumulative Frequency 
Less than 55.5 
0 
55.5 and more 
65 
Less than 58.0 
1 
58.0 and more 
64 
Less than 60.5 
7 
60.5 and more 
58 
Less than 63.0 
24 
63.0 and more 
41 
Less than 65.5 
42 
65.5 and more 
23 
Less than 68.0 
60 
68.0 and more 
5 
Less than 70.5 
64 
70.5 and more 
1 
Less than 73.0 
65 
73.0 and more 
0 
(d)
Relative
Frequency Distribution: is expressed in
percentage of frequency to total frequency:
Heights 
Frequency (No.
of students) 
Relative
frequency (%) 
55.558.0 
1 
1 / 65 × 100 = 1.54 
58.060.5 
6 
6 / 65 × 100 = 9.23 
60.563.0 
17 
26.15 
63.065.5 
18 
27.69 
65.568.0 
18 
27.69 
68.070.5 
4 
6.16 
70.573.0 
1 
1.54 

65 
100 
(e)
Relative
Cumulative Frequency Distribution:
Heights (Inches) 
No. of students Less
than Cumulative Frequency 
Relative Frequency 
Heights (Inches) 
No.
of students Greater
than Cumulative Frequency 
Relative Frequency 
Less than 55.5 
0 
0 
55.5 and more 
65 
100 
Less than 58.0 
1 
1
/ 65 × 100 = 1.54 
58.0 and more 
64 
98.46 
Less than 60.5 
7 
7
/ 65 × 100 = 10.77 
60.5 and more 
58 
89.23 
Less than 63.0 
24 
36.92 
63.0 and more 
41 
63.08 
Less than 65.5 
42 
64.61 
65.5 and more 
23 
35.38 
Less than 68.0 
60 
92.31 
68.0 and more 
5 
7.69 
Less than 70.5 
64 
98.46 
70.5 and more 
1 
1.54 
Less than 73.0 
65 
100 
73.0 and more 
0 
0 
(f)
Bivariate
Frequency Distribution: involves constructing
frequency distribution of two variables:
Weights (pounds) 
Heights
(inches) 

5759 
6062 
6365 
6668 
6971 
7274 
Total 

100104 
3 
7 
 
 
 
 
10 
105109 
 
5 
10 
2 
1 
 
18 
110114 
1 
1 
4 
6 
4 
0 
14 
115119 
 
 
1 
1 
4 
2 
8 
Total 
3 
12 
15 
9 
9 
2 
50 