Measures of Central Tendency II

(d)   Median:

  1. Median is defined as the middle value of the data when the values are arranged in ascending or descending order.

  2. If there are even number of values in the data, the average of two middle values in the array is taken as the median:

Median for Grouped Data:

Where  l = lower class boundary

            h = width

            f = frequency of the median class

            cf = cumulative frequency of the preceding class

Example:

Class Boundaries

Frequency

9.5-19.5

5

19.5-29.5

8

29.5-39.5

13

39.5-49.5

19

49.5-59.5

23

59.5-69.5

15

69.5-79.5

7

79.5-89.5

5

89.5-99.5

3

99.5-109.5

2

Total

100

Solution:

Class Boundaries

f

x

cf

9.5-19.5

5

14.5

5

19.5-29.5

8

24.5

13

29.5-39.5

13

34.5

26

39.5-49.5

19

44.5

45

49.5-59.5

23

54.5

68

59.5-69.5

15

64.5

83

69.5-79.5

7

74.5

90

79.5-89.5

5

84.5

95

89.5-99.5

3

94.5

98

99.5-109.5

2

104.5

100

Total

100

 

 

th value lies in the 5th class, viz., 50-59 or 49.5-59.5.  Therefore, it is the median class.  Here, l = 49.5, h = 10, f = 23,  = 100, and cf = 45.

Median for Discrete Data:

To find the  from discrete data, we form a cumulative frequency.  The  is the value corresponding to CF distribution in which th value lies:

Example:

No. of children

No. of families

Cumulative frequency

0

4

4

1

25

29

2

53

82

3

18

100

4

14

114

5

6

120

 

120

 

Solution:

Since th value (i.e., th value lies in the CF corresponding to 2, the median is 2. ( = 2).

Graphical Location of Median:

The approximate value of the median can be located from an ogive, i.e., a cumulative frequency polygon:

Example:

Class Boundaries

f

CF

9.5-19.5

5

5

19.5-29.5

8

13

29.5-39.5

13

26

39.5-49.5

19

45

49.5-59.5

23

68

59.5-69.5

15

83

69.5-79.5

7

90

79.5-89.5

5

95

89.5-99.5

3

98

99.5-109.5

2

100

Total

100

 

Solution:

Quartiles, Deciles and Percentiles:

  1. The values which divide an arrayed set of data into four equal parts are called ‘quartiles’.

  2. The first and third quartiles are also known as lower and upper quartiles respectively.

  3. The quartiles are expressed as follows:

*

  1. The values which divide an arrayed set of data into ten equal parts are called ‘deciles’.

  1. The values which divide an arrayed set of data into one hundred equal parts are called ‘percentiles’:

  1. The quartiles, deciles and percentiles may be determined from the grouped data in the same way as the median except that in place of , we will use ,  and :

                                                Where              l = lower class boundary of Q1 class

                                                                        h = width of class boundary

                                                                        f = frequency of Q1 class

CF = Cumulative frequency of the class preceding to Q1 class

  1. For discrete data, the quartiles, deciles and percentiles are determined in the same way as the median.

  2. The quartiles, deciles and percentiles may be located from an ogive in a similar way as the median:

Example:

(See the former example)

Solution:

(e)   Mode:

  1. The mode is defined as that value in the data which occurs the greatest number of times provided that such a value exists.

  2. If each value occurs the same number of times, then there is no mode.  If two or more values occurs the same number of times but more frequently than any of the other values, then there is more than one mode.

  3. The distribution having only one mode is called ‘uni-modal distribution’, two modes ‘bi-modal distribution’, and more than two modes ‘multi-modal distribution’.

Mode for Grouped Data:

In case of grouped data, the mode is defined as that value of x which corresponds to the highest points on the curve. The mode is denoted by  (read as “x caret”):

Where fm is the frequency of modal class, f1 is the frequency of preceding class, and f2 is the frequency of following class.

Example:

Class Boundaries

f

9.5-19.5

5

19.5-29.5

8

29.5-39.5

13

39.5-49.5

19

49.5-59.5

23

59.5-69.5

15

69.5-79.5

7

79.5-89.5

5

89.5-99.5

3

99.5-109.5

2

Total

100

Solution:

Class Boundaries

f

9.5-19.5

5

19.5-29.5

8

29.5-39.5

13

39.5-49.5

19

49.5-59.5

23

59.5-69.5

15

69.5-79.5

7

79.5-89.5

5

89.5-99.5

3

99.5-109.5

2

Total

100

In the above table, the frequency of 5th class is maximum which is, therefore, the modal class.  Here l = 49.5; h = 10; fm = 23; f1 = 19; and f2 = 15.

Mode for Discrete Data:

In case of discrete data, the mode may be picked out by inspection.  It is the most common value, i.e., the value with greatest frequency.

Relation between Mean, Median and Mode:

  1. In Symmetrical distribution, the mean, median and mode coincide, i.e., equal in value.

    2.   In moderately skewed distributions:

 or

Advantages and Disadvantages of Averages:

Arithmetic Mean:

Advantages:

  1. The arithmetic mean is rigidly defined by a mathematical formula.

  2. It is most widely used and most commonly understood of all the averages.

  3. It is easy to calculate and is determinate in almost every case.

  4. It depends on all the values of the data and a change in any value changes the value of the mean.

  5. It is capable of further algebraic manipulation.

  6. It is relatively stable measure.

Disadvantages:

  1. The mean is greatly influenced by extreme values especially by extremely large ones.

  2. It is not an appropriate average for highly skewed distributions, e.g., distributions of wages or incomes, etc., and U-shaped distributions.

  3. It cannot be accurately computed in case of an open-end frequency table.

  4. It may locate the value at a point at which few or none or the actual observations lie.

Geometric Mean:

Advantages:

  1. The GM is rigidly defined by a mathematical formula.

  2. It is based on all the values.

  3. It is less affected by extremely large values than does the AM.

  4. It is capable of further algebraic manipulation.

  5. It gives equal weight to all the values.

  6. It is not much affected by fluctuations of sampling.

  7. It is used in finding average of values which are in geometric progression.

  8. It is the appropriate average for averaging the rates of change (e.g., the rate of change in income, population, etc.) and ratios (e.g., price indices).

Disadvantages:

  1. It is neither easy to calculate nor to understand.

  2. It vanishes if any item in the data is zero.

  3. It cannot be computed if any value is negative.

  4. It may locate the value at a point at which few or none of the actual values lie.

Harmonic Mean:

Advantages:

  1. HM is rigidly defined by a mathematical formula.

  2. It is based on all the values.

  3. It is capable of further algebraic manipulation.

  4. It is not much affected by fluctuations of sampling.

  5. It is an appropriate average for averaging time rates (e.g., speeds per hour) and ratios (e.g., units purchased per rupee, etc.)

Disadvantages:

  1. It is neither simple to understand nor easy to calculate.

  2. The HM is greatly influenced by extremely small values.

  3. It cannot be determined if any value in the data is zero.

Median:

Advantages:

  1. The median is simple to understood and easy to calculate.

  2. It is not affected by extremely large or extremely small values.

  3. It can be computed from an open-end frequency table.

  4. It is the most appropriate average in a highly skewed distribution, e.g., the distribution of wages, incomes, etc.

  5. It can be located even if the items are not capable of quantitative measurement.  For instance, we may arrange a number of pieces of blue cloth in order of intensity of their colour and find the piece with the median colour.

  6. It is not affected by changes in the values of the items.

  7. The sum of absolute deviations (i.e., the ignoring negative signs) is the smallest when measured from median than from any other average.

Disadvantages:

  1. It is not rigidly defined.

  2. It is not based on all the values.

  3. It is not capable of further algebraic manipulation.

  4. It is necessary to arrange the values in an array before finding the median, which is a tedious work

Mode:

Advantages:

  1. It is simple to understand and easy to calculate.  It can be located simply by inspection in discrete distributions.

  2. It is not influenced by extremely large or extremely small values.

  3. It can be determined even in an open-end frequency table.

Disadvantages:

  1. It is not well-defined.  Sometimes, a distribution may have no mode at all or it may have more than one mode.

  2. It is not based on all the values.

  3. It is not capable of further algebraic manipulation.

  4. There will be no well-defined mode if the distribution consists of small number of values.


* It should be noted here that .

The relationship between  is expressed as .

Top

Home Page