Mathematics: Measure of Dispersion

The central value like mean is generally used to convey the general behavior of a data set. For example, the average score of the class in a math test hints at the general comfort level of the class in the topic which is tested. But, if two classes have the same average score, can the teacher conclude that the understanding level is same for the two classes? The arithmetic mean does not convey the variations displayed in the individual marks of the students. The teacher needs to have some idea about the spread of the marks of both the classes. Teacher needs some numerical measure of dispersion which would convey how the marks are spread about the central value, the mean in this case.

Dispersion is useful to find the relation between the set of data. Actually, there are two measures of dispersion. First is standard deviation and second is variance. So, we will explain both standard deviation and variance in detail.

Standard Deviation: Standard Deviation can be defined as the measurement of describing the variability of given data set. Standard deviation is also used for measuring the exact value of the number in the given data set. Dispersion is defined as the measurement including the average deviation, variance, and then standard deviation. The standard deviation and the variance are most widely used for measuring the dispersion.

Formula for Standard Deviation Dispersion:

The formula for the standard deviation dispersion are as follows:

Variance = σ2σ2 = Σ(xr−μ)2NΣ(xr−μ)2N

The standard deviation dispersion =σσ

Variance = σ2

Dispersion Definition

The dispersion is the tendency of data to be scattered over a range. Dispersion is the important feature of a frequency distribution. It is also called spread or variation. Range, variance and standard deviation are all measures of dispersion. Common measure of dispersion are, range, inter quartile range and quartile deviation, average deviation or mean deviation and standard deviation.

Data Dispersion

Measures of dispersion measure how spread out a set of data. It is important to know the amount of dispersion, variation or spread, as data that is more dispersed or separated is less reliable for analytical purposes. Dispersion is depend upon the type of scale used to measure data characteristics that is quantitative or categorical.

Degree of Dispersion

Dispersion may enable to get additional information about the composition of data. A point is that a high degree of uniformity is a desirable quality. Measure of dispersion are useful in comparing two or more distributions in respect to disparities. A greater degree of dispersion mean lack of uniformity or homogeneity in the data while a low degree of dispersion stands for uniformity or homogeneity.

Measures of Dispersion

These measures of dispersions are used to convey the spread in probability distribution as they are used to find the variability when the data is given as a frequency distributions. Range, variance and standard deviation are all measures of dispersion.

1. Range

2. Inter Quartile Range

3. Mean absolute Deviation

4. Variance

5. Standard deviation

6. Coefficient of variation

Range

Range is the difference between the largest and the smallest values

Range = Largest Value Smallest Value

For example if the lowest and highest marks scored in a test are 15 and 95, the range = 95 -15 = 80.

Even though the range is easiest measure of dispersion to calculate, it is not considered a good measure of dispersion as it does not utilize the other information related to the spread. The Outliers, either the extreme low value or extreme high value can affect the range considerably. In the case of test marks, if the next lowest mark happens to 25, then the mark 15 is an extreme low value which shoots the range up by 10 marks.

Inter-Quartile Range

Inter-Quartile range is a measure of dispersion, which is not affected by the outliers of the data, but nevertheless conveys the idea of range. it measures the spread of the middle 50% of an ordered data set. In other words it is the numerical difference between the first and the third Quartiles.

Inter quartile Range = Q₃ – Q₁

Mean absolute Deviation

This measure of dispersion is the average of absolute deviations of data values from the mean. It is computed using the formula,

\frac{\sum (x_{i} - \bar{x})}{n}

\frac{\sum (x_{i} - \bar{x})}{n}

Where x is the arithmetic mean of the data set.

So, what does this measure of dispersion tell about the spread? A data set with a larger Mean absolute difference is more spread when compared to a data set with a smaller MSD. The Mean absolute difference is sensitive to the outliers.

Variance and Standard deviation

The variance and the standard deviation are two dispersion measures which are widely applied because they are calculated utilizing the data effectively. The deviation about the mean for each value in the data set is taken into note. The variance is the average of the squared deviations.

The illustrations below explain the behavior of data sets with larger and smaller variances.

Solved Example:

Question 1: The Math test marks of a class are as follows

52, 45, 25, 75, 63, 86, 72, 85, 55, 65, 70, 82, 90, 48, 68, 86, 65, 64, 78, 75, 32, 42. Find the inter quartile range.

Solution:

Given data:
52, 45, 25, 75, 63, 86, 72, 85, 55, 65, 70, 82, 90, 48, 68, 86, 65, 64, 78, 75, 32, 42

Step 1:

If the marks are arranged in an order

25	32	42	45	48	52	55	63	64	65	65	68	70	72	75	75	78	82	85	86	86	90
					a					65						a
					Q₁					Q₂						Q₃

Step 2:

Inter Quartile range = Q₃ - Q₁ = 78 - 52 = 26

This tells that the middle 50% of the test score are dispersed over a range of 26 marks.

Question 2: Find the standard deviation of the data set 5,10,15, 20, 25, 30, 35, 40, 45, 50.

Solution: Given data, 5,10,15, 20, 25, 30, 35, 40, 45, 50.

Step 1:

The first step would be to compute the mean = x =

\frac{5 + 10 + 15 + 20 + 25 + 30 + 35 + 40 + 45 + 50}{10}

\frac{275}{10}

= 27.5

Step 2:

Now we build a table calculating the deviations and squared deviations. Finally we total up the squared deviations column.

x	Deviations (x - x)	Squared Deviations (x - x)²
5	-22.5	506.25
10	-17.5	306.25
15	-12.5	156.25
20	-7.5	56.25
25	-2.5	6.25
30	2.5	6.25
35	7.5	56.25
40	12.5	156.25
45	17.5	306.25
50	22.5	506.25
Total		2062.5

Step 3:

We plug the total found in the last column of the table and n= 10 in the formula for variance,

σ^{2}

\frac{\sum (x_{i} - μ)}{n}

\frac{\sum (x_{i} - μ)}{n}

\frac{2062.5}{10}

= 206.25

And the standard deviation

σ

\sqrt{206.25}

14.36

Dispersion Graph

Dispersion is a measure of data variability. This influences the confidence that an analyst can have in the representativeness and reliability of central location measures. A dispersion graph describes the relationship between two variables. It gives a simple illustration of how one variable can influence the other. When constructing a dispersion graph, first clearly define the variables that are to be evaluated. Plot data pairs using the horizontal axis for probable cause and using the vertical axis for probable effect. A dispersion graph places individual data values along a number line, thereby representing the position of each data value in relation to all the other data values.

Coefficient of Dispersion

The measure of dispersion gives an idea about the scattering of the values in the data about the average. The range, mean deviation and standard deviation are three important measure of dispersion. The coefficient of dispersion is a pure number
independent of the units of the measurement. For any data, it is always desirable that the measure of dispersion is less. A small value for the measure means that the values in the data are more or less consistent, centering on their average.

Types of Dispersion

Measures of dispersion indicate how to spread out values around the central value. For the study of dispersion, we need some measures which show whether the dispersion is small or large.

There are two types of measure of dispersion:
(a) Absolute Measure of Dispersion
(b) Relative Measure of Dispersion

Absolute Measure of Dispersion

Absolute dispersion is the actual variation, as determined from the standard deviation or other measure of dispersion. Absolute measure of dispersion are presented in the same unit as the unit of distribution. Absolute measures can be used for comparison of dispersion in two or more than two sets of data. The absolute measures which are commonly used are:

The Range
The Quartile Deviation
The Mean Deviation
The Standard deviation and Variance

Relative Measure of Dispersion

Relative measure of dispersion are useful in comparing two sets of data which have different units of measurement. Relative dispersion is the ratio of absolute dispersion to the average. The numerator and denominator of the ratios in the measure have the same units, the resulting measure of relative dispersion has no units. If the absolute dispersion is the standard deviation and if the average is the mean, then relative dispersion is called the coefficient of variation or coefficient of dispersion.

It is denoted by V and is given by:

Coefficient of variation (V) =

\frac{Standard Deviation}{Mean}

\frac{Standard Deviation}{Mean}

The relative measures of dispersion are:

Coefficient of dispersion.
Coefficient of Quartile Deviation
Coefficient of Mean Deviation
Coefficient of Standard Deviation
Coefficient of Variation

Composite Dispersion

The composite dispersion number calculates the range of returns from high to low. This number is meaningful because it is calculated as a lot, if not, performance related statistics from the reported composite average. Then all the expectations are based on the average weighted return in the composite. The composite internal dispersion is a measure of the variability of portfolio-level returns for only those portfolios that are included in the composite for the full year around the composite return.