Wednesday, 13 July 2016

Measure of Dispersion

The central value like mean is generally used to convey the general behavior of a data set. For example, the average score of the class in a math test hints at the general comfort level of the class in the topic which is tested. But, if two classes have the same average score, can the teacher conclude that the understanding level is same for the two classes? The arithmetic mean does not convey the variations displayed in the individual marks of the students. The teacher needs to have some idea about the spread of the marks of both the classes. Teacher needs some numerical measure of dispersion which would convey how the marks are spread about the central value, the mean in this case.

Dispersion is useful to find the relation between the set of data. Actually, there are two measures of dispersion. First is standard deviation and second is variance. So, we will explain both standard deviation and variance in detail.


Standard Deviation: Standard Deviation can be defined as the measurement of describing the variability of given data set. Standard deviation is also used for measuring the exact value of the number in the given data set. Dispersion is defined as the measurement including the average deviation, variance, and then standard deviation. The standard deviation and the variance are most widely used for measuring the dispersion.

Formula for Standard Deviation Dispersion:
The formula for the standard deviation dispersion are as follows:
Variance = σ2σ2 = Σ(xr−μ)2NΣ(xr−μ)2N
The standard deviation dispersion =σσ
Variance = σ2

Dispersion Definition

The dispersion is the tendency of data to be scattered over a range. Dispersion is the important feature of a frequency distribution. It is also called spread or variation. Range, variance and standard deviation are all measures of dispersion. Common measure of dispersion are, range, inter quartile range and quartile deviation, average deviation or mean deviation and standard deviation.


Data Dispersion 

Measures of dispersion measure how spread out a set of data. It is important to know the amount of dispersion, variation or spread, as data that is more dispersed or separated is less reliable for analytical purposes. Dispersion is depend upon the type of scale used to measure data characteristics that is quantitative or categorical.



Degree of Dispersion

Dispersion may enable to get additional information about the composition of data. A point is that a high degree of uniformity is a desirable quality. Measure of dispersion are useful in comparing two or more distributions in respect to disparities. A greater degree of dispersion mean lack of uniformity or homogeneity in the data while a low degree of dispersion stands for uniformity or homogeneity.

Measures of Dispersion


These measures of dispersions are used to convey the spread in probability distribution as they are used to find the variability when the data is given as a frequency distributions. Range, variance and standard deviation are all measures of dispersion. 

1. Range
2. Inter Quartile Range
3. Mean absolute Deviation
4. Variance
5. Standard deviation
6. Coefficient of variation

Range
Range is the difference between the largest and the smallest values
Range = Largest Value Smallest Value
For example if the lowest and highest marks scored in a test are 15 and 95, the range = 95 -15 = 80.
Even though the range is easiest measure of dispersion to calculate, it is not considered a good measure of dispersion as it does not utilize the other information related to the spread. The Outliers, either the extreme low value or extreme high value can affect the range considerably. In the case of test marks, if the next lowest mark happens to 25, then the mark 15 is an extreme low value which shoots the range up by 10 marks.

Inter-Quartile Range
Inter-Quartile range is a measure of dispersion, which is not affected by the outliers of the data, but nevertheless conveys the idea of range. it measures the spread of the middle 50% of an ordered data set. In other words it is the numerical difference between the first and the third Quartiles.
Inter quartile Range = Q3 – Q1
Mean absolute Deviation
This measure of dispersion is the average of absolute deviations of data values from the mean. It is computed using the formula,
(xix¯)n Where x is the arithmetic mean of the data set.
So, what does this measure of dispersion tell about the spread? A data set with a larger Mean absolute difference is more spread when compared to a data set with a smaller MSD. The Mean absolute difference is sensitive to the outliers.
Variance and Standard deviation
The variance and the standard deviation are two dispersion measures which are widely applied because they are calculated utilizing the data effectively. The deviation about the mean for each value in the data set is taken into note. The variance is the average of the squared deviations.

The illustrations below explain the behavior of data sets with larger and smaller variances. 


Solved Example:

Question 1: The Math test marks of a class are as follows
52, 45, 25, 75, 63, 86, 72, 85, 55, 65, 70, 82, 90, 48, 68, 86, 65, 64, 78, 75, 32, 42. Find the inter quartile range.

Solution:
Given data:
52, 45, 25, 75, 63, 86, 72, 85, 55, 65, 70, 82, 90, 48, 68, 86, 65, 64, 78, 75, 32, 42

Step 1:
If the marks are arranged in an order
25
32
42
45
48
52
55
63
64
65
65
68
70
72
75
75
78
82
85
86
86
90
a
65
a
Q1
Q2
Q3

Step 2:
Inter Quartile range = Q3 - Q1 = 78 - 52 = 26
This tells that the middle 50% of the test score are dispersed over a range of 26 marks.
Question 2: Find the standard deviation of the data set 5,10,15, 20, 25, 30, 35, 40, 45, 50.


Solution: Given data, 5,10,15, 20, 25, 30, 35, 40, 45, 50.

Step 1:
The first step would be to compute the mean = x = 5+10+15+20+25+30+35+40+45+5010 = 27510 = 27.5


Step 2:
Now we build a table calculating the deviations and squared deviations. Finally we total up the squared deviations column.
x
Deviations (x - x)
Squared Deviations (x - x)2
5
-22.5
506.25
10
-17.5
306.25
15
-12.5
156.25
20
-7.5
56.25
25
-2.5
6.25
30
2.5
6.25
35
7.5
56.25
40
12.5
156.25
45
17.5
306.25
50
22.5
506.25
Total
2062.5

Step 3:
We plug the total found in the last column of the table and n= 10 in the formula for variance,
σ2 = (xiμ)n = 2062.510 = 206.25
And the standard deviation σ = 206.25 = 14.36

Dispersion Graph


Dispersion is a measure of data variability. This influences the confidence that an analyst can have in the representativeness and reliability of central location measures. A dispersion graph describes the relationship between two variables. It gives a simple illustration of how one variable can influence the other. When constructing a dispersion graph, first clearly define the variables that are to be evaluated. Plot data pairs using the horizontal axis for probable cause and using the vertical axis for probable effect. A dispersion graph places individual data values along a number line, thereby representing the position of each data value in relation to all the other data values.



Coefficient of Dispersion


The measure of dispersion gives an idea about the scattering of the values in the data about the average. The range, mean deviation and standard deviation are three important measure of dispersion. The coefficient of dispersion is a pure number
independent of the units of the measurement. For any data, it is always desirable that the measure of dispersion is less. A small value for the measure means that the values in the data are more or less consistent, centering on their average.

Types of Dispersion


Measures of dispersion indicate how to spread out values around the central value. For the study of dispersion, we need some measures which show whether the dispersion is small or large.

There are two types of measure of dispersion:
(a) Absolute Measure of Dispersion
(b) Relative Measure of Dispersion


Absolute Measure of Dispersion

Absolute dispersion is the actual variation, as determined from the standard deviation or other measure of dispersion. Absolute measure of dispersion are presented in the same unit as the unit of distribution. Absolute measures can be used for comparison of dispersion in two or more than two sets of data. The absolute measures which are commonly used are:


  • The Range
  • The Quartile Deviation
  • The Mean Deviation
  • The Standard deviation and Variance


Relative Measure of Dispersion 

Relative measure of dispersion are useful in comparing two sets of data which have different units of measurement. Relative dispersion is the ratio of absolute dispersion to the average. The numerator and denominator of the ratios in the measure have the same units, the resulting measure of relative dispersion has no units. If the absolute dispersion is the standard deviation and if the average is the mean, then relative dispersion is called the coefficient of variation or coefficient of dispersion. 

It is denoted by V and is given by:

Coefficient of variation (V) = 


The relative measures of dispersion are:
  • Coefficient of dispersion.
  • Coefficient of Quartile Deviation
  • Coefficient of Mean Deviation
  • Coefficient of Standard Deviation
  • Coefficient of Variation

Composite Dispersion


The composite dispersion number calculates the range of returns from high to low. This number is meaningful because it is calculated as a lot, if not, performance related statistics from the reported composite average. Then all the expectations are based on the average weighted return in the composite. The composite internal dispersion is a measure of the variability of portfolio-level returns for only those portfolios that are included in the composite for the full year around the composite return.








Examples of Dispersion


Given below are some of the examples on Standard Deviation Dispersion:

Solved Examples

Question 1: Calculate the variance and also standard deviation for the following values: 1, 3, 5, 6, 6, 8, 9, and 10.
Solution:

The mean = 
=
 
= 6
Therefore, (= 6)

Step 1: Subtract the mean value from given values and square he mean values.
 X
X - X¯ 
(X - X¯)2 
 1
 1 - 6 = - 5
 25
 3
 3 - 6 = - 3
 9
 5
 5 - 6 = - 1
 1
 6
 6 - 6 = 0
 0
 6
 6 - 6 = 0
 0
 8
 8 - 6 =2
 4
 9
 9 - 6 = 3
 9
 10
 10 - 6 = 4
 16
 Total   

 64

After finding square mean ,sum the square mean 25 + 9 + 1 + 0 + 0 + 4 + 9 + 16 = 64.
Step 2: Here , N = 8
=> σ2=(XX¯)2N
=> Variance  = = 8
Step 3: Therefore standard deviation 
S.D. = 8 = 2.8284.
Answer:
Variance = 8
and standard deviation = 2.8284

 
Question 2: Calculate the variance and also standard deviation for the following values: 1, 2, 3, 4, and 5.
Solution:

The mean =
 

=

 

= 3
Therefore (= 3)

Step 1: Subtract the mean value from given values and  calculate the square for the mean values
X 
 X - X¯
 (X - X¯)2
 1
 1 - 3 = - 2
   4
 2
 2 - 3 = - 1
   1
 3
 3 - 3 = 0
   0
 4
 4 - 3 = 1
   1
 5
 5 - 3 = 2
   4
 Total

   10

Step 2:
=> σ2=(XX¯)2N
= 105 = 2
Step 3: Therefore standard deviation 
SD = 2 = 1.41
Answer:
Variance = 1.6
and standard deviation = 1.41

 
Question 3: Measure the square of standard deviation for the following data set. 556, 563, 565, 568.
Solution:

Step 1:
Mean: 
Measuring the value for mean
 

Step 2:
Standard deviation: Measure the standard deviation using the formula,
 =
= 
=
 
=

= 5.0990195135928
Step 3:
Square of standard deviation: Take the square for the standard deviation

Reference:
 http://math.tutorvista.com/statistics/dispersion.html#measures-of-dispersion

No comments:

Post a Comment