Standard Deviation
What is Standard Deviation?
Standard deviation is the measurement of the dispersion of the data set from its mean value. It is always measured in arithmetic value. Standard deviation is always positive and is denoted by σ (sigma). Standard Deviation or SD is very accurate and is preferred from other measures of dispersion.
The standard deviation (SD) is calculated as The square root of variance by determining each data point's deviation relative to the arithmetic mean. In case the data points are far from the mean, it denotes a higher deviation within the set of data. Hence, it indicates more spread-out the data; the higher is the standard deviation.
The formula to calculate Standard Deviation is:
Properties of Standard Deviation
ü Standard Deviation is only used in measuring dispersion or spread around the mean value of the data set.
ü Standard Deviation is always in positive value.
ü It determines the dispersion or variation that exists from the average value.
ü Standard Deviation is a very sensitive outlier. Any single outlier can distort the picture of dispersion.
ü For the data set with an approximately same mean value, the higher the dispersion or spread, the greater is the Standard Deviation.
ü Standard Deviation is zero when the values of a particular data set are the same.
ü While analyzing the normally distributed data, SD is used in conjunction along with the mean to calculate the data intervals.
If = mean, S = Standard Deviation, and x = Value in the Data set, then
around 68% of the Data is in the interval:- - S < x < mean + S.
around 95% of the Data is in the interval:- - 2S < x <mean + 2S.
around 99% of the Data is in the interval:- - 3S < x < mean + 3S.
Standard Deviation Calculation
Before calculating the Standard Deviation, it is essential to underline the three types of data distribution. These are:
1. Individual series
A single column denoting the observation is available here.
Score | 28 | 34 | 48 | 69 | 73 | 78 | 84 | 89 | 93 |
2. Discrete series
There are two columns that represent different data. One column shows the observation, while the other column is for frequency corresponding to the observation column.
Marks | 30 | 40 | 50 | 60 | 70 | 80 |
Number of students | 5 | 6 | 4 | 9 | 10 | 8 |
3. Frequency distribution
It has two columns, one representing the observations, and the other is corresponding frequency.
Here the observations are classified further into intervals or classes.
Age | 20-30 | 30-40 | 40-50 | 50-60 | 60-70 | 70-80 |
Number of people | 34 | 45 | 30 | 20 | 15 | 10 |
Sigma for individual series
The Standard deviation for individual series can be calculated by three methods; these are:
A direct method to calculate Standard Deviation
Use the formula ∑X/N to calculate the arithmetic mean. After this, we calculate the deviations of all the observations from the mean value using the formula D= X-mean.
Now, the deviations, x, are squared and summed. The resultant value is then divided by the total number of observations. The square root of the above-derived value = Standard Deviation
The formula is - σ = √[∑D²/N]
Here, D = deviation of an item that is relative to mean. It is calculated as D= X- mean.
N= Number of observations
Short-cut method
In this method, any random value is assumed to calculate deviation. It is believed that the assumed value is in the Middle of the Range of Values. The short cut method is derived using the formula;
σ = √[(∑D²/N) – (∑D/N)²]
Step-deviation method
It is a simple form of the short-cut method. Here, we select a common factor C, among the deviations. All the deviation values reduce when divided by C, simplifying the calculations. The formula is;
SD (σ)= √[(∑D’²/N) – (∑D’/N)²] × C
D'= step-deviation of Observations relative to an Assumed mean. It is calculated as D'= (X-A)/C
C= Common Factor chosen.
Sigma for discrete series
There are two ways to calculate Standard Deviation in discrete series, theses are:
Direct method
We know that in the discrete series, another frequency column is added; the direct method formula to calculate Standard Deviation is:
Standard deviation (σ) = √(∑fD²)/N)
Short-cut method
Standard deviation (σ) = √[(∑fD²/N) – (∑fD/N)²]
Sigma for frequency distribution
There are three different methods that can be used to calculate standard deviation in frequency distribution series; these methods are:
Direct method
The direct method employed to derive standard deviation in a frequency distribution is very similar to the discrete series done above. The value of observation (when used) in the frequency distribution is the only difference between the two series. Here, the mid-value of the class is determined by dividing the sum' of the upper value of the class and the lower value of the class. The value thus derived is used for calculation. The formula is;
Standard Deviation (σ) = √(∑fD²)/N)
In the calculation, D = Deviation of an item that is relative to mean value and is calculated as,
D = Xi – Mean
f= frequencies corresponding to the Observations
N = The summation of the frequency.
Step-deviation Method
The step- deviation method is the short cut method to determine Standard Deviation. The formula is:
Standard Deviation (σ) = √[(∑fD’²/N) – (∑fD’/N)²] × C
In the above calculation, D'= Step-Deviation of the observations relative to the assumed value. It is calculated as - D'= (Xi-A)/C
N = The Summation of Frequency.
C = Common Factor chosen
Did you know?
Without Standard Deviation, one can’t compare two sets of data effectively. Suppose there are two data sets having the same average, does that imply that the sets of data are exactly the same? No. Forex. the data sets - 199, 200, 201 and other 0, 200, 400 have the same 200 average, yet they have different standard deviations. Here, the first data has a small standard deviation (s=1) in comparison to the second set of data (s=200).
Fun facts about Standard Deviation
Try to find out the average ‘daily high temperature’ of two cities. One inland city while the other near the ocean. It is important to underline that the ranges of daily high temp. of the coastal city is less than the inland city. Both the cities can have the same average daily High temperature, but the standard deviation in the daily High temperature for a city near to ocean is less in comparison to the inland city.
FAQs
Q What does SD or standard deviation indicates?
A The standard deviation (SD) indicates the amount of variability on an average in your set of data. On average, it tells us how far each and every score is available from the mean value. The SD is the most accurate measurement in comparison to other measures of dispersion available and can never be negative. Standard deviation is denoted by the symbol Sigma or σ.
In normal distributions, a higher standard deviation implies that the values are further away from the mean. Similarly, a lower standard deviation means the values are clustered very close to the arithmetic mean value.
Q What is the difference between the variance and the standard deviation?
A The difference between the standard deviation and the variance is as follows:
Variance means the average squared deviations that are measured from the mean, whereas Standard Deviation is calculated as the Square root of this number. Although both the measurements indicate variability in the distribution, however, their units differ:
Ø The standard deviation (SD) is expressed as the same unit that is available in the original value (example - meters, grams or minutes)
Ø The variance is denoted in larger units in comparison, such as a square meter.
Although the units measured of variance are a little difficult to understand initially, the variance is significant in the statistical test.