Introduction
Standard deviation is
a statistical term that measures the amount of variability or dispersion around
an average. Standard deviation is also a measure of volatility. Generally
speaking, dispersion is the difference between the actual value and the average
value. The larger this dispersion or variability is, the higher the standard
deviation. The smaller this dispersion or variability is, the lower the
standard deviation. Chartists can use the standard deviation to measure
expected risk and determine the significance of certain price movements.
CONCEPT OF Standard Deviation(SD) AND Standard Error of Mean
To study the entire
population is time and resource intensive and not always feasible; therefore
studies are often done on the sample; and data is summarized using descriptive
statistics. These findings are further generalized to the larger, unobserved population
using inferential statistics.
For example, in order to understand cholesterol
levels of the population, cholesterol levels of study sample, drawn from same
population are measured. The findings of this sample are best described by two
parameters; mean and SD. Sample mean is average of these observations and
denoted by X̄ . It is the center of distribution of observations
(central tendency). Other parameter, SD tells us dispersion of individual
observations about the mean. In other words, it characterizes typical distance
of an observation from distribution center or middle value. If observations are
more disperse, then there will be more variability. Thus, a low SD signifies
less variability while high SD indicates more spread out of data. Mathematically,
the SD is
s = sample SD; X - individual value; X̄ -
sample mean; n = sample size.
Figure 1a shows
cholesterol levels of population of 200 healthy individuals. Cholesterol of the
most of individuals is between 190-210mg/dl, with a mean (μ) 200mg/dl and SD
(s) 10mg/dl. A study in 10 individuals drawn from same population with
cholesterol levels of 180, 200, 190, 180, 220, 190, 230, 190, 190, 180mg/dl
gives X̄ = 195 mg/dl and SD (s) = 17.1 mg/dl.
If
one draws three different groups of 10 individuals each, one will obtain three
different mean and SD. (Adapted from Glantz, 2002)
These sample results are used to make inferences
based on the premise that what is true for a randomly selected sample will be
true, more or less, for the population from which the sample is chosen. This
means, sample mean (X̄ ) estimates the true but unknown population mean
(μ) and sample SD (s) estimates population SD (s). However, the precision with
which sample results determine population parameters needs to be addressed.
Thus, in above case X̄ = 195 mg/ dl estimates the population
mean μ = 200 mg/dl. If other samples of 10 individuals are selected, because of
intrinsic variability, it is unlikely that exactly same mean and SD [Figures [Figures1b,1b, c and d] would be observed; and therefore
we may expect different estimate of population mean every time.
Figure 2 shows mean
of 25 groups of 10 individuals each drawn from the population shown in Figure 1. If these 25
group means are treated as 25 observations, then as per the statistical
“Central Limit Theorem” these observations will be normally distributed
regardless of nature of original population. Mean of all these sample means
will equal the mean of original population and standard deviation of all these
sample means will be called as SEM as explained below.
This
figure illustrates the mean of 25 groups of 10 individuals each drawn from the
population of 200 individuals shown in the Figure 1. The means of
three groups shown in Figure 1 are shown
using circles filled with corresponding patterns
SEM is the standard deviation of mean of random
samples drawn from the original population. Just as the sample SD (s) is an
estimate of variability of observations, SEM is an estimate of variability of
possible values of means of samples. As mean values are considered for
calculation of SEM, it is expected that there will be less variability in the
values of sample mean than in the original population. This shows that SEM is a
measure of the precision with which sample mean X̄ estimate the
population mean μ. The precision increases as the sample size increases [Figure 3].
The
figure shows that the SEM is a function of the sample size
Thus, SEM quantifies
uncertainty in the estimate of the mean.[13,14] Mathematically, the best estimate of SEM
from single sample is
σM =
SEM; s = SD of sample; n = sample size.
However, SEM by itself
doesn’t convey much useful information. Its main function is to help construct
confidence intervals (CI).[16] CI is the range of values that is believed
to encompass the actual (“true”) population value. This true population value
usually is not known, but can be estimated from an appropriately selected
sample. If samples are drawn repeatedly from population and CI is constructed
for every sample, then certain percentage of CIs can include the value of true
population while certain percentage will not include that value. Wider CIs
indicate lesser precision, while narrower ones indicate greater precision.[17]
CI is calculated for any
desired degree of confidence by using sample size and variability (SD) of the
sample, although 95% CIs are by far the most commonly used; indicating that the
level of certainty to include true parameter value is 95%. CI for the true
population mean μ is given by[12]
s = SD of sample; n =
sample size; z (standardized score) is the value of the standard normal
distribution with the specific level of confidence. For a 95% CI, Z = 1.96.
A 95% CI for population
as per the first sample with mean and SD as 195 mg/dl and 17.1 mg/dl
respectively will be 184.4 - 205.5 mg/dl; indicating that the interval includes
true population mean m = 200 mg/dl with 95% confidence. In essence, a
confidence interval is a range that we expect, with some level of confidence,
to include the actual value of population mean.[17]
APPLICATION
As explained above, SD and SEM estimate
quite different things. But in many articles, SEM and SD are used
interchangeably and authors summarize their data with SEM as it makes data seem
less variable and more representative. However, unlike SD which quantifies the
variability, SEM quantifies uncertainty in estimate of the mean.[
13]
As readers are generally interested in knowing the variability within sample
and not proximity of mean to the population mean, data should be precisely
summarized with SD and not with SEM.[
18,
19]
The importance of SD in clinical settings is discussed
below. In a atherosclerotic disease study, an investigator reports mean peak
systolic velocity (PSV) in the carotid artery, a measure of stenosis, as
220cm/sec with SD of 10cm/ sec.[
20] In this case it would be unusual to
observe PSV less than 200 cm/sec or greater than 240cm/sec as 95% of population
fall within 2SD of the mean, assuming that the population follows a normal
distribution. Thus, there is a quick summary of the population and the range
against which to compare the specific findings. Unfortunately, investigators are
quite likely to report the PSV as 220cm/ sec ± 1.6 (SEM). If one confused the
SEM with the SD, one would believe that the range of the population is narrow
(216.8 to 223.2cm/sec), which is not the case.
Additionally, when two groups are compared (e.g. treatment
and control groups), SD helps in visualizing the effect size, which is an index
of how much difference is there between two groups.[
12]
Effect size gives an idea of magnitude of difference to help differentiate
between statistical significance and practical importance. Effect size is
determined by calculating the difference between the means divided by the
pooled or average standard deviation from two groups. Generally, effect size of
0.8 or more is considered as a large effect and indicates that the means of two
groups are separated by 0.8SD; effect size of 0.5 and 0.2, are considered as
moderate or small respectively and indicate that the means of the two groups
are separated by 0.5 and 0.2SD.[
12]
However, same can’t be interpreted with SEM. More importantly, SEMs do not
provide direct visual impression of the effect size, if number of subjects
differs between groups.
Exceptionally the SD as an index of variability may be a
deceptive one in many experimental situations where biological variable differs
grossly from a normal distribution (e.g. distribution of plasma creatinine,
growth rate of tumor and plasma concentration of immune or inflammatory
mediators). In these cases, because of the skewed distribution, SD will be an
inflated measure of variability. In such cases, data can be presented using
other measures of variability (e.g. mean absolute deviation and the
interquartile range), or can be transformed (common transformations include the
logarithmic, inverse, square root, and arc sine transformations).[
17]
Some journal editors require their authors to use the SD
and not the SEM. There are two reasons for this trend. First, the SEM is a
function of the sample size, so it can be made smaller simply by increasing the
sample size (n) [
Figure 3]. Second, the interval (mean ± 2 SEM)
will contain approximately 95% of the means of samples, but will never contain
95% of the observations on individuals; in the latter situation, mean ± 2 SD is
needed.[
21]
In general, the use of the SEM should be
limited to inferential statistics where the author explicitly wants to inform
the reader about the precision of the study, and how well the sample truly
represents the entire population.[
22] In graphs and figures too, use of SD is
preferable to the SEM. Further, in every case, standard deviations should
preferably be reported in parentheses [i.e., mean (SD)] than using mean ± SD
expressions, as the latter specification can be confused with a 95% CI.[
17]
Calculation
StockCharts.com
calculates the standard deviation for a population, which assumes that the
periods involved represent the whole data set, not a sample from a bigger data
set. The calculation steps are as follows:
1.
Calculate the average (mean) price for the number of periods or
observations.
2.
Determine each period's deviation (close less average price).
3.
Square each period's deviation.
4.
Sum the squared deviations.
5.
Divide this sum by the number of observations.
6.
The standard deviation is then equal to the square root of that
number.
The spreadsheet above
shows an example for a 10-period standard deviation using QQQQ data. Notice
that the 10-period average is
calculated after the 10th period and this average is applied to all 10 periods.
Building a running standard deviation with this formula would be quite
intensive. Excel has an easier way with the STDEVP formula. The table below
shows the 10-period standard deviation using this formula. Here's an Excel
Spreadsheet that shows the standard deviation calculations.
CONCLUSION
Proper understanding and use of fundamental statistics,
such as SD and SEM and their application will allow more reliable analysis,
interpretation, and communication of data to readers. Though, SEM and SD are
used interchangeably to express the variability; they measure different
parameters. SEM, an inferential parameter, quantifies uncertainty in the
estimate of the mean; whereas SD is a descriptive parameter and quantifies the
variability. As readers are generally interested in knowing variability within
the sample, descriptive data should be precisely summarized with SD. Use of SEM
should be limited to compute CI which measures the precision of population
estimate.