Courses

# Statistics, Probability & Noise Electronics and Communication Engineering (ECE) Notes | EduRev

## Electronics and Communication Engineering (ECE) : Statistics, Probability & Noise Electronics and Communication Engineering (ECE) Notes | EduRev

``` Page 1

11
CHAPTER
2 Statistics, Probability and Noise
Statistics and probability are used in Digital Signal Processing to characterize signals and the
processes that generate them.  For example, a primary use of DSP is to reduce interference, noise,
and other undesirable components in acquired data.  These may be an inherent part of the signal
being measured, arise from imperfections in the data acquisition system, or be introduced as an
unavoidable byproduct of some DSP operation.  Statistics and probability allow these disruptive
features to be measured and classified, the first step in developing strategies to remove the
offending components.  This chapter introduces the most important concepts in statistics and
probability, with emphasis on how they apply to acquired signals.
Signal and Graph Terminology
A signal is a description of how one parameter is related to another parameter.
For example, the most common type of signal in analog electronics is a voltage
that varies with time .   Since both parameters can assume a continuous range
of values, we will call this a continuous signal .  In comparison, passing this
signal through an analog-to-digital converter forces each of the two parameters
to be quantized .  For instance, imagine the conversion being done with 12 bits
at a sampling rate of 1000 samples per second. The voltage is curtailed to 4096
(2
12
) possible binary levels, and the time is only defined at one millisecond
increments.  Signals formed from parameters that are quantized in this manner
are said to be discrete signals or digitized signals .  For the most part,
continuous signals exist in nature, while discrete signals exist inside computers
(although you can find exceptions to both cases).  It is also possible to have
signals where one parameter is continuous and the other is discrete.  Since
these mixed signals are quite uncommon, they do not have special names given
to them, and the nature of the two parameters must be explicitly stated.
Figure 2-1 shows two discrete signals, such as might be acquired with a
digital data acquisition system.  The vertical axis may represent voltage, light
Page 2

11
CHAPTER
2 Statistics, Probability and Noise
Statistics and probability are used in Digital Signal Processing to characterize signals and the
processes that generate them.  For example, a primary use of DSP is to reduce interference, noise,
and other undesirable components in acquired data.  These may be an inherent part of the signal
being measured, arise from imperfections in the data acquisition system, or be introduced as an
unavoidable byproduct of some DSP operation.  Statistics and probability allow these disruptive
features to be measured and classified, the first step in developing strategies to remove the
offending components.  This chapter introduces the most important concepts in statistics and
probability, with emphasis on how they apply to acquired signals.
Signal and Graph Terminology
A signal is a description of how one parameter is related to another parameter.
For example, the most common type of signal in analog electronics is a voltage
that varies with time .   Since both parameters can assume a continuous range
of values, we will call this a continuous signal .  In comparison, passing this
signal through an analog-to-digital converter forces each of the two parameters
to be quantized .  For instance, imagine the conversion being done with 12 bits
at a sampling rate of 1000 samples per second. The voltage is curtailed to 4096
(2
12
) possible binary levels, and the time is only defined at one millisecond
increments.  Signals formed from parameters that are quantized in this manner
are said to be discrete signals or digitized signals .  For the most part,
continuous signals exist in nature, while discrete signals exist inside computers
(although you can find exceptions to both cases).  It is also possible to have
signals where one parameter is continuous and the other is discrete.  Since
these mixed signals are quite uncommon, they do not have special names given
to them, and the nature of the two parameters must be explicitly stated.
Figure 2-1 shows two discrete signals, such as might be acquired with a
digital data acquisition system.  The vertical axis may represent voltage, light
The Scientist and Engineer's Guide to Digital Signal Processing 12
intensity, sound pressure, or an infinite number of other parameters.  Since we
don't know what it represents in this particular case, we will give it the generic
label: amplitude .   This parameter is also called several other names: the y-
axis , the dependent variable , the range , and the ordinate .
The horizontal axis represents the other parameter of the signal, going by
such names as: the x-axis , the independent variable , the domain , and the
abscissa .  Time is the most common parameter to appear on the horizontal axis
of acquired signals; however, other parameters are used in specific applications.
For example, a geophysicist might acquire measurements of rock density at
equally spaced distances along the surface of the earth.  To keep things
general, we will simply label the horizontal axis: sample number .  If this
were a continuous signal, another label would have to be used, such as: time , distance , x , etc.
The two parameters that form a signal are generally not interchangeable.  The
parameter on the y-axis (the dependent variable) is said to be a function of the
parameter on the x-axis (the independent variable).  In other words, the
independent variable describes how or when each sample is taken, while the
dependent variable is the actual measurement.  Given a specific value on the
x-axis, we can always find the corresponding value on the y-axis, but usually
not the other way around.
Pay particular attention to the word: domain , a very widely used term in DSP.
For instance, a signal that uses time as the independent variable (i.e., the
parameter on the horizontal axis), is said to be in the time domain .  Another
common signal in DSP uses frequency as the independent variable, resulting in
the term, frequency domain .  Likewise, signals that use distance as the
independent parameter are said to be in the spatial domain (distance is a
measure of space).  The type of parameter on the horizontal axis is the domain
of the signal; it's that simple.  What if the x-axis is labeled with something
very generic, such as sample number ?  Authors commonly refer to these signals
as being in the time domain.  This is because sampling at equal intervals of
time is the most common way of obtaining signals, and they don't have anything
more specific to call it.
Although the signals in Fig. 2-1 are discrete, they are displayed in this figure
as continuous lines.  This is because there are too many samples to be
distinguishable if they were displayed as individual markers.  In graphs that
portray shorter signals, say less than 100 samples, the individual markers are
usually shown.  Continuous lines may or may not be drawn to connect the
markers, depending on how the author wants you to view the data.  For
instance, a continuous line could imply what is happening between samples, or
simply be an aid to help the reader's eye follow a trend in noisy data.   The
point is, examine the labeling of the horizontal axis to find if you are working
with a discrete or continuous signal.  Don't rely on an illustrator's ability to
draw dots.
The variable, N , is widely used in DSP to represent the total number of
samples in a signal.  For example,  for the signals in Fig. 2-1.  To N ' 512
Page 3

11
CHAPTER
2 Statistics, Probability and Noise
Statistics and probability are used in Digital Signal Processing to characterize signals and the
processes that generate them.  For example, a primary use of DSP is to reduce interference, noise,
and other undesirable components in acquired data.  These may be an inherent part of the signal
being measured, arise from imperfections in the data acquisition system, or be introduced as an
unavoidable byproduct of some DSP operation.  Statistics and probability allow these disruptive
features to be measured and classified, the first step in developing strategies to remove the
offending components.  This chapter introduces the most important concepts in statistics and
probability, with emphasis on how they apply to acquired signals.
Signal and Graph Terminology
A signal is a description of how one parameter is related to another parameter.
For example, the most common type of signal in analog electronics is a voltage
that varies with time .   Since both parameters can assume a continuous range
of values, we will call this a continuous signal .  In comparison, passing this
signal through an analog-to-digital converter forces each of the two parameters
to be quantized .  For instance, imagine the conversion being done with 12 bits
at a sampling rate of 1000 samples per second. The voltage is curtailed to 4096
(2
12
) possible binary levels, and the time is only defined at one millisecond
increments.  Signals formed from parameters that are quantized in this manner
are said to be discrete signals or digitized signals .  For the most part,
continuous signals exist in nature, while discrete signals exist inside computers
(although you can find exceptions to both cases).  It is also possible to have
signals where one parameter is continuous and the other is discrete.  Since
these mixed signals are quite uncommon, they do not have special names given
to them, and the nature of the two parameters must be explicitly stated.
Figure 2-1 shows two discrete signals, such as might be acquired with a
digital data acquisition system.  The vertical axis may represent voltage, light
The Scientist and Engineer's Guide to Digital Signal Processing 12
intensity, sound pressure, or an infinite number of other parameters.  Since we
don't know what it represents in this particular case, we will give it the generic
label: amplitude .   This parameter is also called several other names: the y-
axis , the dependent variable , the range , and the ordinate .
The horizontal axis represents the other parameter of the signal, going by
such names as: the x-axis , the independent variable , the domain , and the
abscissa .  Time is the most common parameter to appear on the horizontal axis
of acquired signals; however, other parameters are used in specific applications.
For example, a geophysicist might acquire measurements of rock density at
equally spaced distances along the surface of the earth.  To keep things
general, we will simply label the horizontal axis: sample number .  If this
were a continuous signal, another label would have to be used, such as: time , distance , x , etc.
The two parameters that form a signal are generally not interchangeable.  The
parameter on the y-axis (the dependent variable) is said to be a function of the
parameter on the x-axis (the independent variable).  In other words, the
independent variable describes how or when each sample is taken, while the
dependent variable is the actual measurement.  Given a specific value on the
x-axis, we can always find the corresponding value on the y-axis, but usually
not the other way around.
Pay particular attention to the word: domain , a very widely used term in DSP.
For instance, a signal that uses time as the independent variable (i.e., the
parameter on the horizontal axis), is said to be in the time domain .  Another
common signal in DSP uses frequency as the independent variable, resulting in
the term, frequency domain .  Likewise, signals that use distance as the
independent parameter are said to be in the spatial domain (distance is a
measure of space).  The type of parameter on the horizontal axis is the domain
of the signal; it's that simple.  What if the x-axis is labeled with something
very generic, such as sample number ?  Authors commonly refer to these signals
as being in the time domain.  This is because sampling at equal intervals of
time is the most common way of obtaining signals, and they don't have anything
more specific to call it.
Although the signals in Fig. 2-1 are discrete, they are displayed in this figure
as continuous lines.  This is because there are too many samples to be
distinguishable if they were displayed as individual markers.  In graphs that
portray shorter signals, say less than 100 samples, the individual markers are
usually shown.  Continuous lines may or may not be drawn to connect the
markers, depending on how the author wants you to view the data.  For
instance, a continuous line could imply what is happening between samples, or
simply be an aid to help the reader's eye follow a trend in noisy data.   The
point is, examine the labeling of the horizontal axis to find if you are working
with a discrete or continuous signal.  Don't rely on an illustrator's ability to
draw dots.
The variable, N , is widely used in DSP to represent the total number of
samples in a signal.  For example,  for the signals in Fig. 2-1.  To N ' 512
Chapter 2- Statistics, Probability and Noise 13
Sample number
0 64 128 192 256 320 384 448 512
-4
-2
0
2
4
6
8
511 a.  Mean = 0.5, F = 1
Sample number
0 64 128 192 256 320 384 448 512 -4
-2
0 2 4 6 8 511 b.  Mean = 3.0, F = 0.2
Amplitude Amplitude FIGURE 2-1
Examples of two digitized signals with different means and standard deviations . EQUATION 2-1
Calculation of a signal's mean.  The signal is
contained in x 0 through x N -1
, i is an index that
runs through these values, and µ is the mean.
µ '
1 N j
N & 1 i ' 0 x i keep the data organized, each sample is assigned a sample number or
index .  These are the numbers that appear along the horizontal axis.  Two
notations for assigning sample numbers are commonly used.  In the first
notation, the sample indexes run from 1 to N  (e.g., 1 to 512).  In the second
notation, the sample indexes run from 0 to  (e.g., 0 to 511). N & 1 Mathematicians often use the first method (1 to N ), while those in DSP
commonly uses the second (0 to ).  In this book, we will use the second N & 1 notation.  Don't dismiss this as a trivial problem.  It will confuse you
sometime during your career.  Look out for it!
Mean and Standard Deviation
The mean , indicated by µ (a lower case Greek mu ), is the statistician's  jargon
for the average value of a signal.  It is found just as you would expect: add all
of the samples together, and divide by N .  It looks like this in mathematical
form:
In words, sum the values in the signal, , by letting the index, i , run from 0 x i to .  Then finish the calculation by dividing the sum by N .  This is N & 1 identical to the equation: .  If you are not already µ ' ( x 0 % x 1 % x 2 % þ% x N & 1 ) / N familiar with E  (upper case Greek sigma ) being used to indicate summation , study these equations carefully, and compare them with the computer program
in Table 2-1.  Summations of this type are abundant in DSP, and you need to
understand this notation fully.
Page 4

11
CHAPTER
2 Statistics, Probability and Noise
Statistics and probability are used in Digital Signal Processing to characterize signals and the
processes that generate them.  For example, a primary use of DSP is to reduce interference, noise,
and other undesirable components in acquired data.  These may be an inherent part of the signal
being measured, arise from imperfections in the data acquisition system, or be introduced as an
unavoidable byproduct of some DSP operation.  Statistics and probability allow these disruptive
features to be measured and classified, the first step in developing strategies to remove the
offending components.  This chapter introduces the most important concepts in statistics and
probability, with emphasis on how they apply to acquired signals.
Signal and Graph Terminology
A signal is a description of how one parameter is related to another parameter.
For example, the most common type of signal in analog electronics is a voltage
that varies with time .   Since both parameters can assume a continuous range
of values, we will call this a continuous signal .  In comparison, passing this
signal through an analog-to-digital converter forces each of the two parameters
to be quantized .  For instance, imagine the conversion being done with 12 bits
at a sampling rate of 1000 samples per second. The voltage is curtailed to 4096
(2
12
) possible binary levels, and the time is only defined at one millisecond
increments.  Signals formed from parameters that are quantized in this manner
are said to be discrete signals or digitized signals .  For the most part,
continuous signals exist in nature, while discrete signals exist inside computers
(although you can find exceptions to both cases).  It is also possible to have
signals where one parameter is continuous and the other is discrete.  Since
these mixed signals are quite uncommon, they do not have special names given
to them, and the nature of the two parameters must be explicitly stated.
Figure 2-1 shows two discrete signals, such as might be acquired with a
digital data acquisition system.  The vertical axis may represent voltage, light
The Scientist and Engineer's Guide to Digital Signal Processing 12
intensity, sound pressure, or an infinite number of other parameters.  Since we
don't know what it represents in this particular case, we will give it the generic
label: amplitude .   This parameter is also called several other names: the y-
axis , the dependent variable , the range , and the ordinate .
The horizontal axis represents the other parameter of the signal, going by
such names as: the x-axis , the independent variable , the domain , and the
abscissa .  Time is the most common parameter to appear on the horizontal axis
of acquired signals; however, other parameters are used in specific applications.
For example, a geophysicist might acquire measurements of rock density at
equally spaced distances along the surface of the earth.  To keep things
general, we will simply label the horizontal axis: sample number .  If this
were a continuous signal, another label would have to be used, such as: time , distance , x , etc.
The two parameters that form a signal are generally not interchangeable.  The
parameter on the y-axis (the dependent variable) is said to be a function of the
parameter on the x-axis (the independent variable).  In other words, the
independent variable describes how or when each sample is taken, while the
dependent variable is the actual measurement.  Given a specific value on the
x-axis, we can always find the corresponding value on the y-axis, but usually
not the other way around.
Pay particular attention to the word: domain , a very widely used term in DSP.
For instance, a signal that uses time as the independent variable (i.e., the
parameter on the horizontal axis), is said to be in the time domain .  Another
common signal in DSP uses frequency as the independent variable, resulting in
the term, frequency domain .  Likewise, signals that use distance as the
independent parameter are said to be in the spatial domain (distance is a
measure of space).  The type of parameter on the horizontal axis is the domain
of the signal; it's that simple.  What if the x-axis is labeled with something
very generic, such as sample number ?  Authors commonly refer to these signals
as being in the time domain.  This is because sampling at equal intervals of
time is the most common way of obtaining signals, and they don't have anything
more specific to call it.
Although the signals in Fig. 2-1 are discrete, they are displayed in this figure
as continuous lines.  This is because there are too many samples to be
distinguishable if they were displayed as individual markers.  In graphs that
portray shorter signals, say less than 100 samples, the individual markers are
usually shown.  Continuous lines may or may not be drawn to connect the
markers, depending on how the author wants you to view the data.  For
instance, a continuous line could imply what is happening between samples, or
simply be an aid to help the reader's eye follow a trend in noisy data.   The
point is, examine the labeling of the horizontal axis to find if you are working
with a discrete or continuous signal.  Don't rely on an illustrator's ability to
draw dots.
The variable, N , is widely used in DSP to represent the total number of
samples in a signal.  For example,  for the signals in Fig. 2-1.  To N ' 512
Chapter 2- Statistics, Probability and Noise 13
Sample number
0 64 128 192 256 320 384 448 512
-4
-2
0
2
4
6
8
511 a.  Mean = 0.5, F = 1
Sample number
0 64 128 192 256 320 384 448 512 -4
-2
0 2 4 6 8 511 b.  Mean = 3.0, F = 0.2
Amplitude Amplitude FIGURE 2-1
Examples of two digitized signals with different means and standard deviations . EQUATION 2-1
Calculation of a signal's mean.  The signal is
contained in x 0 through x N -1
, i is an index that
runs through these values, and µ is the mean.
µ '
1 N j
N & 1 i ' 0 x i keep the data organized, each sample is assigned a sample number or
index .  These are the numbers that appear along the horizontal axis.  Two
notations for assigning sample numbers are commonly used.  In the first
notation, the sample indexes run from 1 to N  (e.g., 1 to 512).  In the second
notation, the sample indexes run from 0 to  (e.g., 0 to 511). N & 1 Mathematicians often use the first method (1 to N ), while those in DSP
commonly uses the second (0 to ).  In this book, we will use the second N & 1 notation.  Don't dismiss this as a trivial problem.  It will confuse you
sometime during your career.  Look out for it!
Mean and Standard Deviation
The mean , indicated by µ (a lower case Greek mu ), is the statistician's  jargon
for the average value of a signal.  It is found just as you would expect: add all
of the samples together, and divide by N .  It looks like this in mathematical
form:
In words, sum the values in the signal, , by letting the index, i , run from 0 x i to .  Then finish the calculation by dividing the sum by N .  This is N & 1 identical to the equation: .  If you are not already µ ' ( x 0 % x 1 % x 2 % þ% x N & 1 ) / N familiar with E  (upper case Greek sigma ) being used to indicate summation , study these equations carefully, and compare them with the computer program
in Table 2-1.  Summations of this type are abundant in DSP, and you need to
understand this notation fully.
The Scientist and Engineer's Guide to Digital Signal Processing 14
EQUATION 2-2
Calculation of the standard deviation of a
signal. The signal is stored in , µ is the x i mean found from Eq. 2-1, N is the number of
samples, and  is the standard deviation. s
F
2 '
1 N & 1 j
N & 1 i ' 0 ( x i & µ ) 2 In electronics, the mean is commonly called the DC (direct current) value.
Likewise, AC (alternating current) refers to how the signal fluctuates around
the mean value.  If the signal is a simple repetitive waveform, such as a sine
or square wave, its excursions can be described by its peak-to-peak amplitude.
Unfortunately, most acquired signals do not show a well defined peak-to-peak
value, but have a random nature, such as the signals in Fig. 2-1.  A more
generalized method must be used in these cases, called the standard
deviation , denoted by F F (a lower case Greek sigma ).
As a starting point, the expression, , describes how far the  sample * x i & µ * i th
deviates (differs) from the mean.  The average deviation of a signal is found
by summing the deviations of all the individual samples, and then dividing by
the number of samples, N.  Notice that we take the absolute value of each
deviation before the summation; otherwise the positive and negative terms
would average to zero.  The average deviation provides a single number
representing the typical distance that the samples are from the mean.  While
convenient and straightforward, the average deviation is almost never used in
statistics.  This is because it doesn't fit well with the physics of how signals
operate.  In most cases, the important parameter is not the deviation from the
mean, but the power represented by the deviation from the mean.  For example,
when random noise signals combine in an electronic circuit, the resultant noise
is equal to the combined power of the individual signals, not their combined
amplitude .
The standard deviation is similar to the average deviation , except the
averaging is done with power instead of amplitude.  This is achieved by
squaring each of the deviations before taking the average (remember, power %
voltage
2 ).  To finish, the square root is taken to compensate for the initial
squaring.  In equation form, the standard deviation is calculated:
In the alternative notation: . F' ( x 0 & µ ) 2 % ( x 1 & µ ) 2 % þ% ( x N & 1 & µ ) 2 / ( N & 1 ) Notice that the average is carried out by dividing by  instead of N.  This N & 1 is a  subtle feature of the equation that will be discussed in the next section.
The term, F
2 , occurs frequently in statistics and is given the name variance.
The standard deviation is a measure of how far the signal fluctuates from the
mean.  The variance represents the power of this fluctuation.   Another term
you should become familiar with is the rms (root-mean-square) value,
frequently used in electronics.  By definition, the standard deviation only
measures the AC portion of a signal, while the rms value measures both the AC
and DC components.  If a signal has no DC component, its rms value is
identical to its standard deviation.  Figure 2-2 shows the relationship between
the standard deviation and the peak-to-peak value of several common
waveforms.
Page 5

11
CHAPTER
2 Statistics, Probability and Noise
Statistics and probability are used in Digital Signal Processing to characterize signals and the
processes that generate them.  For example, a primary use of DSP is to reduce interference, noise,
and other undesirable components in acquired data.  These may be an inherent part of the signal
being measured, arise from imperfections in the data acquisition system, or be introduced as an
unavoidable byproduct of some DSP operation.  Statistics and probability allow these disruptive
features to be measured and classified, the first step in developing strategies to remove the
offending components.  This chapter introduces the most important concepts in statistics and
probability, with emphasis on how they apply to acquired signals.
Signal and Graph Terminology
A signal is a description of how one parameter is related to another parameter.
For example, the most common type of signal in analog electronics is a voltage
that varies with time .   Since both parameters can assume a continuous range
of values, we will call this a continuous signal .  In comparison, passing this
signal through an analog-to-digital converter forces each of the two parameters
to be quantized .  For instance, imagine the conversion being done with 12 bits
at a sampling rate of 1000 samples per second. The voltage is curtailed to 4096
(2
12
) possible binary levels, and the time is only defined at one millisecond
increments.  Signals formed from parameters that are quantized in this manner
are said to be discrete signals or digitized signals .  For the most part,
continuous signals exist in nature, while discrete signals exist inside computers
(although you can find exceptions to both cases).  It is also possible to have
signals where one parameter is continuous and the other is discrete.  Since
these mixed signals are quite uncommon, they do not have special names given
to them, and the nature of the two parameters must be explicitly stated.
Figure 2-1 shows two discrete signals, such as might be acquired with a
digital data acquisition system.  The vertical axis may represent voltage, light
The Scientist and Engineer's Guide to Digital Signal Processing 12
intensity, sound pressure, or an infinite number of other parameters.  Since we
don't know what it represents in this particular case, we will give it the generic
label: amplitude .   This parameter is also called several other names: the y-
axis , the dependent variable , the range , and the ordinate .
The horizontal axis represents the other parameter of the signal, going by
such names as: the x-axis , the independent variable , the domain , and the
abscissa .  Time is the most common parameter to appear on the horizontal axis
of acquired signals; however, other parameters are used in specific applications.
For example, a geophysicist might acquire measurements of rock density at
equally spaced distances along the surface of the earth.  To keep things
general, we will simply label the horizontal axis: sample number .  If this
were a continuous signal, another label would have to be used, such as: time , distance , x , etc.
The two parameters that form a signal are generally not interchangeable.  The
parameter on the y-axis (the dependent variable) is said to be a function of the
parameter on the x-axis (the independent variable).  In other words, the
independent variable describes how or when each sample is taken, while the
dependent variable is the actual measurement.  Given a specific value on the
x-axis, we can always find the corresponding value on the y-axis, but usually
not the other way around.
Pay particular attention to the word: domain , a very widely used term in DSP.
For instance, a signal that uses time as the independent variable (i.e., the
parameter on the horizontal axis), is said to be in the time domain .  Another
common signal in DSP uses frequency as the independent variable, resulting in
the term, frequency domain .  Likewise, signals that use distance as the
independent parameter are said to be in the spatial domain (distance is a
measure of space).  The type of parameter on the horizontal axis is the domain
of the signal; it's that simple.  What if the x-axis is labeled with something
very generic, such as sample number ?  Authors commonly refer to these signals
as being in the time domain.  This is because sampling at equal intervals of
time is the most common way of obtaining signals, and they don't have anything
more specific to call it.
Although the signals in Fig. 2-1 are discrete, they are displayed in this figure
as continuous lines.  This is because there are too many samples to be
distinguishable if they were displayed as individual markers.  In graphs that
portray shorter signals, say less than 100 samples, the individual markers are
usually shown.  Continuous lines may or may not be drawn to connect the
markers, depending on how the author wants you to view the data.  For
instance, a continuous line could imply what is happening between samples, or
simply be an aid to help the reader's eye follow a trend in noisy data.   The
point is, examine the labeling of the horizontal axis to find if you are working
with a discrete or continuous signal.  Don't rely on an illustrator's ability to
draw dots.
The variable, N , is widely used in DSP to represent the total number of
samples in a signal.  For example,  for the signals in Fig. 2-1.  To N ' 512
Chapter 2- Statistics, Probability and Noise 13
Sample number
0 64 128 192 256 320 384 448 512
-4
-2
0
2
4
6
8
511 a.  Mean = 0.5, F = 1
Sample number
0 64 128 192 256 320 384 448 512 -4
-2
0 2 4 6 8 511 b.  Mean = 3.0, F = 0.2
Amplitude Amplitude FIGURE 2-1
Examples of two digitized signals with different means and standard deviations . EQUATION 2-1
Calculation of a signal's mean.  The signal is
contained in x 0 through x N -1
, i is an index that
runs through these values, and µ is the mean.
µ '
1 N j
N & 1 i ' 0 x i keep the data organized, each sample is assigned a sample number or
index .  These are the numbers that appear along the horizontal axis.  Two
notations for assigning sample numbers are commonly used.  In the first
notation, the sample indexes run from 1 to N  (e.g., 1 to 512).  In the second
notation, the sample indexes run from 0 to  (e.g., 0 to 511). N & 1 Mathematicians often use the first method (1 to N ), while those in DSP
commonly uses the second (0 to ).  In this book, we will use the second N & 1 notation.  Don't dismiss this as a trivial problem.  It will confuse you
sometime during your career.  Look out for it!
Mean and Standard Deviation
The mean , indicated by µ (a lower case Greek mu ), is the statistician's  jargon
for the average value of a signal.  It is found just as you would expect: add all
of the samples together, and divide by N .  It looks like this in mathematical
form:
In words, sum the values in the signal, , by letting the index, i , run from 0 x i to .  Then finish the calculation by dividing the sum by N .  This is N & 1 identical to the equation: .  If you are not already µ ' ( x 0 % x 1 % x 2 % þ% x N & 1 ) / N familiar with E  (upper case Greek sigma ) being used to indicate summation , study these equations carefully, and compare them with the computer program
in Table 2-1.  Summations of this type are abundant in DSP, and you need to
understand this notation fully.
The Scientist and Engineer's Guide to Digital Signal Processing 14
EQUATION 2-2
Calculation of the standard deviation of a
signal. The signal is stored in , µ is the x i mean found from Eq. 2-1, N is the number of
samples, and  is the standard deviation. s
F
2 '
1 N & 1 j
N & 1 i ' 0 ( x i & µ ) 2 In electronics, the mean is commonly called the DC (direct current) value.
Likewise, AC (alternating current) refers to how the signal fluctuates around
the mean value.  If the signal is a simple repetitive waveform, such as a sine
or square wave, its excursions can be described by its peak-to-peak amplitude.
Unfortunately, most acquired signals do not show a well defined peak-to-peak
value, but have a random nature, such as the signals in Fig. 2-1.  A more
generalized method must be used in these cases, called the standard
deviation , denoted by F F (a lower case Greek sigma ).
As a starting point, the expression, , describes how far the  sample * x i & µ * i th
deviates (differs) from the mean.  The average deviation of a signal is found
by summing the deviations of all the individual samples, and then dividing by
the number of samples, N.  Notice that we take the absolute value of each
deviation before the summation; otherwise the positive and negative terms
would average to zero.  The average deviation provides a single number
representing the typical distance that the samples are from the mean.  While
convenient and straightforward, the average deviation is almost never used in
statistics.  This is because it doesn't fit well with the physics of how signals
operate.  In most cases, the important parameter is not the deviation from the
mean, but the power represented by the deviation from the mean.  For example,
when random noise signals combine in an electronic circuit, the resultant noise
is equal to the combined power of the individual signals, not their combined
amplitude .
The standard deviation is similar to the average deviation , except the
averaging is done with power instead of amplitude.  This is achieved by
squaring each of the deviations before taking the average (remember, power %
voltage
2 ).  To finish, the square root is taken to compensate for the initial
squaring.  In equation form, the standard deviation is calculated:
In the alternative notation: . F' ( x 0 & µ ) 2 % ( x 1 & µ ) 2 % þ% ( x N & 1 & µ ) 2 / ( N & 1 ) Notice that the average is carried out by dividing by  instead of N.  This N & 1 is a  subtle feature of the equation that will be discussed in the next section.
The term, F
2 , occurs frequently in statistics and is given the name variance.
The standard deviation is a measure of how far the signal fluctuates from the
mean.  The variance represents the power of this fluctuation.   Another term
you should become familiar with is the rms (root-mean-square) value,
frequently used in electronics.  By definition, the standard deviation only
measures the AC portion of a signal, while the rms value measures both the AC
and DC components.  If a signal has no DC component, its rms value is
identical to its standard deviation.  Figure 2-2 shows the relationship between
the standard deviation and the peak-to-peak value of several common
waveforms.
Chapter 2- Statistics, Probability and Noise 15
Vpp
F
Vpp
F
Vpp
F
Vpp
F
FIGURE 2-2
Ratio of the peak-to-peak amplitude to the standard deviation for several common waveforms. For the square
wave, this ratio is 2; for the triangle wave it is ; for the sine wave it is .  While random 12 ' 3 . 46 2 2 ' 2 . 83 noise has no exact peak-to-peak value, it is approximately 6 to 8 times the standard deviation.
a.  Square Wave, Vpp = 2 F
c. Sine wave, Vpp = 2 2 F
d. Random noise, Vpp . 6-8 F
b. Triangle wave, Vpp = 12 F
100 CALCULATION OF THE MEAN AND STANDARD DEVIATION
110 '
120 DIM X 'The signal is held in X to X
130 N% = 512 'N% is the number of points in the signal
140 '
150 GOSUB XXXX 'Mythical subroutine that loads the signal into X[ ]
160 '
170 MEAN = 0 'Find the mean via Eq. 2-1
180 FOR I% = 0 TO N%-1
190   MEAN = MEAN + X[I%]
200 NEXT I%
210 MEAN = MEAN/N%
220 '
230 VARIANCE = 0 'Find the standard deviation via Eq. 2-2
240 FOR I% = 0 TO N%-1
250   VARIANCE = VARIANCE + ( X[I%] - MEAN )^2
260 NEXT I%
270 VARIANCE = VARIANCE/(N%-1)
280 SD = SQR(VARIANCE)
290 '
300 PRINT  MEAN  SD 'Print the calculated mean and standard deviation
310 '
320 END
TABLE 2-1
Table 2-1 lists a computer routine for calculating the mean and standard
deviation using Eqs. 2-1 and 2-2.  The programs in this book are intended to
convey algorithms in the most straightforward way; all other factors are
treated as secondary.  Good programming techniques are disregarded if it
makes the program logic more clear.  For instance: a simplified version of
BASIC is used, line numbers are included, the only control structure allowed
is the FOR-NEXT loop, there are no I/O statements, etc.  Think of these
programs as an alternative way of understanding the equations used
```
Offer running on EduRev: Apply code STAYHOME200 to get INR 200 off on our premium plan EduRev Infinity!

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

;