Courses

# Numerical Algorithms & Errors (Computer Science & Engineering) Computer Science Engineering (CSE) Notes | EduRev

## Computer Science Engineering (CSE) : Numerical Algorithms & Errors (Computer Science & Engineering) Computer Science Engineering (CSE) Notes | EduRev

``` Page 1

CHAPTER 1 (4 LECTURES)
NUMERICAL ALGORITHMS AND ERRORS
1. Numerical analysis
Numerical analysis is the branch of mathematics which study and develop the algorithms that
use numerical approximation for the problems of mathematical analysis (continuous mathematics).
Numerical technique is widely used by scientists and engineers to solve their problems. A major
advantage for numerical technique is that a numerical answer can be obtained even when a problem
has no analytical solution. However, result from numerical analysis is an approximation, in general,
which can be made as accurate as desired. For example to nd the approximate values of
p
2,  etc.
In this chapter, we introduce and discuss some basic concepts of scientic computing. We begin
with discussion of
oating-point representation and then we discuss the most fundamental source of
imperfection in numerical computing namely roundo errors. We also discuss source of errors and then
stability of numerical algorithms.
2. Numerical analysis and the art of scientific computing
Scientic computing is a discipline concerned with the development and study of numerical algo-
rithms for solving mathematical problems that arise in various disciplines in science and engineering.
Typically, the starting point is a given mathematical model which has been formulated in an attempt
to explain and understand an observed phenomenon in biology, chemistry, physics, economics, or any
engineering or scientic discipline. We will concentrate on those mathematical models which are con-
tinuous (or piecewise continuous) and are dicult or impossible to solve analytically: this is usually
the case in practice. Relevant application areas within computer science include graphics, vision and
motion analysis, image and signal processing, search engines and data mining, machine learning, hybrid
and embedded systems, and many more. In order to solve such a model approximately on a computer,
the (continuous, or piecewise continuous) problem is approximated by a discrete one. Continuous func-
tions are approximated by nite arrays of values. Algorithms are then sought which approximately
solve the mathematical problem eciently, accurately and reliably.
3. Floating-point representation of numbers
Any real number is represented by an innite sequence of digits. For example
8
3
= 2:66666 =

2
10
1
+
6
10
2
+
6
10
3
+:::

 10
1
:
This is an innite series, but computer use an nite amount of memory to represent numbers. Thus
only a nite number of digits may be used to represent any number, no matter by what representation
method.
For example, we can chop the innite decimal representation of
8
3
after 4 digits,
8
3
= (
2
10
1
+
6
10
2
+
6
10
3
+
6
10
4
) 10
1
= 0:2666 10
1
:
Generalizing this, we say that number has n decimal digits and call this n as precision.
For each real number x, we associate a
oating point representation denoted by fl(x); given by
fl(x) =(0:a
1
a
2
:::a
n
)


e
;
here  based fraction is called mantissa with all a
i
integers and e is known as exponent. This repre-
sentation is called based
oating point representation of x.
For example,
42:965 = 4 10
1
+ 2 10
0
+ 9 10
1
+ 6 10
2
+ 5 10
3
= 42965 10
2
:
1
Page 2

CHAPTER 1 (4 LECTURES)
NUMERICAL ALGORITHMS AND ERRORS
1. Numerical analysis
Numerical analysis is the branch of mathematics which study and develop the algorithms that
use numerical approximation for the problems of mathematical analysis (continuous mathematics).
Numerical technique is widely used by scientists and engineers to solve their problems. A major
advantage for numerical technique is that a numerical answer can be obtained even when a problem
has no analytical solution. However, result from numerical analysis is an approximation, in general,
which can be made as accurate as desired. For example to nd the approximate values of
p
2,  etc.
In this chapter, we introduce and discuss some basic concepts of scientic computing. We begin
with discussion of
oating-point representation and then we discuss the most fundamental source of
imperfection in numerical computing namely roundo errors. We also discuss source of errors and then
stability of numerical algorithms.
2. Numerical analysis and the art of scientific computing
Scientic computing is a discipline concerned with the development and study of numerical algo-
rithms for solving mathematical problems that arise in various disciplines in science and engineering.
Typically, the starting point is a given mathematical model which has been formulated in an attempt
to explain and understand an observed phenomenon in biology, chemistry, physics, economics, or any
engineering or scientic discipline. We will concentrate on those mathematical models which are con-
tinuous (or piecewise continuous) and are dicult or impossible to solve analytically: this is usually
the case in practice. Relevant application areas within computer science include graphics, vision and
motion analysis, image and signal processing, search engines and data mining, machine learning, hybrid
and embedded systems, and many more. In order to solve such a model approximately on a computer,
the (continuous, or piecewise continuous) problem is approximated by a discrete one. Continuous func-
tions are approximated by nite arrays of values. Algorithms are then sought which approximately
solve the mathematical problem eciently, accurately and reliably.
3. Floating-point representation of numbers
Any real number is represented by an innite sequence of digits. For example
8
3
= 2:66666 =

2
10
1
+
6
10
2
+
6
10
3
+:::

 10
1
:
This is an innite series, but computer use an nite amount of memory to represent numbers. Thus
only a nite number of digits may be used to represent any number, no matter by what representation
method.
For example, we can chop the innite decimal representation of
8
3
after 4 digits,
8
3
= (
2
10
1
+
6
10
2
+
6
10
3
+
6
10
4
) 10
1
= 0:2666 10
1
:
Generalizing this, we say that number has n decimal digits and call this n as precision.
For each real number x, we associate a
oating point representation denoted by fl(x); given by
fl(x) =(0:a
1
a
2
:::a
n
)


e
;
here  based fraction is called mantissa with all a
i
integers and e is known as exponent. This repre-
sentation is called based
oating point representation of x.
For example,
42:965 = 4 10
1
+ 2 10
0
+ 9 10
1
+ 6 10
2
+ 5 10
3
= 42965 10
2
:
1
2 NUMERICAL ALGORITHMS AND ERRORS
0:00234 =0:234 10
2
:
0 is written as 0:00::: 0
e
. Likewise, we can use for binary number system and any real x can be
written
x =q 2
m
with
1
2
q 1 and some integer m. Both q and m will be expressed in terms of binary numbers.
For example,
1001:1101 = 1 2
3
+ 2 2
0
+ 1 2
1
+ 1 2
2
+ 1 2
4
= (9:8125)
10
:
Remark: The above representation is not unique.
For example, 0:2666 10
1
= 0:02666 10
2
.
Denition 3.1 (Normal form). A non-zero
oating-point number is in normal form if the values of
mantissa lies in (1;
1

] or [
1

; 1).
Therefore, we normalize the representation by a
1
6= 0: Not only the precision is limited to a nite
number of digits, but also the range of exponent is also restricted. Thus there are integers m and M
such thatmeM.
3.1. Rounding and chopping. Let x be any real number and fl(x) be its machine approximation.
There are two ways to do the \cutting" to store a real number
x =(0:a
1
a
2
:::a
n
a
n+1
::: )


e
; a
1
6= 0:
(1) Chopping: We ignore digits after a
n
and write the number as following in chopping
fl(x) = (:a
1
a
2
:::a
n
)

e
:
(2) Rounding: Rounding is dened as following
fl(x) =

(0:a
1
a
2
:::a
n
)


e
; 0a
n+1
<=2 (rounding down)
(0:a
1
a
2
:::a
n
)

+ (0:00::: 01)


e
; =2a
n+1
< (rounding up):
Example 1.
fl

6
7

=

0:86 10
0
(rounding)
0:85 10
0
(chopping):
Rules for rounding o numbers:
(1) If the digit to be dropped is greater than 5, the last retained digit is increased by one. For example,
12.6 is rounded to 13.
(2) If the digit to be dropped is less than 5, the last remaining digit is left as it is. For example,
12.4 is rounded to 12.
(3) If the digit to be dropped is 5, and if any digit following it is not zero, the last remaining digit is
increased by one. For example,
12.51 is rounded to 13.
(4) If the digit to be dropped is 5 and is followed only by zeros, the last remaining digit is increased
by one if it is odd, but left as it is if even. For example,
11.5 is rounded to 12, and 12.5 is rounded to 12.
Denition 3.2 (Absolute and relative error). If fl(x) is the approximation to the exact value x, then
the absolute error isjxfl(x)j; and relative error is
jxfl(x)j
jxj
:
Remark: As a measure of accuracy, the absolute error may be misleading and the relative error is more
meaningful.
Denition 3.3 (Over
ow and under
ow). An over
ow is obtained when a number is too large to t
into the
oating point system in use, i.e e>M. An under
ow is obtained when a number is too small,
i.ee<m . When over
ow occurs in the course of a calculation, this is generally fatal. But under
ow
is non-fatal: the system usually sets the number to 0 and continues. (Matlab does this, quietly.)
Page 3

CHAPTER 1 (4 LECTURES)
NUMERICAL ALGORITHMS AND ERRORS
1. Numerical analysis
Numerical analysis is the branch of mathematics which study and develop the algorithms that
use numerical approximation for the problems of mathematical analysis (continuous mathematics).
Numerical technique is widely used by scientists and engineers to solve their problems. A major
advantage for numerical technique is that a numerical answer can be obtained even when a problem
has no analytical solution. However, result from numerical analysis is an approximation, in general,
which can be made as accurate as desired. For example to nd the approximate values of
p
2,  etc.
In this chapter, we introduce and discuss some basic concepts of scientic computing. We begin
with discussion of
oating-point representation and then we discuss the most fundamental source of
imperfection in numerical computing namely roundo errors. We also discuss source of errors and then
stability of numerical algorithms.
2. Numerical analysis and the art of scientific computing
Scientic computing is a discipline concerned with the development and study of numerical algo-
rithms for solving mathematical problems that arise in various disciplines in science and engineering.
Typically, the starting point is a given mathematical model which has been formulated in an attempt
to explain and understand an observed phenomenon in biology, chemistry, physics, economics, or any
engineering or scientic discipline. We will concentrate on those mathematical models which are con-
tinuous (or piecewise continuous) and are dicult or impossible to solve analytically: this is usually
the case in practice. Relevant application areas within computer science include graphics, vision and
motion analysis, image and signal processing, search engines and data mining, machine learning, hybrid
and embedded systems, and many more. In order to solve such a model approximately on a computer,
the (continuous, or piecewise continuous) problem is approximated by a discrete one. Continuous func-
tions are approximated by nite arrays of values. Algorithms are then sought which approximately
solve the mathematical problem eciently, accurately and reliably.
3. Floating-point representation of numbers
Any real number is represented by an innite sequence of digits. For example
8
3
= 2:66666 =

2
10
1
+
6
10
2
+
6
10
3
+:::

 10
1
:
This is an innite series, but computer use an nite amount of memory to represent numbers. Thus
only a nite number of digits may be used to represent any number, no matter by what representation
method.
For example, we can chop the innite decimal representation of
8
3
after 4 digits,
8
3
= (
2
10
1
+
6
10
2
+
6
10
3
+
6
10
4
) 10
1
= 0:2666 10
1
:
Generalizing this, we say that number has n decimal digits and call this n as precision.
For each real number x, we associate a
oating point representation denoted by fl(x); given by
fl(x) =(0:a
1
a
2
:::a
n
)


e
;
here  based fraction is called mantissa with all a
i
integers and e is known as exponent. This repre-
sentation is called based
oating point representation of x.
For example,
42:965 = 4 10
1
+ 2 10
0
+ 9 10
1
+ 6 10
2
+ 5 10
3
= 42965 10
2
:
1
2 NUMERICAL ALGORITHMS AND ERRORS
0:00234 =0:234 10
2
:
0 is written as 0:00::: 0
e
. Likewise, we can use for binary number system and any real x can be
written
x =q 2
m
with
1
2
q 1 and some integer m. Both q and m will be expressed in terms of binary numbers.
For example,
1001:1101 = 1 2
3
+ 2 2
0
+ 1 2
1
+ 1 2
2
+ 1 2
4
= (9:8125)
10
:
Remark: The above representation is not unique.
For example, 0:2666 10
1
= 0:02666 10
2
.
Denition 3.1 (Normal form). A non-zero
oating-point number is in normal form if the values of
mantissa lies in (1;
1

] or [
1

; 1).
Therefore, we normalize the representation by a
1
6= 0: Not only the precision is limited to a nite
number of digits, but also the range of exponent is also restricted. Thus there are integers m and M
such thatmeM.
3.1. Rounding and chopping. Let x be any real number and fl(x) be its machine approximation.
There are two ways to do the \cutting" to store a real number
x =(0:a
1
a
2
:::a
n
a
n+1
::: )


e
; a
1
6= 0:
(1) Chopping: We ignore digits after a
n
and write the number as following in chopping
fl(x) = (:a
1
a
2
:::a
n
)

e
:
(2) Rounding: Rounding is dened as following
fl(x) =

(0:a
1
a
2
:::a
n
)


e
; 0a
n+1
<=2 (rounding down)
(0:a
1
a
2
:::a
n
)

+ (0:00::: 01)


e
; =2a
n+1
< (rounding up):
Example 1.
fl

6
7

=

0:86 10
0
(rounding)
0:85 10
0
(chopping):
Rules for rounding o numbers:
(1) If the digit to be dropped is greater than 5, the last retained digit is increased by one. For example,
12.6 is rounded to 13.
(2) If the digit to be dropped is less than 5, the last remaining digit is left as it is. For example,
12.4 is rounded to 12.
(3) If the digit to be dropped is 5, and if any digit following it is not zero, the last remaining digit is
increased by one. For example,
12.51 is rounded to 13.
(4) If the digit to be dropped is 5 and is followed only by zeros, the last remaining digit is increased
by one if it is odd, but left as it is if even. For example,
11.5 is rounded to 12, and 12.5 is rounded to 12.
Denition 3.2 (Absolute and relative error). If fl(x) is the approximation to the exact value x, then
the absolute error isjxfl(x)j; and relative error is
jxfl(x)j
jxj
:
Remark: As a measure of accuracy, the absolute error may be misleading and the relative error is more
meaningful.
Denition 3.3 (Over
ow and under
ow). An over
ow is obtained when a number is too large to t
into the
oating point system in use, i.e e>M. An under
ow is obtained when a number is too small,
i.ee<m . When over
ow occurs in the course of a calculation, this is generally fatal. But under
ow
is non-fatal: the system usually sets the number to 0 and continues. (Matlab does this, quietly.)
NUMERICAL ALGORITHMS AND ERRORS 3
4. Errors in numerical approximations
Letx be any real number we want to represent in a computer. Let fl(x) be the representation of x
in the computer then what is largest possible values of
jxfl(x)j
jxj
? In the worst case, how much data
we are losing due to round-o errors or chopping errors?
Chopping errors: Let
x = (0:a
1
a
2
:::a
n
a
n+1
::: )


e
=

1
X
i=1
a
i

i
!

e
; a
1
6= 0:
fl(x) = (0:a
1
a
2
:::a
n
)


e
=

n
X
i=1
a
i

i
!

e
:
Therefore
jxfl(x)j =

1
X
i=n+1
a
i

i
!

e

e
jxfl(x)j =
1
X
i=n+1
a
i

i
:
Now since each a
i
 1;
therefore,

e
jxfl(x)j 
1
X
i=n+1
1

i
= ( 1)

1

n+1
+
1

n+2
+:::

= ( 1))
"
1

n+1
1
1

#
=
n
:
Therefore absolute error bound is
E
a
=jxfl(x)j
en
:
Now
jxj = (0:a
1
a
2
:::a
n
)


e

1


e
:
Therefore relative error bound is
E
r
=
jxfl(x)j
jxj


n

e

1

e
=
1n
:
Rounding errors: For rounding
fl(x) =
8
>
>
<
>
>
:
(0:a
1
a
2
:::a
n
)


e
=

n
P
i=1
a
i

i


e
; 0a
n+1
<=2
(0:a
1
a
2
:::a
n1
[a
n
+ 1])

e
=

1

n
+
n
P
i=1
a
i

i


e
; =2a
n+1
<:
For 0<a
n+1
<=2;

e
jxfl(x)j =
1
X
i=n+1
a
i

i
=
a
n+1

n+1
+
1
X
i=n+2
a
i

i

=2 1

n+1
+
1
X
i=n+2
( 1)

i
=
=2 1

n+1
+
1

n+1
=
1
2

n
:
Page 4

CHAPTER 1 (4 LECTURES)
NUMERICAL ALGORITHMS AND ERRORS
1. Numerical analysis
Numerical analysis is the branch of mathematics which study and develop the algorithms that
use numerical approximation for the problems of mathematical analysis (continuous mathematics).
Numerical technique is widely used by scientists and engineers to solve their problems. A major
advantage for numerical technique is that a numerical answer can be obtained even when a problem
has no analytical solution. However, result from numerical analysis is an approximation, in general,
which can be made as accurate as desired. For example to nd the approximate values of
p
2,  etc.
In this chapter, we introduce and discuss some basic concepts of scientic computing. We begin
with discussion of
oating-point representation and then we discuss the most fundamental source of
imperfection in numerical computing namely roundo errors. We also discuss source of errors and then
stability of numerical algorithms.
2. Numerical analysis and the art of scientific computing
Scientic computing is a discipline concerned with the development and study of numerical algo-
rithms for solving mathematical problems that arise in various disciplines in science and engineering.
Typically, the starting point is a given mathematical model which has been formulated in an attempt
to explain and understand an observed phenomenon in biology, chemistry, physics, economics, or any
engineering or scientic discipline. We will concentrate on those mathematical models which are con-
tinuous (or piecewise continuous) and are dicult or impossible to solve analytically: this is usually
the case in practice. Relevant application areas within computer science include graphics, vision and
motion analysis, image and signal processing, search engines and data mining, machine learning, hybrid
and embedded systems, and many more. In order to solve such a model approximately on a computer,
the (continuous, or piecewise continuous) problem is approximated by a discrete one. Continuous func-
tions are approximated by nite arrays of values. Algorithms are then sought which approximately
solve the mathematical problem eciently, accurately and reliably.
3. Floating-point representation of numbers
Any real number is represented by an innite sequence of digits. For example
8
3
= 2:66666 =

2
10
1
+
6
10
2
+
6
10
3
+:::

 10
1
:
This is an innite series, but computer use an nite amount of memory to represent numbers. Thus
only a nite number of digits may be used to represent any number, no matter by what representation
method.
For example, we can chop the innite decimal representation of
8
3
after 4 digits,
8
3
= (
2
10
1
+
6
10
2
+
6
10
3
+
6
10
4
) 10
1
= 0:2666 10
1
:
Generalizing this, we say that number has n decimal digits and call this n as precision.
For each real number x, we associate a
oating point representation denoted by fl(x); given by
fl(x) =(0:a
1
a
2
:::a
n
)


e
;
here  based fraction is called mantissa with all a
i
integers and e is known as exponent. This repre-
sentation is called based
oating point representation of x.
For example,
42:965 = 4 10
1
+ 2 10
0
+ 9 10
1
+ 6 10
2
+ 5 10
3
= 42965 10
2
:
1
2 NUMERICAL ALGORITHMS AND ERRORS
0:00234 =0:234 10
2
:
0 is written as 0:00::: 0
e
. Likewise, we can use for binary number system and any real x can be
written
x =q 2
m
with
1
2
q 1 and some integer m. Both q and m will be expressed in terms of binary numbers.
For example,
1001:1101 = 1 2
3
+ 2 2
0
+ 1 2
1
+ 1 2
2
+ 1 2
4
= (9:8125)
10
:
Remark: The above representation is not unique.
For example, 0:2666 10
1
= 0:02666 10
2
.
Denition 3.1 (Normal form). A non-zero
oating-point number is in normal form if the values of
mantissa lies in (1;
1

] or [
1

; 1).
Therefore, we normalize the representation by a
1
6= 0: Not only the precision is limited to a nite
number of digits, but also the range of exponent is also restricted. Thus there are integers m and M
such thatmeM.
3.1. Rounding and chopping. Let x be any real number and fl(x) be its machine approximation.
There are two ways to do the \cutting" to store a real number
x =(0:a
1
a
2
:::a
n
a
n+1
::: )


e
; a
1
6= 0:
(1) Chopping: We ignore digits after a
n
and write the number as following in chopping
fl(x) = (:a
1
a
2
:::a
n
)

e
:
(2) Rounding: Rounding is dened as following
fl(x) =

(0:a
1
a
2
:::a
n
)


e
; 0a
n+1
<=2 (rounding down)
(0:a
1
a
2
:::a
n
)

+ (0:00::: 01)


e
; =2a
n+1
< (rounding up):
Example 1.
fl

6
7

=

0:86 10
0
(rounding)
0:85 10
0
(chopping):
Rules for rounding o numbers:
(1) If the digit to be dropped is greater than 5, the last retained digit is increased by one. For example,
12.6 is rounded to 13.
(2) If the digit to be dropped is less than 5, the last remaining digit is left as it is. For example,
12.4 is rounded to 12.
(3) If the digit to be dropped is 5, and if any digit following it is not zero, the last remaining digit is
increased by one. For example,
12.51 is rounded to 13.
(4) If the digit to be dropped is 5 and is followed only by zeros, the last remaining digit is increased
by one if it is odd, but left as it is if even. For example,
11.5 is rounded to 12, and 12.5 is rounded to 12.
Denition 3.2 (Absolute and relative error). If fl(x) is the approximation to the exact value x, then
the absolute error isjxfl(x)j; and relative error is
jxfl(x)j
jxj
:
Remark: As a measure of accuracy, the absolute error may be misleading and the relative error is more
meaningful.
Denition 3.3 (Over
ow and under
ow). An over
ow is obtained when a number is too large to t
into the
oating point system in use, i.e e>M. An under
ow is obtained when a number is too small,
i.ee<m . When over
ow occurs in the course of a calculation, this is generally fatal. But under
ow
is non-fatal: the system usually sets the number to 0 and continues. (Matlab does this, quietly.)
NUMERICAL ALGORITHMS AND ERRORS 3
4. Errors in numerical approximations
Letx be any real number we want to represent in a computer. Let fl(x) be the representation of x
in the computer then what is largest possible values of
jxfl(x)j
jxj
? In the worst case, how much data
we are losing due to round-o errors or chopping errors?
Chopping errors: Let
x = (0:a
1
a
2
:::a
n
a
n+1
::: )


e
=

1
X
i=1
a
i

i
!

e
; a
1
6= 0:
fl(x) = (0:a
1
a
2
:::a
n
)


e
=

n
X
i=1
a
i

i
!

e
:
Therefore
jxfl(x)j =

1
X
i=n+1
a
i

i
!

e

e
jxfl(x)j =
1
X
i=n+1
a
i

i
:
Now since each a
i
 1;
therefore,

e
jxfl(x)j 
1
X
i=n+1
1

i
= ( 1)

1

n+1
+
1

n+2
+:::

= ( 1))
"
1

n+1
1
1

#
=
n
:
Therefore absolute error bound is
E
a
=jxfl(x)j
en
:
Now
jxj = (0:a
1
a
2
:::a
n
)


e

1


e
:
Therefore relative error bound is
E
r
=
jxfl(x)j
jxj


n

e

1

e
=
1n
:
Rounding errors: For rounding
fl(x) =
8
>
>
<
>
>
:
(0:a
1
a
2
:::a
n
)


e
=

n
P
i=1
a
i

i


e
; 0a
n+1
<=2
(0:a
1
a
2
:::a
n1
[a
n
+ 1])

e
=

1

n
+
n
P
i=1
a
i

i


e
; =2a
n+1
<:
For 0<a
n+1
<=2;

e
jxfl(x)j =
1
X
i=n+1
a
i

i
=
a
n+1

n+1
+
1
X
i=n+2
a
i

i

=2 1

n+1
+
1
X
i=n+2
( 1)

i
=
=2 1

n+1
+
1

n+1
=
1
2

n
:
4 NUMERICAL ALGORITHMS AND ERRORS
For =2a
n+1
<;

e
jxfl(x)j =

1
X
i=n+1
a
i

i

1

n

=

1

n

1
X
i=n+1
a
i

i

=

1

n

a
n+1

n+1

1
X
i=n+2
a
i

i



1

n

a
n+1

n+1

Sincea
n+1
=2; therefore

e
jxfl(x)j 

1

n

=2

n+1

=
1
2

n
:
Therefore, for both cases absolute error bound is
E
a
=jxfl(x)j
1
2

en
:
Also relative error bound is
E
r
=
jxfl(x)j
jxj

1
2

n

e

1

e
=
1
2

1n
:
5. Significant Figures
All measurements are approximations. No measuring device can give perfect measurements without
experimental uncertainty. By convention, a mass measured to 13:2 g is said to have an absolute
uncertainty of plus or minus 0:1 g and is said to have been measured to the nearest 0:1 g. In other
words, we are somewhat uncertain about that last digit-it could be a \2"; then again, it could be a
\1" or a \3". A mass of 13:20 g indicates an absolute uncertainty of plus or minus 0:01 g.
The number of signicant gures in a result is simply the number of gures that are known with some
degree of reliability.
The number 25:4 is said to have 3 signicant gures. The number 25:40 is said to have 4 signicant
gures
Rules for deciding the number of signicant gures in a measured quantity:
(1) All nonzero digits are signicant:
1:234 has 4 signicant gures, 1:2 has 2 signicant gures.
(2) Zeros between nonzero digits are signicant: 1002 has 4 signicant gures.
(3) Leading zeros to the left of the rst nonzero digits are not signicant; such zeros merely indicate
the position of the decimal point: 0:001 has only 1 signicant gure.
(4) Trailing zeros that are also to the right of a decimal point in a number are signicant: 0:0230 has
3 signicant gures.
(5) When a number ends in zeros that are not to the right of a decimal point, the zeros are not neces-
sarily signicant: 190 may be 2 or 3 signicant gures, 50600 may be 3, 4, or 5 signicant gures.
The potential ambiguity in the last rule can be avoided by the use of standard exponential, or \scien-
tic," notation. For example, depending on whether the number of signicant gures is 3, 4, or 5, we
would write 50600 calories as:
0:506 10
5
(3 signicant gures)
0:5060 10
5
(4 signicant gures), or
0:50600 10
5
(5 signicant gures).
What is an \exact number"? Some numbers are exact because they are known with complete certainty.
Most exact numbers are integers: exactly 12 inches are in a foot, there might be exactly 23 students in
Page 5

CHAPTER 1 (4 LECTURES)
NUMERICAL ALGORITHMS AND ERRORS
1. Numerical analysis
Numerical analysis is the branch of mathematics which study and develop the algorithms that
use numerical approximation for the problems of mathematical analysis (continuous mathematics).
Numerical technique is widely used by scientists and engineers to solve their problems. A major
advantage for numerical technique is that a numerical answer can be obtained even when a problem
has no analytical solution. However, result from numerical analysis is an approximation, in general,
which can be made as accurate as desired. For example to nd the approximate values of
p
2,  etc.
In this chapter, we introduce and discuss some basic concepts of scientic computing. We begin
with discussion of
oating-point representation and then we discuss the most fundamental source of
imperfection in numerical computing namely roundo errors. We also discuss source of errors and then
stability of numerical algorithms.
2. Numerical analysis and the art of scientific computing
Scientic computing is a discipline concerned with the development and study of numerical algo-
rithms for solving mathematical problems that arise in various disciplines in science and engineering.
Typically, the starting point is a given mathematical model which has been formulated in an attempt
to explain and understand an observed phenomenon in biology, chemistry, physics, economics, or any
engineering or scientic discipline. We will concentrate on those mathematical models which are con-
tinuous (or piecewise continuous) and are dicult or impossible to solve analytically: this is usually
the case in practice. Relevant application areas within computer science include graphics, vision and
motion analysis, image and signal processing, search engines and data mining, machine learning, hybrid
and embedded systems, and many more. In order to solve such a model approximately on a computer,
the (continuous, or piecewise continuous) problem is approximated by a discrete one. Continuous func-
tions are approximated by nite arrays of values. Algorithms are then sought which approximately
solve the mathematical problem eciently, accurately and reliably.
3. Floating-point representation of numbers
Any real number is represented by an innite sequence of digits. For example
8
3
= 2:66666 =

2
10
1
+
6
10
2
+
6
10
3
+:::

 10
1
:
This is an innite series, but computer use an nite amount of memory to represent numbers. Thus
only a nite number of digits may be used to represent any number, no matter by what representation
method.
For example, we can chop the innite decimal representation of
8
3
after 4 digits,
8
3
= (
2
10
1
+
6
10
2
+
6
10
3
+
6
10
4
) 10
1
= 0:2666 10
1
:
Generalizing this, we say that number has n decimal digits and call this n as precision.
For each real number x, we associate a
oating point representation denoted by fl(x); given by
fl(x) =(0:a
1
a
2
:::a
n
)


e
;
here  based fraction is called mantissa with all a
i
integers and e is known as exponent. This repre-
sentation is called based
oating point representation of x.
For example,
42:965 = 4 10
1
+ 2 10
0
+ 9 10
1
+ 6 10
2
+ 5 10
3
= 42965 10
2
:
1
2 NUMERICAL ALGORITHMS AND ERRORS
0:00234 =0:234 10
2
:
0 is written as 0:00::: 0
e
. Likewise, we can use for binary number system and any real x can be
written
x =q 2
m
with
1
2
q 1 and some integer m. Both q and m will be expressed in terms of binary numbers.
For example,
1001:1101 = 1 2
3
+ 2 2
0
+ 1 2
1
+ 1 2
2
+ 1 2
4
= (9:8125)
10
:
Remark: The above representation is not unique.
For example, 0:2666 10
1
= 0:02666 10
2
.
Denition 3.1 (Normal form). A non-zero
oating-point number is in normal form if the values of
mantissa lies in (1;
1

] or [
1

; 1).
Therefore, we normalize the representation by a
1
6= 0: Not only the precision is limited to a nite
number of digits, but also the range of exponent is also restricted. Thus there are integers m and M
such thatmeM.
3.1. Rounding and chopping. Let x be any real number and fl(x) be its machine approximation.
There are two ways to do the \cutting" to store a real number
x =(0:a
1
a
2
:::a
n
a
n+1
::: )


e
; a
1
6= 0:
(1) Chopping: We ignore digits after a
n
and write the number as following in chopping
fl(x) = (:a
1
a
2
:::a
n
)

e
:
(2) Rounding: Rounding is dened as following
fl(x) =

(0:a
1
a
2
:::a
n
)


e
; 0a
n+1
<=2 (rounding down)
(0:a
1
a
2
:::a
n
)

+ (0:00::: 01)


e
; =2a
n+1
< (rounding up):
Example 1.
fl

6
7

=

0:86 10
0
(rounding)
0:85 10
0
(chopping):
Rules for rounding o numbers:
(1) If the digit to be dropped is greater than 5, the last retained digit is increased by one. For example,
12.6 is rounded to 13.
(2) If the digit to be dropped is less than 5, the last remaining digit is left as it is. For example,
12.4 is rounded to 12.
(3) If the digit to be dropped is 5, and if any digit following it is not zero, the last remaining digit is
increased by one. For example,
12.51 is rounded to 13.
(4) If the digit to be dropped is 5 and is followed only by zeros, the last remaining digit is increased
by one if it is odd, but left as it is if even. For example,
11.5 is rounded to 12, and 12.5 is rounded to 12.
Denition 3.2 (Absolute and relative error). If fl(x) is the approximation to the exact value x, then
the absolute error isjxfl(x)j; and relative error is
jxfl(x)j
jxj
:
Remark: As a measure of accuracy, the absolute error may be misleading and the relative error is more
meaningful.
Denition 3.3 (Over
ow and under
ow). An over
ow is obtained when a number is too large to t
into the
oating point system in use, i.e e>M. An under
ow is obtained when a number is too small,
i.ee<m . When over
ow occurs in the course of a calculation, this is generally fatal. But under
ow
is non-fatal: the system usually sets the number to 0 and continues. (Matlab does this, quietly.)
NUMERICAL ALGORITHMS AND ERRORS 3
4. Errors in numerical approximations
Letx be any real number we want to represent in a computer. Let fl(x) be the representation of x
in the computer then what is largest possible values of
jxfl(x)j
jxj
? In the worst case, how much data
we are losing due to round-o errors or chopping errors?
Chopping errors: Let
x = (0:a
1
a
2
:::a
n
a
n+1
::: )


e
=

1
X
i=1
a
i

i
!

e
; a
1
6= 0:
fl(x) = (0:a
1
a
2
:::a
n
)


e
=

n
X
i=1
a
i

i
!

e
:
Therefore
jxfl(x)j =

1
X
i=n+1
a
i

i
!

e

e
jxfl(x)j =
1
X
i=n+1
a
i

i
:
Now since each a
i
 1;
therefore,

e
jxfl(x)j 
1
X
i=n+1
1

i
= ( 1)

1

n+1
+
1

n+2
+:::

= ( 1))
"
1

n+1
1
1

#
=
n
:
Therefore absolute error bound is
E
a
=jxfl(x)j
en
:
Now
jxj = (0:a
1
a
2
:::a
n
)


e

1


e
:
Therefore relative error bound is
E
r
=
jxfl(x)j
jxj


n

e

1

e
=
1n
:
Rounding errors: For rounding
fl(x) =
8
>
>
<
>
>
:
(0:a
1
a
2
:::a
n
)


e
=

n
P
i=1
a
i

i


e
; 0a
n+1
<=2
(0:a
1
a
2
:::a
n1
[a
n
+ 1])

e
=

1

n
+
n
P
i=1
a
i

i


e
; =2a
n+1
<:
For 0<a
n+1
<=2;

e
jxfl(x)j =
1
X
i=n+1
a
i

i
=
a
n+1

n+1
+
1
X
i=n+2
a
i

i

=2 1

n+1
+
1
X
i=n+2
( 1)

i
=
=2 1

n+1
+
1

n+1
=
1
2

n
:
4 NUMERICAL ALGORITHMS AND ERRORS
For =2a
n+1
<;

e
jxfl(x)j =

1
X
i=n+1
a
i

i

1

n

=

1

n

1
X
i=n+1
a
i

i

=

1

n

a
n+1

n+1

1
X
i=n+2
a
i

i



1

n

a
n+1

n+1

Sincea
n+1
=2; therefore

e
jxfl(x)j 

1

n

=2

n+1

=
1
2

n
:
Therefore, for both cases absolute error bound is
E
a
=jxfl(x)j
1
2

en
:
Also relative error bound is
E
r
=
jxfl(x)j
jxj

1
2

n

e

1

e
=
1
2

1n
:
5. Significant Figures
All measurements are approximations. No measuring device can give perfect measurements without
experimental uncertainty. By convention, a mass measured to 13:2 g is said to have an absolute
uncertainty of plus or minus 0:1 g and is said to have been measured to the nearest 0:1 g. In other
words, we are somewhat uncertain about that last digit-it could be a \2"; then again, it could be a
\1" or a \3". A mass of 13:20 g indicates an absolute uncertainty of plus or minus 0:01 g.
The number of signicant gures in a result is simply the number of gures that are known with some
degree of reliability.
The number 25:4 is said to have 3 signicant gures. The number 25:40 is said to have 4 signicant
gures
Rules for deciding the number of signicant gures in a measured quantity:
(1) All nonzero digits are signicant:
1:234 has 4 signicant gures, 1:2 has 2 signicant gures.
(2) Zeros between nonzero digits are signicant: 1002 has 4 signicant gures.
(3) Leading zeros to the left of the rst nonzero digits are not signicant; such zeros merely indicate
the position of the decimal point: 0:001 has only 1 signicant gure.
(4) Trailing zeros that are also to the right of a decimal point in a number are signicant: 0:0230 has
3 signicant gures.
(5) When a number ends in zeros that are not to the right of a decimal point, the zeros are not neces-
sarily signicant: 190 may be 2 or 3 signicant gures, 50600 may be 3, 4, or 5 signicant gures.
The potential ambiguity in the last rule can be avoided by the use of standard exponential, or \scien-
tic," notation. For example, depending on whether the number of signicant gures is 3, 4, or 5, we
would write 50600 calories as:
0:506 10
5
(3 signicant gures)
0:5060 10
5
(4 signicant gures), or
0:50600 10
5
(5 signicant gures).
What is an \exact number"? Some numbers are exact because they are known with complete certainty.
Most exact numbers are integers: exactly 12 inches are in a foot, there might be exactly 23 students in
NUMERICAL ALGORITHMS AND ERRORS 5
a class. Exact numbers are often found as conversion factors or as counts of objects. Exact numbers
can be considered to have an innite number of signicant gures. Thus, the number of apparent
signicant gures in any exact number can be ignored as a limiting factor in determining the number
of signicant gures in the result of a calculation.
6. Rules for mathematical operations
In carrying out calculations, the general rule is that the accuracy of a calculated result is limited
by the least accurate measurement involved in the calculation. In addition and subtraction, the result
is rounded o so that it has the same number of digits as the measurement having the fewest decimal
places (counting from left to right). For example,
100 (assume 3 signicant gures) +23:643 (5 signicant gures) = 123:643;
which should be rounded to 124 (3 signicant gures). However, that it is possible two numbers have
no common digits (signicant gures in the same digit column).
In multiplication and division, the result should be rounded o so as to have the same number of
signicant gures as in the component with the least number of signicant gures. For example,
3:0 (2 signicant gures ) 12:60 (4 signicant gures) = 37:8000
which should be rounded to 38 (2 signicant gures).
Let X =f(x
1
;x
2
;:::;x
n
) be the function having n variables. To determine the error X in X due to
the errors x
1
;x
2
;:::;x
n
, respectively.
X +X =f(x
1
+x
1
;x
2
+x
2
;:::;x
n
+x
n
):
Error in addition of numbers. Let X =x
1
+x
2
+ +x
n
:
Therefore
X +X = (x
1
+x
1
) + (x
2
+x
2
) + + (x
n
+x
n
)
= (x
1
+x
2
+ +x
n
) + (x
1
+x
2
+ +x
n
)
Therefore absolute error,
jXjjx
1
j +jx
2
j + +jx
n
j:
Dividing by X we get,

X
X



x
1
X

+

x
2
X

+:::

x
n
X

which is a maximum relative error. Therefore it shows that when the given numbers are added then
the magnitude of absolute error in the result is the sum of the magnitudes of the absolute errors of the
components of that number.
Error in subtraction of numbers. As in the case of addition, we can obtain the maximum absolute
errors for subtraction of two numbers. Let X =x
1
x
2
: Then
jXjjx
1
j +jx
2
j:
Also

X
X



x
1
X

+

x
2
X

which is a maximum relative error in subtraction of numbers.
Error in product of numbers. Let X =x
1
x
2
:::x
n
then using the general formula for error
X =x
1
@X
@x
1
+x
2
@X
@x
2
+ +x
n
@X
@x
n
:
We have
X
X
=
x
1
X
@X
@x
1
+
x
2
X
@X
@x
2
+ +
x
n
X
@X
@x
n
:
Now
1
X
@X
@x
1
=
x
2
x
3
:::x
n
x
1
x
2
x
3
:::x
n
=
1
x
1
1
X
@X
@x
2
=
x
1
x
3
:::x
n
x
1
x
2
x
3
:::x
n
=
1
x
2

```
Offer running on EduRev: Apply code STAYHOME200 to get INR 200 off on our premium plan EduRev Infinity!

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

;