lgebra Linear e Aplicaes LINEAR EQUATIONS Start with a few examples - - PowerPoint PPT Presentation
lgebra Linear e Aplicaes LINEAR EQUATIONS Start with a few examples - - PowerPoint PPT Presentation
lgebra Linear e Aplicaes LINEAR EQUATIONS Start with a few examples One way of designing a reconstruction kernel Cubic B-Spline Leads to linear equations Two ways of computing the sums of squares Both lead to linear
LINEAR EQUATIONS
Start with a few examples
- One way of designing a reconstruction kernel
- Cubic B-Spline
- Leads to linear equations
- Two ways of computing the sums of squares
- Both lead to linear equations
- One simpler than the other
Desirable properties
- A smooth, piecewise polynomial function
P3(x) P3(−x) Q3(−x) Q3(x) 1 2
- 1
- 2
P 0
3(0) = 0
Q3(2) = 0 Q0
3(2) = 0
Q00
3(2) = 0
P3(1) = Q3(1) P 0
3(1) = Q0 3(1)
P 00
3 (1) = Q00 3(1)
Z 1 P3(x) dx + Z 2
1
Q3(x) dx = 1 2 P3(x) = ax3 + bx2 + cx + d Q3(x) = ex3 + fx2 + gx + h
Resulting linear equations
- Substituting, we get the linear system
c = 0 8e + 4f + 2g + h = 0 12e + 4f + g = 0 12e + 2f = 0 a + b + c + d − e − f − g − h = 0 3a + 2b + c − 3e − 2f − g = 0 6a + 2b − 6e − 2f = 0
1 4a + 1 3b + 1 2c + d + 15 4 e + 7 3f + 3 2g + h = 1 2
Result: the Cubic B-Spline function
- Solving, we get
P3(x) P3(−x) Q3(−x) Q3(x) 1 2
- 1
- 2
P3(x) = x3/2 − x2 + 2/3 Q3(x) = −x3/6 + x2 − 2x + 4/3
Sum of squares
- Find a formula for the sum of squares
- Seems to be a cubic polynomial
- Use first 4 results to solve for a, b, c, and d
S2(n) =
n
X
i=1
i2 P2(n) = a n3 + b n2 + c n + d P2(0) = d = 0 = S2(0) P2(1) = a + b + c + d = 1 = S2(1) P2(2) = 8a + 4b + 2c + d = 5 = S2(2) P2(3) = 27a + 9b + 3c + d = 14 = S2(3)
Even simpler!
- On one hand
- On the other hand
- Equating coefficient by coefficient
P2(n + 1) − P2(n) = 3a n2 + (3a + 2b) n + a + b + c S2(n + 1) − S2(n) = (n + 1)2 = n2 + 2n + 1 3a = 1 3a + 2b = 2 a + b + c = 1 d = 0 P2(n) = n (n + 1)(2n + 1) 6
Generalizing
- Formulate general problem
- Propose a solution strategy
- Elimination
- Back-substitution
- Propose a convenient notation
- Investigate the cost of solution
Systems of linear equations
- General problem formulation
- aij are the coefficients
- bi is the right-hand side
- xi are the variables or unknowns
- m equations on n unknowns
a11x1 + a12x2 + · · · a1nxn = b1 a21x1 + a22x2 + · · · a2nxn = b2 . . . am1x1 + am2x2 + · · · amnxn = bm
Solutions to linear equations
- There are only three possibilities
- System has unique solution
- System has no solution
- System has infinitely many solutions
- Easy to understand geometrically
The row-view of linear systems
- Imagine m = 2 and n = 2
- Each equation represents a line in R2
- Planes in 3D, hyper-planes in higher dimensions
a11x + a12y = b1 a21x + a22y = b2
How to solve a linear system?
- Transform to an equivalent system
- Same solution set
- Easier to solve
- Method known as Gaussian Elimination
2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7 2x + y + z = 1 − y − 2z = −4 − 4z = −4
Elementary operations
- Consider the following row operations:
- Exchange two rows
- Multiply a row by a non-zero constant
- Add a multiple of a row into another
- Can these operations change the solution set?
- Are these operations reversible?
Elimination step-by-step
- Let each row i be denoted by Ei
- Use Ei to eliminate variable i from Ei+1 to En
2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7 2x + y + z = 1 − y − 2z = −4 (E2 − 3E1) −2x + 2y + z = 7 2x + y + z = 1 − y − 2z = −4 3y + 2z = 8 (E3 + E1) 2x + y + z = 1 − 1y − 2z = −4 3y + 2z = 8 2x + y + z = 1 − 1y − 2z = −4 − 4z = −4 (E3 + 3E2) 2x + y + z = 1 − y − 2z = −4 − 4z = −4
Back substitution
- Start from triangularized system
- Solve for z in E3
- Solve for y in E2
- Solve for x in E1
2x + y + z = 1 − y − 2z = −4 − 4z = −4 z = 1 y = 4 − 2z = 4 − 2(1) = 2 x = 1 2(1 − y − z) = 1 2(1 − 2 − 1) = −1
Simplifying notation
- Variable names and operators are redundant
- Coefficient matrix A and right-hand side b
- Augmented matrix [A|b]
- A matrix is a rectangular array of scalars
- A scalar is a real or complex number
2 1 1 1 6 2 1 −1 −2 2 1 7 2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7
Elimination on augmented matrix
2 1 1 1 6 2 1 −1 −2 2 1 7 2x + y + z = 1 − y − 2z = −4 − 4z = −4 2x + y + z = 1 6x + 2y + z = −1 −2x + 2y + z = 7 2 1 1 1 −1 −2 −4 3 2 8 R2 − 3R1 R3 + R1 2 1 1 1 −1 −2 −4 −4 −4 R3 + 3R2
Back-substitution on augmented matrix
x = −1 y = 2 z = 1 2 1 1 1 −1 −2 −4 1 1 R3/(−4) 2 1 −1 −2 1 1 R1 − R3 R2 + 2R3 2 1 1 2 1 1 R2/(−1) 2 −2 1 2 1 1 R1 − R2 1 −1 1 2 1 1 R1/2
Origins of method
- Chinese book Chiu-chang Suan-shu
- Nine chapters on arithmetic
- From around 200 B.C.
- Using a “counting board”
- Row manipulations
- Made its way to Japan then Europe
- Gauss was a user of method, not inventor
How much computation?
- Assume an n × n system of equations
- Count number of additions/subtractions
- Count number of multiplications/divisions
- The elimination step is
a11 a12 · · · a1n b1 a21 a22 · · · a2n b2 . . . . . . ... . . . . . . an1 an2 · · · ann bn t11 t12 · · · t1n c1 t22 · · · t2n c2 . . . . . . ... . . . . . . · · · tnn cn
Operation count of elimination step
Pivot Elimination Multiplication & Division Addition & Subtraction 1 (n-1)(1+(n-1)+1) (n-1)((n-1)+1) 2 (n-2)(1+(n-2)+1) (n-2)((n-2)+1) … … … n-1 1(1+(1)+1) 1((1)+1)
n−1
X
i=1
i(i + 2)
n−1
X
i=1
i(i + 1)
How much computation?
- The back-substitution step is
- And the solution is given by
t11 t12 · · · t1n c1 t22 · · · t2n c2 . . . . . . ... . . . . . . · · · tnn cn 1 · · · s1 1 · · · s2 . . . . . . ... . . . . . . · · · 1 sn x1 x2 . . . xn = s1 s2 . . . sn
Operation count of back-substitution
Pivot Elimination Back-substitution Multiplication & Division Addition & Subtraction Multiplication & Division Addition & Subtraction 1 (n-1)(1+(n-1)+1) (n-1)((n-1)+1) 1+(0) 2 (n-2)(1+(n-2)+1) (n-2)((n-2)+1) 1+(1) 1 … … … … … n-1 1(1+(1)+1) 1((1)+1) 1+(n-1) n-1
n−1
X
i=1
i(i + 2)
n−1
X
i=1
i(i + 1)
n
X
i=1
i
n−1
X
i=1
i
Total operation count of Gaussian Elimination
- Number of multiplication and divisions
- Number of additions and subtractions
- So about of each operation (n is large!)
n3 3 + n2 − n 3 n3 3 + n2 2 − 5n 6 n3 3
Gauss-Jordan method
- Try to avoid back-substitution
- Modify Gaussian Elimination
- Scale Ei so pivot is 1
- Annihilate all terms above pivot as well
- Intermediate state would look like
1 c13 · · · c1n d1 1 c23 · · · c2n d2 c33 · · · c3n d3 . . . . . . . . . ... . . . . . . cn3 · · · cnn dn
Total operation count of Gaussian-Jordan method
- Number of multiplication and divisions
- Number of additions and subtractions
- So about of each operation
- Worse than Gaussian Elimination! Why?
n3 2 + n2 2 n3 2 − n2 2 n3 2
Why do we care about cost?
- Are there large linear systems?
- Yes, very large systems
- Look at two examples
- Signal reconstruction
- With our Cubic B-Spline!
- Two-point boundary problem
- Solve ODE numerically through discretization
Signal reconstruction
- Assume you have sampled a function f at a
range of integer positions
- Now you want to reconstruct the value of the
function at all reals in a finite range
- Use a compactly supported generating
function and define
- f(0), f(1), f(2), . . .
ϕ(x) ˜ f(x) = X
i
f(i)ϕ(x − i)
- 2
- 1
1 2 0.2 0.4 0.6 0.8 1.0
- 2
- 1
1 2 0.2 0.4 0.6 0.8 1.0
- 2
- 1
1 2 0.2 0.4 0.6 0.8 1.0
Generating functions
- Typical generating functions
- Typical reconstructions
nearest/box linear/tent cubic B-Spline
1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5
B-Spline interpolation
- Some generating functions do not interpolate
- What are the interpolation constraints?
1 2 3 4 5 6 7 1 2 3 4 5 1 2 3 4 5 6 7 1 2 3 4 5
˜ f(x) = X
j
cjϕ(x − j) ˜ f(x) = X
i
f(i)ϕ(x − i)
˜ f(i) = X
j
cjϕ(i − j) = f(i)
1 2 3 4 5 6 7 1 2 3 4 5
Results in large linear system
- Digital music is sampled air pressure over time
- Typical CD track runs for 5 minutes @ 44KHz
- Leads to 44,000 × 60 × 5 = 13,200,000 samples
- Each sample leads to one equation
- How much memory would we need?
- Does sparsity help?
- Is there any hope in solving this linear system?
- Does matrix structure help?
Two-point boundary value
- Consider the ordinary differential equation
where and
- We want to find y(t), for t in [a, b]
- u, v, w, and f are known in [a, b]
- It may not be possible to express y in terms of
elementary functions
- Use numerical methods
u(t)y00(t) + v(t)y0(t) + w(t)y(t) = f(t) y(a) = α y(b) = β
Discretize domain and derivatives
- Aproximate y at many discrete positions in [a, b]
- Use Taylor series to approximate derivatives
- Smaller step size h improves approximation
h t0 = a t1 = a + h t2 = a + 2h tn+1 = b tn = a + nh h h
… …
y0(ti) = y(ti + h) − y(ti − h) 2h + O(h2) y00(ti) = y(ti + h) − 2y(ti) + y(ti − h) h2 + O(h2)
Results in large linear system
- Discretizing and substituting approximations
- We reach
u(t)y00(t) + v(t)y0(t) + w(t)y(t) = f(t) y0(ti) ≈ y(ti + h) − y(ti − h) 2h y00(ti) ≈ y(ti + h) − 2y(ti) + y(ti − h) h2 ui(yi+1 − 2yi + yi−1) 1 h2 + vi(yi+1 − yi−1) 1 2h + wiyi = fi (2ui + hvi) yi+1 + (2h2wi − 4ui) yi + (2ui − hvi) yi−1 = 2h2fi
= yi+1 − yi−1 2h = yi+1 − 2yi + yi−1 h2
Machine problem #1
- Use Matlab to solve boundary problem
- See the effect of the step size on precision
- See the effect of dense/sparse matrices
- On system solution
- On memory
- Compare with built-in solver
Lecture 2
- Number representation by computer
- and limitations
- Consequences on Gaussian Elimination
- and solutions
FLOATING-POINT REPRESENTATION
Floating-point numbers
- All real numbers represented in a computer?
- Impossible, of course. Finite memory!
- Can only represent a finite set of values
- Widely accepted IEEE floating-point standard
- Formats, rounding, arithmetic operations
- Represented by familiar scientific notation
- Except, in binary…
6.02214179 × 1023 −1.602176487 × 10−19
Binary floating-point
- One sign bit
- w exponent bits
- t mantissa or fraction or significand bits
s
sign exponent fraction
w t e (−1)s × (1.b-1b-2b-3 · · · b-t)2 × 2e-z m = (1.b-1b-2b-3 · · · b-t)2 z = 2w−1 − 1
Representation details
- Normalized representation for mantissa m
- Ensures unique representation for mantissa
- First bit is implicitly set to 1
- Get one extra significant bit!
- Excess encoding for exponent e
- Allows for positive and negative exponents
- Therefore large and small magnitudes
- Subtract from encoded exponent
12 ≤ m < 102 z = 2w−1 − 1 (−1)s × (1.b-1b-2b-3 · · · b-t)2 × 2e-z z = 2w−1 − 1
Special values
- Largest representable exponent is reserved
- m = 0 represents ±Inf
- m ≠ 0 represents NaN (Not-a-Number)
- NaN propagates with any operation
- ex: NaN + x = NaN
- Other special operations
n ÷ ±Inf = ±0 ±0 ÷ ±0 = NaN ±Inf × ±Inf = ±Inf Inf - Inf = NaN n ÷ ±0 = ±Inf ±Inf ÷ ±Inf = NaN Inf + Inf = Inf ±Inf × 0 = NaN
Denormalization
- What happens when exponent is the smallest?
- Normalization causes abrupt underflow
- From to 0
- Introduce denormalized numbers
- When exponent is smallest, no implicit leading 1
- PS: Same number of values in each [2i, 2i+1]
1 2 3 4
1.
t
z }| { 00 · · · 1 ×2-z
0.5
Summary of representation
sign exponent fraction ± not 0…0 or 1…1 any normalized ± 0…0 not 0…0 denormalized ± 0...0 0…0 zero ± 1…1 0…0 Inf ± 1…1 not 0…0 NaN
Typical floating-point formats
Single precision (binary32) Total bits 32 Exponent bits 8 Mantissa bits 23 Exponent range
- 126…127
Smallest magnitude ≈ 10-45 Decimal range ≈ -1038 to 1038 Decimal precision ≈ 7 ditigs
Typical floating-point formats
Single precision (binary32) Double precision (binary64) Total bits 32 64 Exponent bits 8 11 Mantissa bits 23 52 Exponent range
- 126…127
- 1022…1023
Smallest magnitude ≈ 10-45 ≈ 10-324 Decimal range ≈ -1038 to 1038
- 10308 to 10308
Decimal precision ≈ 7 digits ≈ 16 digits
Conversion between bases
- Integer part
- Repeatedly divide by base
- Keep remainders
- Stop when dividend is zero
- Read remainders in reverse order
- Fractional part
- Repeated multiply by base
- Keep integer parts
- Stop when maximum precision reached (or zero)
- Read integer parts in direct order
Conversion to floating-point
- Convert -1313.3125 to single precision
- Integer part is 1313
- Fractional part is .3125
- Therefore 1313.3125
- Normalization gives 1.01001000010101 × 210
- Mantissa is 01001000010101000000000
- Exponent is 10+127 = 137 = 100010012
- Sign is 1
1 10001001 01001000010101000000000 = C4A42A0016
= 101001000012 = .01012 = 10100100001.01012
1 2 3 4 …
Rounding, overflow, and underflow
- Let’s try representing 0.1 in floating point
- Mantissa is 0.0001100110011001100…
- No exact representation possible
- Can happen after each arithmetic operation!
- Errors can grow and dominate results
- This problem often happens in practice
≈ 0 ≈ 2.5 ≈ Inf
Effect on addition accuracy
- May not be exact even when exponents equal
- Ex: 1.1010 + 1.0101 = 1.01111×21 → 1.1000×21
- Problem is worse when exponents differ
- Must pre-shift before adding
- Biggest source of rounding error
- Ex: 1.0000 + 1.0000 × 2-5 = 1.00001 → 1.0000
- When adding large list of numbers
- May need to sort before adding!
Multiplication error example
- Closest representable number to 0.110 is
- 0.100000001490116119384765625
- Squaring gives (exactly)
- 0.01000000029802322609739917425031308084
7263336181640625
- Squaring in single precision gives (exactly)
- 0.010000000707805156707763671875
- Closest representable number to 0.0110 is
- 0.009999999776482582092285156250
Integers in floating-point
- Find largest range of integers that can be
represented exactly in single precision
- 23 mantissa bits
- Add implicit normalization bit
- Add sign bit
- From -224+1 = -16,777,215 to 224-1
- After this range, we could have x+1 = x+2!
- 32-bit integer has much larger range!
- from -231 = -2,147,483,648 to 231 - 1
Other weirdness
- Associative property does not hold!
- (a + b) + c ≠ a + (b + c)
- Beware compiler optimizations
- Equality operator not meaningful!
- Returns true only if exactly equal
- Must use special function
- Absolute error vs. relative error
MAKING GAUSSIAN ELIMINATION WORK IN A COMPUTER
Errors during elimination
- 3-digit elimination
✓47 28 19 89 53 36 ◆ ✓ 47 28 19 fl(89 − 88.8) fl(53 − 52.9) fl(36 − 35.9) ◆ ✓47 28 19 .2 .1 .1 ◆ ✓47 28 19 .1 .1 ◆
fl(89/47) = 1.89 fl(1.89 × 47) = 88.8 fl(1.89 × 28) = 52.9 fl(1.89 × 19) = 35.9
Errors during substitution
- 3-digit back-substitution
- Exact solution
✓47 28 19 89 53 36 ◆ ✓47 28 19 −1/47 1/47 ◆ ✓1 1 1 −1 ◆ ✓47 28 19 .1 .1 ◆
fl ✓19 − 28 47 ◆ = fl ✓−9 47 ◆ = −.191 fl (.1/.1) = 1
✓1 −.191 1 1 ◆
Common sources of error
- Some error is unavoidable in general
- How to protect against unnecessary errors?
- Try to avoid rounding after addition
- Most extreme when exponents are very different
- Two strategies against that
- Partial pivoting
- Scaling
Motivation for partial pivoting
- Exact arithmetic
- 3-digit arithmetic
x = 1 1.0001 y = 1.0002 1.0001 ✓−10−4 1 1 104 104 ◆ ✓−10−4 1 1 1 1 2 ◆
fl(2 + 104) = 104 fl(1 + 104) = fl(.10001 × 105) = .100 × 104 = 104
y = 1 ✓−10−4 1 1 10, 001 10, 002 ◆ x = 0
Partial pivoting
- Search column for coefficient of largest magnitude
- Exchange rows to bring it to the pivotal position
✓−10−4 1 1 1 1 2 ◆ ✓ 1 1 2 −10−4 1 1 ◆ ✓1 1 2 1 1 ◆
fl(1 + 10−4) = 1 fl(1 + 2 × 10−4) = 1
x = 1 y = 1
Motivation for row scaling
- 3-digit arithmetic
- Choose proper units for problem
- Scale rows so max magnitude is 1
- Column scaling is typically unnecessary
✓−10 105 105 1 1 2 ◆ ✓−10 105 105 104 104 ◆ y = 1 x = 0
fl(2 + 104) = 104 fl(1 + 104) = 104
Complete pivoting
- At least as good as partial pivoting
- However, not used very frequently. Why?
- What is the computational cost?
- What the number of memory accesses?
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ s ∗ ∗ ∗ s ∗ ∗ ∗ s ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ s s s ∗ s s s ∗ s s s ∗
ILL-CONDITIONED SYSTEMS
Not all systems are born equal
- Gaussian elimination is great
- with partial pivoting
- on a properly scaled system
- But it is not magic!
- Some systems are extraordinarily difficult
- No numerical technique can deal with them
Ill-conditioned systems
- Original system
- Perturbed
.835x + .667y = .168 .333x + .266y = .067 y = −1 x = 1 .835ˆ x + .667ˆ y = .168 .333ˆ x + .266ˆ y = .067 − .001 ˆ y = 834 ˆ x = −666
- Small perturbation in the system can cause a
relatively large change in the solution!
– What if there are measurement errors? – What about floating-point errors?
- “Checking the answer” doesn’t help!
What is the impact to engineering?
- Both matrix and right-hand side are usually
results of measurements
- Even if the exact solution can be obtained,
does it really mean anything?
- The problem is with the system, not with the
solution to the system
- You are solving the wrong problem!
- Find a different experiment
- Hope for better conditioning
The geometric interpretation
- Can we detect this numerically?
- Yes, indeed
- Need more linear algebra
Well-conditioned Ill-conditioned