Session4: Normsand inner-products Optimization and Computational - - PowerPoint PPT Presentation

session4 normsand inner products
SMART_READER_LITE
LIVE PREVIEW

Session4: Normsand inner-products Optimization and Computational - - PowerPoint PPT Presentation

Session4: Normsand inner-products Optimization and Computational Linear Algebra for Data Science Lo Miolane Contents 1. Norms & inner-products 2. Orthogonality 1 3. Orthogonal projection 4. Proof of the Cauchy-Schwarz inequality


slide-1
SLIDE 1

Session4: Normsand inner-products

Optimization and Computational Linear Algebra for Data Science Léo Miolane
slide-2
SLIDE 2

Contents

  • 1. Norms & inner-products
  • 2. Orthogonality
  • 3. Orthogonal projection
  • 4. Proof of the Cauchy-Schwarz inequality

1

slide-3
SLIDE 3 Norms and inner-products 1/20

Normsandinner-products

slide-4
SLIDE 4

Questions

Norms and inner-products 2/20 Norms in machine

learning :

  • H
  • H2
Euclidean norm → measure distances . ^ .
  • a. y
Euclidean dot product

ask)

= ftp.Y-yuz-CG-it-3
  • Use
norms

for

" regularization " minimize Loss( data, a) + thalli with a ER"

{

+ Make
slide-5
SLIDE 5

Questions

Norms and inner-products 2/20
slide-6
SLIDE 6

Questions

Norms and inner-products 2/20
slide-7
SLIDE 7 Orthogonality 3/20

Orthogonality

slide-8
SLIDE 8

Definition

Orthogonality 4/20 Definition We say that vectors x and y are orthogonal if Èx, yÍ = 0. We write then x ‹ y. We say that a vector x is orthogonal to a set of vectors A if x is
  • rthogonal to all the vectors in A. We write then x ‹ A.
Exercise: If x is orthogonal to v1, . . . , vk then x is orthogonal to any linear combination of these vectors i.e. x ‹ Span(v1, . . . , vk).

q

ifeng.se::÷.

5)

Guy> = details> t.itanefa.tk

to

Io

E

=o
slide-9
SLIDE 9

Pythagorean Theorem

Orthogonality 5/20 Theorem (Pythagorean theorem) Let Î · Î be the norm induced by È·, ·Í. For all x, y œ V we have x ‹ y ≈ ∆ Îx + yÎ2 = ÎxÎ2 + ÎyÎ2. Proof. ⇤

① €

Katy 112

=

( nty , at y >

= ling t

THE

slide-10
SLIDE 10

Application to random variables

Orthogonality 6/20 ✓ = f random variables with

finite

secmoondmeuf) For X, Y , we

define kx,y>=LfI

the norm induced is 11×11 =T¥ '
  • Let's
assume that X, Yhauezeiome-au.EE

XIY

⇒ ECXYT
  • o
Cov CX, Y) =o

By Pyth

. them . this is

equivalent

to 11×+4112 = 11×112 t 114112
  • = Efcxt4723

Va¥

=

Voix)

+ Vaecyj
slide-11
SLIDE 11

Orthogonal & orthonormal families

Orthogonality 7/20 Definition We say that a family of vectors (v1, . . . , vk) is:
  • rthogonal if the vectors v1, . . . , vn are pairwise orthogonal,
i.e. Èvi, vjÍ = 0 for all i ”= j.
  • rthonormal if it is orthogonal and if all the vi have unit norm:
Îv1Î = · · · = ÎvkÎ = 1.
  • =
  • Example :
  • the
canonical basis of Rn is orthonormal
  • ( (f) , ft))
is
  • rthogonal
but not
  • rthonormal
slide-12
SLIDE 12

Coordinates in an orthonormal basis

Orthogonality 8/20 Proposition A vector space of finite dimension admits an orthonormal basis. Proposition Assume that dim(V ) = n and let (v1, . . . , vn) be an orthonormal basis of V . Then the coordinates of a vector x œ V in the basis (v1, . . . , vn) are (Èv1, xÍ, . . . , Èvn, xÍ): x = Èv1, xÍv1 + · · · + Èvn, xÍvn.
=

Proof

: we have ⇒rvn t .
  • tan Un

for some

ar -
  • an ER
=

Gave t

  • tan un , Qi>
=

HEIL

'
slide-13
SLIDE 13

Coordinates in an orthonormal basis

Orthogonality 9/20 Let n, y EVERY n = Cue,a) Vz t
  • tan, a
> Vn

T ae

→n

y
  • Cy, y> Vst
  • t Lun , ay> Un
  • Ps
Bn

(my>

= ( airs t .
  • -tannin ,
Reva t
  • tpnvn)
= E. Et aifjLvi,v =/ 1 if i -j =

dept t

  • + an pre
  • therwise
Hall = FaEt---aI
slide-14
SLIDE 14

Proof

Orthogonality 10/20
slide-15
SLIDE 15 Orthogonal projection 11/20

Orthogonalprojection

slide-16
SLIDE 16

Picture

Orthogonal projection 12/20 From now, È·, ·Í denotes the Euclidean dot product, and Î · Î the Euclidean norm. What is the vector y of S which is the closest to a ? I

den, s)

t¥s

(sobspace)
slide-17
SLIDE 17

Orthogonal projection and distance to a subspace

Orthogonal projection 13/20 Definition Let S be a subspace of Rn. The orthogonal projection of a vector x
  • nto S is defined as the vector PS(x) in S that minimizes the
distance to x: PS(x) def = arg min y∈S Îx ≠ yÎ. The distance of x to the subspace S is then defined as d(x, S) def = min y∈S Îx ≠ yÎ = Îx ≠ PS(x)Î. ← the rectory ES

that nphiamjy.es
  • -
  • if
a ¢ S then a # Bca)
  • if
a E S then a = Ps Ca)
slide-18
SLIDE 18

Computing orthogonal projections

Orthogonal projection 14/20 Proposition Let S be a subspace of Rn and let (v1, . . . , vk) be an orthonormal basis of S. Then for all x œ Rn, PS(x) = Èv1, xÍv1 + · · · + Èvk, xÍvk.
  • Proof :
Let YES , y = aah t - -
  • taa Va
Hn
  • y 112
=
  • EYE

for

some h . -
  • aa ER
  • Ily If
= ¥, ai
  • k
  • ( NYS
= La , as vet .
  • tamed
= ¥ , di la, Vi)
slide-19
SLIDE 19

Proof

Orthogonal projection 15/20 Ha-YR = Half t o¥dg2-2aiCa
  • :*÷÷:÷÷:
  • i
minimizer

y

is given by

Ps(a)

= Ca,# ve t . . - t Lana> re > flail = di
  • 22, Cami> ¥1

f-

'Gi)
  • of flail = Zai
  • Ka, vis
* =
aint = Ca ,vis the
slide-20
SLIDE 20

Consequence

Orthogonal projection 16/20

Ps Ca

)

= re t
  • t
  • k

at define

:IEi÷¥;If

'

'il

:÷÷,

Vita = fu

, = (Va ,a)Os t
  • -
  • t Loa , a) Vee
.
slide-21
SLIDE 21

Consequence

Orthogonal projection 17/20 Corollary For all x œ Rn, x ≠ PS(x) is orthogonal to S. ÎPS(x)Î Æ ÎxÎ.

=

slide-22
SLIDE 22 Proof of Cauchy-Schwarz inequality 18/20

ProofofCauchy-Schwarz inequality

slide-23
SLIDE 23

Cauchy-Schwarz inequality

Proof of Cauchy-Schwarz inequality 19/20 Theorem Let Î · Î be the norm induced by the inner product È·, ·Í on the vector space V . Then for all x, y œ V : |Èx, yÍ| Æ ÎxÎ ÎyÎ. (1) Moreover, there is equality in (1) if and only if x and y are linearly dependent, i.e. x = αy or y = αx for some α œ R.

I

=

Proof :

Let

my EV

  • If
a - O de y = the result is obvious .
  • From
now we assume that day¥8
slide-24
SLIDE 24

Proof

Proof of Cauchy-Schwarz inequality 20/20 We

define

f

: IR → R

E t

Hy

  • talk

for TER, fct)

=

Ily 112 - It La, y>

+ ¥ Half
  • Ranch#I :

fettes

  • a Tyree
  • 2 poltyomialcint)
  • Remark #2
:

fct) 30 for

all t

. Hence the discriminant

d

  • f fct) is

so

a=a T-4"

to

→ ÷!!
slide-25
SLIDE 25

Proof

Proof of Cauchy-Schwarz inequality 20/20 There is

equality

kn.gs/--HaUllyH

if and only

if

A- O

There exists some

t

such

that

fat

  • O
" Hy
  • talk -0
this means that

y

  • ta -0

I;

s

  • ta
.
slide-26
SLIDE 26

Questions?

21/20
slide-27
SLIDE 27

Questions?

21/20 cos O =

KI

lyIk

Is

slide-28
SLIDE 28

Orthogonal matrices

22/20 Definition A matrix A œ Rn×n is called an orthogonal matrix if its columns are an orthonormal family. =

=

ten

  • f

go.a.g.ae

Ro

  • ( ÷:

Isis:)

matias .
slide-29
SLIDE 29

A proposition

23/20 Proposition Let A œ Rn×n. The following points are equivalent:
  • 1. A is orthogonal.
  • 2. ATA = Idn.
  • 3. AAT = Idn
G '

O

A invertible and AT = A' G' ⇐ s - A- → µ
  • y

i÷:t÷÷÷:

slide-30
SLIDE 30

Orthogonal matrices & norm

24/20 Proposition Let A œ Rn×n be an orthogonal matrix. Then A preserves the dot product in the sense that for all x, y œ Rn, ÈAx, AyÍ = Èx, yÍ. In particular if we take x = y we see that A preserves the Euclidean norm: ÎAxÎ = ÎxÎ.