Results on the PASCAL PROMO challenge Ivan Markovsky University of - - PowerPoint PPT Presentation

results on the pascal promo challenge
SMART_READER_LITE
LIVE PREVIEW

Results on the PASCAL PROMO challenge Ivan Markovsky University of - - PowerPoint PPT Presentation

Results on the PASCAL PROMO challenge Ivan Markovsky University of Southampton The challenge Data: consists of two (simulated) time series u d ( 1 ) ,..., u d ( 1095 ) { 0 , 1 } 1000 promotions y d ( 1 ) ,..., y d ( 1095 ) R 100


slide-1
SLIDE 1

Results on the PASCAL PROMO challenge

Ivan Markovsky

University of Southampton

slide-2
SLIDE 2

The challenge

Data: consists of two (simulated) time series ud(1),...,ud(1095) ∈ {0,1}1000 promotions yd(1),...,yd(1095) ∈ R100 product sales Challenge: find ≤ 50 promotions that affect each product sales

200 400 600 800 1000 0.2 0.4 0.6 0.8 1

time 7th promotion

200 400 600 800 1000 1000 1500 2000

time 3rd product sales

slide-3
SLIDE 3

Comments

  • time series nature of the data =

⇒ dynamic phenomenon (the current output may depend on past inputs and outputs)

  • it is natural to think of the promotions as inputs (causes)

and the sales as outputs (effects)

  • multivariable data: m = 1000 inputs, p = 100 outputs
  • T = 1095 data points—very few, relative to m and p
  • even static linear model y = Au is unidentifiable (A can not

be recovered uniquely from (ud,yd)) for T < Tmin := 105

  • prior knowledge that a few (≤ 50) inputs affect each output

helps (Tmin = 5000) but doesn’t recover identifiability

  • this prior knowledge makes the problem combinatorial
slide-4
SLIDE 4

Proposed model

Main assumptions:

  • 1. static input-output relation yj(t) = aju(t)

(this implies that one output can not affect other outputs)

  • 2. there is offset and seasonal component, which is sine, i.e.,

Base line: ybl,j(t) := bj +cj sin(ωjt +φj) The model is yj(t) = ybl,j(t)+Au(t)

  • r, with Y :=
  • y(1)

··· y(T)

  • , U :=
  • u(1)

··· u(T)

  • , etc.,

Y = Ybl(b,c,ω,φ)+AU

slide-5
SLIDE 5

Identification problem

Parameters: A ∈ Rp×m — input/output (feedthrough) matrix b := (b1,...,bp) ∈ Rp — vector of offsets c := (c1,...,cp) ∈ Rp — vector of amplitudes ω := (ω1,...,ωp) ∈ Rp — vector of frequencies φ := (φ1,...,φp) ∈ [−π,π]p — vector of phases Identification problem: minimize

  • ver the parameters

Yd −Ybl(b,c,ω,φ)−AUd subject to each row of A has at most 50 nonzero elements. combinatorial, constrained, nonlinear, least squares problem

slide-6
SLIDE 6

Solution approach

Model:

  • yj(t) = bj +cj sin(ωjt +φj)+Au(t)

Linear in A,b,c. Nonlinear in ω,φ. Combinatorial in A. Our approach: Split the problem into two stages:

  • 1. Baseline estim.: minimize over b,c,ω,φ, assuming A = 0.

Nonlinear LS problem. We use local optimization.

  • 2. I/O function etim.: minimize over A,b,c, with ω,φ fixed.

This is a combinatorial problem. We use the ℓ1 heuristic. This approach simplifies the solution but leads to suboptimality.

slide-7
SLIDE 7

Identification of the autonomous term

The problem decouples into p independent problems: minimize

  • ver bj,cj,ωj ∈ R, φj ∈ [−π,π]

yd,j −ybl,j(bj,cj,ωj,φj)2 (1) (yd,j — jth row of Yd, ybl,j — jth row of Ybl) A special case of the line spectral estimation problem, for which solution subspace and maximum likelihood (ML) methods exist. We use the ML approach, i.e., local optimization, assuming ωj = 12π/T (one year period) or 6π/T (half year period). Furthermore, we eliminate the “linear” parameters bj,cj by projection

  • VARPRO method
slide-8
SLIDE 8

200 400 600 800 1000 600 800 1000 1200 1400 1600 1800 2000 2200 2400

t yd,3 and y∗

bl,3

Baseline

slide-9
SLIDE 9

200 400 600 800 1000 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800

t yd,4 and y∗

bl,4

Baseline

slide-10
SLIDE 10

Identification of the term involving the inputs

Problem: minimize

  • ver bj,cj,aj

yd,j −ybl,j(bj,cj,φ ∗

j ,ω∗ j )−a⊤ j Ud2

subject to aj has at most 50 nonzero elements (2) Proposed heuristic: minimize

  • ver bj,cj,aj

yd,j −ybl,j(bj,cj,φ ∗

j ,ω∗ j )−a⊤ j Ud2

subject to aj1 ≤ γj (3) γj > 0 is parameter controlling the sparsity vs accuracy trade-off

slide-11
SLIDE 11

Choice of the regularization parameter γj

If we fix the nonzero elements to be the first 10 elements, the

  • ptimal solution (with this choice of the nonzero elements) is

aj := (ydj −ybl,j)Ud(1:10,:)+ 01×(m−10)

  • Let a∗ be the optimal solution over all choices of the nonzero

elements. Since a∗

j 1 = γj, a heuristic choice for γj is γj := aj1.

slide-12
SLIDE 12

200 400 600 800 1000 600 800 1000 1200 1400 1600 1800 2000 2200 2400

t yd,3 and y∗

3

Complete model (baseline and 24 inputs)

slide-13
SLIDE 13

200 400 600 800 1000 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800

t yd,4 and y∗

4

Complete model (baseline and 25 inputs)

slide-14
SLIDE 14

Nonuniqueness of the solution

For uniqueness of A, we need Ud to be full row rank. Special cases that lead to rank deficiency of U:

  • Zero inputs can’t affect the output. Removing them leads

to an equivalent reduced model. For maximum sparsity, assign zero weights in A to those inputs.

  • Inputs that are multiples of other inputs lead to essential

nonuniqueness that can not be recovered by the sparsity. Preprocessing step: remove redundant inputs.

slide-15
SLIDE 15

Algorithm

  • 1. Input: Ud ∈ Rm×T and Yd ∈ Rp×T.
  • 2. Preprocessing: detect and remove redundant inputs.
  • 3. For j = 1 to p

3.1 Identify the baseline (ω∗

j ,φ ∗ j ,c∗ j ,a∗ j )

3.2 Identify the I/O relation (b∗

j ,c∗ j ,a∗ j ), sparsity pattern of a∗ j

3.3 Solve (2) with fixed sparsity pattern, φj = φ ∗

j and ωj = ω∗ j

(b∗

j ,c∗ j ,a∗ j )

  • 4. Postprocessing: add zero rows in A∗ corresponding to

the removed inputs

  • 5. Output: Ybl(b∗,c∗,ω∗,φ ∗) and A∗
slide-16
SLIDE 16

Identification of the baseline:

  • 1. Let f ′,φ ′

j be minimum value/point of (1) with ωj = 6π/T.

  • 2. Let f ′′,φ ′′

j be minimum value/point of (1) with ωj = 12π/T.

  • 3. If f ′ < f ′′, ω∗

j := 6π/T, φ ∗ j := φ ′ j , else ω∗ j := 12π/T, φ ∗ j := φ ′′ j .

Identification of the baseline:

  • 1. Let γj := (yd,j −ybl,j)Ud(1:10,:)+1.
  • 2. Let a′

j be solution to (3) with φj = φ ∗ j , ωj = ω∗ j .

  • 3. Determine the sparsity pattern of a′

j.

slide-17
SLIDE 17

Results on the PROMO challenge

20 40 60 80 100 10 20 30 40 50 60 70 80 90 100

  • utput

% correctly identified inputs Total: 2321 true inputs, 1796 identified inputs, of which 507 correct. Code: http://www.ecs.soton.ac.uk/~im/challenge.tar