Lecture 17: Scheduling Sanjay Rajopadhye Computer Science, Colorado - - PowerPoint PPT Presentation

lecture 17 scheduling
SMART_READER_LITE
LIVE PREVIEW

Lecture 17: Scheduling Sanjay Rajopadhye Computer Science, Colorado - - PowerPoint PPT Presentation

High-Performance Embedded Systems-on-a-Chip Lecture 17: Scheduling Sanjay Rajopadhye Computer Science, Colorado State University High-Performance Embedded Systems-on-a-Chip p.1/18 Limitations of Systolic Arrays Only a


slide-1
SLIDE 1

High-Performance Embedded Systems-on-a-Chip

Lecture 17: Scheduling

Sanjay Rajopadhye Computer Science, Colorado State University

High-Performance Embedded Systems-on-a-Chip – p.1/18

slide-2
SLIDE 2

Limitations of Systolic Arrays

Only a (very small) proper subset of SAREs: Those that

  • are Serializable,
  • are Localizable,
  • correspnod to a Single Equation, and
  • admit a One-dimensional schedule.

Question: What is beyond systolic arrays?

High-Performance Embedded Systems-on-a-Chip – p.2/18

slide-3
SLIDE 3

(Silicon) Compilation

  • For each point in domain of each variable,

determine:

  • A time instant

schedule

  • A place

processor and memory allocation

  • Transform P-SARE so that indices denote either
  • time,
  • processor, or
  • memory address
  • Generate code (or HDL, we hope)

High-Performance Embedded Systems-on-a-Chip – p.3/18

slide-4
SLIDE 4

Two Orthogonal Issues

  • Static Analysis: what transformation to apply
  • scheduling
  • processor (& memory) allocation
  • Program Transformation: manipulating the SARE
  • Rules to modify the SARE (Change of Basis)
  • Code Generation (how to interpret the

transformed SARE)

High-Performance Embedded Systems-on-a-Chip – p.4/18

slide-5
SLIDE 5

Golden Rule of Static Analysis

The dependence graph cannot be explicitly constructed

  • Too large
  • Not (fully) known at compile time – parameters
  • Explicitly constructed results are not useful

Implication: use compact information

High-Performance Embedded Systems-on-a-Chip – p.5/18

slide-6
SLIDE 6

Compact information

Reduced Dependence (Multi) Graph (RDG)

  • Nodes

variables in the SARE

  • Edges

for each occurrence of

✂ ✄ ☎ ✆✞✝ ✟✠
  • n the rhs
  • f the equation for
✡☞☛ ✌ ✍✏✎ ✑ ✒✔✓ ✕ ✕✖ ✗

edge from

to

. Labeled with

  • the dependence function,
  • the (sub) domain of

where it occurs

  • Miscellaneous info (eg. duration, etc.)

High-Performance Embedded Systems-on-a-Chip – p.6/18

slide-7
SLIDE 7

Key Problem: Scheduling

  • Definition: A function

such that whenever

✘ ✄✞✙ ✠

depends on

✚ ✄✞✛ ✠

, then

✒ ✆ ✘ ☛ ✙ ✟✏✜ ✒ ✆ ✚ ☛ ✛ ✟

.

  • Affine schedules:
✒ ✆ ✘ ☛ ✙ ✟ ✢ ✣ ✤ ✙ ✥✧✦ ✤ ★ ✣ ✤ ✩ ✙ ✩ ✥✫✪ ✪ ✪ ✥ ✣ ✤ ✬ ✙ ✬ ✥ ✦ ✤

Geometric interpretation: all points executed at time

✒✮✭ ★ ✣ ✤ ✝ ✥ ✦ ✤

belong to isotemporal hyperplane with normal vector

✣ ✤

High-Performance Embedded Systems-on-a-Chip – p.7/18

slide-8
SLIDE 8

Scheduling a (single) URE

✚ ✄ ✝ ✠ ★ ✯ ✝ ✰ ✱ ✲✴✳ ☎ ✆ ✚ ✄ ✝ ✥ ✵ ✩ ✠ ☛ ✪ ✪ ✪ ☛ ✚ ✄ ✝ ✥ ✵✷✶ ✠

Its RDG is just

  • ne node,
  • with

self loops,

  • each labeled with a vector
✵✺✹

.

High-Performance Embedded Systems-on-a-Chip – p.8/18

slide-9
SLIDE 9

Scheduling a single URE

✣ ☛ ✦ ✼

is valid iff for

✽ ★ ✾ ✪ ✪ ✪ ✸

, and

✿ ✝ ✰ ❀ ✣ ✝ ✥ ✦ ✜ ✣ ✆✞✝ ✥ ✵❂❁ ✟ ✥ ✦ ★ ✣ ✝ ✥ ✣ ✵✺❁

i.e.,

✣ ✵ ❁ ❃ ❄
  • Finite number of constraints, independent of

domain size. Scheduling

Linear Programming

  • Geometric view: Choose the hyperplanes so that

dependences point backwards

High-Performance Embedded Systems-on-a-Chip – p.9/18

slide-10
SLIDE 10

Example

❅ ❅ ❅ ❅ ❅ ❅❇❆ ❆ ❆ ❆ ❆ ❆ ❈ ❈ ❈ ❈ ❈ ❈❇❉ ❉ ❉ ❉ ❉ ❉ ❊ ❊ ❊ ❊ ❊ ❊ ❋ ❋ ❋ ❋ ❋ ❋
  • ❇❍
❍ ❍ ❍ ❍ ❍ ■ ■ ■ ■ ■ ■❇❏ ❏ ❏ ❏ ❏ ❏ ❑ ❑ ❑ ❑ ❑ ❑❇▲ ▲ ▲ ▲ ▲ ▲ ▼ ▼ ▼ ▼ ▼ ▼❇◆ ◆ ◆ ◆ ◆ ◆ ❖ ❖ ❖ ❖ ❖ ❖❇P P P P P P ◗ ◗ ◗ ◗ ◗ ◗❇❘ ❘ ❘ ❘ ❘ ❘ ❙ ❙ ❙ ❙ ❙ ❙❇❚ ❚ ❚ ❚ ❚ ❚ ❯ ❯ ❯ ❯ ❯ ❯❇❱ ❱ ❱ ❱ ❱ ❱ ❲ ❲ ❲ ❲ ❲ ❲❇❳ ❳ ❳ ❳ ❳ ❳ ❨ ❨ ❨ ❨ ❨ ❨❇❩ ❩ ❩ ❩ ❩ ❩ ❬ ❬ ❬ ❬ ❬ ❬ ❬ ❬ ❬ ❭ ❭ ❭ ❭ ❭ ❭ ❪ ❪ ❪ ❪ ❪ ❪ ❪ ❪ ❪ ❫ ❫ ❫ ❫ ❫ ❫ ❴ ❴ ❴ ❴ ❴ ❴ ❴ ❴ ❴ ❵ ❵ ❵ ❵ ❵ ❵ ❛ ❛ ❛ ❛ ❛ ❛ ❛ ❛ ❛ ❜ ❜ ❜ ❜ ❜ ❜ ❝ ❝ ❝ ❝ ❝ ❝❇❞ ❞ ❞ ❞ ❞ ❞ ❡ ❡ ❡ ❡ ❡ ❡❇❢ ❢ ❢ ❢ ❢ ❢ ❣ ❣ ❣ ❣ ❣ ❣❇❤ ❤ ❤ ❤ ❤ ❤ ✐ ✐ ✐ ✐ ✐ ✐❇❥ ❥ ❥ ❥ ❥ ❥ ❦ ❦ ❦ ❦ ❦ ❦❇❧ ❧ ❧ ❧ ❧ ❧ ♠ ♠ ♠ ♠ ♠ ♠❇♥ ♥ ♥ ♥ ♥ ♥ ♦ ♦ ♦ ♦ ♦ ♦❇♣ ♣ ♣ ♣ ♣ ♣ q q q q q q❇r r r r r r s s s s s s s s s t t t t t t ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✉ ✈ ✈ ✈ ✈ ✈ ✈ ✇ ✇ ✇ ✇ ✇ ✇❇① ① ① ① ① ① ② ② ② ② ② ② ② ②❇③ ③ ③ ③ ③ ③ ④ ④ ④ ④ ④ ④❇⑤ ⑤ ⑤ ⑤ ⑤ ⑤ ⑥ ⑥ ⑥ ⑥ ⑥ ⑥❇⑦ ⑦ ⑦ ⑦ ⑦ ⑦ ⑧ ⑧ ⑧ ⑧ ⑧ ⑧ ⑧ ⑧❇⑨ ⑨ ⑨ ⑨ ⑨ ⑨ ⑩ ⑩ ⑩ ⑩ ⑩ ⑩❇❶ ❶ ❶ ❶ ❶ ❶ ❷ ❷ ❷ ❷ ❷ ❷❇❸ ❸ ❸ ❸ ❸ ❸ ❹ ❹ ❹ ❹ ❹ ❹❇❺ ❺ ❺ ❺ ❺ ❺ ❻ ❻ ❻ ❻ ❻ ❻❇❼ ❼ ❼ ❼ ❼ ❼ ❽ ❾ ❿➁➀ ➂ ➃ ➄ ➅ ➆ ❽ ❾ ❿➈➇ ➉ ➀ ➂ ➃ ➀ ❽ ❾ ❿➁➀ ➂ ➇ ➉ ➃ ➊ ➆ ❿➁➀ ➂ ➋ ➌ ➍ ❿ ➎ ➏ ➂ ➎ ➐

Schedule validity conditions

❾ ➍ ➀ ➏ ➃ ❾ ➑ ➀ ➇ ➉ ➃ ➒ ➓ ➑ ❾ ➍ ➀ ➏ ➃ ❾ ➇ ➉ ➀ ➑ ➃ ➒ ➓ ➑ ➐ ➔ ➑

i.e.,

→↔➣ ➀ ↕ ➀ ➐ ➙ ➣ ➀ ↕ ➛ ➜ ➀ ➐ ➔ ➜ ➝

Optimal schedule:

➊ ➆ ❿➁➀ ➂ ➋ ➄ ❿ ➎ ➂

High-Performance Embedded Systems-on-a-Chip – p.10/18

slide-11
SLIDE 11

Scheduling an SURE

  • Single schedule for all variables

Not general enough: some well defined SURE’s don’t admit such a schedule (e.g. the convolution example)

  • Shifted linear schedules

Allow the

✦ ✤

to be different for each variable,

, but same

also not general enough

  • Variable dependent schedules: different slopes for

different variables) not general enogh either

  • Multidimensional schedules most general, but still

not enough

High-Performance Embedded Systems-on-a-Chip – p.11/18

slide-12
SLIDE 12

Limits of shifted linear schedules

✡ ✄ ✍ ☛ ➞ ✠ ★ ✎ ✆ ✡ ✄ ✍ ➟ ✾ ☛ ➞ ✥ ✾ ✠ ✟ ✂ ✄ ✍ ☛ ➞ ✠ ★ ✑ ✆ ✂ ✄ ✍ ✥ ✾ ☛ ➞ ➟ ✾ ✠ ✟
  • This SURE cannot be scheduled with same-slope

lines for both

and

.

  • But do a simple CoB—transpose one of the

vars—and now it can.

High-Performance Embedded Systems-on-a-Chip – p.12/18

slide-13
SLIDE 13

A less contrived example

➠ ➠ ➠ ➠ ➠ ➠➢➡ ➡ ➡ ➡ ➡ ➡ ➤ ➤ ➤ ➤ ➤ ➤➢➥ ➥ ➥ ➥ ➥ ➥ ➦ ➦ ➦ ➦ ➦ ➦ ➦ ➦ ➦ ➧ ➧ ➧ ➧ ➧ ➧ ➧ ➧ ➧ ➨ ➨ ➨ ➨ ➨ ➨ ➩ ➩ ➩ ➩ ➩ ➩ ➫ ➫ ➫ ➫ ➫ ➫➢➭ ➭ ➭ ➭ ➭ ➭ ➯ ➯ ➯ ➯ ➯ ➯➢➲ ➲ ➲ ➲ ➲ ➲ ➳ ➳ ➳ ➳ ➳ ➳➢➵ ➵ ➵ ➵ ➵ ➵ ➸ ➸ ➸ ➸ ➸ ➸➢➺ ➺ ➺ ➺ ➺ ➺ ➻ ➻ ➻ ➻ ➻ ➻ ➻ ➻ ➻ ➼ ➼ ➼ ➼ ➼ ➼ ➼ ➼ ➼ ➽ ➽ ➽ ➽ ➽ ➽➢➾ ➾ ➾ ➾ ➾ ➾ ➚ ➚ ➚ ➚ ➚ ➚ ➚ ➚ ➚ ➪ ➪ ➪ ➪ ➪ ➪ ➪ ➪ ➪ ➶ ➶ ➶ ➶ ➶ ➶ ➶ ➶ ➶ ➹ ➹ ➹ ➹ ➹ ➹ ➹ ➹ ➹ ➘ ➘ ➘ ➘ ➘ ➘ ➘ ➘ ➘ ➴ ➴ ➴ ➴ ➴ ➴ ➴ ➴ ➴ ➷ ➷ ➷ ➷ ➷ ➷➢➬ ➬ ➬ ➬ ➬ ➬ ➮ ➮ ➮ ➮ ➮ ➮➢➱ ➱ ➱ ➱ ➱ ➱ ✃ ✃ ✃ ✃ ✃ ✃➢❐ ❐ ❐ ❐ ❐ ❐ ❒ ❒ ❒ ❒ ❒ ❒➢❮ ❮ ❮ ❮ ❮ ❮ ❰ ❰ ❰ ❰ ❰ ❰➢Ï Ï Ï Ï Ï Ï Ð Ð Ð Ð Ð Ð➢Ñ Ñ Ñ Ñ Ñ Ñ Ò Ò Ò Ò Ò Ò➢Ó Ó Ó Ó Ó Ó Ô Ô Ô Ô Ô Ô➢Õ Õ Õ Õ Õ Õ ❽ ❾ ❿➁➀ ➂ ➃ ➄ ➅ ➆ ❽ ❾ ❿ ➇ ➉ ➀ ➂ ➎ ➉ ➃ ➋ Ö ❾ ❿➁➀ ➂ ➃ ➄ × ➆ Ö ❾ ❿ ➎ ➉ ➀ ➂ ➇ ➉ ➃ ➀ ❽ ❾ ❿➁➀ ➂ ➃ ➋ ➊ÙØ ➆ ❿➁➀ ➂ ➋ ➄ ➍ Ø ❿ ➎ ➏ Ø ➂ ➎ ➐ Ø ➊ÛÚ ➆ ❿➁➀ ➂ ➋ ➄ ➍ Ú ❿ ➎ ➏ Ú ➂ ➎ ➐ Ú

Optimal solution

➊ Ø ➆ ❿ ➀ ➂ ➋ ➄ ❿ ➊ÙÚ ➆ ❿ ➀ ➂ ➋ ➄ ❿ ➎ Ü ➂ ➎ ➉

High-Performance Embedded Systems-on-a-Chip – p.13/18

slide-14
SLIDE 14

Variable dependent schedules

High-Performance Embedded Systems-on-a-Chip – p.14/18

slide-15
SLIDE 15

Exercise

Ý Ý Ý Ý Ý Ý➢Þ Þ Þ Þ Þ Þ ß ß ß ß ß ß➢à à à à à à á á á á á á➢â â â â â â ã ã ã ã ã ã ä ä ä ä ä ä å å å å å å➢æ æ æ æ æ æ ç ç ç ç ç ç➢è è è è è è é é é é é é➢ê ê ê ê ê ê ë ë ë ë ë ë➢ì ì ì ì ì ì í í í í í í➢î î î î î î ï ï ï ï ï ï➢ð ð ð ð ð ð ñ ñ ñ ñ ñ ñ➢ò ò ò ò ò ò ó ó ó ó ó ó➢ô ô ô ô ô ô õ õ õ õ õ õ➢ö ö ö ö ö ö ÷ ÷ ÷ ÷ ÷ ÷➢ø ø ø ø ø ø ù ù ù ù ù ù➢ú ú ú ú ú ú û û û û û û➢ü ü ü ü ü ü ý ý ý ý ý ý➢þ þ þ þ þ þ ÿ ÿ ÿ ÿ ÿ ÿ✁
✂ ✂ ✂ ✂ ✂ ✄ ✄ ✄ ✄ ✄ ✄ ☎ ☎ ☎ ☎ ☎ ☎ ✆ ✆ ✆ ✆ ✆ ✆ ✝ ✝ ✝ ✝ ✝ ✝ ✞ ✞ ✞ ✞ ✞ ✞ ❽ ❾ ❿ ➀ ➂ ➃ ➄ ✟ ➆ ❽ ❾ ❿ ➇ ➉ ➀ ➂ ➎ ➉ ➃ ➀ Ö ❾ ❿ ➀ ➂ ➇ ➉ ➃ ➋ Ö ❾ ❿➁➀ ➂ ➃ ➄ ➅ ➆ ❽ ❾ ❿➁➀ ➂ ➃ ➀ Ö ❾ ❿ ➎ ➉ ➀ ➂ ➇ ➉ ➃ ➋

Find length of longest path reaching the green (cf. red) node at

❾ ❿➁➀ ➂ ➃

High-Performance Embedded Systems-on-a-Chip – p.15/18

slide-16
SLIDE 16

Solution

  • The SURE is inherently sequential (exercise: check it)
➊ÛØ ➆ ❿➁➀ ➂ ➋ ➄ ➆ ❿ ➎ ➂ ➋ ➆ ❿ ➎ ➂ ➎ ➉ ➋ ➎ ❿ ➄ ❿ ✠ ➎ ➂ ✠ ➎ Ü ❿ ➂ ➎ Ü ❿ ➎ ➂ ➊ÛÚ ➆ ❿➁➀ ➂ ➋ ➄ ➆ ❿ ➎ ➂ ➋ ➆ ❿ ➎ ➂ ➎ ➉ ➋ ➎ ❿ ➎ Ü ➂ ➎ ➉ ➄ ❿ ✠ ➎ ➂ ✠ ➎ Ü ❿ ➂ ➎ Ü ❿ ➎ ✡ ➂ ➎ ➉
  • Schedule is quadratic NOT AFFINE
  • Generalization: polynomial schedules
  • Break down of geometric interpretation as hyperplanes

High-Performance Embedded Systems-on-a-Chip – p.16/18

slide-17
SLIDE 17

Polynomial/Multidimensional Schedules

✒ ☛ ✄ ✍ ☛ ➞ ✠ ★ ✍ ✥ ➞ ✍ ✒ ☞ ✄ ✍ ☛ ➞ ✠ ★ ✍ ✥ ➞ ✍ ✥ ✌ ➞ ✥ ✾

High-Performance Embedded Systems-on-a-Chip – p.17/18

slide-18
SLIDE 18

Scheduling Algorithm

✣ ☛ ✦ ✼

is valid iff for

✽ ★ ✾ ✪ ✪ ✪ ✸

, and

✿ ✝ ✰ ❀ ✣ ✝ ✥ ✦ ✜ ✣ ✆✞✝ ✥ ✵❂❁ ✟ ✥ ✦ ★ ✣ ✝ ✥ ✣ ✵✺❁

i.e.,

✣ ✵ ❁ ❃ ❄
  • Finite number of constraints, independent of

domain size. Scheduling

Linear Programming Geometric view: Choose the hyperplanes so that dependences point backwards

High-Performance Embedded Systems-on-a-Chip – p.18/18