Understanding the Characteristics of Android Wear OS Renju Liu and - - PowerPoint PPT Presentation

understanding the characteristics of android wear os
SMART_READER_LITE
LIVE PREVIEW

Understanding the Characteristics of Android Wear OS Renju Liu and - - PowerPoint PPT Presentation

Understanding the Characteristics of Android Wear OS Renju Liu and Felix Xiaozhu Lin Purdue ECE The Wearable stack 5 Top questions Wearables should enjoy Baremetal performance Baremetal efficiency In this talk: Android Wear


slide-1
SLIDE 1

Understanding the Characteristics of Android Wear OS

Renju Liu and Felix Xiaozhu Lin Purdue ECE

slide-2
SLIDE 2

The Wearable stack

5

slide-3
SLIDE 3

Top questions

  • Wearables should enjoy

– Baremetal performance – Baremetal efficiency

  • In this talk: Android Wear

– Are we close to baremetal? – What is going on inside? – How should the OS evolve?

6

slide-4
SLIDE 4

Observation -- Symptoms

  • The current performance & efficiency

are far from baremetal

  • Pacing – inefficient
  • face update: 400ms 88% busy

Clock face update

7

slide-5
SLIDE 5

Observation -- Symptoms

  • The current performance & efficiency

are far from baremetal

  • Pacing – inefficient
  • face update: 400ms 88% busy
  • Racing – slow
  • Launch an in-mem app: 1 sec

Launch “settings”

9

slide-6
SLIDE 6

App UI shown User touch Launch action starts

What happens underneath?

810 ms 177 ms

11

slide-7
SLIDE 7

App UI shown User touch Launch action starts Power / mW

1000 500

What happens underneath?

810 ms 177 ms

12

slide-8
SLIDE 8

177 ms 810 ms Phase 1 Phase 2

Idle Busy with various tasks App UI shown User touch Launch action starts Power / mW CPU Exec.

1000 500

What happens underneath?

13

slide-9
SLIDE 9

177 ms 810 ms Phase 1 Phase 2 28 ms 130 ms 19 ms

Idle Busy with various tasks App UI shown User touch Launch action starts Power / mW CPU Exec.

1000 500

What happens underneath?

14

slide-10
SLIDE 10

Four Aspects

CPU busy? CPU idle? Thread-level parallelism (TLP) Microarchitectural behaviors

20

Won’t talk about our methodologies

slide-11
SLIDE 11

Profiling – Core Use Scenarios

Wakeup Update notification wrist… Interaction Game notes navigation Sensing Accel heart baro Single Input launch apps palming voice…

21

slide-12
SLIDE 12

CPU busy CPU idle TLP uArch

OS execution dominates CPU usage.

0% 25% 50% 75% 100% update notif wrist touch lch.set lch.calc lch.game palming voice game notes navi accel heart baro Wakeup Single In. Interact. Sensing

26

slide-13
SLIDE 13

CPU busy CPU idle TLP uArch

0% 25% 50% 75% 100% update notif wrist touch lch.set lch.calc lch.game palming voice game notes navi accel heart baro OS:Clockwork OS:daemons Wakeup Single In. Interact. Sensing

OS execution dominates CPU usage.

27

slide-14
SLIDE 14

CPU busy CPU idle TLP uArch

0% 25% 50% 75% 100% update notif wrist touch lch.set lch.calc lch.game palming voice game notes navi accel heart baro Apps OS:Clockwork OS:daemons Wakeup Single In. Interact. Sensing

OS execution dominates CPU usage.

28

slide-15
SLIDE 15

CPU busy CPU idle TLP uArch

0% 25% 50% 75% 100% update notif wrist touch lch.set lch.calc lch.game palming voice game notes navi accel heart baro Idle Apps OS:Clockwork OS:daemons Wakeup Single In. Interact. Sensing

OS execution dominates CPU usage.

29

slide-16
SLIDE 16

CPU busy CPU idle TLP uArch

0% 25% 50% 75% 100% update notif wrist touch lch.set lch.calc lch.game palming voice game notes navi accel heart baro Idle Apps OS:Clockwork OS:daemons Wakeup Single In. Interact. Sensing

OS execution dominates CPU usage.

30

slide-17
SLIDE 17

CPU busy CPU idle TLP uArch

OS execution dominates CPU usage.

31

slide-18
SLIDE 18

CPU busy CPU idle TLP uArch

OS execution dominates CPU usage.

32

slide-19
SLIDE 19

CPU busy CPU idle TLP uArch

Costly OS services are ...

33

slide-20
SLIDE 20

CPU busy CPU idle TLP uArch

Costly OS services are likely cruft.

34

slide-21
SLIDE 21

CPU busy CPU idle TLP uArch

Hot functions: highly skewed distribution Top 5 à >20% CPU cycles Top 50 à >50% CPU cycles

35

slide-22
SLIDE 22

CPU busy CPU idle TLP uArch

Hot functions: highly skewed distribution Top 5 à >20% CPU cycles Top 50 à >50% CPU cycles Manipulating basic data structures Legacy/improper OS designs

36

slide-23
SLIDE 23

CPU busy CPU idle TLP uArch

Hot functions: highly skewed distribution

Backlight UI layout low-mem killer Anecdotes

Top 5 à >20% CPU cycles Top 50 à >50% CPU cycles Manipulating basic data structures Legacy/improper OS designs

37

slide-24
SLIDE 24

CPU busy CPU idle TLP uArch

Idle episodes: plentiful and

  • f various lengths

Time (ms) Pct. Overall Episodes Pct. Explained

614.1 17.1% 376 100.0% notes 843.3 50.5% 352 100.0% voice 722.6 50.9% 205 99.9% lch.game 185.2 25.6% 110 92.9% lch.calc 153.6 15.6% 120 91.4% lch.set 16.8 10.6% 6 100.0% touch 223.0 61.2% 44 100.0% update 2173.0 52.80% 912 100.0% navi 4035.6 86.80% 277 100.0% notif 38

slide-25
SLIDE 25

CPU busy CPU idle TLP uArch

Idle anomalies are caused by …

250 500 750 update lch.set lch.game notes Device suspend Voice UI

  • Cont. interaction
  • Cont. interact.+NetI/O

Storage I/O User think Bluetooth tail time OS shell policy App policy 2000 4000 notif navi

Time (ms) Pct. Overall Episodes Pct. Explained

614.1 17.1% 376 100.0% notes 843.3 50.5% 352 100.0% voice 722.6 50.9% 205 99.9% lch.game 185.2 25.6% 110 92.9% lch.calc 153.6 15.6% 120 91.4% lch.set 16.8 10.6% 6 100.0% touch 223.0 61.2% 44 100.0% update 2173.0 52.80% 912 100.0% navi 4035.6 86.80% 277 100.0% notif

Time / ms

39

slide-26
SLIDE 26

CPU busy CPU idle TLP uArch

Idle anomalies are caused by …

250 500 750 update lch.set lch.game notes Device suspend Voice UI

  • Cont. interaction
  • Cont. interact.+NetI/O

Storage I/O User think Bluetooth tail time OS shell policy App policy 2000 4000 notif navi

Time (ms) Pct. Overall Episodes Pct. Explained

614.1 17.1% 376 100.0% notes 843.3 50.5% 352 100.0% voice 722.6 50.9% 205 99.9% lch.game 185.2 25.6% 110 92.9% lch.calc 153.6 15.6% 120 91.4% lch.set 16.8 10.6% 6 100.0% touch 223.0 61.2% 44 100.0% update 2173.0 52.80% 912 100.0% navi 4035.6 86.80% 277 100.0% notif

Time / ms

40

Legacy/improper OS designs Performance overprovisioning

Voice UI Anecdote

slide-27
SLIDE 27

CPU busy CPU idle TLP uArch

Substantial TLP on a par with desktop

# of concurrent threads

42

slide-28
SLIDE 28

CPU busy CPU idle TLP uArch

Substantial TLP on a par with desktop

# of concurrent threads

43

slide-29
SLIDE 29

CPU busy CPU idle TLP uArch

Substantial TLP on a par with desktop

# of concurrent threads TLP: avg. busy CPU cores (over non-idle time)

44

slide-30
SLIDE 30

CPU busy CPU idle TLP uArch

…due to short interactions.

# of concurrent threads TLP: avg. busy CPU cores (over non-idle time)

45

slide-31
SLIDE 31

CPU busy CPU idle TLP uArch

Apps are mostly single-threaded; OS contributes to TLP significantly.

46

slide-32
SLIDE 32

CPU busy CPU idle TLP uArch

Wearable suffers from uArch inefficiency

Cycles-per-instruction (lower is better)

2 -- 5 (high!)

47

slide-33
SLIDE 33

CPU busy CPU idle TLP uArch

Wearable suffers from uArch inefficiency

Cycles-per-instruction (lower is better)

2 -- 5 (high!)

Smartphone as a comparison

1.3 -- 2.5 web rendering <2 SPEC INT

48

slide-34
SLIDE 34

CPU busy CPU idle TLP uArch

Wearable suffers from uArch inefficiency

Cycles-per-instruction (lower is better)

2 -- 5 (high!)

Smartphone as a comparison

1.3 -- 2.5 web rendering <2 SPEC INT

49

slide-35
SLIDE 35

CPU busy CPU idle TLP uArch

Wearable suffers from uArch inefficiency

Cycles-per-instruction (lower is better)

2 -- 5 (high!)

Smartphone as a comparison

1.3 -- 2.5 web rendering <2 SPEC INT

50

slide-36
SLIDE 36

CPU busy CPU idle TLP uArch

The major cause: complex OS code

(L1 icache, iTLB, and branch predictor)

51

slide-37
SLIDE 37

CPU busy CPU idle TLP uArch

The major cause: complex OS code

(L1 icache, iTLB, and branch predictor)

uArch problem will NOT be gone with future wearable CPUs

52

slide-38
SLIDE 38

Four Aspects

CPU busy

¨ OS dominates ¨ Lots of cruft ¨ Skewed hot functions ¨ Legacy bottlenecks

CPU idle

¨ Anomalous ¨ OS flaws ¨ Too much performance

Thread-level parallelism

¨ Desktop-like ¨ OS-contributed

Microarchitectural behaviors

¨ Mismatch ¨ OS code complexity

54

slide-39
SLIDE 39

Repair, don’t overhaul (yet)

CPU busy

¨ OS dominates ¨ Lots of cruft ¨ Skewed hot functions ¨ Legacy bottlenecks

CPU idle

¨ Anomalous ¨ OS flaws ¨ Too much performance

Thread-level parallelism

¨ Desktop-like ¨ OS-contributed

Microarchitectural behaviors

¨ Mismatch ¨ OS code complexity

55

slide-40
SLIDE 40

How about after that? (i.e. “next-gen wearable OS”)

We probably will reach a point when OS

  • verhaul/redesign is justified.

Specializing OS for common, single-app scenarios

56

slide-41
SLIDE 41

Restructuring OS for Wearable

Full Simple

OS Daemons Kernel

Full Simple Activity Manager Window Manager

Specializing OS for common, single-app scenarios

58

slide-42
SLIDE 42

Restructuring OS for Wearable

Full Simple

OS Daemons Kernel

Full Simple Activity Manager Window Manager

Apps

59

slide-43
SLIDE 43

Simple Simple

Restructuring OS for Wearable

Full

OS Daemons Kernel

Full Activity Manager Window Manager

Apps

60

slide-44
SLIDE 44
  • Wearables: unique usage and hardware
  • Many mobile OS tradeoffs are invalid

– efficiency v.s. flexibility & programming ease

  • Immediate actions: fixing individual OS

components

  • Future: OS specialization may be needed

xsel.rocks/p/wear Final takeaway

Tools, data, and benchmark videos

66

slide-45
SLIDE 45

FAQ

  • You forgot Apple Watch or Samsung Tizen.
  • Isn’t your discovery just some oversight of

Google engineers?

  • Aren’t these things easy to fix?
  • Doesn’t multicore wearable sound crazy?
  • Power! I want to learn about power.
  • I bet the Android Wear team already fixed

these!

67

xsel.rocks/p/wear

slide-46
SLIDE 46

Has Android Wear improved?

68

slide-47
SLIDE 47

69