EECS 753: Embedded Real-Time Systems Heechul Yun 1 Welcome to - - PowerPoint PPT Presentation

eecs 753 embedded real time
SMART_READER_LITE
LIVE PREVIEW

EECS 753: Embedded Real-Time Systems Heechul Yun 1 Welcome to - - PowerPoint PPT Presentation

EECS 753: Embedded Real-Time Systems Heechul Yun 1 Welcome to EECS753 About the course 2 About Instructor Heechul Yun Assistant Prof., Dept. of EECS, University of Kansas (Aug.13 ~ ) Office: 3040 Eaton, 236 Nichols


slide-1
SLIDE 1

EECS 753: Embedded Real-Time Systems

Heechul Yun

1

slide-2
SLIDE 2

Welcome to EECS753

  • About the course

2

slide-3
SLIDE 3

About Instructor

  • Heechul Yun

– Assistant Prof., Dept. of EECS, University of Kansas (Aug.’13 ~ ) – Office: 3040 Eaton, 236 Nichols – Email: heechul.yun@ku.edu

  • Educations

– Ph.D. (CS), University of Illinois at Urbana-Champaign – M.S. (CS) and B.S (CS), KAIST

  • Professional Experiences

– Senior software engineer @ Samsung Electronics

  • Research Areas

– Operating systems, embedded/real-time systems

  • More Information

– http://ittc.ku.edu/~heechul

3

slide-4
SLIDE 4

About This Class

  • Topics

– Embedded Real-Time Systems. Cyber Physical Systems

  • Prerequisite

– EECS 645 Computer Architecture. – EECS 678 Introduction to Operating Systems.

  • No textbook!
  • Course website

– http://ittc.ku.edu/~heechul/courses/eecs753

  • Audience

– Grad students (senior undergraduate) who are interested in research

  • Times

– Lecture: M/W/F 10:00 – 11:00 LEA 1131 – Office hour: M/F 11:00 - 11:50 @ 3040 Eaton

4

slide-5
SLIDE 5

About This Class

  • Seminar style

– I and YOU will present (more on later)

  • Goals

– Learn and discuss advanced topics in real-time embedded systems – Improve your research skills

  • Research skills

– Learning skills

  • Quickly learn from papers, books, and the internet!

– Communication skills

  • Written form = paper, oral form = presentation

– Programming skills

  • Need to build some “interesting” things

5

slide-6
SLIDE 6

Topics

  • Introduction to Real-Time Systems, CPS
  • CPS Applications: Intelligent Vehicles
  • Real-time Hardware Architecture
  • Real-time OS and Middleware
  • Fault tolerance and Security

6

Amazon prime air

slide-7
SLIDE 7

Methodology

  • Learning by reading research papers
  • Learning by building an actual system

7

slide-8
SLIDE 8

Reading Papers

  • You are expected to

– Read assigned papers (~two/week) – Summarize them

  • Reading paper well is an important skill

– A good reference: “How to Read a Paper”

8

slide-9
SLIDE 9

Written Summary

  • Each summary should include:

– Summary of main ideas – What you liked – What you disliked

  • Submit via Blackboard

9

slide-10
SLIDE 10

Written Summary: Example

  • [Summary] This paper presents a kernel level page allocator which is DRAM Bank-
  • Aware. This allocator is able to allocate pages across cores in a way that causes

banks to be shared or partitioned depending on user configuration. This can be used to provide more predictable memory access to multicore software. The authors implemented their memory allocator in a recent version of the Linux Kernel and compared its performance with the existing buddy allocator.

  • [The good] This paper is well written. The issue of DRAM banks was not familiar to

me at the time of reading but was well explained which motivated the rest of the paper well. The algorithm used is quite straightforward and the explanation is easy to follow.

  • [The bad] While the authors acknowledge that the approach they take bears

similarity to multi-core page coloring[1,2,3,4] the novelty of their work is not well

  • established. This work appears to be a relatively straightforward application of

rudimentary page coloring techniques. The related work section touches on these similarities but does not establish any particular novelty aside from the fact that this paper is addressing the problem of shared DRAM banks for the sake of isolation and not shared caches.

10

slide-11
SLIDE 11

Lecture Organization

  • Typical week

– Mon: Lecture on the week’s topic – Wed/Fri: Paper presentations

  • Paper presentation

– I will Introduce the paper – I (or you) will present the paper – We will discuss the paper

  • You are required to present

– One (or two) paper per semester – May change depending on the class size

11

slide-12
SLIDE 12

Reading List

  • Posted on the class website

– Subject to change – Mostly recent papers and some classic ones

  • Sign-up process

– Email me the paper in the list you want to present – I will update the schedule on a First Come First Serve (FCFS) basis

12

slide-13
SLIDE 13

Paper Presentation & Discussion

  • Suggested structure (30min)

– Motivation & Background

  • Ask why the authors write this paper?

– Explain the main ideas

  • From your perspective. Careful about their assumptions

– Discussion topics

  • Questions: “I don’t understand XXX.”
  • Critiques: “This approach seems bad because …”
  • Submission

– Draft: by 5:00 p.m. the day before your presentation – Final version: before the class begins

13

slide-14
SLIDE 14

Final Exam

  • No midterm exam
  • Early final exam on April 15

14

slide-15
SLIDE 15

Homework & Mini Projects

  • Plan

– Two homework assignments on Linux PC – Two mini projects using a Raspberry Pi 3

  • About

– Basic real-time scheduling – Basic AI and self-driving car

15

slide-16
SLIDE 16
  • Use a Convolutional Neural

Network (CNN) to drive a car.

  • Trained with human driving data
  • Could successfully drive a car on

public roads w/o human

16 Source: https://devblogs.nvidia.com/deep-learning-self-driving-cars/

DAVE-2 CNN: 9 layers, ~250K parameters, ~27M connections

NVIDIA DAVE-2 Self-Driving Car

Video: https://www.youtube.com/watch?v=NJU9ULQUwng

slide-17
SLIDE 17

Term Project

  • 2nd half of the semester
  • DeepPicar Competition

– Build a self-driving car – Based on DeepPicar – Competition format

17

slide-18
SLIDE 18

DeepPicar Competition

  • Goal

– Safely drive autonomously on a given track – Using camera and Deep Neural Network (DNN)

  • Metrics

– Distance and time

  • Your tasks

– Build a car (instruction, materials will be given) – Develop/tune the AI (basic code will be given)

18

slide-19
SLIDE 19

(Tentative) Project Schedule

  • 3/18: Materials ready (build start)
  • 4/22: Manual driving check
  • 4/29: Autonomous driving check
  • 5/06: Competition
  • 5/15: Final report due

– 5 pages – Must be written using Latex

19

slide-20
SLIDE 20

Latex

  • Everybody in CS uses it to write papers

– Final report must be prepared using Latex

  • Overleaf

– https://www.overleaf.com

  • Ubuntu

– Install texlive-full

  • Window

– Install MikTex.

20

slide-21
SLIDE 21

Grading

  • Paper summaries (20%)
  • Student presentations (10%)
  • Final exam (30%)
  • Mini project(5%)
  • Homework (5%)
  • Project (30%)

– Competition: 20% – Final report: 10%

21

slide-22
SLIDE 22

Grading

  • 90+ : A
  • 80-89: B
  • 70-79: C
  • 50-69: D
  • 0-49: F

22

slide-23
SLIDE 23

Office Hours

  • M/F 11:00 – 11:50 at 3040 Eaton
  • By appoint at 236 Nichols

– heechul.yun@ku.com

23

slide-24
SLIDE 24

Introduce Yourselves

  • Name
  • Status: grad/undergrad, year
  • Relevant background
  • Interests

– What do you want to learn in this class?

24

slide-25
SLIDE 25

Today

  • Course overview

25

slide-26
SLIDE 26

Embedded Systems

  • Computing systems designed for specific purpose.
  • Embedded systems are everywhere

26

slide-27
SLIDE 27

Today’s Car

  • Quiz. How many embedded processors are in a car?

– A: ~100s

27

Simon Fürst, BMW, EMCC2015 Munich, adopted from OSPERT2015 keynote

slide-28
SLIDE 28

Future Automotive Systems

28

  • A. Hamann. “Industrial challenges: Moving from classical to high performance real-time systems.” In International

Workshop on Analysis Tools and Methodologies for Embedded and Real-time Systems (WATERS), July 2018

slide-29
SLIDE 29

Trends

  • More powerful and cheaper computing
  • More connected

29

slide-30
SLIDE 30

Internet of Things (IoT)

  • IoT ~= Internet connected embedded systems

30

slide-31
SLIDE 31

Cyber-Physical Systems (CPS)

  • Cyber system (Computer) + Physical system (Plant)
  • Still embedded systems, but

integration of physical systems is emphasized.

31

slide-32
SLIDE 32

Real-Time Systems

  • The correctness of the system depends on not
  • nly on the logical result of the computation

but also on the time at which the results are produced

  • A correct value at a wrong time is a fault.
  • CPS are often real-time systems

– Because physical process depends on time

32

slide-33
SLIDE 33

CPS Requirements

  • Real-time performance

– Meet deadlines in processing large amounts of real-time data from various sensors (e.g., autonomous cars) – Many constraints: size, weight, and power (SWaP); cost

  • Safety

– Interact with the environment, human, in real-time – Can hurt humans, destroy things, blow up (e.g., Nuclear plants) – Need both logical and temporal (time) correctness

  • Security

– Communicate over the internet (cloud servers etc.) – Remote software update (fix bugs, …) – Run untrusted 3rd party software (e.g., Apple CarPlay)

33

slide-34
SLIDE 34

Performance

  • Many cyber-physical systems (CPS) need:

– More performance – Less cost, size, weight, and power

34 CMU’s “Boss” Self-driving car, circa 2007 10 dual-processor blade servers on the trunk Audi’s zFAS platform. 2016-2018 A single-board computer with multiple CPUs, GPU, FPGA

Audi A8

slide-35
SLIDE 35

Compute Performance Demand

35

Intel, “Technology and Computing Requirements for Self-Driving Cars”

slide-36
SLIDE 36

Real-Time Data

  • from many sensors needs powerful computers

36

Source: http://on-demand.gputechconf.com/gtc/2015/presentation/S5870-Daniel-Lipinski.pdf

slide-37
SLIDE 37

Size, Weight, and Power (SWaP) Constraints

  • Maximum performance with minimal resources

– Cannot afford too many or too power hungry ECUs

37

Figure source: OSPERT 2015 Keynote by Leibinger

slide-38
SLIDE 38

Mobileye EveQ4

  • Real-time vision

processor w/ DNN

  • 2.5 teraflops @ 3W
  • 8 cameras @ 36 fps
  • Tesla uses EveQ3
  • 14 cores

– 4 MIPS cores – 10 vector cores

38

slide-39
SLIDE 39

Nvidia’s Drive PX2 Platform

  • 12 CPU + 2 GPU

– 8 Tegraflops @250W

  • Real-time processing of

– Up to 12 cameras, radar, .. – Deep Neural Network (DNN) for detection, classification

39

http://www.nvidia.com/object/drive-px.html

slide-40
SLIDE 40

Safety Failures

40

  • Computer controlled medical X-ray

treatments

  • Six people died/injured due to massive
  • verdoses (1985-1987)
  • Caused by synchronization mistakes
  • 7 billion dollar rocket was destroyed after 40

secs (6/4/1996)

  • “caused by the complete loss of guidance and

altitude information ”  Caused by 64bit floating to 16bit integer conversion

Therac 25 Arian 5

slide-41
SLIDE 41

Air France 447 (2009)

  • Airbus A330 crashed into the Atlantic Ocean in 2009
  • Caused in part by computer’s misguidance

– Pitot tube (speed sensor) failure  Flight Director (FD) malfunction (shows “head up”)  pilots follow the faulty FD  enter stall

41

http://www.spiegel.de/international/world/experts-say-focus-on-manual-flying-skills-needed-after-air-france-crash-a-843421.html http://www.slate.com/blogs/the_eye/2015/06/25/air_france_flight_447_and_the_safety_paradox_of_airline_automation_on_99.html

Stall Normal

slide-42
SLIDE 42

Lion Air Flight 610 (2018)

  • Boeing 737 crashed into the Java See in 2018
  • Caused by stall prevention system (MCAS)

– sensor error (plane is “stall”)  nose down (to the ocean)

42

slide-43
SLIDE 43

Tesla Autopilot (2016)

43

http://www.nytimes.com/interactive/2016/07/01/business/inside-tesla-accident.html

  • Tesla autopilot failed to recognize a trailer

resulting in a death of the driver

slide-44
SLIDE 44

NHTSA Report

  • Both the radar and camera sub-systems are designed

for front-to-rear collision prediction mitigation or avoidance.

  • The system requires agreement from both sensor

systems to initiate automatic braking.

  • The camera system uses Mobileye’s EyeQ3 processing

chip which uses a large dataset of the rear images of vehicles to make its target classification decisions.

  • Complex or unusual vehicle shapes may delay or

prevent the system from classifying certain vehicles as targets/threats

44

https://static.nhtsa.gov/odi/inv/2016/INCLA-PE16007-7876.PDF

slide-45
SLIDE 45

NHTSA Report

  • Object classification algorithms in the Tesla and

peer vehicles with AEB technologies are designed to avoid false positive brake activations.

  • The Florida crash involved a target image (side of

a tractor trailer) that would not be a “true” target in the EyeQ3 vision system dataset and

  • The tractor trailer was not moving in the same

longitudinal direction as the Tesla, which is the vehicle kinematic scenario the radar system is designed to detect

45

https://static.nhtsa.gov/odi/inv/2016/INCLA-PE16007-7876.PDF

slide-46
SLIDE 46

Uber Self-Driving Car (2018)

46

  • Kill a pedestrian crossing a road in Arizona

https://www.nytimes.com/2018/03/19/technology/uber-driverless-fatality.html

slide-47
SLIDE 47

NTSB Report

  • The system first registered radar and LIDAR observations of the pedestrian

about 6 seconds before impact

  • Software classified the pedestrian as an unknown object, as a vehicle, and

then as a bicycle with varying expectations of future travel path.

  • At 1.3 seconds before impact,

the system determined that an emergency braking maneuver was needed

  • Emergency braking maneuvers

are not enabled while the vehicle is under computer control, to reduce the potential for erratic vehicle behavior

47

https://www.ntsb.gov/investigations/AccidentReports/Reports/HWY18MH010-prelim.pdf

Failures in CPS have consequences

slide-48
SLIDE 48

Security

  • Interconnected CPS are open to attacks
  • Examples

– Stuxnet: Iranian nuclear power plant hacking – Vermont power grid hack by Russia – Remote hack into cars (Jeep) – Police drone hacking – Sensor hacking: GPS spoofing. IMU spoofing

48

slide-49
SLIDE 49

Challenges

  • Time Predictability
  • Complexity
  • Reliability
  • Security

49

slide-50
SLIDE 50

Time Predictability

  • At low-level, hardware is deterministic timing
  • At higher-levels, not so much ignore timing

– Pipeline, caches, Out-of-order execution, speculation, ISA – Process, thread, lock, interrupt

  • Focus on average case, not

worst-case. No guarantees

– Fine in cyber world – Real-world doesn’t work that way

50

slide-51
SLIDE 51

Timing Predictability

  • Q. Can you tell exactly how long a piece of

code will take to execute on a computer?

– Used to be (relatively) easy to do so.

  • Measure timing. Use the timing for analysis.

– Very difficult to answer in today’s computers

  • Pipeline, cache, out-of-order and speculative execution,

multicore, shared cache/dram very high variance.

51

slide-52
SLIDE 52

Denial-of-Service Attack

  • Delay execution time of time sensitive code

– E.g., real-time control software of a car – Observed >21X execution time increase on Odroid XU4 (*)

  • Even after cache partitioning is applied

– Observed >10X increase on RPi 3 (**)

  • Of a realistic DNN-based real-time control program

52

LLC Core1 Core2 Core3 Core4

bench co-runner(s)

(*) Prathap Kumar Valsan, Heechul Yun, Farzad Farshchi. “Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems.” In RTAS, IEEE, 2016. Best Paper Award (**) Michael Garrett Bechtel, Elise McEllhiney, Minje Kim, Heechul Yun. “DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car.” In RTCSA, IEEE, 2018

slide-53
SLIDE 53

Denial-of-Service Attack

53

[C] Michael Garrett Bechtel and Heechul Yun. Denial-of-Service Attacks on Shared Cache in Multicore: Analysis and Prevention. IEEE Intl. Conference

  • n Real-Time and Embedded Technology and Applications Symposium (RTAS), IEEE, 2019. (to appear)

> 300X slowdown !!!

slide-54
SLIDE 54

Complexity

  • Software complexity increases

54

Lines of Code in Typical GM Car

1 10 100 1000 10000 100000 1970 1990 2010 Model Year KLOC

Figures are from NASA JPL. “Flight Software Complexity,” 2008

Growth in Software Size

200 400 600 800 1000 1200 1400 Apollo 1968 Space Shuttle Orion (est.) Flight Vehicle K SLOC

slide-55
SLIDE 55

Linux Kernel Code Size

  • Linux: > 15M SLOC, multithreaded

 Software bugs are hard to weed out

55

https://www.quora.com/How-many-lines-of-code-are-in-the-Linux-kernel

slide-56
SLIDE 56

Reliability

  • Transient hardware faults (soft errors)

– Single event upset (SEU) in SRAM, logic

  • Due to alpha particle, cosmic radiation

– Manifested as software failures

  • Crashes, wrong output: silent data corruption

– Bigger problem in advanced CPU

  • Increased density, freq  higher soft error
  • Hardware bugs

– Pentium floating point bug (FDIV bug) – Intel CPU bugs in 2015: http://danluu.com/cpu-bugs/

  • “Certain Combinations of AVX Instructions May Cause Unpredictable System Behavior”
  • “Processor May Experience a Spurious LLC-Related Machine Check During Periods of

High Activity”

56

http://www.cotsjournalonline.com/articles/view/102279

slide-57
SLIDE 57

Security

57

https://meltdownattack.com/

slide-58
SLIDE 58

Micro-Architectural Side-Channels

  • Many micro-architectural components contain

hidden state which leaks secret

– often via observable timing variations

  • Known to exist in cache, DRAM bank, OoO

speculation, branch predictor, etc.

  • Logically correct, proven software is also

vulnerable

58

slide-59
SLIDE 59

Example: Spectre Attack

  • Wrong branch is speculatively taken.
  • x is maliciously chosen by the attacker.
  • The attacker probes arrary2 to recover secret:

array1[x]

59

slide-60
SLIDE 60

(Cache) Timing Channel Attack

  • By measuring access timing differences of a

memory location, an attacker can determine whether the memory is cached or not.

  • This can be used to leak secret information
  • Methods: Flush + Reload, Prime + Probe, etc.

60

Image source: M. Lipp et al., “Meltdown,” arXiv Prepr., 2018.

slide-61
SLIDE 61

CPS: Related Areas

  • CPS requires inter disciplinary approach

– EECS

  • Computer architecture
  • Real-time systems
  • Formal method
  • Software engineering
  • Control

– Aerospace, and other engineering

  • Physical systems (plant/actuator) modeling/control

61

slide-62
SLIDE 62

Topics

  • Introduction to Real-Time Systems, CPS
  • CPS Applications
  • Real-time multicore architecture
  • Real-time OS and middleware
  • Fault tolerance, safety, security

62

Amazon prime air

slide-63
SLIDE 63

Topics

  • Introduction to Real-Time Systems, CPS

– Background on Real-time scheduling theory, timing analysis, server, priority inversion

  • CPS Applications
  • Real-time architecture
  • Real-time OS and middleware
  • Fault tolerance, safety, security

63

slide-64
SLIDE 64

Topics

  • Introduction to Real-Time Systems, CPS
  • CPS Applications

– More detailed look at individual CPS applications – Intelligent vehicle development techniques

  • Real-time architecture
  • Real-time OS and middleware
  • Fault tolerance, safety, security

64

slide-65
SLIDE 65

Topics

  • Introduction to Real-Time Systems, CPS
  • CPS Applications
  • Real-time architecture

– Real-time cache, DRAM controller designs – Predictable microarchitecture designs – Real-time support for GPU/FPGA

  • Real-time OS and middleware
  • Fault tolerance, safety, security

65

slide-66
SLIDE 66

Topics

  • Introduction to Real-Time Systems, CPS
  • CPS Applications
  • Real-time architecture
  • Real-time OS and middleware

– RTOS, ARINC 653, AUTOSAR, ROS, DDS

  • Fault tolerance, safety, security

66

slide-67
SLIDE 67

Topics

  • Introduction to Real-Time Systems, CPS
  • CPS Applications
  • Real-time architecture
  • Real-time OS and middleware
  • Fault tolerance, safety, security

– CPS specific security issues, case studies – Simplex architecture, – CPS modeling and verification

67

slide-68
SLIDE 68

APPENDIX

68

slide-69
SLIDE 69

69

slide-70
SLIDE 70

Links

  • Linux kernel related

– PALLOC – MemGuard

  • Self-driving car related

– DeepTraffic https://selfdrivingcars.mit.edu/deeptraffic/ – DeepTesla https://selfdrivingcars.mit.edu/deeptesla/

70

slide-71
SLIDE 71

Embedded Systems

  • More embedded systems than PC/servers

– 10 billion chips in 2013 by ARM

71

http://jbpress.ismedia.jp/articles/-/36814