IoT Operating Systems Chenyang Lu 1 Critiques 1/2 page critiques - - PowerPoint PPT Presentation

iot operating systems
SMART_READER_LITE
LIVE PREVIEW

IoT Operating Systems Chenyang Lu 1 Critiques 1/2 page critiques - - PowerPoint PPT Presentation

IoT Operating Systems Chenyang Lu 1 Critiques 1/2 page critiques of research papers Due at 10am on the class day (hard deadline) Email Corey <dairuixuan@wustl.edu> in plain txt Back-of-envelop notes - NOT whole essays


slide-1
SLIDE 1

IoT Operating Systems

Chenyang Lu

1

slide-2
SLIDE 2

Critiques

Ø 1/2 page critiques of research papers Ø Due at 10am on the class day (hard deadline) Ø Email Corey <dairuixuan@wustl.edu> in plain txt Ø Back-of-envelop notes - NOT whole essays Ø Guidelines: http://www.cs.wustl.edu/%7Elu/cse521s/critique.html

2

slide-3
SLIDE 3

Critique #1 (IoT OS)

Ø Due on 9/24 Ø D. Gay, P. Levis, R. von Behren, M. Welsh, E. Brewer, and D. Culler, The nesC Language: A Holistic Approach to Networked Embedded Systems, ACM Conference on Programming Language Design and Implementation (PLDI), June 2003.

Chenyang Lu 3

slide-4
SLIDE 4

Critique #2 (Wireless)

Ø Due on 10/15 Ø M. Buettner, G.V. Yee, E. Anderson and R. Han, X-MAC: a Short Preamble MAC Protocol for Duty-Cycled Wireless Sensor Networks, ACM Conference on Embedded Networked Sensor Systems (SenSys), November 2006.

Chenyang Lu 4

slide-5
SLIDE 5

Proposal Presentation

Ø In class on 10/1 Ø 7 min per group

q 6-min talk + 1-min Q&A q 4 slides q Rehearse over Zoom q Turn on your video during your presentation

Ø Your elevator pitch! Ø Email Corey your slides before class

5

slide-6
SLIDE 6

Written Proposal

Ø One proposal/team, one page

q Team members q Concise description of project q Responsibilities of each member q Equipment needed

Ø Written proposal due: 10/1, 11:59pm

q Email to Corey

6

slide-7
SLIDE 7

Demo I

Ø In class on 10/27 and 10/29. Ø 15 min per team. Ø Must show something real. Ø Send Corey a video before class as backup.

7

slide-8
SLIDE 8

Demo II

Ø In class on 11/17 and 11/19. Ø 15 min per team. Ø Substantial progress à final demo. Ø Send Corey a video before class as backup.

8

slide-9
SLIDE 9

IoT OS

Ø Linux Ø Windows 10 IoT Core: Windows 10 optimized for ARM and x86/x64. Ø Amazon FreeRTOS: open-source FreeRTOS kernel + libraries to securely connect devices to AWS cloud services. Ø Arm Mbed: open-source OS for Arm Cortex-M microcontrollers. Ø Contiki: open-source, multi-threaded OS.

Chenyang Lu 9

slide-10
SLIDE 10

Amazon FreeRTOS

Chenyang Lu 10

https://aws.amazon.com/freertos

slide-11
SLIDE 11

Diverse Platforms

TelosB Ø TI MSP430 microcontroller, 4/8 MHz, 8 bit Ø Memory: 10KB data, 48 KB program Ø IEEE 802.15.4 radio: max 250 Kbps Raspberry Pi 4 Ø Quad core Cortex-A72 64-bit SoC, 1.5GHz Ø 2GB-8GB SDRAM Ø IEEE 802.11ac wireless, Bluetooth 5.0, BLE Ø Gigabit Ethernet

Chenyang Lu 11

slide-12
SLIDE 12

I/O (some shared) 8 ADC (12 bit) 2 DAC (12 bit) 1 I2C 1 JTAG 1 1-Wire 2 SPI 2 UART 8 general, 8 interrupt, and 5 special pin connectors

Epic Core

12

RAM 10 KB Flash 48 KB

16 MB Flash memory CC2420 radio 802.15.4 6LoWPAN/IPv6 Typical sleep current 9μA at 3V, radio active ~20mA

3 V TI MSP430

2.5 x 2.5 cm Unique hardware ID Clock 4/8 MHz

slide-13
SLIDE 13

TelosB

Ø Six major I/O devices Ø Possible Concurrency

q I2C, SPI, ADC

Ø Energy Management

q Turn peripherals on only when needed q Turn off otherwise

13

slide-14
SLIDE 14

Hardware Constraints

Severe constraints on power, size, and cost à

Ø slow microprocessor Ø low-bandwidth radio Ø limited memory Ø limited hardware parallelism à CPU hit by many interrupts! Ø manage sleep modes in hardware components

Chenyang Lu 14

slide-15
SLIDE 15

Software Challenges

Ø Small memory footprint Ø Efficiency - power and processing Ø Concurrency-intensive operations Ø Diversity in applications & platform à efficient modularity

q Support reconfigurable hardware and software

Chenyang Lu 15

slide-16
SLIDE 16

OS: Basic Functions

Ø OS controls resources:

q who gets the CPU; q when I/O takes place; q how much memory is allocated; q power management

Ø Application programs run on top of OS services Ø Challenge: manage multiple, concurrent tasks.

Chenyang Lu 16

slide-17
SLIDE 17

Example: Engine Control

Concurrent tasks Ø spark control Ø crankshaft sensing Ø fuel/air mixture Ø oxygen sensor

Chenyang Lu 17

engine controller

slide-18
SLIDE 18

Process

Ø A process is a unique execution of a program.

q Several copies of a program may run simultaneously.

Ø A process has its own context.

q Data in registers, Program Counter (PC), status. q Stored in Process Control Block (PCB)

Ø Thread: lightweight process

q Threads share memory space in a same process.

Ø OS manages processes and threads.

Chenyang Lu 18

slide-19
SLIDE 19

Traditional OS

Ø Multi-threaded Ø Preemptive scheduling Ø Threads:

q ready to run; q executing on the CPU; q waiting for data.

Chenyang Lu 19

executing ready waiting needs data gets data needs data preempted gets CPU

slide-20
SLIDE 20

Preemptive Priority Scheduling

Ø Each process has a fixed priority (1 highest); Ø P1: priority 1; P2: priority 2; P3: priority 3.

Chenyang Lu 20

time P2 released P1 released P3 released 30 10 20 60 40 50 P2 P2 P1 P3

slide-21
SLIDE 21

Context Switch

Chenyang Lu 21

CPU

registers process 1 process 2 ...

memory

PC

slide-22
SLIDE 22

Limitations of Traditional OS

Ø Multi-threaded + preemptive scheduling

q Preempted threads waste memory q Context switch overhead

Ø I/O

q Blocking I/O: waste memory on blocked threads q Polling (busy-wait): waste CPU cycles and power

Chenyang Lu 22

slide-23
SLIDE 23

Existing Embedded OS

Ø QNX context switch = 2400 cycles on x86 Ø pOSEK context switch > 40 µs Ø Creem -> no preemption

Chenyang Lu 23

Name Code Size Target CPU pOSEK 2K Microcontrollers pSOSystem PII->ARM Thumb VxWorks 286K Pentium -> Strong ARM QNX Nutrino >100K Pentium II -> NEC QNX RealTime 100K Pentium II -> SH4 OS-9 Pentium -> SH4 Chorus OS 10K Pentium -> Strong ARM ARIEL 19K SH2, ARM Thumb Creem 560 bytes ATMEL 8051

System architecture directions for network sensors, J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler, K. Pister. ASPLOS 2000.

slide-24
SLIDE 24

TinyOS Solutions

Ø Efficient modularity

q Application = scheduler + graph of components q Compiled into one executable q Only needed components are complied/loaded

Ø Concurrency: event-driven architecture

Chenyang Lu 24

Co Commun mmunicati tion Ac Actuating ing Se Sensing Co Commun mmunicati tion Ap Applic icatio ion n (User Compone nent nts) Ma Main (includes s Scheduler) r) Ha Hardware Abstractions

Modified from D. Culler et al., TinyOS boot camp presentation, Feb 2001

slide-25
SLIDE 25

Example: Surge

Chenyang Lu 25

SurgeC BootC SurgeP

StdControl ADC Timer Leds SendMsg StdControl

PhotoC

ADC StdControl

TimerC MultihopC LedsC

Leds SendMsg StdControl Timer StdControl

slide-26
SLIDE 26

Example: Mica2 Mote

Ø Microcontroller: 7.4 MHz, 8 bit Ø Memory: 4KB data, 128 KB program Ø Radio: max 38.4 Kbps Ø Sensors: Light, temperature, acceleration, acoustic, magnetic… Ø Power

q <1 week on two AA batteries in active mode q >1 year battery life on sleep modes!

Chenyang Lu 26

slide-27
SLIDE 27

Example: Application

Chenyang Lu 27

RFM Radio Byte (MAC) Radio Packet i2c Temp photo Messaging Layer clocks bit byte packet Routing Layer sensing application application

HW SW

ADC messaging routing

  • D. Culler et. Al., TinyOS boot camp presentation, Feb 2001
slide-28
SLIDE 28

Two-level Scheduling

Ø Events handle interrupts

q Interrupts trigger lowest level events q Events can signal events, call commands, or post tasks

Ø Tasks perform deferred computations Ø Interrupts preempt tasks and interrupts

Chenyang Lu 28

Hardware Interrupts events commands FIFO Tasks POST Preempt Time commands

slide-29
SLIDE 29

Multiple Data Flows

Ø Respond quickly: sequence of events/commands through the component graph.

q Immediate execution of function calls q e.g., get bit out of radio before it gets lost.

Ø Post tasks for deferred computations.

q e.g., encoding.

Ø Events preempt tasks to handle new interrupts.

Chenyang Lu 29

slide-30
SLIDE 30

Sending a Message

Chenyang Lu 30

Timing diagram of event propagation (step 0-6 takes about 95 microseconds total)

slide-31
SLIDE 31

Scheduling

Ø Interrupts preempt tasks

q Respond quickly q Event/command implemented as function calls

Ø Task cannot preempt tasks

q Reduce context switch à efficiency q Single stack à low memory footprint q TinyOS 2 supports pluggable task scheduler (default: FIFO).

Ø Scheduler puts processor to sleep when

q no event/command is running q task queue is empty

Chenyang Lu 31

slide-32
SLIDE 32

Space Breakdown…

Chenyang Lu 32

Code size for ad hoc networking application

500 1000 1500 2000 2500 3000 3500

Bytes

Interrupts Message Dispatch Initilization C-Runtime Light Sensor Clock Scheduler Led Control Messaging Layer Packet Layer Radio Interface Routing Application Radio Byte Encoder

Scheduler: 144 Bytes code Totals: 3430 Bytes code 226 Bytes data

  • D. Culler et. Al., TinyOS boot camp presentation, Feb 2001
slide-33
SLIDE 33

Power Breakdown…

q Lithium battery runs for 35 hours at peak load and years at

minimum load!

  • That’s three orders of magnitude difference!

q A one byte transmission uses the same energy as

approximately 11000 cycles of computation.

Chenyang Lu 33

Active Idle Sleep CPU 5 mA 2 mA 5 μA Radio 7 mA (TX) 4.5 mA (RX) 5 μA EE-Prom 3 mA LED’s 4 mA Photo Diode 200 μA Temperature 200 μA

Panasonic CR2354 560 mAh

slide-34
SLIDE 34

Time Breakdown…

Ø 50 cycle task overhead (6 byte copies) Ø 10 cycle event overhead (1.25 byte copies)

Chenyang Lu 34

Components

Packet reception work breakdown CPU Utilization Energy (nj/Bit) AM

0.05% 0.20% 0.33

Packet

1.12% 0.51% 7.58

Radio handler

26.87% 12.16% 182.38

Radio decode thread

5.48% 2.48% 37.2

RFM

66.48% 30.08% 451.17

Radio Reception

  • 1350

Idle

  • 54.75%
  • Total

100.00% 100.00% 2028.66

slide-35
SLIDE 35

Advantages

Ø Small memory footprint

q Only needed components are complied/loaded q Single stack for tasks

Ø Power efficiency

q Put CPU to sleep whenever the task queue is empty q TinyOS 2 (ICEM) provides power management for peripherals.

Ø Efficient modularity

q Event/command interfaces between components q Event/command implemented as function calls

Ø Concurrency-intensive operations

q Event/command + tasks

Chenyang Lu 35

slide-36
SLIDE 36

Critiques

Ø No protection barrier between kernel and applications Ø No preemptive scheduling à a real-time task may wait for non-urgent ones Ø Static linking à cannot change parts of the code dynamically Ø Virtual memory?

Chenyang Lu 36

slide-37
SLIDE 37

nesC

ØProgramming language for TinyOS and applications ØSupport TinyOS components ØWhole-program analysis at compile time

q Improve robustness: detect race conditions q Optimization: function inlining

ØStatic language

q No function pointer q No malloc q Call graph and variable access are known at compile time

Chenyang Lu 37

slide-38
SLIDE 38

Application

Ø Interfaces

q provides interface q uses interface

Chenyang Lu 38

Ø Implementation

q module: C behavior q configuration: select & wire

SurgeC BootC SurgeP

StdControl ADC Timer Leds SendMsg StdControl

PhotoC

ADC StdControl

TimerC MultihopC LedsC

Leds SendMsg StdControl Timer StdControl

slide-39
SLIDE 39

Module

Chenyang Lu 39

Ø Interfaces

q provides interface q uses interface

module TimerP { provides { interface StdControl; interface Timer; } uses interface Clock; ... }

Clock Timer StdControl

TimerP

Ø Implementation

q module: C behavior q configuration: select & wire

slide-40
SLIDE 40

Interface

interface Clock { command error_t setRate(char interval, char scale); event error_t fire(); } interface Send { command error_t send(message_t *msg, uint16_t length); event error_t sendDone(message_t *msg, error_t success); } interface ADC { command error_t getData(); event error_t dataReady(uint16_t data); }

Bidirectional interface supports split-phase operation

Chenyang Lu 40

slide-41
SLIDE 41

Chenyang Lu 41

Module

module SurgeP { provides interface StdControl; uses interface ADC; uses interface Timer; uses interface Send; } implementation { bool busy; norace uint16_t sensorReading; async event result_t Timer.fired() { bool localBusy; atomic { localBusy = busy; busy = TRUE; } if (!localBusy) call ADC.getData(); return SUCCESS; } async event result_t ADC.dataReady(uint16_t data) { sensorReading = data; post sendData(); return SUCCESS; } ... }

slide-42
SLIDE 42

Configuration

Chenyang Lu 42

TimerC

Clock

HWClock

Clock Timer StdControl

TimerP

Timer StdControl

configuration TimerC { provides { interface StdControl; interface Timer; } } implementation { components TimerP, HWClock; StdControl = TimerP.StdControl; Timer = TimerP.Timer; TimerP.Clock -> HWClock.Clock; }

slide-43
SLIDE 43

Example: Surge

Chenyang Lu 43

SurgeC BootC SurgeP

StdControl ADC Timer Leds SendMsg StdControl

PhotoC

ADC StdControl

TimerC MultihopC LedsC

Leds SendMsg StdControl Timer StdControl

slide-44
SLIDE 44

Concurrency

Ø Race condition: concurrent interrupts/tasks update shared variables. Ø Only interrupts cause preemption à concurrency

q Asynchronous code (AC): reachable from at least one interrupt. q Synchronous code (SC): reachable from tasks only.

Ø Any update of a shared variable from AC is a potential race condition!

Chenyang Lu 44

slide-45
SLIDE 45

A Race Condition

module SurgeP { ... } implementation { bool busy; norace uint16_t sensorReading; async event result_t Timer.fired() { if (!busy) { busy = TRUE; call ADC.getData(); } return SUCCESS; } task void sendData() { // send sensorReading adcPacket.data = sensorReading; call Send.send(&adcPacket, sizeof adcPacket.data); return SUCCESS; } async event result_t ADC.dataReady(uint16_t data) { sensorReading = data; post sendData(); return SUCCESS; }

Chenyang Lu 45

slide-46
SLIDE 46

Atomic Sections

atomic { <Statement list> } ØDisable interrupt when atomic code is being executed ØBut cannot disable interrupt for long!

q No loop q No command/event q Function calls OK, but callee must meet restrictions too

Chenyang Lu 46

slide-47
SLIDE 47

Prevent Race

module SurgeP { ... } implementation { bool busy; norace uint16_t sensorReading; async event result_t Timer.fired() {

bool localBusy; atomic { localBusy = busy; busy = TRUE; } if (!localBusy) call ADC.getData(); return SUCCESS; }

Chenyang Lu 47

test-and-set

disable interrupt enable interrupt

slide-48
SLIDE 48

nesC Compiler

Ø Race-free invariant: any update of a shared variable

q is from SC only, or q occurs within an atomic section.

Ø Compiler returns error if the invariant is violated. Ø Fix

q Make access to shared variables atomic. q Move access to shared variables to tasks.

Chenyang Lu 48

slide-49
SLIDE 49

Results

ØTested on full TinyOS code, plus applications

q 186 modules (121 modules, 65 configurations) q 20-69 modules/app, 35 average q 17 tasks, 75 events on average (per application) - lots of concurrency!

ØFound 156 races: 103 real

q About 6 per 1000 lines of code!

ØFixed races:

q Add atomic sections q Post tasks (move code to task context)

Chenyang Lu 49

slide-50
SLIDE 50

Function Inlining

int foo(a,b,c) { return a + b - c;} z = foo(w,x,y);

ð

z = w + x - y; Ø Improve performance by eliminating function call overhead. Ø May increase code size, but not always… Ø Affect instruction cache behavior.

Chenyang Lu CSE 467S 50

slide-51
SLIDE 51

Optimization: Inlining

Chenyang Lu 51

  • Inlining improves performance and reduces code size.
  • Why?

App Code size Code reduction Data size CPU reduction inlined noninlined Surge 14794 16984 12% 1188 15% Maté 25040 27458 9% 1710 34% TinyDB 64910 71724 10% 2894 30%

slide-52
SLIDE 52

Overhead for Function Calls

Ø Caller: call a function

q Push return address to stack q Push parameters to stack q Jump to function

Ø Callee: receive a call

q Pop parameters from stack

Ø Callee: return

q Pop return address from stack q Push return value to stack q Jump back to caller

Ø Caller: return

q Pop return value

Chenyang Lu 52

Overhead instructions for function calls!

slide-53
SLIDE 53

Principles Revisited

ØSupport TinyOS components

qInterface, modules, configuration

ØWhole-program analysis and optimization

qImprove robustness: detect race conditions qOptimization: function inlining qMore: memory footprint.

ØStatic language

qNo malloc, no function pointers

Chenyang Lu 53

slide-54
SLIDE 54

Critiques

Ø No dynamic memory allocation

q Bound memory footprint q Allow offline footprint analysis q How to size buffer when data size varies dynamically?

Ø Restriction: no “long-running” code in

q command/event handlers q atomic sections

Chenyang Lu 54

slide-55
SLIDE 55

Reading

Ø D. Gay, P . Levis, R. von Behren, M. Welsh, E. Brewer, and D. Culler, The nesC Language: A Holistic Approach to Networked Embedded Systems. [Critique #1] Ø J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler, and K. Pister, System Architecture Directions for Network Sensors. Ø P . Levis and D. Gay, TinyOS Programming, 2009.

q Purchase the book online q Download the first half of the published version for free

Ø http://www.tinyos.net/

Chenyang Lu 55