Qduino: A Multithreaded Arduino System for Embedded Computing - - PowerPoint PPT Presentation

qduino a multithreaded arduino system for embedded
SMART_READER_LITE
LIVE PREVIEW

Qduino: A Multithreaded Arduino System for Embedded Computing - - PowerPoint PPT Presentation

Qduino: A Multithreaded Arduino System for Embedded Computing Zhuoqun Cheng, Ye Li, Richard West Computer Science Background Many Robotics, Internet of Things, Home Automation applications have been developed recently Perform complicated


slide-1
SLIDE 1

Qduino: A Multithreaded Arduino System for Embedded Computing

Zhuoqun Cheng, Ye Li, Richard West

Computer Science

slide-2
SLIDE 2

2

Background

Many Robotics, Internet of Things, Home Automation applications have been developed recently Perform complicated computing tasks Interact with the physical world Need an easy-to-use platform to develop applications High processing capabilities Straightforward hardware and software interface

slide-3
SLIDE 3

Arduino Digital and analog GPIOs Simple API Low processing capabilities

Arduino Uno: 16MHz 8-bit ATmega328P

Background

slide-4
SLIDE 4

More powerful Arduino-compatible boards emerge to meet the demands Intel Galileo: 400MHz Intel Quark X1000 Intel Edison: 500MHz dual-core Atom Arduino-compatible: the same GPIO layout with the standard Arduino boards

Background

slide-5
SLIDE 5

6

Background

The standard Arduino runs sketches (Arduino program) on the bare metal New boards are shipped with Linux

Able to afford the overhead of operating systems To cope with the complexity of the hardware Run sketches as Linux processes

slide-6
SLIDE 6

Linux lacks predictability

Many embedded applications have real-time requirements RTOS is needed The standard Arduino API designed for a single thread of

execution

No multithreading or concurrency Fails to utilize computing resources and hardware parallelism

Motivation

slide-7
SLIDE 7

Qduino: a programming environment that provides support for preemptive multithreading Arduino API that guarantees timing

predictability of different control flows in a sketch

Multithreaded sketches, and synchronization and communication between control flows Temporal isolation between different control flows and asynchronous system events, e.g., interrupts Predictable event delivery for I/O handling in sketches

Contributions

slide-8
SLIDE 8

Qduino Architecture

Sketch

Kernel User ...

Quest Native App Quest Native App Galileo Qduino Libs loop1 loopN

... x86 SoC

Edison Minnowboard GPIO Driver SPI Driver I2C Driver

slide-9
SLIDE 9

Category Standard APIs New APIs (backward compatible) Structure setup(), loop() loop(id, C, T) Digital and Analog I/Os pinMode(), digitalWrite(),digitalRead(), anlogWrite(), anlogRead() Interrupts Interrupts(), noInterrupts(), attachInterrupt(pin, ISR, mode), detachInterrupt(pin) interruptsVcpu(C, T), attachInterruptVcpu(pin, ISR, mode, C, T) Synchronization & Communication spinlock, four-slot channel, ringbuffer Other Utility Functions micros(), delay(), min(), sqrt(), sin(), isLowerCase(), random(), bitset(), ...

Arduino vs Qduino APIs

slide-10
SLIDE 10

Qduino: Multithreaded sketches, and synchronization and communication between control flows Temporal isolation between different control flows and asynchronous system events, e.g., interrupts Predictable event delivery for I/O handling in sketch

Contributions

slide-11
SLIDE 11

Sketch

Kernel User ...

Quest Native App Quest Native App Galileo Qduino Libs loop1 loopN

... x86 SoC

Edison Minnowboard GPIO Driver SPI Driver I2C Driver

Standard API

Only one loop() is allowed Blocking I/Os block the sketch

Qduino:

Up to 32 loop() in one sketch Each loop() function is assigned to a Quest thread

Structure loop(), setup() loop(id, C, T)

Multithreaded Sketch

slide-12
SLIDE 12

Benefits

Loop interleaving Blocking I/Os won't block the entire sketch increase CPU utilization Easy to write sketches with parallel tasks Example: toggle pin 9 every 2s, pin 10 every 3s

Multithreaded Sketch

slide-13
SLIDE 13

//Sketch 2: toggle pin 10 every 3s int val10 = 0; void setup() { pinMode(10, OUTPUT); } void loop() { val10 = !val10; //flip the output value digitalWrite(10, val10); delay(3000); //delay 3s } //Sketch 1: toggle pin 9 every 2s int val9 = 0; void setup() { pinMode(9, OUTPUT); } void loop() { val9 = !val9; //flip the output value digitalWrite(9, val9); delay(2000); //delay 2s }

Delay(?) No way to merge them!

Multithreaded Sketch

delay(2000); delay(3000);

slide-14
SLIDE 14

Inefficient Do scheduling by hand Hard to scale

Multithreaded Sketch

int val9, val10 = 0; int next_flip9, next_flip10 = 0; void setup() { pinMode(9, OUTPUT); pinMode(10, OUTPUT); } void loop() { if (millis() >= next_flip9) { val9 = !val9; //flip the output value digitalWrite(9, val9); next_flip9 += 2000; } if (millis() >= next_flip10) { val10 = !val10; //flip the output value digitalWrite(10, val10); next_flip10 += 3000; } } if (millis() >= next_flip9) if (millis() >= next_flip10)

slide-15
SLIDE 15

Multithreaded Sketch in Qduino

Multithreaded Sketch

int val9, val10 = 0; int C = 500, T = 1000; void setup() { pinMode(9, OUTPUT); pinMode(10, OUTPUT); } void loop(1, 5, 10) { val9 = !val9; //flip the output value digitalWrite(9, val9); delay(2000); } void loop(2, 5, 10) { val10 = !val10; //flip the output value digitalWrite(10, val10); delay(3000); } loop(1, C, T) loop(2, C, T)

slide-16
SLIDE 16

Loops – threads

Communication via global variables

Serialized global variable access

Explicit: spinlock Implicit: channel, ring buffer

Communication & Synchronization

Function Signatures Category

  • spinlockInit(lock)
  • spinlockLock(lock)
  • spinlockUnlock(lock)

Spinlock

  • channelWrite(channel,item)
  • item channelRead(channel)

Four-slot

  • ringbufInit(buffer,size)
  • ringbufWrite(buffer,item)
  • ringbufRead(buffer,item)

Ring buffer

slide-17
SLIDE 17

Qduino: Multithreaded sketches, and synchronization and communication between control flows Temporal isolation between different control flows and asynchronous system events, e.g., interrupts Predictable event delivery for I/O handling in sketch

Contributions

slide-18
SLIDE 18

Real-time Virtual CPU (VCPU) Scheduling

VCPU: kernel objects for time accounting and scheduling Two classes:

Main VCPU – conventional thread I/O VCPU – threaded interrupt handler

Temporal Isolation

Main VCPUs I/O VCPUs Threads PCPUs (Cores) Address Space

slide-19
SLIDE 19

Real-time Virtual CPU (VCPU) Scheduling

Each VCPU has a max budget C, a period T and a utilization U = C / T Integrate the scheduling of tasks & I/O interrupts

Extension to rate-monotonic scheduling Ensure temporal isolation if the Liu- Layland utilization bound is satisfied

Temporal Isolation

Main VCPUs I/O VCPUs Threads PCPUs (Cores) Address Space

slide-20
SLIDE 20

Sketch

Kernel User ...

Quest Native App Quest Native App Galileo Qduino Libs loop1 loopN

... x86 SoC

Edison Minnowboard GPIO Driver SPI Driver I2C Driver

Loop – thread – Main VCPU Specify loop timing requirements GPIO interrupt handler – I/O VCPU Control # of interrupts to handle Balance CPU time between tasks, as well as tasks and interrupts

Structure loop(), setup() loop(id, C, T) Interrupts interrupts() interruptsVcpu(C, T)

Temporal Isolation

slide-21
SLIDE 21

Qduino: Multithreaded sketches, and synchronization and communication between control flows Temporal isolation between different control flows and asynchronous system events, e.g., interrupts Predictable event delivery for I/O handling in sketch

Contributions

slide-22
SLIDE 22

Event delivery time: the time interval between the invocation of the ISR and the invocation of the user-level interrupt handler Predictable end-to-end event delivery attachInterruptVcpu(..., C, T), interruptsVcpu(C, T)

Category Standard APIs Newly added APIs Interrupts Interrupts(), noInterrupts(), attachInterrupt(pin, ISR, mode), detachInterrupt(pin) interruptsVcpu(C, T), attachInterruptVcpu(pin, ISR, mode, C, T)

Predictable Events

Scheduler

Main VCPU Main VCPU

Sketch Thread

I/O VCPU

User Interrupt Handler Interrupt Bottom Half

CPU Core(s) GPIO Expander Kernel User

Wakeup

attachInterruptVcpu interrupt return

GPIO Driver

Hardware Interrupt

slide-23
SLIDE 23

ΔWCD=Δbh+(T h−Ch)=(Tio−Cio)+⌈ δbh Cio −1⌉⋅T io+δbh modCio+(T h−Ch)

Predictable Events

I/O VCPU (Cio, Tio) – threaded interrupt bottom half Main VCPU (Ch, Th) – threaded user interrupt handler Worst Case Event Delivery Time:

I/O VCPU used up budget Interrupt bottom half execution time Main VCPU used up budget

slide-24
SLIDE 24

Experiment Setup

Intel Galileo board Gen 1 Qduino vs. Clanton Clanton Linux 3.8.7 is shipped with the Galileo board

Evaluation

slide-25
SLIDE 25

Case 1 Case 2 Case 3 Case 4 2 4 6 8 10 12 3.8 7.6 11.2 8 3.7 7.6 10.8 7.7

Clanton Qduino

Multithreaded Sketch

Computation-intensive: find all prime numbers smaller than 80000 I/O-intensive: 2000 digital write Reduce 30% CPU Cycles

Evaluation

Case # Description Case 1 Single-loop digitalWrite() Case 2 Single-loop findPrime Case 3 Single-loop digitalWrite() + findPrime Case 4 Multi-loop digitalWrite() + findPrime

CPU Cycles (x10^9)

slide-26
SLIDE 26

Predictable loop execution

1 Foreground loop increments a counter during its loop period 2/4 background loops act as potential interference Result interpretation

Overlapped – temporal isolation Straight line – timing guarantee

Evaluation

10 20 30 40 50 60 100T 200T 300T 400T 500T

Counter (x104) Time (Periods)

(50,100),2 (50,100),4 (70,100),2 (70,100),4 (90,100),2 (90,100),4 Linux,2 Linux,4

slide-27
SLIDE 27

Temporal Isolation between

loops and interrupts

Use an external device to toggle pin 2 of Galileo Run findPrime at the same time Execution time of findPrime and # of interrupts handled

Evaluation

Case # I/O VCPU External Interrupts Case 1 10/100 OFF Case 2 0/100 ON Case 3 5/100 ON Case 4 10/100 ON Case 5 Disabled ON

Case 1 Case 2 Case 3 Case 4 Case 5 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 12 12 12.2 12.4 19.5 2.1 4.2 17

CPU Cycles Interrupts Handled CPU Cycles (x10^9) Counts (x1000)

slide-28
SLIDE 28

Autonomous Vehicle

Collision avoidance using ultrasonic sensor Two tasks: A sensing task detects distance to an

  • bstacle - delay(200)

An actuation task controls the motors

  • delay(100)

Evaluation

slide-29
SLIDE 29

Autonomous Vehicle

Measure the time interval between two consecutive calls to the motor actuation code Clanton single loop delay from both sensing and actuation task Qduino multi-loop No delay from the sensing loop No delay from sensor timeout The shorter the worst case time interval, the faster the vehicle can drive

Evaluation

100 200 300 400 500 600 700 800 10 20 30 40 50 60 70 80 90 100

Time (milliseconds) Sample #

Clanton Single-loop Qduino Multi-loop Qduino Single-loop Clanton Interrupt

slide-30
SLIDE 30

Supported Quest RTOS on Intel Arduino-compatible boards Designed and implemented an extension to the Arduino API for Quest on new powerful Arduino-compatible boards

Multi-loop sketches Real-time guarantee

Conclusions

slide-31
SLIDE 31

Questions? More information can be found at: https://www.cs.bu.edu/~richwest/Qduino.php

Thank you!

slide-32
SLIDE 32

Conditional loops Communication between loops with loop IDs Multi-sketches

Future Work

slide-33
SLIDE 33

Memory Footprint

Text (Bytes) Data (Bytes) Qduino kernel 953358 321516 Clanton kernel 4390436 336104 Qduino autonomous vehicle sketch 4832 2360 Clanton autonomous vehicle sketch 26249 27652

slide-34
SLIDE 34

Sketch

Kernel User ...

Quest Native App Quest Native App Galileo Qduino Libs loop1 loopN

... x86 SoC

Edison Minnowboard GPIO Driver SPI Driver I2C Driver

Complicated I/O Architecture on new boards

Category Standard APIs Newly added APIs Digital and Analog I/Os PinMode(), digitalWrite(), digitalRead(), anlogWrite(), anlogRead() GPIOs On-chip GPIO controller GPIO Expander Chip I2C Bus Analog Digital Write Read ADC Chip SPI Bus

GPIOs