More robust I2C designs with a new fault-injection driver Wolfram - - PowerPoint PPT Presentation

more robust i2c designs with a new fault injection driver
SMART_READER_LITE
LIVE PREVIEW

More robust I2C designs with a new fault-injection driver Wolfram - - PowerPoint PPT Presentation

More robust I2C designs with a new fault-injection driver Wolfram Sang, Consultant / Renesas ELCE17 Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 1 / 24 Motivation It really got personal I2C maintainer since


slide-1
SLIDE 1

More robust I2C designs with a new fault-injection driver

Wolfram Sang, Consultant / Renesas ELCE17

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 1 / 24

slide-2
SLIDE 2

Motivation

It really got personal…

I2C maintainer since 2012 encountered similar type of problems handling rare error cases in I2C master drivers again and again myself unsure how drivers for Renesas I2C IP cores behaved

… so as a fjrst step

reproducible way to generate test cases was desired!

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 2 / 24

slide-3
SLIDE 3

Introduction: sigrok

Figure 1: https://www.sigrok.org

The sigrok project aims at creating a portable, cross-platform, Free/Libre/Open-Source signal analysis software suite that supports various device types (e.g. logic analyzers, oscilloscopes, and many more).1

1from their website Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 3 / 24

slide-4
SLIDE 4

Introduction: sigrok II

Features & Design goals2

Broad hardware support

logic analyzers, oscilloscopes, multimeters, data loggers etc.

Cross-platform Scriptable protocol decoding

stackable, Python3

File format support

binary, ASCII, hex, CSV, gnuplot, VCD, WAV, …

Reusable libraries

libsigrok, libsigrokdecode

Various frontends

PulseView (LA GUI), sigrok-meter (DMM GUI), sigrok-cli

2from their website, slightly shortened Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 4 / 24

slide-5
SLIDE 5

Setup for sigrok

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 5 / 24

slide-6
SLIDE 6

Live demo setup

Click here and there until everything works :)

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 6 / 24

slide-7
SLIDE 7

Some basics: about START and STOP

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 7 / 24

slide-8
SLIDE 8

Defjnitions of ‘message’ and ‘transfer’

transfer everything between START and STOP message everything between START or REP_START and STOP or REP_START

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 8 / 24

slide-9
SLIDE 9

Live demo 1

Difgerence between STOP+START vs. REP_START on the wire

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 9 / 24

slide-10
SLIDE 10

It really happens!

From: Giuseppe Cantavenera <...> Subject: Re: [PATCH] i2c-cadence: fix repeated start in message sequence ... Sadly, it would have saved our team weeks of investigation

  • n a major issue if we had noticed before, but that's our

problem :( ...

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 10 / 24

slide-11
SLIDE 11

How to debug error cases?

Cases of interest

stalled bus!

SDA stuck low SCL stuck low

arbitration lost faulty bits Those usually happen rarely. Even if, often hard to reproduce.

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 11 / 24

slide-12
SLIDE 12

Solution: fault-injector

GPIOs driven by extended i2c-gpio driver

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 12 / 24

slide-13
SLIDE 13

GPIO based I2C fault injector

Implementation details

currently compiled-in extension to i2c-gpio driver might be refactored to an additional module if it grows too large controlled by fjles in debugfs

if you don’t know it already, super-convenient for such cases. Much better than sysfs!

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 13 / 24

slide-14
SLIDE 14

Error case: SDA held low by a device

How it can happen

Handover between bootloader and Kernel during a transfer Watchdog resets system during a transfer Device got stuck

What it means

SCL high, SDA low (held by the client device) → bus not free

How it is simulated

address phase to a known client is started when client acks its presence we stop clocking SCL

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 14 / 24

slide-15
SLIDE 15

Live demo 2

Incomplete transfer to the PMIC the audio codec

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 15 / 24

slide-16
SLIDE 16

I2C bus recovery

I2C specs have a solution for this (Revision 6, Chapter 3.1.16): If the data line (SDA) is stuck LOW, the master should send nine clock pulses. The device that held the bus LOW should release it sometime within those nine clocks. If not, then use the HW reset or cycle power to clear the bus.

The Linux Kernel has support for that

populate a bus_recovery_info structure generic helpers if SCL/SDA are controllable generic helpers if you want to use GPIOs

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 16 / 24

slide-17
SLIDE 17

Live demo 3

Incomplete transfer to the audio codec using another I2C IP core

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 17 / 24

slide-18
SLIDE 18

When to not use bus recovery

Not suitable when

SDA is not low

you should try emitting a STOP

the transfer timed out

could happen because device is busy Problem! I2C has no timeouts defjned. SMBus has.

SCL is stuck low

we’ll talk about that very soon

so

  • nly when SDA is stuck low at the beginning of a transfer

sometimes doing $RANDOM things will recover a device for you. But $RANDOM might break things for other users randomly.

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 18 / 24

slide-19
SLIDE 19

Error case: SCL held low by a device

How it can happen

Device got stuck

What it means

SCL low (held by the client device), SDA doesn’t really matter → bus not free and we cannot clock SCL

How it is simulated

SCL is pinned low by the GPIO

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 19 / 24

slide-20
SLIDE 20

Live demo 4

pinning SCL low

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 20 / 24

slide-21
SLIDE 21

Solution is to reset

I2C specs also have a solution for this (Revision 6, Chapter 3.1.16): In the unlikely event where the clock (SCL) is stuck LOW, the preferential procedure is to reset the bus using the HW reset signal if your I2C devices have HW reset inputs. If the I2C devices do not have HW reset inputs, cycle power to the devices to activate the mandatory internal Power-On Reset (POR) circuit.

not much we can do

return -EBUSY and let the client driver handle the necessary steps

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 21 / 24

slide-22
SLIDE 22

Outlook

add some more failure cases

arbitration lost

hold SDA low for a while once we detect START

SDA stuck low without external device

hold SDA low until we counted some SCL pulses

insert some faulty bits

could be used to check PEC bytes

decide whether to use add-on module

all this extra code might bloat the core driver source

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 22 / 24

slide-23
SLIDE 23

Summary

What has been shown:

I2C can be measured without much efgort and cost really easy to detect incorrect sequences faults can be injected via an extended i2c-gpio driver I2C host drivers can then be checked against that when to use bus recovery and when not

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 23 / 24

slide-24
SLIDE 24

Let’s do good engineering :)

Thank you!

Questions?

Right here, right now… Later at the conference wsa@the-dreams.de And thanks again to Renesas for funding this work!

Wolfram Sang, Consultant / Renesas Robust I2C with fault-injection ELCE17 24 / 24