Device From kernel-level code Drivers Interact with device(s) - - PowerPoint PPT Presentation

device
SMART_READER_LITE
LIVE PREVIEW

Device From kernel-level code Drivers Interact with device(s) - - PowerPoint PPT Presentation

Universit degli studi di Udine Universit degli studi di Udine Device Driver Receive requests From user-level code (through syscalls) Device From kernel-level code Drivers Interact with device(s) Send commands Read responses Handle


slide-1
SLIDE 1

Università degli studi di Udine

Device Drivers

Università degli studi di Udine

Device Driver

Receive requests

From user-level code (through syscalls) From kernel-level code

Interact with device(s)

Send commands Read responses Handle IRQs

Send data/result to requester

Università degli studi di Udine

Device Driver

Device response Device IRQ Device commands I/O request I/O request I/O result I/O result

I/O Subsystem Device Driver HW controller SW HW Device Driver HW controller Device Driver HW controller

Università degli studi di Udine

Driver request

Handled in the same context of requester

Chain of function calls

Some call can be a syscall

Simple

Handled in a different context

One or more task devoted to handle requests Interprocess communication Flexible

slide-2
SLIDE 2

Università degli studi di Udine

Implementation - objectives

Abstraction

Simplified interface to devices

Flexibility

Same interface for different devices

Modularity

Drivers added at:

Compile-time (without changes at I/O subsystem) run-time

Università degli studi di Udine

Device Drivers

Driver interface

Università degli studi di Udine

Interface

Drivers expose a set of functions Drivers must register their exposed interface to OS

I/O subsystem provides mechanisms to register interface

req1 req2 req3 req4

Device Driver I/O Subsystem

Interface Function pointers Registered drivers

Università degli studi di Udine

Interface

Drivers must inform kernel that some new function group exists

Through global variables Driver exposes its function pointers in initialized data

e.g., eCos

an exported struct contains: device name pointer to a function table read, write, select, get_config, set_config init function lookup function private_info: argument for lookup, init, and functions in table

slide-3
SLIDE 3

Università degli studi di Udine

Interface

Drivers must inform kernel that some new function group exists

Through initialization function Kernel invokes a module initialization function

e.g., Windows

initialization function is exported as driver's entry point default name: “DriverEntry”

NTSTATUS DriverEntry( struct _DRIVER_OBJECT *DriverObject, PUNICODE_STRING RegistryPath);

the initialization function fills a struct (received by kernel) with driver function pointers

DriverObject->DriverUnload = My_Drv_Unload; DriverObject->MajorFunction[IRP_MJ_READ] = My_Drv_Read; DriverObject->MajorFunction[IRP_MJ_WRITE] = My_Drv_Write;

receives a registry path to check for persistent parameters Implicit registration

Università degli studi di Udine

Interface

Drivers must inform kernel that some new function group exists

Through initialization function Kernel invokes a module initialization function

e.g., Linux

initialization function is indicated as “entry point” in the binary header

int my_driver_init(void); module_init(my_driver_init);

the initialization function prepare a structure with function pointers invoke the kernel registration function to add the new function group kernel can set the value of some “exported” variables as parameters Explicit registration

Università degli studi di Udine

Explicit registration

req1 req2 req3 req4

Device Driver I/O Subsystem

Interface Function pointers

Registered drivers

Initialization: register driver interface

Università degli studi di Udine

Explicit registration

I/O subsystem provides functions to

Register/unregister interfaces Register an IRQ handler

Assign a function to an IRQ

Allocate resources

I/O addresses, GPIOs, memory, ...

slide-4
SLIDE 4

Università degli studi di Udine

Device Drivers

Linux drivers interface

Driver interface

Università degli studi di Udine

Linux drivers interfaces

File-based driver interface

Character devices Block devices

Network driver interface

Università degli studi di Udine

Linux drivers interfaces

File-based driver interface

Character devices

Interface:

  • pen, release, read, write, llseek, unlocked_ioctl, mmap, ...

Structures:

struct cdev struct file_operations

Registration:

alloc_chrdev_region cdev_alloc cdev_add

Università degli studi di Udine

Linux character-devices interface implementation

cdev

kobj kobj

  • wner
  • wner
  • ps
  • ps

list list dev dev count count llseek llseek read read write write

file_operations Pointer to the module that implements the driver (NULL if the driver is built-in in kernel)

Device major and minor numbers Usage count Driver functions Driver structure Other driver data fields Other driver data fields

list of cdev structures cdev structure: allocated by driver, managed by kernel

slide-5
SLIDE 5

Università degli studi di Udine

Linux character-devices:

  • pen function

struct file

f_op f_op

private_data private_data

llseek llseek read read write write

file_operations struct inode

i_cdev i_cdev

cdev

kobj kobj

  • wner
  • wner
  • ps
  • ps

list list dev dev count count Driver structure Arguments (from kernel) pointer set by driver (if needed) struct file argument will be used by next function calls (read, write, etc.) driver can change the f_op pointer during device

  • pening if needed

Prototype:

void open(struct inode *, struct file *);

Università degli studi di Udine

Linux drivers interfaces

File-based driver interface

Block devices

Interface:

  • pen, release, ioctl, ...

request_fn, make_request_fn, ... No read and write functions

Structures:

struct gendisk struct block_device_operations struct request_queue

Registration:

register_blkdev alloc_disk blk_init_queue add_disk

Università degli studi di Udine

Linux drivers interfaces

Network driver interface

Interface:

ndo_init, ndo_uninit, ndo_open, ndo_stop, ndo_start_xmit, ndo_tx_timeout, ndo_do_ioctl, ndo_change_mtu, ndo_get_stats, ... No receive function

Structures:

struct net_device struct net_device_ops

Registration:

alloc_netdev register_netdev

slide-6
SLIDE 6

Università degli studi di Udine

Device Drivers

Linux kernel coding

Università degli studi di Udine

Linux kernel-level coding

Concurrency - use reentrant code Current process (if appliable): current

current->comm (command name) current->pid

Limited stack size

No large automatic variables

No floating point

Floating point context is not saved

No standard headers and standard libs N.B.: facilities are available to kernel just after registration

Università degli studi di Udine

Other info

Syscalls in kernel: sys_ Low level kernel interface: names start with __

Use with caution

Symbols

/proc/kallsyms

Loaded modules

/proc/modules

Devices and major number

/proc/devices

Resources

/proc/interrupts /proc/iomem /proc/ioports

Memory access

/dev/kmem (Kernel virtual memory access) /dev/mem (Physical memory access)

Kernel core

/proc/kcore

Università degli studi di Udine

Device Drivers Linux kernel coding

Init / Cleanup

slide-7
SLIDE 7

Università degli studi di Udine

Initialization code

Require resources

Memory, device numbers, address ranges, synchronization, ...

Register features

Interfaces, irq handlers, ...

Return a result

Typically:

0: success negative: error code

Handle failure

Check returned codes Release resources on failure De-register features on failure

Università degli studi di Udine

Cleanup code

Release resources De-register features Typically no result value is needed

Università degli studi di Udine

Init code example

int init_func(void) { int res; res = require_resource_A(...); if (res < 0) goto fail_A; res = require_resource_B(...); if (res < 0) goto fail_B; res = require_resource_C(...); if (res < 0) goto fail_C; return 0; /* success */ /* release_resource_C(...); */ fail_C: release_resource_B(...); fail_B: release_resource_A(...); fail_A: return res; }

Regular code, no duplicated calls, easy to maintain. Uses goto

Università degli studi di Udine

Cleanup code example

void cleanup_func(void) { release_resource_C(...); release_resource_B(...); release_resource_A(...); }

If called, all resources are allocated

slide-8
SLIDE 8

Università degli studi di Udine

Device Drivers Linux kernel coding Data types

Università degli studi di Udine

Integer types

kernel space user space C99

s8 __s8 int8_t

8 bit

u8 __u8 uint8_t s16 __s16 int16_t

16 bit

u16 __u16 uint16_t s32 __s32 int32_t

32 bit

u32 __u32 uint32_t s64 __s64 int64_t

64 bit

u64 __u64 uint64_t

kernel space C99

unsigned long intptr_t uintptr_t

Exact-width integer types: Integer for pointers types:

Università degli studi di Udine

Doubly linked list

next next prev prev next next prev prev

custom struct struct list_head

next next prev prev

custom struct

next next prev prev

custom struct

next next prev prev

custom struct

Università degli studi di Udine

Doubly linked list: reference

LIST_HEAD(list_name) compile-time initialization: defines list_name as an empty list (of type struct list_head) void INIT_LIST_HEAD(struct list_head *list) run-time initialization e.g.,

struct list_head mylist; INIT_LIST_HEAD(&mylist);

void list_add(struct list_head *new, struct list_head *head) add a new entry: new is inserted as first element of head (used for implementing stacks) void list_add_tail(struct list_head *new, struct list_head *head) add a new entry: new is inserted as last element of head (used for implementing queues)

slide-9
SLIDE 9

Università degli studi di Udine

Doubly linked list: reference

void list_del(struct list_head *entry) deletes entry from the list where it belongs void list_replace(struct list_head *old, struct list_head *new) replace old entry by new one int list_empty(const struct list_head *head) tests whether a list is empty list_for_each(pos, list_head) iterate over a list e.g.,

struct list_head *pos; list_for_each(pos, list_head) { /* work on element */ } Università degli studi di Udine

Doubly linked list: reference

list_entry(ptr, type, member) get the struct that encloses this entry e.g.,

/* custom struct organized in a list */ struct my_struct { int integer_field; struct list_head list; /* enclosed list field */ }; struct my_struct *myvar; struct list_head *pos; /* list_head: head pointer to a list of struct my_struct */ list_for_each(pos, list_head) { /* pos is a pointer to the internal struct list_head field */ myvar = list_entry(pos, struct my_struct, list); myvar->integer_field++; } Università degli studi di Udine

Doubly linked list: reference

void list_replace_init(struct list_head *old, struct list_head *new) replace old entry by new one, and reinitialize old void list_del_init(struct list_head *entry) deletes entry from list and reinitialize it void list_move(struct list_head *entry, struct list_head *head) delete entry from one list and add as another's head void list_move_tail(struct list_head *entry, struct list_head *head) delete entry from one list and add as another's tail void list_is_last(const struct list_head *entry, const struct list_head *head) tests whether entry is the last entry in list head void list_empty_careful(struct list_head *head) tests whether a list is empty and not being modified void list_rotate_left(struct list_head *head) rotate the list to the left

Università degli studi di Udine

Doubly linked list: reference

int list_is_singular(const struct list_head *head) tests whether a list has just one entry void list_cut_position(struct list_head *list, struct list_head *head, struct list_head *entry) cut a list into two void list_splice(const struct list_head *list, struct list_head *head) join two lists, this is designed for stacks void list_splice_tail(struct list_head *list, struct list_head *head) join two lists, each list being a queue void list_splice_init(struct list_head *list, struct list_head *head) join two lists and reinitialize the emptied list void list_splice_tail_init(struct list_head *list, struct list_head *head) join two lists and reinitialise the emptied list list_first_entry(ptr, type, member) get the first element from a list

slide-10
SLIDE 10

Università degli studi di Udine

Doubly linked list: reference

list_last_entry(ptr, type, member) get the last element from a list list_first_entry_or_null(ptr, type, member) get the first element from a list list_next_entry(pos, member) get the next element in list list_prev_entry(pos, member) get the prev element in list list_for_each_prev(pos, head) iterate over a list backwards list_for_each_safe(pos, n, head) iterate over a list safe against removal of list entry list_for_each_prev_safe(pos, n, head) iterate over a list backwards safe against removal of list entry

Università degli studi di Udine

Doubly linked list: reference

list_for_each_entry(pos, head, member) iterate over list of given type list_for_each_entry_reverse(pos, head, member) iterate backwards over list of given type list_prepare_entry(pos, head, member) prepare a pos entry for use in list_for_each_entry_continue() list_for_each_entry_continue(pos, head, member) continue iteration over list of given type list_for_each_entry_continue_reverse(pos, head, member) iterate backwards from the given point list_for_each_entry_from(pos, head, member) iterate over list of given type from the current point list_for_each_entry_safe(pos, n, head, member) iterate over list of given type safe against removal of list entry

Università degli studi di Udine

Doubly linked list: reference

list_for_each_entry_safe_continue(pos, n, head, member) continue list iteration safe against removal list_for_each_entry_safe_from(pos, n, head, member) iterate over list from current point safe against removal list_for_each_entry_safe_reverse(pos, n, head, member) iterate backwards over list safe against removal list_safe_reset_next(pos, n, member) reset a stale list_for_each_entry_safe loop

slide-11
SLIDE 11

Università degli studi di Udine

Device Drivers Linux kernel coding

Functions / Macros

(some)

Università degli studi di Udine

Module

module_init(init_function); module_exit(exit_function); module_param(variable, type, perm); EXPORT_SYMBOL(symbol); EXPORT_SYMBOL_GPL(symbol);

define module initialization function define module cleanup function mark a variable as a module parameter export a symbol

}

Università degli studi di Udine

Memory allocation

void *kmalloc(size_t size, gfp_t flags); void kfree(void *ptr); long __get_free_pages(gfp_t gfp_mask, unsigned int order); __get_free_page(gfp_mask) long get_zeroed_page(gfp_t gfp_mask); void free_pages(unsigned long addr, unsigned int order); free_page(addr) void *vmalloc(unsigned long size); void vfree(const void *addr);

memory allocation common flags: GFP_KERNEL GFP_ATOMIC __GFP_DMA allocate 2order pages allocate virtual memory

Università degli studi di Udine

Memory copying

long copy_to_user(void __user *to, const void *from, unsigned long n); long __copy_to_user(void __user *to, const void *from, unsigned long n); put_user(x, ptr) __put_user(x, ptr) long copy_from_user(void *to, const void __user * from, unsigned long n); long __copy_from_user(void *to, const void __user * from, unsigned long n); __get_user(x, ptr) get_user(x, ptr) void memcpy_fromio(void *dst, const volatile void __iomem *src, int count); void memcpy_toio(void *dst, const volatile void __iomem *src, int count); void memset_io(void *dst, const volatile void __iomem *src, int count);

safe version safe version kernel address space <--> I/O (memory mapped) areas kernel address space <--> user address space

slide-12
SLIDE 12

Università degli studi di Udine

Char devices

int alloc_chrdev_region(dev_t *devno, unsigned firstminor, unsigned count, const char *name); void unregister_chrdev_region(dev_t devno, unsigned nr_devs); void cdev_init(struct cdev *cdev, struct file_operations *fops); int cdev_add(struct cdev *cdev, dev_t devno, unsigned count); void cdev_del(struct cdev *cdev); MAJOR(dev_t devno); MINOR(dev_t devno);

require a set of device numbers initialize structure release dev. numbers add a device to the system remove a device from the system

Università degli studi di Udine

Misc

printk memset strncpy strlen sprintf container_of(ptr, container_type, member); int capable(int cap); access_ok(type, addr, size)

as the standard printf plus a logging level same as their libc homonymous

}

container member ptr container_of(ptr, ...)

Università degli studi di Udine

Proc filesystem

struct proc_dir_entry *create_proc_read_entry( const char *name, mode_t mode, struct proc_dir_entry *base, read_proc_t *read_proc, void * data); void remove_proc_entry(const char *name, struct proc_dir_entry *parent);

slide-13
SLIDE 13

Università degli studi di Udine

Device Drivers Linux kernel coding

Synchronization

Università degli studi di Udine

Device Drivers Linux kernel coding Synchronization

HW primitives (atomic accesses)

Università degli studi di Udine

GCC builtins for atomic accesses

Read-Modify-Write operations

__sync_fetch_and_add (type *ptr, type value); __sync_fetch_and_sub (type *ptr, type value); __sync_fetch_and_or (type *ptr, type value); __sync_fetch_and_and (type *ptr, type value); __sync_fetch_and_xor (type *ptr, type value); __sync_fetch_and_nand (type *ptr, type value);

perform the operation suggested by the name, and returns the old value; imply a full memory barrier

Università degli studi di Udine

GCC builtins for atomic accesses

Read-Modify-Write operations

__sync_add_and_fetch (type *ptr, type value); __sync_sub_and_fetch (type *ptr, type value); __sync_or_and_fetch (type *ptr, type value); __sync_and_and_fetch (type *ptr, type value); __sync_xor_and_fetch (type *ptr, type value); __sync_nand_and_fetch (type *ptr, type value);

perform the operation suggested by the name, and returns the new value; imply a full memory barrier

slide-14
SLIDE 14

Università degli studi di Udine

GCC builtins for atomic accesses

Read-Modify-Write operations

__sync_lock_test_and_set (type *ptr, type value);

perform an atomic exchange: writes value into *ptr and returns the previous contents of *ptr; implies an acquire barrier

Read-Test-Modify-Write operations

__sync_val_compare_and_swap (type *ptr, type oldval, type newval); __sync_bool_compare_and_swap (type *ptr, type oldval, type newval); perform atomic compare-and-swap: if the current value of *ptr is oldval, then write newval into *ptr; __sync_val_compare_and_swap returns the old value of *ptr __sync_bool_compare_and_swap returns true if the comparison is successful imply a full memory barrier

Università degli studi di Udine

GCC builtins for atomic accesses

Others:

__sync_lock_release (type *ptr);

Writes 0 to *ptr; implies a release barrier

__sync_synchronize ();

Issues a full memory barrier

Università degli studi di Udine

Linux atomic operations

types:

atomic_t (32 bit) atomic64_t (64 bit)

  • perations:

atomic_t ATOMIC_INIT(int i); (nb) int atomic_read(atomic_t *v); (nb) void atomic_set(atomic_t *v, int i); (nb) void atomic_add(int i, atomic_t *v); (nb) int atomic_add_return(int i, atomic_t *v); (fb)

return the operation result

void atomic_inc(atomic_t *v); (nb) int atomic_inc_return(v); (fb)

return the operation result

nb: do not imply memory barriers fb: imply a full memory barrier

Università degli studi di Udine

Linux atomic operations

  • perations:

void atomic_sub(int i, atomic_t *v); (nb) void atomic_sub_return(int i, atomic_t *v); (fb)

return the operation result

void atomic_dec(atomic_t *v); (nb) int atomic_dec_return(v); (fb)

return the operation result

int atomic_add_negative(int i, atomic_t *v); (fb)

return true if result is negative

int atomic_sub_and_test(int i, atomic_t *v); (fb)

return true if result is 0

slide-15
SLIDE 15

Università degli studi di Udine

Linux atomic operations

  • perations:

int atomic_inc_and_test(atomic_t *v); (fb)

return true if result is 0

int atomic_dec_and_test(atomic_t *v); (fb)

return true if result is 0

void atomic_clear_mask(int i, atomic_t *v); (nb) void atomic_set_mask(int i, atomic_t *v); (nb) unsigned long atomic_cmpxchg(atomic_t *v, unsigned long testval, unsigned long newval); (fb)

perform the swap if v contains testval; return the previous value of v also: cmpxchg

unsigned long atomic_xchg(volatile void *ptr, type val); (fb)

return the previous value stored in ptr also: xchg

not in all architectures

Università degli studi di Udine

Linux atomic operations

types:

atomic_t (32 bit) atomic64_t (64 bit)

  • perations:

ATOMIC64_INIT(i) long long atomic64_read(const atomic64_t *v); void atomic64_set(atomic64_t *v, long long i); void atomic64_add(long long a, atomic64_t *v); long long atomic64_add_return(long long a, atomic64_t *v); void atomic64_sub(long long a, atomic64_t *v); long long atomic64_sub_return(long long a, atomic64_t *v);

Università degli studi di Udine

Linux atomic operations

  • perations:

void atomic64_inc(atomic64_t *v); long long atomic64_inc_return(atomic64_t *v); long long atomic64_inc_and_test(atomic64_t *v);

return true if result is 0

void atomic64_dec(atomic64_t *v); long long atomic64_dec_return(atomic64_t *v); long long atomic64_dec_and_test(atomic64_t *v);

return true if result is 0

long long atomic64_dec_if_positive(atomic64_t *v); int atomic64_inc_not_zero(atomic64_t *v);

increment v if not null; return true if the increment is performed

int atomic64_add_negative(long long a, atomic64_t *v);

return true if result is negative

Università degli studi di Udine

Linux atomic operations

  • perations:

int atomic64_add_unless(atomic64_t *v, long long a, long long u);

add a unless v contains u; return true if the sum is performed

long long atomic64_sub_and_test(long long a, atomic64_t *v);

return true if result is 0

long long atomic64_cmpxchg(atomic64_t *v, long long testval, long long newval); long long atomic64_xchg(atomic64_t *v, long long newval);

slide-16
SLIDE 16

Università degli studi di Udine

Device Drivers Linux kernel coding Synchronization

Locks (low-level primitives)

Università degli studi di Udine

Exclusive lock

Spinlocks

repeatedly check the lock variable

loop until is locked

Linux implementation:

ticket locks

whenever possible, processor is turned in a low-power state when waiting e.g., with a WFE in ARM

Università degli studi di Udine

Linux locks

type:

spinlock_t

  • perations:

initialization:

DEFINE_SPINLOCK(lockname);

define and initialize a spinlock_t variable

spin_lock_init(spinlock_t *lock);

Università degli studi di Udine

Linux locks

type:

spinlock_t

  • perations:

locking:

void spin_lock(spinlock_t *lock); void spin_lock_irq(spinlock_t *lock);

disable irqs

void spin_lock_bh(spinlock_t *lock);

disable only softirqs

void spin_lock_irqsave(spinlock_t *lock, unsigned long flags);

save irq state (in flags) and disable irqs

slide-17
SLIDE 17

Università degli studi di Udine

Linux locks

type:

spinlock_t

  • perations:

unlocking:

void spin_unlock(spinlock_t *lock); void spin_unlock_irq(spinlock_t *lock); void spin_unlock_bh(spinlock_t *lock); void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags); void spin_unlock_wait(spinlock_t *lock)

Università degli studi di Udine

Linux locks

type:

spinlock_t

  • perations:

tentative locking:

int spin_trylock(spinlock_t *lock); int spin_trylock_irq(spinlock_t *lock); int spin_trylock_bh(spinlock_t *lock); int spin_trylock_irqsave(spinlock_t *lock, unsigned long flags);

return true if the lock has been acquired

Università degli studi di Udine

Linux locks

type:

rwlock_t

  • perations:

initialization:

DEFINE_RWLOCK(lockname);

define and initialize a rwlock_t variable

rwlock_init(rwlock_t *lock);

Università degli studi di Udine

Linux locks

type:

rwlock_t

  • perations:

read locking:

void read_lock(rwlock_t *lock); void read_lock_irq(rwlock_t *lock); void read_lock_bh(rwlock_t *lock); void read_lock_irqsave(rwlock_t *lock, unsigned long flags);

write locking:

void write_lock(rwlock_t *lock); void write_lock_irq(rwlock_t *lock); void write_lock_bh(rwlock_t *lock); void write_lock_irqsave(rwlock_t *lock, unsigned long flags);

slide-18
SLIDE 18

Università degli studi di Udine

Linux locks

type:

rwlock_t

  • perations:

read unlocking:

void read_unlock(rwlock_t *lock); void read_unlock_irq(rwlock_t *lock); void read_unlock_bh(rwlock_t *lock); void read_unlock_irqsave(rwlock_t *lock, unsigned long flags);

write unlocking:

void write_unlock(rwlock_t *lock); void write_unlock_irq(rwlock_t *lock); void write_unlock_bh(rwlock_t *lock); void write_unlock_irqsave(rwlock_t *lock, unsigned long flags);

Università degli studi di Udine

Linux locks

type:

rwlock_t

  • perations:

tentative locking:

void read_trylock(rwlock_t *lock); void write_trylock(rwlock_t *lock); void write_trylock_irqsave(rwlock_t *lock, unsigned long flags);

Università degli studi di Udine

Device Drivers Linux kernel coding Synchronization

High-level primitives

Università degli studi di Udine

Semaphores Mutexes Barriers Futexes Completion Deferred processing

Reference counting Read-Copy-Update (RCU)

High-level synchronization primitives

slide-19
SLIDE 19

Università degli studi di Udine

Synchronization - semaphores

void sema_init(struct semaphore *sem, int initial_val); void down(struct semaphore *sem); int down_interruptible(struct semaphore *sem); int down_trylock(struct semaphore *sem); void up(struct semaphore *sem);

initialization can be interrupted by signals to user process

Università degli studi di Udine

Synchronization - spinlocks

spinlock_t my_lock = SPIN_LOCK_UNLOCKED; void spin_lock_init(spinlock_t *lock); void spin_lock(spinlock_t *lock); void spin_unlock(spinlock_t *lock); void spin_lock_irqsave(spinlock_t *lock, unsigned long flags); void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags); void spin_lock_irq(spinlock_t *lock); void spin_unlock_irq(spinlock_t *lock);

initialization

  • nly for spinlocks not used in ISRs

disable/enable IRQs

  • n current CPU
  • nly if IRQs are surely enabled

when the spinlock is acquired

To avoid deadlocks when a spinlock is held: No preemption (guaranteed by kernel). Avoid sleep! To avoid deadlocks when a spinlock is held: No preemption (guaranteed by kernel). Avoid sleep!

Università degli studi di Udine

Synchronization

Danger 1

func1 tries to acquire lock and calls func2 func2 tries to acquire lock deadlock

Provide rules

Example:

  • nly external called functions (interface) acquire lock

internal functions do not call interface

Danger 2

func1 acquires lock1 and lock2 func2 acquires lock2 and lock1 possible deadlock

Always use the same order

hint: first acquire a local lock, then try to acquire a global lock

Test code against preemptive kernels and SMP architectures

Università degli studi di Udine

Synchronization

Semaphores and mutex (and “completion”) Spinlocks

do not allow process to sleep if it holds a spinlock

Don't

try to acquire a semaphore or a mutex copy_from_user / copy_to_user kmalloc(..., GFP_KERNEL)

Atomic variable access

atomic_t v = ATOMIC_INIT(0); / void atomic_add(int i, atomic_t *v); void set_bit(nr, void *addr); / void clear_bit(nr, void *addr);

seqlocks Read-Copy-Update

slide-20
SLIDE 20

Università degli studi di Udine

Device Drivers

IRQ handling

Università degli studi di Udine

Interrupts

Asynchronous events

identified by an integer

IRQ number

HW device requiring attention

Timer events

Periodic ticks Timer alarms

I/O events

Operation completed Buffer empty/full New data incoming

e.g., new network packet, key pressed

I/O errors

Virtual interrupts (softirqs)

Università degli studi di Udine

Interrupts handling

ISR must be fast

IRQ disabled

System response time

Interrupt handling requires work

Data transfer, Error checking, Process wakeup, ... Split work in two halves Top half (within ISR: IRQs disabled) Bottom half (IRQs enabled)

Time

process 1 (user level) process 1 (kernel level) syscall IRQ interrupt handling (top half) process context interrupt context (hard irq) interrupt context (soft irq) return from isr interrupt handling (bottom half) trigger a softirq

Università degli studi di Udine

Interrupts handling

ISR must be fast

IRQ disabled

System response time

Interrupt handling requires work

Data transfer, Error checking, Process wakeup, ... Split work in two halves Top half (within ISR: IRQs disabled) Bottom half (a kernel thread)

Time

process 1 (user level) process 1 (kernel level) syscall IRQ process context interrupt context (hard irq) process context (dedicated thread) interrupt handling (top half) return from isr interrupt handling (bottom half) Wake up a thread

slide-21
SLIDE 21

Università degli studi di Udine

Device Drivers IRQ handling

Linux IRQ handling

Università degli studi di Udine

Interrupts handling in Linux

Interruption on software execution

CPU jumps to a specific address (arch. dependent)

is OS code

1) low-level dispatch (architecture dependent code) 2) generic_handle_irq (architecture independent routines) calls a registered ISR for this IRQ number (if any) 3) ISR (device dependent code) provided by driver "Interrupt Service Routine" 4) invoke_softirq (architecture independent) if softirqs have been triggered, handlers are invoked 5) normal work is resumed (architecture dependent code) maybe scheduling other tasks

Università degli studi di Udine

Linux execution contexts

Process context (process time)

User level Kernel level IRQs enabled May sleep

Interrupt context (interrupt time)

Kernel level hard irq

IRQ disabled

soft irq

IRQ enabled

Cannot sleep

On SMP architectures an ISR can run concurrently

  • n several CPU ISR must be re-entrant

Università degli studi di Udine

Linux softirqs

HI_SOFTIRQ

Highest priority (used by high priority tasklets)

TIMER_SOFTIRQ

For kernel timers

NET_TX_SOFTIRQ, NET_RX_SOFTIRQ

For networking subsystem

BLOCK_SOFTIRQ, BLOCK_IOPOLL_SOFTIRQ

For block-devices subsystem

TASKLET_SOFTIRQ

For normal priority tasklets

SCHED_SOFTIRQ

Used by scheduler to perform periodic load balancing

HRTIMER_SOFTIRQ

Used by high resolution timers (when reprogrammed)

RCU_SOFTIRQ

For the Read-Copy Update (mechanism for mutual exclusion)

slide-22
SLIDE 22

Università degli studi di Udine

Top / bottom halves

Interrupt service routine and tasklet

Install ISR with request_irq Call tasklet_schedule in ISR Tasklet will run in soft irq context

HW IRQs enabled no sleeping allowed

Università degli studi di Udine

Top / bottom halves

Interrupt service routine and workqueue

Install ISR with request_irq Call schedule_work or queue_work in ISR The “work” is managed by a kernel thread

the bottom half is managed in process context

IRQs enabled sleeping allowed

Università degli studi di Udine

Top / bottom halves

Threaded interrupt management

Install ISR with request_threaded_irq

request_threaded_irq creates a kernel thread

Return IRQ_WAKE_THREAD from ISR A function is called from the dedicated kernel thread

the bottom half is managed in process context

IRQs enabled sleeping allowed

Università degli studi di Udine

Interrupt handler installation – 1

Install an interrupt handler:

error = request_irq(irq_number, interrupt_service_routine, flags, module_name, (void*)dataptr);

error: == 0 OK != 0 FAILED interrupt_service_routine:

irqreturn_t interrupt_service_routine(int irq_number, void *dataptr);

flags

  • ne or more (or none):

IRQF_SHARED IRQF_TRIGGER_RISING IRQF_TRIGGER_FALLING

slide-23
SLIDE 23

Università degli studi di Udine

Interrupt handler – 1

The interrupt service routine

prototype:

irqreturn_t interrupt_service_routine(int irq_number, void *dataptr);

No process context (“interrupt time” [“hard irq”]) No current No copy_from_user or copy_to_user No sleeping No semaphores or mutexes No kmalloc(..., GFP_KERNEL)

Use GFP_ATOMIC

No msleep or ssleep Shared interrupt: check if it is your interrupt (use dataptr and probe hw) Be fast (defer work if needed) Return IRQ_NONE or IRQ_HANDLED

Università degli studi di Udine

Interrupt handler installation – 2

Install an interrupt handler (threaded irq handling):

error = request_threaded_irq(irq_number, interrupt_service_routine, thread_function, flags, module_name, (void*)dataptr);

error: == 0 OK != 0 FAILED interrupt_service_routine (for top half):

irqreturn_t interrupt_service_routine(int irq_number, void *dataptr);

thread_function (for bottom half):

irqreturn_t thread_function(int irq_number, void *dataptr);

flags

  • ne or more (or none):

IRQF_SHARED - IRQF_TRIGGER_RISING - IRQF_TRIGGER_FALLING

Università degli studi di Udine

Interrupt handler – 2

The interrupt service routine

prototype:

irqreturn_t interrupt_service_routine(int irq_number, void *dataptr);

No process context (“interrupt time” [“hard irq”]) No sleeping Shared interrupt: check if it is your interrupt (use dataptr and probe hw) if so, disable the interrupt on the device and return IRQ_WAKE_THREAD Be fast (long work must be performed by thread_function) Return IRQ_NONE or IRQ_HANDLED or IRQ_WAKE_THREAD

Università degli studi di Udine

Thread function – 2

The thread function

prototype:

irqreturn_t thread_function(int irq_number, void *dataptr);

In process context (“process time”) IRQs enabled IRQ of the device disabled by top half May sleep Return IRQ_HANDLED

slide-24
SLIDE 24

Università degli studi di Udine

Interrupt resource requesting

Non-managed resources

request_irq request_threaded_irq request_any_context_irq

calls request_threaded_irq if irq is nested, request_irq otherwise

Managed resources

automatically released on driver detach

devm_request_irq devm_request_threaded_irq devm_request_any_context_irq

Università degli studi di Udine

Interrupt handler removing

Remove an interrupt handler

free_irq(irq_number, dataptr);

slide-25
SLIDE 25

Università degli studi di Udine

Device Drivers

Time handling

Università degli studi di Udine

Time handling

Deferred work

Workqueues: prepare work to be scheduled afterwards

“Works” are arranged in queues

The scheduler manages their advancement Run in a kernel process context

Tasklets: prepare work to be scheduled soon afterwards

soft irq handlers (but running only on one CPU simultaneously)

Delay

Wait for some amount of time

Scheduled

Can sleep

Busy wait

Consumes CPU time

Timers

(almost) Precise future scheduling

Università degli studi di Udine

Device Drivers Time handling

Deferred work

Università degli studi di Udine

Workqueue3 Workqueue2

Workqueues

Work 1 Work 2 Work 3 function1 function2 function3

Workqueue1

  • ther

data

Loops in kernel threads

Queues of “Works”

A Work is a structure that contains a pointer to a function

slide-26
SLIDE 26

Università degli studi di Udine

Workqueues

Queues of “Works”

A work is a structure that contains a pointer to a function

Usually a work is enclosed in a larger structure (for driver private data)

Kernel threads (“kworkers”) handle workqueues

Queues are handled concurrently Extract a work from a queue and invoke the pointed function

the function receives a pointer to the work structure

A work in a queue may delay (or block) other works in the same queue

Kernel provides a shared workqueue (“events”) Drivers can allocate their own workqueues

Università degli studi di Udine

Workqueues - reference

DECLARE_WORK(name, func) macro to declare and initialize a variable of type struct work_struct DECLARE_DELAYED_WORK(name, func) DECLARE_DEFERRED_WORK(name, func) macros to declare and initialize a variable of type struct delayed_work PREPARE_WORK(name, func) PREPARE_DELAYED_WORK(name, func) macros to initialize a work item's function pointer INIT_WORK(name, func) INIT_DELAYED_WORK(name, func) INIT_DELAYED_WORK_DEFERRABLE(name, func) macros to initialize all of a work item in one go

Università degli studi di Udine

Workqueues - reference

alloc_workqueue(fmt, flags, max_active, args...) allocate a workqueue fmt: printf format for the name of the workqueue flags: WQ_* flags max_active: max in-flight work items, 0 for default args: args for @fmt returns a pointer to the workqueue or NULL Examples: alloc_workqueue("myworkq", 0, 0); No flags, use the default number for max_active alloc_workqueue("myworkq", WQ_HIGHPRI, 2); For high priority works, no more than 2 work concurrently executed alloc_workqueue("myworkq-%d", 0, 0, my_id); Build the name in a printf-like style

Università degli studi di Udine

Workqueues - reference

int queue_work(struct workqueue_struct *wq, struct work_struct *work) queue work on a workqueue returns 0 if work was already on a queue, non-zero otherwise. int queue_work_on(int cpu, struct workqueue_struct *wq, struct work_struct *work) queue work on specific cpu cpu: CPU number to execute work on int queue_delayed_work(struct workqueue_struct *wq, struct delayed_work *dwork, unsigned long delay) queue work on a workqueue after delay delay: number of jiffies to wait before queueing int queue_delayed_work_on(int cpu, struct workqueue_struct *wq, struct delayed_work *dwork, unsigned long delay) queue work on specific CPU after delay

slide-27
SLIDE 27

Università degli studi di Udine

Workqueues - reference

void flush_workqueue(struct workqueue_struct *wq) ensure that any scheduled work has run to completion. typically used in driver shutdown handlers void drain_workqueue(struct workqueue_struct *wq) wait until the workqueue becomes empty

Università degli studi di Udine

Workqueues - reference

bool flush_work(struct work_struct *work) wait for a work to finish executing the last queueing instance work: the work to flush returns true if waited for the work to finish execution, false if it was already idle.

  • n return work might still be executing if work has been enqueued across

different CPUs or on multiple workqueues bool flush_work_sync(struct work_struct *work) wait until a work has finished execution

  • n return, it's guaranteed that all queueing instances of work which happened before

this function is called are finished bool flush_delayed_work(struct delayed_work *dwork) queue work on a workqueue after delay bool flush_delayed_work_sync(struct delayed_work *dwork) wait for a dwork to finish

Università degli studi di Udine

Workqueues - reference

bool cancel_work_sync(struct work_struct *work) cancel a work and wait for it to finish bool cancel_delayed_work_sync(struct delayed_work *dwork) cancel a work and wait for it to finish bool cancel_delayed_work(struct delayed_work *work) kill off a pending schedule_delayed_work() the work callback function may still be running on return work_pending(work) find out whether a work item is currently pending delayed_work_pending(dwork) find out whether a delayable work item is currently pending

Università degli studi di Udine

Workqueues - reference

int schedule_work(struct work_struct *work) put work task in global workqueue work: job to be done int schedule_work_on(int cpu, struct work_struct *work) put work task on a specific cpu cpu: cpu to put the work task on int schedule_delayed_work(struct delayed_work *dwork, unsigned long delay) put work task in global workqueue after delay delay: number of jiffies to wait int schedule_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay) queue work in global workqueue on CPU after delay

slide-28
SLIDE 28

Università degli studi di Udine

Workqueues - reference

int execute_in_process_context(work_func_t fn, struct execute_work *ew) reliably execute the routine with user context fn: the function to execute ew: guaranteed storage for the execute work structure (must be available when the work executes) returns 0 if function was executed, 1 if function was scheduled for execution void flush_scheduled_work(void) ensure that any scheduled work has run to completion

Università degli studi di Udine

Workqueues - reference

void destroy_workqueue(struct workqueue_struct *wq) safely terminate a workqueue; all work currently pending will be done first. void workqueue_set_max_active(struct workqueue_struct *wq, int max_active) adjust max_active of a workqueue max_active: new max_active value don't call from IRQ context bool workqueue_congested(unsigned int cpu, struct workqueue_struct *wq) test whether a workqueue is congested

Università degli studi di Udine

Workqueues - reference

unsigned int work_cpu(struct work_struct *work) return the last known associated cpu for work unsigned int work_busy(struct work_struct *work) test whether a work is currently pending or running max_active: new max_active value don't call from IRQ context long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg) run a function in user context on a particular cpu cpu: the cpu to run on fn: the function to run arg: the function arg

Università degli studi di Udine

shared workqueue private workqueue

Workqueues: how to use

  • 1. Link a “struct work” to a function

INIT_WORK(&userdata->work, workfunction);

  • 2a. Create a queue of works (struct workqueue_struct)

qptr = alloc_workqueue("workqueue name", 0, 0);

  • 3a. Add the work to the queue

queue_work(qptr, &userdata->work); queue_delayed_work(qptr, &userdata->work, delay);

  • 2b. Add the work to the shared queue

schedule_work(&userdata->work);

slide-29
SLIDE 29

Università degli studi di Udine

Workqueues: how to use

Safely remove a private workqueue

destroy_workqueue(device.wq);

Safely remove a work from the shared queue

cancel_work_sync(&device.work);

Università degli studi di Udine

Workqueues: how to use

#include <linux/workqueue.h> /* -------------------------------------------------------------------------- */ /* struct to keep user data and work info */ struct mystruct { ... struct work_struct work; ... } userdata; /* -------------------------------------------------------------------------- */ /* -------------------------------------------------------------------------- */ static void workfunction(struct work_struct *work) { /* deferred work function */ ... } /* -------------------------------------------------------------------------- */ /* -------------------------------------------------------------------------- */ INIT_WORK(&userdata->work, workfunction); ... wq = alloc_workqueue("workqueue_name", 0, 0); .... queue_work(wq, &userdata.work); /* use a private workqueue */ /* -------------------------------------------------------------------------- */

Università degli studi di Udine

Workqueues: how to use

#include <linux/workqueue.h> /* -------------------------------------------------------------------------- */ /* struct to keep user data and work info */ struct mystruct { ... struct work_struct work; ... } userdata; /* -------------------------------------------------------------------------- */ /* -------------------------------------------------------------------------- */ static void workfunction(struct work_struct *work) { /* deferred work function */ ... } /* -------------------------------------------------------------------------- */ /* -------------------------------------------------------------------------- */ INIT_WORK(&userdata->work, workfunction); ... schedule_work(&userdata->work); /* use the default (shared) workqueue */ /* -------------------------------------------------------------------------- */

Università degli studi di Udine

Tasklets

Tasklets are arranged in two lists

High-priority tasklets Normal-priority tasklets

The softirq processing function runs the tasklet handlers

Each handler: scan its own list run the handlers of active tasklets

func next func next func next func next func next tail head

tasklet_vec

tasklets descriptors handler

slide-30
SLIDE 30

Università degli studi di Udine

Tasklets

Tasklet running

at interrupt time (“soft irq”)

  • n the same cpu

strictly serialized with respect to itself

can be interrupted by irqs

  • ne shot (can auto-“rearm”)

max latency: the next timer tick

Università degli studi di Udine

Tasklets - reference

void tasklet_init(struct tasklet_struct *t, void (*func)(unsigned long), unsigned long data) initialize a tasklet func: function to call data: arg for func DECLARE_TASKLET(name, func, data) DECLARE_DISABLED(name, func, data) macros to allocate and initialize a tasklet

Università degli studi di Udine

Tasklets - reference

void tasklet_kill(struct tasklet_struct *t) ensure a tasklet is not running (if a taskelet is scheduled, wait for completion) beware race conditions for auto-rescheduling tasklets (first disable rescheduling, then invoke tasklet_kill)

Università degli studi di Udine

Tasklets - reference

void tasklet_enable(struct tasklet_struct *t) enable a previously disabled tasklet a scheduled tasklet is not executed until disabled (kernel keeps count of disable* calls) void tasklet_hi_enable(struct tasklet_struct *t) enable a high-priority tasklet void tasklet_disable(struct tasklet_struct *t) disable a tasklet (wait for a running tasklet to complete) void tasklet_disable_nosync(struct tasklet_struct *t) disable a tasklet (do not check if tasklet is running)

slide-31
SLIDE 31

Università degli studi di Udine

Tasklets - reference

void tasklet_schedule(struct tasklet_struct *t) signal a tasklet to be executed a running tasklet will be re-executed more schedule request for a non-running tasklet will cause a single execution void tasklet_hi_schedule(struct tasklet_struct *t) schedule a high-priority tasklet void tasklet_hi_schedule_first(struct tasklet_struct *t) schedule a high-priority tasklet it will be the first one in the queue of high-priority tasklets

Università degli studi di Udine

Device Drivers Time handling

Delay

Università degli studi di Udine

Delays

Scheduled delays

not suitable for device drivers

Busy-wait delays

waste CPU time

Università degli studi di Udine

Scheduled delays

Long delays

Time unit: jiffy

Kernel macro: HZ

1 second = HZ jiffies

Waitqueues

wait for external events with a maximum timeout

Notify the scheduler that the task must sleep

Interruptible sleep

Short delays

Time units: ms, s

slide-32
SLIDE 32

Università degli studi di Udine

Delay units

unsigned long round_jiffies(unsigned long j) round jiffies to a full second j: the time in (absolute) jiffies that should be rounded unsigned long round_jiffies_relative(unsigned long j) round jiffies to a full second j: the time in (relative) jiffies that should be rounded unsigned long round_jiffies_up(wq, condition) the same as round_jiffies() except that it will never round down unsigned long volatile jiffies global variable that tracks time

Università degli studi di Udine

Waitqueues

List of tasks waiting for events

Event:

condition and a wakeup* call

A timeout can be used Signal sensitivity can be specified

Tasks can be marked as “exclusive”

An event wakes up all non-exclusive task in the list

Università degli studi di Udine

Waitqueues

Create a waitqueue

wait_queue_head_t waitq; init_waitqueue_head(&waitq); waitq

double linked list

Università degli studi di Udine

Waitqueues

Add a task to a waitqueue

wait_event_interruptible_timeout(waitq, condition, delay);

If condition is false, current task is inserted in the queue and the scheduler runs another task. current stays suspended until: someone else changes condition and calls a wakeup function

  • r until timeout expires

waitq task1 task2 task3 current state: TASK_INTERRUPTIBLE

wait descriptors

slide-33
SLIDE 33

Università degli studi di Udine

Waitqueues - reference

DECLARE_WAIT_QUEUE_HEAD(name) macro to allocate and initialize a waitqueue init_waitqueue_head(initialize a waitqueue) enable a high-priority tasklet wake_up(wq) wake up tasks in interruptible or uninterruptible state (only one exclusive task)

Università degli studi di Udine

Waitqueues - reference

wait_event(wq, condition) sleep until a condition gets true wait_event_timeout(wq, condition, timeout) sleep until a condition gets true or a timeout expires returns: 0: timeout elapsed the remaining jiffies if the condition evaluated to true before timeout wait_event_interruptible(wq, condition) sleep until a condition gets true (can be interrupted by signals) returns:

  • ERESTARTSYS: interrupted by a signal

0: condition evaluated to true wait_event_interruptible_timeout(wq, condition, timeout) sleep until a condition gets true or a timeout expires (can be interrupted by signals)

Università degli studi di Udine

Waitqueues

delay = delay_seconds*HZ; wait_queue_head_t waitq; init_waitqueue_head(&waitq); wait_event_interruptible_timeout(waitq, 0, delay); /* wait until timeout or somebody calls wake_up on waitq return value: 0 => timeout expired > 0 => remaining delay */

Example:

using a waitqueue to wait for a given delay

because resumed by a signal

condition is false: task will be resumed only when timeout expires (or if a signal occurs)

Università degli studi di Udine

Notify the scheduler

1 - Change the task state

set_current_state(TASK_INTERRUPTIBLE); Task will be resumed if a signal is detected set_current_state(TASK_UNINTERRUPTIBLE); Signals are ignored

2 - Yield

schedule_timeout(timeout); Task is not considered for scheduling until timeout expires Returns 0 (timeout expired) or the remaining delay

slide-34
SLIDE 34

Università degli studi di Udine

Notify the scheduler - reference

long schedule_timeout(signed long timeout) sleep until timeout or awaken by some event timeout: timeout value in jiffies void schedule_timeout_interruptible(signed long timeout) same as: set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(timeout) void schedule_timeout_killable(signed long timeout) same as: set_current_state(TASK_KILLABLE); schedule_timeout(timeout) void schedule_timeout_uninterruptible(signed long timeout) same as: set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(timeout)

Università degli studi di Udine

Notify the scheduler - reference

int wake_up_process(struct task_struct *p) wake up a specific process timeout: timeout value in jiffies returns 1: the process was woken up 0: it was already running void schedule(void) The main scheduler interface: release control change task state before call schedule() e.g., set_current_state(TASK_INTERRUPTIBLE); schedule();

Università degli studi di Udine

Notify the scheduler

Examples

delay = delay_seconds*HZ; schedule_timeout_interruptible(delay); /* release control */ /* return value: 0 => timeout expired > 0 => remaining delay */

because resumed by a signal (or someone called wake_up_process)

delay = delay_seconds*HZ; schedule_timeout_uninterruptible(delay); /* release control */ /* return value: 0 => timeout expired > 0 => remaining delay */

Università degli studi di Udine

Short delays

Time unit: ms

msleep(delay_in_ms);

Implemented through schedule_timeout_uninterruptible

msleep_interruptible(delay_in_ms);

Implemented through schedule_timeout_interruptible

Not accurate resolution is one jiffy minimum resulting sleep is 2 jiffies

Time unit: s

ssleep(delay_in_s);

Implemented through msleep

slide-35
SLIDE 35

Università degli studi di Udine

Busy-wait delays

Task remains running

suitable for ISR CPU time is wasted ndelay(delay_in_ns);

could not really exist

implemented as a udelay call

udelay(delay_in_us);

implemented as a loop

mdelay(delay_in_ms);

implemented through udelay

Università degli studi di Udine

Device Drivers Time handling

Timers

Università degli studi di Udine

Timers

When a timeout expires

a function is called

from interrupt context (“soft irq”)

no access to user space sleeping is not allowed asynchronous with the timer instantiation

  • ne-shot timers

can auto-“rearm”

  • n the same cpu

can be interrupted by irq

Time unit: jiffy Used by schedule_timeout

Università degli studi di Udine

Timers

timer2 timer3 timer4 timer1 function1 function2 function3 function4 expires1 expires2 expires3 expires4 data2 data1 data3 data4

When a timer expires its function is called with its data as argument data is unsigned long

Timers are kept in double-linked lists

slide-36
SLIDE 36

Università degli studi di Udine

Timers - reference

DEFINE_TIMER(name, function, expires, data) macro to allocate and initialize a timer TIMER_INITIALIZER(function, expires, data) macro to initialize a timer at compile time init_timer(timer) initialize a timer init_timer_deferrable(timer) as init_timer, but do not cause a CPU to come out of idle just to service it setup_timer(timer, function, data) (re-)initialize a timer

Università degli studi di Udine

Timers - reference

void add_time(struct timer_list *timer) start a timer void add_timer_on(struct timer_list *timer, int cpu) start a timer on a particular CPU

Università degli studi di Udine

Timers - reference

int del_timer(struct timer_list *timer) deactivate a timer returns whether it has deactivated a pending timer or not int del_timer_sync(struct timer_list *timer) deactivate a timer and wait for the handler to finish int try_to_del_timer_sync(struct timer_list *timer) try to deactivate a timer upon successful (ret >= 0) exit the timer is not queued and the handler is not running on any CPU

Università degli studi di Udine

Timers - reference

int mod_timer(struct timer_list *timer, unsigned long expires) modify a timer's timeout timer: the pending timer to be modified expires: new timeout in jiffies returns whether it has modified a pending timer or not. int mod_timer_pending(struct timer_list *timer, unsigned long expires) modify a pending timer's timeout; do not re-activate and modify already deleted timers. int mod_timer_pinned(struct timer_list *timer, unsigned long expires) modify a timer's timeout and not allow the timer to be migrated to a different CPU equivalent to: del_timer(timer); timer->expires = expires; add_timer(timer);

slide-37
SLIDE 37

Università degli studi di Udine

Timers - reference

int timer_pending(const struct timer_list * timer) tell whether a given timer is currently pending void set_timer_slack(struct timer_list *timer, int slack_hz) set the allowed slack for a timer timer: the timer to be modified slack_hz: the amount of time (in jiffies) allowed for rounding

Università degli studi di Udine

Timers

void timer_function(unsigned long timerdata) { ... /* will run at interrupt time (on the same CPU which registered it) no access at user space current is meaningless no sleeping no kmalloc(... , GFP_KERNEL) no semaphores no schedule() no wait_event */ ... add_timer(...); // if needed } struct timer_list T = TIMER_INITIALIZER(timer_function, jiffies+delay, timerdata); add_timer(&T); /* others; int timer_pending(const struct timer_list * timer); int mod_timer(struct timer_list *timer, unsigned long new_expires); int del_timer(struct timer_list * timer); int del_timer_sync(struct timer_list * timer); (may sleep) */

Università degli studi di Udine

High resolution timers

When a (soft) timeout expires

a function is called

from interrupt context (“hard irq”)

no access to user space sleeping is not allowed asynchronous with the timer instantiation

  • ne-shot timers

can auto-“rearm”

  • n the same cpu

cannot be interrupted by irq (irqs are disabled)

Time unit: ns Used by sys_nanosleep (syscall)

Università degli studi di Udine

High resolution timers

Fine resolution and accuracy

depending on system configuration and capabilities

Clockevents

next_event set_next_event min_delta_ns xtime clock lock struct timespec struct clocksource * seqlock_t current clocksource Current time timekeeper (private kernel variable)

Clocksource

read

HR kernel timers

function _softexpires state ktime_t expiry time

struct hrtimer

slide-38
SLIDE 38

Università degli studi di Udine

Timers vs HRtimers

Timers

low overhead low precision

granularity: jiffy

suitable for unlikely timeout errors

very often a timer is canceled before expiration low insert/cancel overhead is needed

High resolution timers

high overhead high precision

Granularity: ~ ns (depends on the clock sources)

suitable for precise time handling

User space fine-timing application Multimedia drivers

Università degli studi di Udine

GPIO functions

int gpio_is_valid(int pin_idx); != 0 : pin is valid == 0 : pin is not valid int gpio_request(int pin_idx, char *pin_label /* optional */); < 0 : error void gpio_free(int pin_idx); int gpio_cansleep(int pin_idx); != 0 : access to pin can sleep int gpio_direction_input(int pin_idx); < 0 : error int gpio_direction_output(int pin_idx, int pin_value); < 0 : error void gpio_set_value(int pin_idx, int pin_value); int gpio_get_value(int pin_idx); int gpio_to_irq(int pin_idx); < 0 : error >= 0 irq related to pin