A System A System-
- on
- n-
- a
a-
- Chip Lock
A System- -on on- -a a- -Chip Lock Chip Lock A System Cache - - PowerPoint PPT Presentation
A System- -on on- -a a- -Chip Lock Chip Lock A System Cache with Task Preemption Cache with Task Preemption Support Support By By Bilge S. Bilge S. Akgul Akgul, , Jaehwan Jaehwan Lee and Lee and Vincent J. Mooney Vincent J.
Code section where shared data between multiple
E.g., multiple readers and multiple writers
A lock is necessary to guarantee the consistency of
Time between release and acquisition of a lock
Time to acquire a lock in the absence of contention
Anderson ’90)
Spin
Array based queuing (
Anderson ’90)
MCS locks (
Mellor-
Crummey, Scott ‘91 , Scott ‘91)
LH and M locks (
Ladin, , Hagerston Hagerston, Magnusson , Magnusson ’ ’94 94)
QOLBY (
Kagi ’ ’99 99)
Ramachandran’ ’96 96)
Memory consistency model
New cache design, extra cache states for locks
MPC750 MPC750 SoC Lock Cache Atalanta-RTOS
Extension MPC750 MPC750 Software Hardware
Seamless CVE from
4 MPC750s SoC Lock Cache
Shared Memory Interface Logic
CS access Task 1 :CS access Task 2 : Try to access CS Task 3 preempt Interrupt Task 2 : Try to access CS Busy-Wait Task 3 CS access Task 1 : CS access
Processor 1 Processor 2
Tasks Execution Time Improvement
Processor 1 Processor 2
Context Sw and ISR
Lock n Lock n … Lock 4 Lock 4 Lock 3 Lock 3 Lock 2 Lock 2 Lock 1 Lock 1
Lock-wait table 2
56 56 57 57 58 58 59 59 60 60 61 61 62 62 63 63 48 48 49 49 50 50 51 51 52 52 53 53 54 54 55 55 40 40 41 41 42 42 43 43 44 44 45 45 46 46 47 47 32 32 33 33 34 34 35 35 36 36 37 37 38 38 39 39 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 8 9 10 10 11 11 12 12 13 13 14 14 15 15 1 2 3 4 5 6 7 56 56 57 57 58 58 59 59 60 60 61 61 62 62 63 63 48 48 49 49 50 50 51 51 52 52 53 53 54 54 55 55 40 40 41 41 42 42 43 43 44 44 45 45 46 46 47 47 32 32 33 33 34 34 35 35 36 36 37 37 38 38 39 39 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 8 9 10 10 11 11 12 12 13 13 14 14 15 15 1 2 3 4 5 6 7
Lock-wait table 1
Lock_longCS Read_lock Remove task from ready table return from Lock_longCS Execute long CS UnLock Context Switch New task Execute ISR, Interrupt Handler
Execution without holding lock Holding lock Fail to acquire lock Release lock
PE1 PE2
Client Server Shared Memory Client address space Server address space client local memory server local memory shared data
29M 29M 23,590 23,590 908 908
With With SoCLC SoCLC 1.27x 1.27x 2.00x 2.00x 1.32x 1.32x Speedup Speedup
1200 1200
Lock Lock Latency Latency ( (clk clk cycles) cycles)
36.9M 36.9M
( (clk clk cycles) cycles)
47,264 47,264
Lock Delay Lock Delay ( (clk clk cycles) cycles) Without Without SoCLC SoCLC
102 102 32 32
With With SoCLC SoCLC 87.6x 87.6x 27x 27x Speedup Speedup
884 884
Lock Lock Latency Latency ( (clk clk cycles) cycles)
8936 8936
Lock Delay Lock Delay ( (clk clk cycles) cycles) Without Without SoCLC SoCLC
14,456 14,456 T= 256 T= 256 L= 128 L= 128 10,717 10,717 T= 192 T= 192 L= 64 L= 64 9,015 9,015 T= 160 T= 160 L= 32 L= 32 8,163 8,163 T= 144 T= 144 L= 16 L= 16 S= 128 S= 128 11,174 11,174 T= 192 T= 192 L= 128 L= 128 7,435 7,435 T= 128 T= 128 L= 64 L= 64 5,733 5,733 T= 96 T= 96 L= 32 L= 32 4,881 4,881 T= 80 T= 80 L= 16 L= 16 S= 64 S= 64 Total Total Area Area (gates) (gates) total # total #
locks locks long long CS CS locks locks short short CS CS locks locks 9,747 9,747 T= 160 T= 160 L= 128 L= 128 6,008 6,008 T= 96 T= 96 L= 64 L= 64 4,306 4,306 T= 64 T= 64 L= 32 L= 32 3,454 3,454 T= 48 T= 48 L= 16 L= 16 S= 32 S= 32 9,027 9,027 T= 144 T= 144 L= 128 L= 128 5,288 5,288 T= 80 T= 80 L= 64 L= 64 3,586 3,586 T= 48 T= 48 L= 32 L= 32 2,734 2,734 T= 32 T= 32 L= 16 L= 16 S= 16 S= 16 Total Total Area Area (gates) (gates) total # total #
locks locks long long CS CS locks locks short short CS CS locks locks