Using TCP/IP Traffic shaping to achieve iSCSI service predictability
Paper presentation Jarle Bjørgeengen
University of Oslo / USIT
November 11, 2010
Using TCP/IP Traffic shaping to achieve iSCSI service predictability - - PowerPoint PPT Presentation
Using TCP/IP Traffic shaping to achieve iSCSI service predictability Paper presentation Jarle Bjrgeengen University of Oslo / USIT November 11, 2010 Outline About resource sharing in storage devices Lab setup / job setup Experiment
Paper presentation Jarle Bjørgeengen
University of Oslo / USIT
November 11, 2010
About resource sharing in storage devices Lab setup / job setup Experiment illustrating the problem One half of the solution: the throttle Live demo
The throttle Part two of the solution: the controller
How the controller works Conclusion and future work
QoS bridge QoS bridge QoS bridge QoS bridge QoS bridge Consumers Shared physical resources SAN Virtual disks Centralized storage pool
Free competition causes unpredictable I/O performance for any given consumer.
b HP SC10 10 x 36GB 10k vg_perc lv_b2 lv_b3 lv_b4 lv_b5 b2 b4 b5 b3 iSCSI target (iet) TCP Connections TCP/IP Layer Striped logical
stripe size across 10 disks /dev/iscsi_0 iqn.iscsilab:perc_b2 iqn.iscsilab:perc_b3 iqn.iscsilab:perc_b4 iqn.iscsilab:perc_b5 Block Layer Block Layer
bm
b1 Argus
b lv_b2 lv_b3 lv_b4 lv_b5 b2 b4 b5 b3
bm
Random read rate=256kB/s Seq write full speed Seq write full speed Seq write full speed
Long response times adversely affect application service availability.
100 200 300 400 500
Time (s)
20 40 60 80 100 120
Wait time (ms)
No interference 1 thread (1 machines) 3 threads (3 machines) 12 threads (3 machines)
SYN SYN+ACK
Initiator Target
ACK Write
Timeline without delay
ACK Write ACK Write ACK Write ACK Time SYN SYN+ACK
Initiator Target
ACK Write ACK Write ACK Write ACK Write ACK
Timeline with delay
Throttling delay
0.6 1.6 2.6 3.6 4.6 5.6 6.6 7.6 8.6 9.6 Introduced delay (ms) Time to read 200MB of data (s) 10 20 30 40 0.6 1.6 2.6 3.6 4.6 5.6 6.6 7.6 8.6 9.6 Introduced delay (ms) Time to write 200MB of data (s) 20 40 60 80
Write rate 15 MB/s - 2.5 MB/s Read rate 22 MB/s - 5 MB/s
Need to operate on sets of consumers (throttlable={10.0.0.243,10.0.0.244}) Ipset: One rule to match them all
✞ ☎
ipset -N $throttlable ipmap --network 10.0.0.0/24 ipset -A $throttlable 10.0.0.243 ipset -A $throttlable 10.0.0.244 iptables --match-set $throttlable dst -j MARK --set-mark $mark
✝ ✆
The mark is a step in the range of available packet delays
Manual throttling and QoS specification An automatic QoS policy and automated throttling
Figure: Block diagram of a PID controller. Created by SilverStar(at)en.wikipedia. Licensed under the terms of Creative Commons Attribution 2.5 Generic.
Start Stop
Calculate Up,Ui,Ud
Up = Kp × ek Ui = Uik−1 + Ts × Kp Ti × ek Ud = Kp × Td × ek − ek−1 Ts
0 < Ui < Ukmax Ui < 0 Ui > Ukmax
N N
Ui=0 Ui=Uik-1 Uk = Up+Ui+Ud 0 < Uk < Ukmax
Y Y
mark = int(ceil(Uk)) Uk < 0 Uk > Ukmax Uk=Ukmax Uk=0
N Y Y Y Y N
ISCSIMAP set_maintaner.pl Create /proc/net/iet/sessions /proc/net/iet/volumes IP-sets Create & maintain members Read perf_maintainer.pl PDATA Read Saturation indicators /proc/diskstats Read pid_reg.pl Read pid_threads Read Spawn($resource) Throttles Files Shared memory Processes Legend: lvs Run Command Dependency Read output perf_server.pl CMEM Throttle values gnuplot
The packet delay throttle is very efficient
Solves the throttling need completely for iSCSI (likely other TCP based storage networks too)
The modified PID controller is consistently keeping response time low in spite of rapidly changing load interference. The concept is widely applicable
The packet delay throttle is very efficient
Solves the throttling need completely for iSCSI (likely other TCP based storage networks too)
The modified PID controller is consistently keeping response time low in spite of rapidly changing load interference. The concept is widely applicable
The packet delay throttle is very efficient
Solves the throttling need completely for iSCSI (likely other TCP based storage networks too)
The modified PID controller is consistently keeping response time low in spite of rapidly changing load interference. The concept is widely applicable
iSCSI disk array Ethernet sw. QoS bridge QoS bridge QoS bridge QoS bridge QoS bridge Consumers QoS bridge Resource/consumer maps Virtual disk latencies Array specific plugin SNMPGET
Packet delay throttle with other algorithms PID controller with other throttles
Negligeble overhead introduced by TC filters Differences measured 20 times t-test 99% confidence shows 0.4% / 1.7 %• overhead for read/write (worst case)
100 200 300 400 500
Time (s)
20 40 60 80
Wait time (ms)
10000 20000 30000 40000 50000
Aggregated interference (kB/s)
Small job average wait time (Left) Interference aggregated throughput (Right). Throttling period with 4.6 ms delay Throttling period with 9.6 ms delay
100 200 300 400
Time (s)
20 40 60 80 100
Average wait time (ms)
No regulation 20 ms treshold 15 ms threshold 10 ms threshold
100 200 300 400 500
Time (s)
10000 20000 30000 40000 50000
Aggregate Write (kB/s)
No regulation 20 ms threshold (smoothed) 15 ms threshold (smoothed) 10 ms threshold (smoothed)
50 100 150 200
Time (s)
10 20 30 40 50
(ms)
10000 20000 30000 40000
(kB/s)
vg_aic read wait time with automatic regulation, thresh=15ms Packet delay introduced to writers Aggregated write rate
50 100 150 200 250 300
Time (s)
5000 10000 15000 20000
Read (kB/s)
b2 b3 b4 b5
50 100 150 200 250 300
Time (s)
5000 10000 15000 20000
Write (kB/s)
b2 b3 b4 b5
100 200 300 400 500 Times 20 40 60 80 100 Wait ms
Red: Exponential Weighted Moving Average (EWMA) Green: Moving median L(t) = l(t)α + L(t−1)(1 − α) EWMA, also called low pass filter
u(t) =
Continous
Proportional
+ Kp Ti
t
+ KpTde′(t)
uk = uk−1
+ Kp(1 + T Ti )ek − Kpek−1 + KpTd T (ek − 2ek−1 + ek−2)
uk = Kpek
Proportional
+ ui(k−1) + KpT Ti ek
+ KpTd T (ek − ek−1)