iSCSI SANs Don’t Have To Suck
Derek J. Balling Data Center Manager derekb@answers.com
1 Thursday, November 11, 2010
iSCSI SANs Dont Have To Suck Derek J. Balling Data Center Manager - - PowerPoint PPT Presentation
iSCSI SANs Dont Have To Suck Derek J. Balling Data Center Manager derekb@answers.com Thursday, November 11, 2010 1 What is iSCSI? iSCSI is a network-based block-level disk protocol Essentially SCSI commands stuffed into the payload
Derek J. Balling Data Center Manager derekb@answers.com
1 Thursday, November 11, 2010
2 Thursday, November 11, 2010
super-short (millisecond) interruptions, just as conventional SCSI disks might be problematic if the cable between the controller and disks didn’t have 100% reliability
performance (latency) and interruptions
iSCSI pain and suffering
3 Thursday, November 11, 2010
zero outage or packet-loss.
apply for any normal data LAN.
Really Robust Ethernet Network”, but it just doesn’t capture the level of effect this has on iSCSI
actually put it all together
4 Thursday, November 11, 2010
and, if it needs access to the SAN, two for the “SAN”
redundancy
active/passive failover
5 Thursday, November 11, 2010
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
6 Thursday, November 11, 2010
network and SAN
VLANs, mostly on the data-network side, but one VLAN for the SAN traffic
7 Thursday, November 11, 2010
the enclosure, so there are two “A” side switches and two “B” side switches. The only difference is which port VLANs are mapped to (data-network
Core switches
BladeSW2, and BladeSW3/BladeSW4 are “inactive” via Spanning Tree
8 Thursday, November 11, 2010
to communicate with each other to ensure that there are no “loops” in the switching fabric
disable to make the loop go away
which “cross the A/B divide” since that’s what causes the actual loop.
9 Thursday, November 11, 2010
everything it needs
priority” (failover) links stay down until they are needed
10 Thursday, November 11, 2010
11 Thursday, November 11, 2010
switch is removed, every switch on the fabric does a quick re-evaluation of what the network looks like
they’re doing this, other than their own STP packets
while the switches refuse to send its packets
12 Thursday, November 11, 2010
being installed during roll-out
Protocol and enabling instead “Uplink Failure Detection”
let the servers just immediately notice the outage and direct traffic directly to the “B” side network equipment
13 Thursday, November 11, 2010
Juniper and Cisco appear to also support it on some of their product line.
Monitor” (LTM) and “Link To Disable” (LTD)
goes dark, it immediately disables all ports in the LTD group
group, the blades in the LTD group
14 Thursday, November 11, 2010
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
15 Thursday, November 11, 2010
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
16 Thursday, November 11, 2010
Switches no longer “participated” in the STP negotiation
didn’t trigger an STP “event”, meaning iSCSI didn’t see as many problems
cabinet switches from time to time, and they don’t have Uplink Failure Detection, and any network maintenance is extremely problematic
17 Thursday, November 11, 2010
that our servers in standard “pizza-box” cabinets can have redundant upstream links, without all needing to be consuming expensive core switchports
consumers
few 2Us that need to connect to it, we’ll jack them into the new “SAN Cores”
18 Thursday, November 11, 2010
just fine
wrong, etc., but it seems to have been the right solution for us
19 Thursday, November 11, 2010
connected to the “A” and “B” side “SAN Core” switches
interconnected
respective core
20 Thursday, November 11, 2010
hardware in our network environment safely
can swap it out for new hardware
cables to some other similarly isolated new piece
21 Thursday, November 11, 2010
Step By Step Walk-Through
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
22 Thursday, November 11, 2010
Disable All SAN “B” Sides and Disconnect
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
23 Thursday, November 11, 2010
Install New “B” Side “SANCore” Switch
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B
24 Thursday, November 11, 2010
Connect Temp Cable From “A” Core to “B” SanCore
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B
25 Thursday, November 11, 2010
Connect “B” Side SAN Equipment to SanCore B
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B
26 Thursday, November 11, 2010
blade consumers, as well as the SAN modules themselves
the existing “B” side Core switch
side “Core” to the “B” side “SANCore”
“Core” to the “B” side “SANCore”.
27 Thursday, November 11, 2010
environment
addresses that are presenting themselves on the “B” side hardware
have its “A” and “B” side NICs on networks that can’t see each other, especially when your systems all expect that they can do so.
28 Thursday, November 11, 2010
with “B” side NICs connected to their own ‘independent’ network
little while on this hybrid network and let things settle down
now, because you don’t want to separate live “A” and “B” networks ever. Badness and pain will ensue
29 Thursday, November 11, 2010
Lather, Rinse, Repeat
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B
30 Thursday, November 11, 2010
Disable and Disconnect “A” Side SAN Ports
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B
31 Thursday, November 11, 2010
Install New “SanCore A” Switch
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B SanCore A
32 Thursday, November 11, 2010
Connect All “A” Side Cables To The New SanCore A
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B SanCore A
33 Thursday, November 11, 2010
Remove The Temporary Cable
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B SanCore A
34 Thursday, November 11, 2010
and SAN modules
“B” side infrastructure
from the Core switch, install the new “A” side “SAN Core”, and move all their cables to the new switch
35 Thursday, November 11, 2010
network, it’s time to cut the cord)
consumers
didn’t even notice
36 Thursday, November 11, 2010
Your Entire Network After The Change
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B) BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
CORE A CORE B SAN
SanCore B SanCore A
37 Thursday, November 11, 2010
Just The SAN-Related Components
2U Server Blade Svr
BLADE SWITCH 4 (B) BLADE SWITCH 3 (A)
SAN
SanCore B SanCore A
38 Thursday, November 11, 2010
Just The LAN-Related Components
2U Server
CAB SWITCH “A” CAB SWITCH “B”
Blade Svr
BLADE SWITCH 1 (A) BLADE SWITCH 2 (B)
CORE A CORE B
39 Thursday, November 11, 2010
switches without missing a beat, you’ll be tempted to do it from time to time
then - replaced the core switches twice, replaced the SAN Core switches twice
40 Thursday, November 11, 2010
41 Thursday, November 11, 2010
including every link to every switch (representative samples are fine, obviously)
represent your changes
“what path does this device now use to get from A to B”?
42 Thursday, November 11, 2010
you’ll start to have dreams (nightmares) about it
exactly as you have written it down already!
43 Thursday, November 11, 2010
you probably had ever read about redundant networking
how to do this sort of thing forever, but as sysadmins, we don’t mess about with it that often
44 Thursday, November 11, 2010
largish flat network, used only for iSCSI, probably isn’t a big problem for a lot of sites
like you might a religious text. If you say to yourself “oh, I can merge steps 17 and 19, and do 18 after”, it’s likely that you’re wrong.
whiteboard, not on the fly.
45 Thursday, November 11, 2010
46 Thursday, November 11, 2010
47 Thursday, November 11, 2010