Transport layer Transport services and protocols applicatio - - PowerPoint PPT Presentation
Transport layer Transport services and protocols applicatio - - PowerPoint PPT Presentation
Transport layer Transport services and protocols applicatio Provide communication n transport between application processes network data link running on different hosts physical Protocols run in end system OS Sender transport
Transport services and protocols
Provide communication
between application processes running on different hosts
Protocols run in end system
OS
Sender transport
Breaks application messages into
segments, passes to network layer Receiver transport
Reassembles segments into
messages, passes to destination application
Two main transport protocols
TCP and UDP
applicatio n transport network data link physical applicatio n transport network data link physical
Transport Layer Functions
Demux to upper layer
Delivering data to correct application process
Connection setup
Providing a connection abstraction over a connectionless
substrate
Delivery semantics
Reliable or unreliable Ordered or unordered Unicast, multicast, anycast
Flow control
Prevent overflow of receiver buffers
Congestion control
Prevent overflow of network buffers Avoid packet loss and packet delay
Security
- 1. Demux to upper layer (both TCP &
UDP)
Which process gets this request? Done via 16-bit source port and 16-bit destination port
in both UDP and TCP
FTP HTTP DNS NTP TCP UDP IP NET1 NET2 NETn …
TCP/UDP IP
Port Number
Network
Protocol Field Type Field
Internet services layered on top of TCP/UDP
What uses TCP?
HTTP (Web pre-2016) SMTP (E-mail transmission) IMAP, POP (E-mail access)
What uses (mainly) UDP?
DNS NTP (network time protocol) Highly interactive on-line games (First-Person Shooters) Many protocols can use both
Internet services layered on top of TCP/UDP
Protocols and their ports listed in /etc/services on
*nix or C:\WIN*\system32\services
IANA
http://www.iana.org/assignments/port-numbers
echo 7/tcp echo 7/udp ssh 22/tcp # SSH Remote Login Protocol ssh 22/udp telnet 23/tcp smtp 25/tcp domain 53/tcp # Domain Name Server domain 53/udp http 80/tcp # WorldWideWeb HTTP http 80/udp # HyperText Transfer Protocol netbios-ssn 139/tcp # NETBIOS session service netbios-ssn 139/udp bgp 179/tcp # Border Gateway Protocol bgp 179/udp https 443/tcp # http protocol over TLS/SSL https 443/udp microsoft-ds 445/tcp # Microsoft Naked CIFS microsoft-ds 445/udp
UDP: User Datagram Protocol
Barebones transport protocol UDP and transport layer functions
Demux Connection setup (none)
Connectionless No handshaking between sender and receiver Minimal state
Delivery semantics
Unreliable, unordered, mostly unicast (multicast no longer
supported) No flow control support No congestion control support No security support
source port # dest port # 32 bits
Application data (message) UDP segment format
length checksum Length, in bytes of UDP segment, including header
UDP: more
Often used for streaming multimedia apps
loss tolerant rate sensitive
Flow and congestion
controlled:
Pipelined operation to
control size of "pipe" (i.e. bandwidth)
Pipeline of packets sized
by MSS (maximum segment size)
Control algorithms to keep
sender from overwhelming receiver or network Connection-oriented Delivery semantics
Reliable, in-order byte
stream
3-way handshake to
initialize sender/receiver and provide connection integrity
Error detection, correction Retransmission Duplicate detection
Unicast (point-to-point)
one sender, one receiver
Full duplex (bi-directional
flow)
TCP: Overview
socket door TCP send buffer TCP receive buffer socket door
segment
application writes data application reads data
source port # dest port #
32 bits
application data (variable length) sequence number acknowledgement number
Receive window Urg data pnter checksum
F S R P A U
head len not used
Options (variable length)
URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes
- f data
(not segments!) Internet checksum (as in UDP)
TCP segment structure
TCP
TCP creates a reliable data transfer service on top of
IP’sunreliableservicevia
Checksum Sequence numbers Acknowledgments Retransmissions Rate limits on sender
What if the Data is Corrupted?
Internet GET windex.html GET index.html Solution: Add a checksum Problem: Data Corruption 0,9 9 6,7,8 21 4,5 7 1,2,3 6
X
Segment integrity via checksum
Checksum included in header by sender
Generated by treating data in the packet as numbers and
adding them all up
Receiver checks checksum
Performs same operation as sender and checks
checksum field
Corruption detected when no match
Solution: Add Sequence Numbers Problem: Out of Order
What if the Data is Out of Order?
GET x.th inde ml GET x.thindeml GET index.html ml 4 inde 2 x.th 3 GET 1
Sequence numbers
Dataineachpacketislabeledwitha“unique”number
Establishes ordering amongst packets Allows receiver to identify which packets have been received
and which have not
Initialized during connection setup (i.e. 3-way handshake)
A B
SYN + Seq A SYN+ACK-A + Seq B ACK-B
What if the Data is Lost?
Internet GET index.html Problem: Lost Data Internet GET index.html Solution: Timeout and Retransmit GET index.html GET index.html
Acknowledgements and retransmissions
TCP receiver sends an acknowledgement back to
sender for the data it receives
Letssenderknowto“moveon” Lets sender know that network has the capacity to deliver
its packets
Retransmissions
Via timeout events
TCP uses single retransmission timer Sender sends segment and sets a timer Timer is based on measured round-trip times and round-trip time
variations
(e.g. timeout after ave. rtt + 2*std. deviation)
Exponential backoff if persistent loss
Via missing acknowledgements
If receiver reports it has received packets 1, 3, 4, and 5, sender
automatically resends 2 before timeout
What if receiver has no resources (flow control)?
Internet PUT remix.mp3 Problem: Overflowing receiver buffers Solution: Receiver advertised window Internet PUT remix.mp3
16KB free
TCP Flow control
Receiver has a finite buffer
App process may be slow reading it Flow control to make sure sender won't overflow it Matchthesendratetothereceivingapp’sdrainrate
Rcvr advertises spare room in buffer by including value
- f RcvWindow in each segment/ACK
Alsoknownasthe“advertised”window Sender limits unACKed data to RcvWindow to avoid
- verflow
TCP Flow control
Problem: 16-bit advertised window field (in bytes)
Maximum of 64KB !!
Consider network with 1500 byte segments, 100ms
RTT, want 10 Gbps throughput
BW*Delay = 10Gbs * 0.1s = 1Gbit
In packets, W=83,333 In bytes, 1Gbit/8 = 125MB Amount of data potentially in flight from sender to receiver Need at least a 125MB receiver buffer to support!
Solution: TCP window scaling option
Scaling factor on advertised window specifies # of bits to shift to
the left
Scaling factor exchanged during connection setup
What if Network is Overloaded?
Short bursts: buffer What if buffer overflows?
Packets dropped and retransmitted Sender adjusts rate until load = resources
Called“Congestioncontrol”
TCP congestion control
End-host, window-based Only place to really prevent collapse is at end-host Added in late 80s due to congestion collapse on the
Internet
Increase in network load results in decrease of useful
work
A result of
Spurious retransmissions of packets still in flight Undelivered packets which consume network resources and are
dropped elsewhere in network
TCP congestion control basics
Keep a congestion window, (cwnd)
Reduce when congestion is perceived Increase otherwise (probe for bandwidth) Size of window denotes how much network is able to
absorb
“Sizeofthepipe” Make cwnd as large as possible without loss TCP“probes”forusablebandwidthcontinuously
Increase cwnd until loss (congestion) Decrease cwnd upon loss ,then begin probing (increasing)
again
Recallreceiver’sadvertisedwindow(rcv_wnd) Sender’smaximumwindow
min(rcv_wnd,cwnd)
TCP slow start (circa 1990s)
When connection begins, increase rate exponentially
fast until first loss event
cwnd = 1 for 1st RTT cwnd = 2 for 2nd RTT cwnd = 4 for 3rd RTT
When connection begins, cwnd = 1 MSS
Example: MSS = 500 bytes & RTT = 200 msec Initial rate = 20 kbps!
Available bandwidth may be much larger than
MSS/RTT
desirable to quickly ramp up to respectable rate
TCP congestion avoidance
Q: When should the exponential increase stop? If loss occurs when cwnd = W
Network can handle 0.5W ~ W segments Cut cwnd in half, grow window more slowly Grow cwnd by 1 every round-trip time Results in additive increase
1 2 4
RTT RTT RTT
W W+1 2W Congestion avoidance Fast Retransmit/Recovery Slow-start
TCP throughput
TCP flow and congestion control issues
Short transfers perform poorly Flows timeout on loss if cwnd < 3
Change dupack threshold for small cwnd
Parameters tuned for 1980s modems!
For a 1KB segment size and a 100ms RTT = 81kbps Cable modems at 312x capacity 3-4 packet flows (most HTTP transfers) need 2-3 round-
trips to complete
Short transfers
Use larger initial cwnd
IETF approved initial cwnd = 10 /usr/src/linux/include/net/tcp.h
/* TCP initial congestion window as per draft-hkchu-tcpm-initcwnd-01 */ #define TCP_INIT_CWND 10
TCP flow and congestion control issues
Problem: TCP Sawtooth for large W (long-fat pipes)
For sawtooth W to 2W Packets xferred in sawtooth
W+(W+1)+(W+2)….+2W=(3W/2)*(W+1)=1.5W(W+1) For W=83,333
Packets xferred in sawtooth between losses = 10.4 billion Loss rate = 1 packet loss per sawtooth ➜ L = 10-10 Wow
Sawtooth length = W*RTT
For W=83,333 and RTT=100ms, sawtooth length over 2 hours Average connection throughput ¾ of capacity
Standard TCP
Low window size resilience to packet loss in High-
Speed Network
Packet loss Time (RTT) Congestion avoidance Packet loss Packet loss cwnd Slow start Packet loss
100,000 10Gbps 50,000 5Gbps 1.4 hours 1.4 hours 1.4 hours
TCP
Slow Increase cwnd = cwnd + 1 Fast Decrease cwnd = cwnd * 0.5
Standard TCP over different capacities
155 622 2500 5000 10000 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2000 4000 6000 8000 10000 Link Utilization Link Capacity (Mbps)
Cannot fully utilize the huge capacity of high- speed networks!
NS-2 Simulation (100 sec)
Link Capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps, Drop-Tail Routers, 0.1BDP Buffer 5 TCP Connections, 100ms RTT, 1000-Byte Packet Size
TCP Cubic for long-fat pipes
The congestion window is a cubic function of time
since the last congestion event, with the inflection point set to the window prior to the event.
Used in QUIC and in Linux versions > 2.6.19 Details in a paper if interested
CUBIC window curves with competing flows (NS simulation in a network with 500Mbps and 100ms RTT), C = 0.4, β = 0.8.
TCP BBR
Google's replacement to TCP Cubic Bottleneck Bandwidth and Round-trip propagation time
Use recent measurements of network delivery rate and
round-trip time to model how quickly to send
Maximum recent bandwidth and minimum recent round-
trip delay
Experimental, but Google now owns many clients and
servers
SSL/TLS
TCP/IP security
No initial support for
Secrecy (eavesdropping) Server authentication (impersonation) Message integrity (man-in-the middle tampering)
Netscape Secure Sockets Layer (SSL) (1994)
Bolted on between application and transport layer to add
security
Now: Transport Layer Security (TLS)
HTTP over TLS => https:// www.openssl.org for more information
Adoption initially slow
Server overhead Lack of incentive Difficulty in obtaining certificates
Now ubiquitous
Hardware support (x86 instructions for AES, dedicated ASICs) Snowden surveillance revelations Let’sEncrypt(2014) free certificates
SSL/TLS overview
Public-key encryption
PK = private key, SK=secret key m = plaintext, c=ciphertext
Bob generates (SKBob , PKBob ), publishes PKBob Alice: using PKBob encrypts messages and only Bob
can decrypt
Alice Enc m c Bob Dec c m
PKBob SKBob PKBob
Certificates
How does Alice (browser) obtain PKBob ?
Note: Implicit trust in Browser of CA and its Public Key CA PK+proof “IamBob” Browser Alice SKCA check proof Issue Cert with SKCA Bob’s key is PK Bob’s key is PK choose (SK,PK) Server Bob PKCA Verify cert
Bob uses Cert for an extended period (e.g. one year)
PKCA
For Let's Encrypt (you own DNS and IP)
Certificate Authorities
⋮ ⋮
Browsers accept certificates from a large number of CAs
ToplevelCAs≈60 Intermediate CAs ≈1200
SSL/TLS (server auth only)
Client random + cipher suites Sever random + certificate signed by PK of CA + cipher suites
Server Hello Client Hello Client Key Exchange
Master secret encrypted w/ server public key
Finished Encrypted application data (e.g. https)
- 1. Verify cert, extract
server public key, encrypt master secret w/ key
- 2. Decrypt master
secret with private key, generate keys and randoms Slow
Client Server
- 3. Generate keys from
master secret + randoms
Initial SSL Handshake
with public-key of certificate authority (PK of CA)
TLS 1.3
Fast-open
Establish cryptographic key between client and server
upon first connection (used as a connection cookie similar to ssh key)
Allow server to immediately send to client upon initial SYN
packet to remove an RTT delay
Basis for 0-RTT handshake in QUIC
Result
Issue #1
But, what if someone reserves 0regonctf.org? Can they generate Yes. Conflation in UI of security and identity Let's Encrypt being used heavily for phishing attacks
Issue #2
Two browser windows or one?
Issue #3
2011: Comodo and DigiNotar CAs hacked, issue certs
forGmail,Yahoo!Mail,…
2013: TurkTrust issued cert. for gmail.com 2014: Indian NIC (intermediate CA trusted by the root
CA IndiaCCA) issue certs for Google and Yahoo! domains
2015: MCS (intermediate CA cert issued by CNNIC)
issues certs for Google domains
Rogue nation-states owning certificate authorities
Man in the middle attack via rogue cert
Attacker proxies data between user and bank. Sees all traffic and can modify data at will. NSA's Flying Pig program
bank
attacker ClientHello ClientHello BankCert BadguyCert ServerCert (Bank) ServerCert (rogue) GET https://bank.com SSL key exchange SSL key exchange
k1 k1 k2 k2
HTTP data enc with k1 HTTP data enc with k2 (cert for Bank by a valid CA)
Potential solutions
See Web Security course Dynamic HTTP public-key pinning
Let a site declare CAs that can sign its cert On subsequent HTTPS, browser rejects certs issued by
- ther CAs
TOFU: Trust on First Use
CertificateTransparency:[LL’12]
CA’smustadvertisealogofallcerts. they issued Browser will only use a cert if it is published on log server
Efficient implementation using Merkle hash trees
Companies can scan logs to look for invalid issuance Certs on a block-chain?
Labs
Transport Lab #1: netstat, nc
Login to a linuxlab machine
Perform a netstat –l –t -4 to find all listening sockets on the
machine (those accepting incoming connections)
The command flags specify sockets listening on TCP ports on IPv4 interfaces
If the -4 flag does not work, omit it (you are on an IPv4-only machine) Note that as a superuser, you can add a –p flag to determine the program that
- wns each socket
Examine the "Local Address" field
Servers such as ssh and nginx typically listen on "*" to accept connections
from on any interface (INADDR_ANY when specifying socket)
Servers intended for local access listen only on the loopback interface
(INADDR_LOOPBACK) (described via man 7 ip)
For the named services
Identify whether they are listening on all interfaces or are local Then, look up their descriptions in /etc/services to find out what they are Note: netstat *should* provide the same information as an external nmap
scan from the previous lab unless malware has been installed to hide itself locally on the machine Repeat the exercise on your Ubuntu VM to identify what default
services are installed
Transport Lab #1: netstat, nc
On the linuxlab machine
Use ifconfig to find the IP address of the machine netcat (nc) is a program that can connect to arbitrary ports on a
server
For example, the following command connects up to the web
server (port 80) of 131.252.220.66
nc 131.252.220.66 80
Using the IP address of the machine, use nc to connect up to the
ssh port in order to identify the version of ssh that is being used on linuxlab machines (Control-c to exit)
Recall the "Local Address" settings previously listed
Using the IP address of the machine, try and use nc to connect up
to the mail transfer port. What happens?
Replace the IP address with localhost and try again. What
software is being used to transfer mail on this machine?
Explain why CAT has configured the mail transfer port this way.
Transport Lab #2: iperf and TCP cwnd
In this lab, we'll look at TCP throughput (as determined
by its window size) based on round-trip times
On GCP, go to Compute Engine and create 3 VMs
Zone: one in us-west1-b, one in Australia, and one in
Europe
Machine type: micro Boot disk: Ubuntu 16.04 Allow HTTP on both
ssh into each and install iperf sudo apt-get update sudo apt-get install iperf Start the iperf server on the HTTP port (80) of all VMs
and note the external IP address of each instance
sudo iperf –s –p 80
On your Ubuntu 16.04 VM on linuxlab
Perform the following to install xmgrace and iperf
sudo apt-get update sudo apt-get install iperf grace
Install the TCP probe module into the Linux kernel and set
it to trace connections to/from port 80. Make the probe
- utput readable
sudo modprobe -r tcp_probe sudo modprobe tcp_probe port=80 full=1 sudo chmod 444 /proc/net/tcpprobe
In this setup, your local VM will attempt to send as much
data as possible to each of the 3 GCP VMs. The tcp_probe kernel module will periodically sample the state of TCP in order for it to be dumped out.
Trace the connection to the us-west server cat /proc/net/tcpprobe > uswest.raw & TCPCAP=$! iperf -c <uswest server IP address> -p 80 (Wait for 10 seconds) kill $TCPCAP cat uswest.raw | awk '{print $1 " " $7}' >| uswest.out
/proc/net/tcpprobe is the file system interface for pulling data
- ut of the kernel module. Performing a cat on it will stream the
measurements into the file uswest.raw
$! in the shell gives the PID of the backgrounded cat process. Its
value is saved into an environment variable before killed
awk statement prints the first and seventh field of the file (timestamp
and cwnd value)
Repeat for the European server cat /proc/net/tcpprobe > ~/eu.raw & TCPCAP=$! iperf -c <europe server IP address> -p 80 (Wait for 10 seconds) kill $TCPCAP cat eu.raw | awk '{print $1 " " $7}' >| eu.out Repeat for the Australian server (10 seconds) cat /proc/net/tcpprobe > ~/aust.raw & TCPCAP=$! iperf -c <australian server IP address> -p 80 (Wait for 10 seconds) kill $TCPCAP cat aust.raw | awk '{print $1 " " $7}' >| aust.out
Plot the output using xmgrace and take screenshots of
each (or plot them all together)
xmgrace uswest.out xmgrace eu.out xmgrace austr.out
or
xmgrace *.out What do the results indicate when compared? Take down all instances when complete