DNS Performance and the Effectiveness of Caching Jaeyeon Jung, Emil - - PowerPoint PPT Presentation

dns performance and the effectiveness of caching
SMART_READER_LITE
LIVE PREVIEW

DNS Performance and the Effectiveness of Caching Jaeyeon Jung, Emil - - PowerPoint PPT Presentation

DNS Performance and the Effectiveness of Caching Jaeyeon Jung, Emil Sit, Hari Balakrishnan, Robert Morris Presenter: Gigis Petros Introduction Two factors contribute to the scalability of DNS hierarchical design around delegated name


slide-1
SLIDE 1

DNS Performance and the Effectiveness of Caching

Jaeyeon Jung, Emil Sit, Hari Balakrishnan, Robert Morris Presenter: Gigis Petros

slide-2
SLIDE 2

Introduction

  • Two factors contribute to the scalability of DNS
  • hierarchical design around delegated name

spaces

  • aggressive use of caching
  • Reduce the load on the root servers
  • Successful caching hopes to limit client-perceived

delays and wide-area bandwidth usage

slide-3
SLIDE 3

Motivation

  • What performance, in terms of latency and failures,

do DNS clients perceive?

  • How does varying the TTL and degree of cache

sharing impact caching effectiveness?

slide-4
SLIDE 4

DNS overview

  • Mapping human-readable host names to IP addresses
  • Reverse mapping & mail-routing information
  • Mapping in the DNS name space are called resource records
  • A record: name’ s IP address
  • NS record: name of DNS server
  • Caching in DNS
  • Time To Live: expiration time by the originator of a neem
  • Negative caching
  • Caches work well because DNS changes slowly
slide-5
SLIDE 5

Terminology

  • A lookup refers to the entire process of translating a

domain name

  • A query refers to a DNS request packet sent to a DNS

server

  • A response refers to a packet sent by DNS server in

reply to a query packet

  • An answer is a response from a DNS server that

terminates the lookup (successfully or unsuccessfully)

slide-6
SLIDE 6

DNS Lookup Sequence

slide-7
SLIDE 7

Questions

  • What is the ratio of TCP connections to DNS A record

lookups?

  • What is the number of DNS queries per lookup?
  • DNS errors
  • What percentage of lookups do never get an answer?
  • Performance of retransmission protocol
  • What is the effect of varying TTLs and degrees of caching

sharing on cache hit rate?

slide-8
SLIDE 8

Key Findings

  • TCP / DNS lookup ratio suggests that the hit rate of DNS caches inside MIT is between 70%

and 80%

  • DNS queries per lookup
  • Unanswered produces 10 query packets
  • Answered 1.3 query packets
  • 23% of all client lookups in the most recent MIT trace fail to elicit any answer
  • 13% of lookups result in an answer that indicates a failure. Most of these failures indicate

NXDOMAIN

  • % of TCP connections made to names with low TTL values increased from 12% to 25% in

2000

  • Setting all A-record TTL’s to a value as small as 10 minutes is not likely to degrade the

scalability of DNS

slide-9
SLIDE 9

MIT Dataset

  • MIT’s laboratory for Computer

science (LCS) and Artificial Intelligence laboratory (AI) to the rest internet

  • 24 internal subnetworks

sharing the border router

  • Data collected in January and

December 2000

  • 500 users, 1200 hosts
slide-10
SLIDE 10

KAIST Dataset

  • Korea Advanced Institute of

Science and Technology (KAIST) to the rest internet

  • Collected May 2001
  • 1000 users, 5000 hosts
  • Only International TCP traffic
slide-11
SLIDE 11

Methodology

  • Collection Methodology
  • Wide-area DNS query/response
  • Outgoing TCP connections: SYN/FIN/RST
  • Anonymized internal addresses
  • Analysis of Methodology
  • Sliding window of 60 seconds
  • A referral occurs when a server does not know the answer to a query, but does know where the

answer can be found

slide-12
SLIDE 12

Data Summary

  • Categorize lookups
  • 1. Negative answer: lookup gets a response with non-

zero response code

  • 2. Zero answer: is authoritative and indicates no error,

but has no ANSWER, AUTHORITY or ADDITIONAL records

  • 3. Answered with success: terminates with a response

that has a NOERROR code and one or more ANSWER

  • All other lookups are considered unanswered
slide-13
SLIDE 13

Data Summary

slide-14
SLIDE 14

Latency

Latency distribution vs. number of referrals for the mit-dec00 trace

  • Latency is affected by the number of referrals
slide-15
SLIDE 15

Latency

  • Cached NS records reduce the load on root

servers

Distribution of latencies for lookups that do and do not involve querying root servers.

slide-16
SLIDE 16

Retransmissions

  • A querying name server retransmits a query if it does not get a response from destination

within a timeout period

  • Unanswered lookups categories
  • Zero referrals (nothing received)
  • Non-zero referrals (not lead to answer)
  • Loops (misconfigured information)
slide-17
SLIDE 17

Retransmissions

Cumulative distribution of number of retransmissions for answered (top most curved) and unanswered lookups

  • 99.9% of answered lookups have <= 2 retransmissions
slide-18
SLIDE 18

Retransmissions

  • Each lookup that elicited zero referrals generated

about five times as many wide-area query packets

  • We conclude that many DNS name servers are too

persistent in their retry strategies

  • Results show that it is better for them to give up after

2 or 3 retransmissions and let client program decide

  • Loops generated on average about 10 query

packets

slide-19
SLIDE 19

Retransmissions

  • Query packets generated by lookups that obtained no

answer

  • - 59% (mit-jan00)
  • 63% (mit-dec00)
  • Names servers are using inappropriate setting for the

number of retries or excessive timeout value

  • Nore transmissions within 60 seconds:
  • 12% (mit-jan00), 19% (mit-dec00)
slide-20
SLIDE 20

Failures

  • Negative answers are mostly errors of NXDOMAIN or SERVFALL
  • NXDOMAIN: request name doesn’t exist
  • SERVFALL: supposed to be authoritative but does not have a valid copy or is
  • ut of memory
  • The largest cause of these error responses are inverse lookups for IP addresses
  • Non-existent top-level domains: loopback, index.htm
  • Large number of distinct names makes negative caching ineffective
slide-21
SLIDE 21

Interactions with Root Servers

  • 15 - 27% of the lookups sent to root name servers

resulted in negative responses

  • Most of these appear to be mistyped names

The percentages are of the total number of lookups in the trace

slide-22
SLIDE 22

Effectiveness of DNS Caching

  • How useful is it to share DNS caches among many client

machines?

  • Locality of references among clients
  • What is the likely impact of choice of TTL on caching effectiveness?
  • Locality of references in time
  • Quantify two important statistics
  • 1. Distribution of name popularity
  • 2. The distribution of TTL values
slide-23
SLIDE 23

Name Popularity

  • The top 10% account for more than 68% of total

lookups

  • A long tail : 9.0% are unique names
slide-24
SLIDE 24

TTL Distribution

  • The fraction of accesses to short TTLs has doubled
  • Increased deployment of DNS-based server

selection

slide-25
SLIDE 25

Trace-driven Simulation

  • Two databases:
  • “name database” maps every IP in A answer to the domain name
  • “TTL database” maps each domain name to the highest TTL A record

for that domain

  • Algorithm
  • Randomly divide TCP clients into groups of size s
  • For each new TCP connection, determine the group g and look for a

name n in the cache of group g

  • If n exists and the cached TTL has not expired, record a hit. Otherwise

record a miss

slide-26
SLIDE 26

Effect of Sharing on Hit rates

  • Most of the benefit of sharing is obtained with as

few as 10 or 20 clients per cache

slide-27
SLIDE 27

Impact of TTL on Hit rates

  • Most of the benefit of caching is achieved with TTLs less than about 1000 seconds.
  • 5-min TTLs would increase DNS traffic by factor of 1.5
  • NS record caching is critical
slide-28
SLIDE 28

Conclusions

  • 1. About a quarter of all DNS lookups never get an answer, which

corresponds over 50% of DNS packets in the wide-area Internet

  • 2. The DNS retransmission protocol appears to be overly

persistent, but in 10%-12% of cases, no retransmissions occur

  • 3. Setting all A-record TTL’s to a value as small as 10 minutes is not

likely to degrade the scalability of DNS in any noticeable way

  • 4. The cache ability of NS records enhances scalability by

reducing load on the root and top-level name servers

  • 5. Little benefit is obtained from sharing a forwarding DNS cache

among than 10 - 20 clients

slide-29
SLIDE 29