Passive DNS Collection and Analysis The 'dnstap' (& fstrm) - - PowerPoint PPT Presentation
Passive DNS Collection and Analysis The 'dnstap' (& fstrm) - - PowerPoint PPT Presentation
Passive DNS Collection and Analysis The 'dnstap' (& fstrm) Approach Farsight Security, Inc. December 2014 Importance of Measuring DNS High volume low latency datagram protocol Channel 202; 40 sources; 1,657,398,226,932 bytes/day;
SLIDE 1
SLIDE 2
Importance of Measuring DNS
- High volume low latency datagram protocol
– Channel 202; 40 sources; 1,657,398,226,932 bytes/day; 153.463 Mbit/sec average rate. #1: 1,003,803,989,532 bytes, 97 sources (60%) #2: 272,389,753,714 bytes, 152 sources (16%) #3: 227,311,932,135 bytes, 54 sources (13%)
- Enables almost all other network flows
– A, AAAA, MX, NS, SRV records
- Traffic analysis: NetFlow vs. DNS
– NetFlow tells you “what” – DNS tells you “why” (and “how”)
SLIDE 3
Challenges of Measuring DNS
- Historically, turning on logging in a DNS server
slows it down to the speed of the file system
– Operationally, measurement loss is always better
- So, success in DNS measurement has come
from an asynchronous approach – BPF/pcap
– NCAP (2006) – looked for authoritative responses, reassembling UDP datagrams as necessary (EDNS) – NMSG (2010) – like NCAP but has to see requests also, and then logs complete DNS transactions
SLIDE 4
Passive DNS Data Flow
Authority Servers Recursive Servers Stub Resolvers Farsight SIE PII Farsight DNSDB DNS Cache
- ther
analysts and
- ther
applications
SLIDE 5
Problems with NCAP/NMSG
- Blind to off-wire events like cache expiry due
to DNS TTL, cache purge due to LRU
- Meaning is not tagged – NMSG receiver has to
impute (“guess”) stub vs. cache miss query type, as well as transaction bailiwick
- Currently blind to TCP/53 – noting that there
can be many transactions per TCP/53 session
SLIDE 6
Overload Handling Still Matters
Diagram courtesy of Van Jacobson, 1995 loss region
SLIDE 7
Enter ‘dnstap’ (DNS Tap) & ‘fstrm’ (Frame Streams)
- ‘dnstap’ is server-embedded
- ‘fstrm’ has reliable front-loss
- Implementation has begun
- Deployment is commencing
SLIDE 8
‘dnstap’ – Server-Embedded
- ‘dnstap’ messages are generated from within
DNS implementations, via instrumentation
– No UDP fragment or TCP stream reassembly – No guessing the transaction bailiwick – No matching of on-wire queries with responses – No imputing stub vs. cache-miss query
- Encoded using Google Protocol Buffers
– Fast, lean, open, high quality, de-facto standard
SLIDE 9
'dnstap' – Perspectives
- Messages can be annotated with off-wire
information, e.g.:
– Identity of the server, similar to NSID (for anycast)
- Messages are tightly bound to the role of the
protocol agent who generates them
– RESOLVER_QUERY and AUTH_QUERY are distinct in ‘dnstap’ but identical in BPF/pcap
- ‘dnstap’ is for observation not eavesdropping
– Its use proves that an endpoint is cooperating
SLIDE 10
‘fstrm’ – Reliable Front-Loss
- TCP protocol vs. “BSD Sockets API”
– Nonblocking UDP socket rejects full datagrams – Nonblocking TCP socket rejects overflow octets
- Which breaks framing unless sender keeps state
- Solution: ‘fstrm’ writer thread
– Lockless SP/SC ring buffer – ‘fstrm’ socket is blocking, so, thread can block – Reliable front-loss occurs when a ring buffer fills
SLIDE 11
‘dnstap’ / ‘fstrm’ Architecture
SLIDE 12
‘dnstap’ – Message Types
- Present:
– Stub {Query, Response} – Authoritative {Q, R} – Resolver {Q, R} – Client {Q, R} – Forwarder {Q, R}
- Prospective:
– RRL bucket {Start, End} – Zone transfer in {S, E} – Zone transfer out {S, E} – Cache purge (LRU) – Cache expiry (TTL)
SLIDE 13
SLIDE 14
SLIDE 15
Licensing/Packaging
- Using Apache Open Source License V2.0
– We loved BSD/ISC license, but AOSL2 is “better”
- Protocol, reference API, reference toolset
– Working now in Unbound, Knot; BIND is next
- Our commercial interest is: wide adoption
– So, it’s all on GitHub (see http://dnstap.info/)
- We intend to patch all F/L/OSS DNS servers
– Eventually this should pressure Nominum and Microsoft to join the ‘dnstap’ ecosystem
SLIDE 16
Context of DNS Measurements
- Farsight SIE – Security Information Exchange
– Commoditize security-relevant Internet telemetry – Channels for Passive DNS (raw, dedup’d, chaff, etc)
- Filtered output goes into DNSDB
– Hierarchical MTBL (Google Sorted String Tables) – Contains all of SIE’s DNS since June 2010 – RESTful API with JSON output
- SIE and DNSDB are cash-free for nonprofit
research/academia (pay us in data of like kind)
SLIDE 17
Passive DNS, SIE, DNSDB – Context
SLIDE 18
Demonstration
- DNSDB API
– online dnsdb_query tool
- SRA
– SIE Remote Access
- NOD
– Newly Observed Domains
SLIDE 19
Summary
- Passive DNS collection (NCAP, NMSG, ‘dnstap’)
- Worked example: DNSDB, SRA, NOD
- More Information: