Query Log Analysis
Detecting Anomalies in DNS Traffic at a TLD Resolver Thesis Defence Jun 30, 2017 Pieter Robberechts Promotor: Prof. Hendrik Blockeel Co-promoter: Ronald Geens
Query Log Analysis Detecting Anomalies in DNS Tra ffi c at a TLD - - PowerPoint PPT Presentation
Query Log Analysis Detecting Anomalies in DNS Tra ffi c at a TLD Resolver Pieter Robberechts Promotor: Prof. Hendrik Blockeel Thesis Defence Co-promoter: Ronald Geens Jun 30, 2017 Goal and Context Goal and Context The QLAD System Results
Detecting Anomalies in DNS Traffic at a TLD Resolver Thesis Defence Jun 30, 2017 Pieter Robberechts Promotor: Prof. Hendrik Blockeel Co-promoter: Ronald Geens
DNS Belgium
The .be ccTLD resolver
Domain name registry for .be/.vlaanderen/.brussels
domains
Highlights uit 2016. DNS Belgium. URL: https://www.dnsbelgium.be/sites/default/files/generated/files/documents/cijfers%20deel%201%20-%20980px_v04_NL.pdf
nameservers
queries / day
DNS Belgium
Current Situation
pcap files
DNS Belgium
Current Situation
We believe that proactive and real-time analysis of this data could contribute to the resilience and security of DNS Belgium’s service.
Design and build a working query log analysis platform using available components and custom development, able to predict, detect and report on common attack and abuse patterns in an
growth and improvement.
Anomaly Detection
Challenges
Anomaly Detection
Goal
Design and implement a query log analysis platform that:
QLAD
Query Log Anomaly Detection
QLAD
System Overview
ENTRADA DSC QLAD-flow QLAD-global QLAD-UI
DATA TRANSFORMATION ANOMALY DETECTION PRESENTATION
Data Transformation
ENTRADA vs DSC
ENTRADA DSC + convert archive SQL aggregate archive MonogDB API
Data Transformation
ENTRADA vs DSC
ENTRADA DSC
"ClientAddr": [ { "val": "195.238.24.111", "count": 1014 }, { "val": "195.238.25.53", "count": 70 }, { "val": "195.238.25.99", "count": 63 }, { "val": "195.238.24.117", "count": 61 }, { "val": "194.78.30.189", "count": 59 }, { "val": "42.236.23.92", "count": 55 }, { "val": "195.238.25.108", "count": 55 }, { "val": "42.236.23.91", "count": 54 }, { "val": "193.58.1.131", "count": 52 },
QLAD-flow
Algorithm
h₁
Dewaele, G., Fukuda, K., Borgnat, P., Abry, P., & Cho, K. (2007). Extracting Hidden Anomalies using Sketch and Non Gaussian Multiresolution Statistical Detection Procedures. Proc. ACM SIGCOMM Workshop on Large-Scale Attack Defense (LSAD’07), 1–8.
α₁, β₁ Level 1 2 1 3 2 1 3 1 2 1 α₂, β₂ Level 2 3 3 3 4 2 α₃, β₃ Level 3 6 7 2
QLAD-flow
Algorithm
α₁ β₁ α₁ β₁ α₁ β₁ + + + Level 1 Level 2 Level 3 Avg Distance
QLAD-flow
Algorithm
Anomalous sketch
QLAD-flow
Algorithm
h1 h2 h3
QLAD-flow
Algorithm
[1] Dewaele, G., Fukuda, K., Borgnat, P., Abry, P., & Cho, K. (2007). Extracting Hidden Anomalies using Sketch and Non Gaussian Multiresolution Statistical Detection Procedures. Proc. ACM SIGCOMM Workshop on Large-Scale Attack Defense (LSAD’07), 1–8. [2] Mikle, O., Slany, K., Vesely, J., Janousek, T., & Survy, O. (2011). Detecting Hidden Anomalies in DNS Communication. [1] [2]
Some attacks span a lot of flows
e.g. DoS with spoofed IP address
QLAD-flow
Shortcomings
QLAD-flow is unable to detect these
QLAD-global
Algorithm
Observation: each traffic anomaly causes changes in the distribution of one or more traffic features
QLAD-global
Algorithm
ENTRADA DSC GET NEW ENTROPIES UPDATE MODELS RUN DETECTOR
REPORT ANOMALIES
anomaly
TLD SLD qtype rcode client ASN country response size TLD SLD qtype rcode client ASN country response size
1 2 4 3
QLAD-UI
Rationale
challenging
traffic
Node.js API
QLAD-UI
Implementation
DATABASE DATA API USER INTERFACE React + Flux + Grommet + D3.js HDFS MongoDB
staging warehouse
Thrift API Mongoose
Data
Description of the evaluation dataset
Sunday 12 to Monday 13 February 2017
server
Results
Detected anomalies
QLAD-flow (source IP) QLAD-flow (query name) QLAD-global Total (unique) Caching resolver 12 2 12 Bening anomaly 1 2 3 Email marketing 8 2 8 Spam sender 3 3 Domain enumeration 5 2 5 Reflection attack 1 1 1 Broken resolver or script 1 1 DoS attack 3 2 1 3 Unknown 1 1 1 False Positive 11 TOTAL 35 15 9 36
Conclusion
Achieved results
QLAD
is a winning combination!
Conclusion
Future (ongoing) work
Can this be automated?
Any questions?
Appendix
Gamma Distribution
The shape parameter α controls the evolution of Γα,β from a highly asymmetric stretched exponential shape (α → 0) to a Gaussian shape (α → +∞).
N(αβ, αβ2) The scale parameter β mostly acts as a multiplicative factor (if X is Γα,β , then γ X is simply Γα,γ β ).