Detecting Botnets with Temporal Persistence
Jaideep Chandrashekar Frederic Giroire Nina Taft Eve Schooler
Intel Labs Mascotte project I3S (CNRS, Univ. of Nice) INRIA Sophia Antipolis
Detecting Botnets with Temporal Persistence Jaideep Chandrashekar - - PowerPoint PPT Presentation
Detecting Botnets with Temporal Persistence Jaideep Chandrashekar Frederic Giroire Nina Taft Eve Schooler Mascotte project I3S (CNRS, Univ. of Nice) Intel Labs INRIA Sophia Antipolis Botnets: Why care? Botnet
Jaideep Chandrashekar Frederic Giroire Nina Taft Eve Schooler
Intel Labs Mascotte project I3S (CNRS, Univ. of Nice) INRIA Sophia Antipolis
exploit
call home actuation
spam DoS clickfraud proxies theft
espionage
IRC HTTP P2P hybrid
dirty webpage drive by download trojans
exploit
call home actuation
spam DoS clickfraud proxies theft
espionage
IRC HTTP P2P hybrid
dirty webpage drive by download trojans
exploit actuation
signature matching software patching
exploit
call home actuation
spam DoS clickfraud proxies theft
espionage
IRC HTTP P2P hybrid
dirty webpage drive by download trojans
exploit actuation
signature matching software patching
see RAID’09 proceedings
exploit
call home actuation
spam DoS clickfraud proxies theft
espionage
IRC HTTP P2P hybrid
dirty webpage drive by download trojans
exploit actuation
signature matching software patching traffic anomaly detectors NBAD
see RAID’09 proceedings
exploit
call home actuation
spam DoS clickfraud proxies theft
espionage
IRC HTTP P2P hybrid
dirty webpage drive by download trojans
exploit actuation
signature matching software patching traffic anomaly detectors NBAD traffic correlation port inspection payload analysis
noisy prone to false positives see RAID’09 proceedings
exploit
call home actuation
spam DoS clickfraud proxies theft
espionage
IRC HTTP P2P hybrid
dirty webpage drive by download trojans
exploit actuation
signature matching software patching traffic anomaly detectors NBAD traffic correlation port inspection payload analysis
hard to adapt a-priori knowledge req. noisy prone to false positives see RAID’09 proceedings
watch outgoing traffic use a frequency based metric
without a-priori assumptions about traffic types, destinations or protocols
without a-priori assumptions about traffic types, destinations or protocols
Training
Day1 Day2 Day3 Day4 Day5 .. .. DayN ..
Training
watch destinations whitelist frequent destinations
Day1 Day2 Day3 Day4 Day5 .. .. DayN ..
Training Detection
watch destinations whitelist frequent destinations
whitelist
params
Day1 Day2 Day3 Day4 Day5 .. .. DayN ..
Training Detection
watch destinations whitelist frequent destinations ignore whitelisted destinations track frequency for non-whitelisted destinations raise alarm for new high frequency destinations
whitelist
params
Day1 Day2 Day3 Day4 Day5 .. .. DayN ..
Training Detection
watch destinations whitelist frequent destinations ignore whitelisted destinations track frequency for non-whitelisted destinations raise alarm for new high frequency destinations
Botnet C&C’s are likely to be frequently visited whitelist
params
Day1 Day2 Day3 Day4 Day5 .. .. DayN ..
Training Detection
watch destinations whitelist frequent destinations ignore whitelisted destinations track frequency for non-whitelisted destinations raise alarm for new high frequency destinations
Botnet C&C’s are likely to be frequently visited
adding to the whitelist is a very rare event
whitelist
params
Day1 Day2 Day3 Day4 Day5 .. .. DayN ..
Destination granularity: tracking IP address = large whitelists!
➡ destination atoms
(Frequency) metric needs to capture: loosely periodic behavior at unknown timescales
➡ persistence
10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9We track persistence of destination atoms & build whitelists of destination atoms
mail1.sc.intel.com mail3.sc.intel.com mail3.jf.intel.com xyz.google.com abs.google.com circuit.intel.com cps.circuit.intel.com
mail1.sc.intel.com mail3.sc.intel.com mail3.jf.intel.com xyz.google.com abs.google.com circuit.intel.com cps.circuit.intel.com
sliding window
sliding window
w
sliding window
w
sliding window
w
connect to C&C hourly p-value: 24/24 =1 connect to C&C every 5-6 hours p-value: 4/24 = 0.17 connect to C&C once a day p-value: 1/24 =0.042
connect to C&C hourly p-value: 24/24 =1 connect to C&C every 5-6 hours p-value: 4/24 = 0.17 connect to C&C once a day p-value: 1/24 =0.042
TS1=(w1,W1), TS2=(w2,W2), TS3=(w3,W3),...., TSn=(wn,Wn)
TS1=(w1,W1), TS2=(w2,W2), TS3=(w3,W3),...., TSn=(wn,Wn)
Can become very expensive!
Trick: select Wi = k. wi
Then, use a single bitmap of size k.wmax
for 4 weeks
truth not available)
→seems reasonable
80 % of destinations have a p-value <0.2 20% of destinations have a p-value > 0.2
1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
persistence
# of atoms
if p(atom) > 0.6 add to whitelist
14
12.5 25 40 70 90 110 125 145 # users whitelist size
windows auto update
malware
ClamAV Signature C&C type # of C&C atoms C&C Volume min - max Trojan.Aimbot-25 port 22 1 0-5.7 Trojan.Wootbot-247 IRC port 12347 4 0-6.8 Trojan.Gobot.T IRC port 66659 1 0.2-2.1 Trojan.Codbot-14 IRC port 6667 2 0-9.2 Trojan.Aimbot-5 IRC via http proxy 3 0-10 Trojan.IRCBot-776* HTTP 16 0-1. Trojan.VB-666* IRC port 6667 1 0-1.3 Trojan.IRC-Script-50 IRC ports 6662-6669,9999,7000 8 0-2.1 8 Trojan.Spybot-248 port 9305 4 3.8-4.6 Trojan.MyBot-8926 IRC port 7007 1 0-0.1 Trojan.IRC.Zapchast-11 IRC ports 6666, 6667 9 0-1 Trojan.Peed-69 [Storm] P2P/Overnet 19672 0-30
to identify/isolate C&C traffic
to identify/isolate attack traffic
37.5 75.0 112.5 150.0 SDBot ZapChast Storm
C&C Attack
0.0 0.2 0.4 0.6 0.8 1.0 1 hr 6 hr 12 hr 18 hr 24 hr SDBot ZapChast Storm
persistence
Botnet Persistence Timescale # dest. atoms IRCBot-776 1.0 (10,1) 1 IRCBot-776 0.8 (200,20) 2 Aimbot-5 1.0 (10,1) 1 Aimbot-5 1.0 (40,4) 1 Aimbot-5 1.0 (160,16) 1 MyBot-8926 0.6 (160,16) 1 IRC.Zapchast-11 1.0 (40,4) 3 Spybot-248 1.0 (10,1) 2 IRC-Script-50 1.0 (10,1) 7 VB-666 0.7 (10,1) 1 Codbot-14 1.0 (10,1) 1 Gobot.T 1.0 (10,1) 1 Wootbot-247 1.0 (10,1) 3 IRC.Zapchast-11 1.0 (10,1) 6 Aimbot-25 1.0 (10,1) 1 Peed-69 [Storm] 1.0 (10,1) > 1
All samples detected by threshold 0.6
associated false positive rate~ 0.5/day
Filtering traffic via whitelists:
suspicious traffic
lowered thresholds)
thresh
7 1 2 3 4 5 6
0.75 0.8 0.85 0.9 0.95 1
False Positives /day Detection Rate
0.1 0.2 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3
filtered~ 30