Effective features for detecting Effective features for detecting - PowerPoint PPT Presentation

Effective features for detecting Effective features for detecting IRC botnets IRC botnets Claudio Mazzariello, Carlo Sansone Carlo Sansone Claudio Mazzariello, Dipartimento di Informatica e Sistemistica Dipartimento di Informatica e Sistemistica University of Napoli Federico II University of Napoli Federico II via Claudio 21, 80125 Napoli (Italy) via Claudio 21, 80125 Napoli (Italy) {claudio.mazzariello, carlo.sansone}@unina.it {claudio.mazzariello, carlo.sansone}@unina.it Terzo workshop italiano su PRIvacy e SEcurity – “PRISE" – Roma 20 0ttobre 2008

Problem Statement Problem Statement  Botnet  A network of infected hosts, named bots , under the control of an operator named botmaster  Control performed by using a Command & Control channel • Centralized (e.g. IRC, HTTP, ...) • Distributed (e.g. P2P...)  Commands out of a quite large and flexible set can be issued by the botmaster to each bot Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 2

Motivation of this work Motivation of this work  Botnets keep spreading  Botnets are able to perform many malicious actions  Spam  ID theft  Clickfraud (e.g. Google AdSense abuse)  Cracking  Malware spreading  DDoS  Traffic Sniffing  Keylogging  Polls/statistics manipulation  …  Botnets involve economic interests  More dangerous than older attack types Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 3

Contribution Contribution  Definition of a model of normal and botnet-related IRC channel usage  Definition of an architecture exploiting such a model for botnet detection  IRC user behavior classification aimed at botnet detection by means of pattern recognition techniques Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 4

Presentation outline Presentation outline  An introduction to botnets  Details on IRC botnets  The proposed detection approach  IRC user behavior model  Detection system reference architecture  Experimental evaluation Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 5

Centralized botnet's lifecycle Centralized botnet's lifecycle  bot-herder configures initial bot parameters and C&C details  register IP at DNS for rendezvous  bot-herder launches or seeds new bot(s) - bots spreading, botnet growing  Vulnerability discovery and exploitation  Malicious code download  DNS lookup for rendezvous  Join the C&C  Receive commands from the Botmaster  losing bots (stasis), botnet not growing  abandon botnet and sever traces  unregister DDNS Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 6

Botnet Statistics Botnet Statistics  60% are IRC bots  70% of all the bots connect to a single IRC server  57,000 Active Bots per day for the first 6 months of 2006 ( Symantec )  4.7 million distinct computers being actively used in Botnets  Most Botnets are managed by a single server ( up to 15,000 bots )  Mocbot seized control of more than 7,700 machines within 24 hours Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 7

Why IRC? Why IRC?  Oldest and most popular IM  Bots were commonly user by channel operator for management and monitoring purposes  Not owned by anyone – public  Defined in RFC 1459  Text based  Designed for both point-to-point and point-to-multipoint communication  one-to-one, or one-to-group chat  flexible, open-source protocol  Potentially able to manage a high number of clients  Grants anonymity for the botmaster Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 8

Centralized C&C Centralized C&C  Easier to manage and use  Easier to disrupt  How do the bots know where the C&C is?  Hardcoded IP based rendezvous • easily uncovered • C&C needs replacement after disruption • All Bots need replacement  Domain names used for rendezvous • DNS RR can be updated to current C&C IP • Bots can dynamically point to the correct C&C IP Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 9

Reference framework Reference framework Port based application protocol detection  RFC based IRC decoder  Model = representative features  Each IRC channel is represented by a  feature vector , representing its status Feature vectors are updated at each  event occurring in the corresponding IRC channel Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 10

Intuitions about IRC based botnets Intuitions about IRC based botnets  Bursty channel activity  After command is issued, bots may respond at once, then be quiet  Limited vocabulary  Sentence structure  May resemble a shell command  The same recurring structure may be found in many sentences  Disproportion between user and control activity in a channel  “strange” words used for communication  Disproportion of consonants and vowels in words used for chatting • Language dependent  Changes and structure of chat room topic  Unusual nicknames  Completely random OR  Unexpextedly regular Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 11

IRC channel features IRC channel features  Users Number:  Join Number:  total number of users in the channel  JOIN rate in the channel  Average words number:  SetMode Number:  average number of unique words in a  SetMode rate in the channel sentence  Nickname Changes:  Average/Variance of Channel Dictionary  count of nickname changes in a channel Cardinality:  Ping Number:  Mean and variance of the vocabulary’s  PING rate in the channel cardinality  IRC Commands Number:  Unusual Nicknames*  overall IRC command rate  Equal Answers:  Active Users Number:  number of sentences with a common ordered subset of words  number of users active in the channel  Control Commands Number:  count of channel control commands issued *J. Goebel and T. Holz. Rishi: identify bot contaminated hosts by irc nickname evaluation. In HotBots’07: Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets, pages 8–8, Berkeley, CA, USA, 2007. USENIX Association. Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 12

Experimental Setup Experimental Setup  Data collection  Botnet related traffic from the Georgia Institute of Technology network  Normal IRC chats logged from the University of Napoli network  Three datasets  50,000 samples (25,000 normal + 25,000 botnet-related) • Small, evenly split  149,999 samples (75,010 normal + 74,989 botnet-related) • Large, evenly split  165,000 samples (150,000 normal + 15,000 botnet-related) • Large, more realistic distribution of t-uples  Selected algorithms  SVM (Support Vector Machine) – very “popular”  J48 (Decision Tree) – very “quick”  Performance evaluation  10-fold cross validation Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 13

Classification algorithms Classification algorithms  SVM – Kernel based method  Search for hyperplanes effectively separating ρ x data points r x′  Support vectors for providing better prediction performance  Non-linearly separable data can be trasformed by means of a kernel function in a space more suitable for linear separability  Separation hyperplane search is performed in transformed space φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ (.) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) φ ( ) Input space Feature space Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 14

Classification algorithms Classification algorithms  J48 – Decision tree  Each attribute of the data can be used to make a decision which splits the data-set into smaller subsets  The normalized information gain is measured  The attribute generating the highest normalized information gain is chosen  The algorithm is recursively applied to the subsets Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 15

Experimental results Experimental results Algorithm SVM J48 Samples 50000 149999 165000 50000 149999 165000 False alarm 0 0 0 < 0.001 0 0 Rate Missed 0 0 0 0 0 0 detection rate Most representative features  Limited vocabulary cardinality  Limited sentence variability  Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 16

Conclusions Conclusions  Promising model for botnet activity detection  Tested on “real” data  Results hopefully valid in a general scenario  Model works with both a very reliable and a very quick classifier  Effective classification performed on a per-tuple basis  Botnet detection accuracy within strict performance boundaries Claudio Mazzariello, Carlo Sansone – Effective features for detecting IRC botnets 17

Effective features for detecting Effective features for detecting - PowerPoint PPT Presentation

Effective features for detecting Effective features for detecting IRC botnets IRC botnets Claudio Mazzariello, Carlo Sansone Carlo Sansone Claudio Mazzariello, Dipartimento di Informatica e Sistemistica Dipartimento di Informatica e

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

NetFlow Analysis: Detecting covert channels on the network Detecting malicious traffic by using

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Detecting Chang Detecting Changes in W s in Water ter Qua Q ualit lity i lit lit i in L

Detecting Self-Interruptions during Reading Jan Pilzer and Sam Liu 2017-11-27 Detecting

Detecting Insolvency Detecting Insolvency David Emanuel 1 4 August 2 0 0 9 Outline

Detecting Cracks under Bushings Detecting Cracks under Bushings in Aircraft Structures in

Detecting abnormal events Detecting abnormal events Jaechul Kim Purpose Purpose Introduce

Detecting and Detecting and Characterizing Heterogeneity Characterizing Heterogeneity

Detecting Topics and their Transitions Victor Mireles , Artem Revenko Hybrid Statistical Semantic

Detecting Errors in Semantic Annotation Argument identification variation Heuristics for

Detecting Outliers under Detecting Outliers . . . What We Plan To Do Interval Uncertainty:

Detecting changes in Detecting changes in the rate the rate of a of a Poisson process

Outline DIF/DSF with DIF/DSF with PCMtrees PCMtrees Detecting Differential Item and Testing

IRC C 471( 471(c) & & 280E 280E Presented by Greenspoon Marder & Bridge West LLC

Humanitarian Program Experiences in Food Security Activities From Harm to Home | Rescue.org IRC

Improving SIP authentication Lars Strand Wolfgang Leister The Tenth International Conference on

CMB Power Spectrum Formula in the Background-Field Method . . . . . Shoichi Ichinose

Does social capital make you healthier? Lorenzo Rocco University of Padova Marc Suhrcke

Oshkosh Corporation Investor Presentation MAY 2019 (NYSE: OSK) Forward-Looking Statements This

403(b) Plan PETERSBURG CITY PUBLIC SCHOOLS OCTOBER 3, 2018 403(b) Background Information A

Co Commi mmitte ttee International Rescue Committee Founded in 1933 at the request of

Effective features for detecting Effective features for detecting - PowerPoint PPT Presentation

Effective features for detecting Effective features for detecting IRC botnets IRC botnets Claudio Mazzariello, Carlo Sansone Carlo Sansone Claudio Mazzariello, Dipartimento di Informatica e Sistemistica Dipartimento di Informatica e

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

NetFlow Analysis: Detecting covert channels on the network Detecting malicious traffic by using

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Detecting Chang Detecting Changes in W s in Water ter Qua Q ualit lity i lit lit i in L

Detecting Self-Interruptions during Reading Jan Pilzer and Sam Liu 2017-11-27 Detecting

Detecting Insolvency Detecting Insolvency David Emanuel 1 4 August 2 0 0 9 Outline

Detecting Cracks under Bushings Detecting Cracks under Bushings in Aircraft Structures in

Detecting abnormal events Detecting abnormal events Jaechul Kim Purpose Purpose Introduce

Detecting and Detecting and Characterizing Heterogeneity Characterizing Heterogeneity

Detecting Topics and their Transitions Victor Mireles , Artem Revenko Hybrid Statistical Semantic

Detecting Errors in Semantic Annotation Argument identification variation Heuristics for

Detecting Outliers under Detecting Outliers . . . What We Plan To Do Interval Uncertainty:

Detecting changes in Detecting changes in the rate the rate of a of a Poisson process

Outline DIF/DSF with DIF/DSF with PCMtrees PCMtrees Detecting Differential Item and Testing

IRC C 471( 471(c) &amp; &amp; 280E 280E Presented by Greenspoon Marder &amp; Bridge West LLC

Humanitarian Program Experiences in Food Security Activities From Harm to Home | Rescue.org IRC

Improving SIP authentication Lars Strand Wolfgang Leister The Tenth International Conference on

CMB Power Spectrum Formula in the Background-Field Method . . . . . Shoichi Ichinose

Does social capital make you healthier? Lorenzo Rocco University of Padova Marc Suhrcke

Oshkosh Corporation Investor Presentation MAY 2019 (NYSE: OSK) Forward-Looking Statements This

403(b) Plan PETERSBURG CITY PUBLIC SCHOOLS OCTOBER 3, 2018 403(b) Background Information A

Co Commi mmitte ttee International Rescue Committee Founded in 1933 at the request of

IRC C 471( 471(c) & & 280E 280E Presented by Greenspoon Marder & Bridge West LLC