less is more with intelligent packet capture
play

Less is More with Intelligent Packet Capture RANDY CALDEJON FLOCON - PowerPoint PPT Presentation

Less is More with Intelligent Packet Capture RANDY CALDEJON FLOCON 2020 Objectives Consider merits of streaming analytics Expose to advanced open source tools Encourage to experiment with OpenArgus 2 2 Streaming Analytics


  1. Less is More with Intelligent Packet Capture RANDY CALDEJON FLOCON 2020

  2. Objectives • Consider merits of streaming analytics • Expose to advanced open source tools • Encourage to experiment with OpenArgus 2 2

  3. Streaming Analytics • Increase speed at the Edge • Reduce bandwidth • Local Resources 3 3

  4. DragonFly Design Goals = Machine Incremental Sustained Single Node Bolt-On Learning Updates Performance Architecture Mindset Analyzes data Receive updates before Maintains 20Gbps+, High-performance Integrate seamlessly as it arrives the flow is complete without a cluster with other security tools 4

  5. A Practical Application of DragonFly PCAP or it didn’t happen. 5 5

  6. 100% 80% High Cost Full Packet 60% 10Gbps Network Link Capture is 30 days ~$1.2M annually 40% Ground Truth; but… Low Signal to Noise 20% Forensically relevant network data is 0% a small fraction of total network data Packet Capture No Forensic Value Forensically Relevant Data Indicators of Compromise 6

  7. Typical Packet Capture Workflow: Retrospective Capture Record Filter Analyze 7

  8. Intelligent Packet Capture Capture Record Filter Analyze 8

  9. Intelligent Packet Capture: Real-Time Capture Analyze Filter Record 9

  10. Intelligent PCAP Using Machine Learning to Capture Packets with Forensic Value Ground truth – Full packet capture has long been viewed as the “ground truth” for activity on the network, allowing analysts to identify the source of security incidents. Intelligent Packet Capture Expensive – Despite its value, full packet capture is not used to its uses threat intelligence , $$ fullest extent because lengthy retention periods are cost prohibitive advanced analytics , and and retention only shrinks as bandwidth utilization increases. Machine Learning to decide in near real-time what to Alternatives Lack Payloads – Though valuable for portions of the security workflow, alternatives to PCAP such as Flow, and Application record. Metadata cannot provide the “ground truth” payload for irregular traffic. Combine forces – Intelligent packet capture combined with augmented flow provides a powerful combination that supports a data friendly log format plus the full packets for anomalous traffic. 10

  11. Intelligent PCAP PACKETS/S Performance Requirements EVENTS/S LOW LATENCY FEEDBACK LOOP 11

  12. Intelligent PCAP Open Source Framework Argus mlpack (extraction) (training) eBPF tcpdump (filtering) (recording) 12

  13. tcpdump -i eth0 -w /cache/pcap-%m-%d-%H-%M-%S \ -W 100 -G 300 –C 1000 13

  14. eBPF for Filtering User Space Kernel reject eBPF eBPF LLVM eBPF load Verifier program Clang bytecode JIT compiler register event config eBPF native code packet data maps 14

  15. eBPF Map struct bpf_map_def SEC("maps") watchlist = { .type = BPF_MAP_TYPE_PERCPU_HASH, .key_size = sizeof(u32), /* ipv4 address */ .value_size = sizeof(u64), /* counter/timeout */ .max_entries = 100000, .map_flags = BPF_F_NO_PREALLOC, } 15

  16. Mlpack for training Scoring Training mlpack lib Model 16

  17. mlpack splitting data 1 /usr/local/bin/mlpack_preprocess_split \ --input_file data/$filename.data.csv \ --input_labels_file data/$filename.labels.csv \ --training_file data/$filename.train.csv \ --training_labels_file data/$filename.train.labels.csv \ --test_file data/$filename.test.csv \ --test_labels_file data/$filename.test.labels.csv \ --test_ratio 0.3 \ --verbose 17

  18. mlpack generating model 2 / usr/local/bin/mlpack_random_forest \ --training_file data/$filename.data.csv \ --labels_file data/$filename.labels.csv \ --num_trees 10 \ --minimum_leaf_size 3 \ --print_training_accuracy \ --output_model_file model /$filename.eval-model.bin \ --verbose 18

  19. mlpack testing model /usr/local/bin/mlpack_random_forest \ 3 --input_model_file model /$filename.eval-model.bin \ --test_file data/$filename.test.csv \ --test_labels_file data/$filename.test.labels.csv \ --probabilities_file probs.csv \ --verbose 19

  20. • Scalable • Lightweight • Flexible • Extensible Version 2.0

  21. DragonFly MLE Analyzers Engine (embedded LUA JIT) Plugins 21

  22. Fast - C/C++ DragonFly Lightweight – Small Library Engine Scriptable – Embedded LUA JIT Easy – Arduino Programming Model 22

  23. DragonFly Scriptable Analyzers function M:setup() model = config[‘module.model’] rf = RandomForest.load(model) end function M:loop (event) …. rf:classify (event) end 23

  24. DragonFly Scriptable Analyzers function M:dns (event) …. rf:classify (event) end function M:tls (event) …. rf:classify (event) end 24

  25. DragonFly Plug-ins mlpack eBPF iptree Redis cuckoo filter 25

  26. Argus ra (client) Argus Radium ratop (client) (multiplexer) (flow meter) Real-time Per Flow Updates Ramle (client) 26

  27. Argus Real-Time Flow Meter Field Overview • IP Addresses • Total Bytes • Start time Flow • Ports • Total Packets • Duration • Protocol Flow Features • MAC, VLAN, MPLS, ICMP, Extended Flow • Flow details by direction • Payload TCP flags and options • Interpacket Arrival time • Connection statistics (FIN, • Connection Setup Times and Jitter RST, SYN, Window Packet • Load and Rates (bytes and • Dropped/retransmitted advertisements, Zero Dynamics packets per second) packet statistics windows) 100+ Features Computed • Producer/Consumer Ratio • Flow Active Runtime Packet Dynamic • Key Stroke Identification Statistics • App/Byte Ratio Statistics Features Derived Fields • Country Code • MAC Manufacturer (OUI) Record • Record Cause (Start, Status, • Unique Identifier (seq) • Record Type (“flow” or Management Stop, Close, Error) • Sensor ID “management”) 27

  28. Intelligent PCAP with raml • Based on Argus client (library) • Integrated with DragonFly (library) • Able to run an instance per core 28

  29. Intelligent PCAP with raml Argus raml mlpack 29

  30. raml: DGA Analyzer function M:loop (event) local v = features(event.domain, event.ttl) score = rf:classify (v) return score end 30

  31. raml: Threat Feed Analyzer function M:setup() file = config[‘ioc.filename’] iplist = iptree(file) end function M:loop (event) local daddr = event[‘daddr’] match = iplist.lookup (daddr) return match end 31

  32. Intelligent PCAP Solutions Argus raml mlpack br0 eth0 pcap0 tcpdump 32

  33. LESSONS LEARNED >14Mpps Performance >750Keps <50 msec 33

  34. Next Steps… • Complete POCs • Publish to GitHub https://github.com/counterflow-ai/dragonfly2 • Merge raml with Argus https://openargus.org/ • Explore additional use cases… 34 34

  35. • Threat Intelligence Triage Streaming Analytics Use Cases • Encrypted Traffic Analysis • Predictive Fault Detection 35 35

  36. Questions? RANDY CALDEJON rc@counterflowai.com https://github.com/counterflow-ai/dragonfly2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend