metadata format for benchmarking anomaly detection
play

Metadata format for benchmarking anomaly detection algorithms Youki - PowerPoint PPT Presentation

Metadata format for benchmarking anomaly detection algorithms Youki Kadobayashi NICT / NAIST youki-k <at> is.naist.jp 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008 Anomaly detection algorithms:


  1. Metadata format for benchmarking anomaly detection algorithms Youki Kadobayashi NICT / NAIST youki-k <at> is.naist.jp 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  2. Anomaly detection algorithms: The problem ● We are still in the dark ages ● Incompatible datasets ● ● Incomparable results ● ● No technical method to accurately communicate the result of anomaly detection, even if we share the common dataset ● Inability to benchmark their performance 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  3. Metadata format for anomaly detection algorithms ● Separate file for each algorithm ● XML-based ● header, {record1, record2, …} ● ● Envelope information: rely on datcat tools 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  4. Header ● Algorithm name ● Algorithm version ● Algorithm URL ● Parameters given to the algorithm ● Date of analysis ● Analyst name ● Analyst organization ● Target dataset ● DATCAT dataset name 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  5. Record ● Each record consists of: ● src, dst, start_time, end_time, anomaly_type, anomaly_value ● ● Arbitrary number of records ● ● Either src or dst can be wildcard 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  6. API ● label_data(int handle, in_addr_t src, in_addr_t dst, time_t start, time_t end, string anomaly_type, float anomaly_value) ● label_data_ex(int handle, in_addr_t[] src, in_addr_t[] dst, time_t start, time_t end, string anomaly_type, float anomaly_value) 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  7. Slicing ● Slice anomalous segments of pcap data ● Based on anomaly_type, anomaly_value ● ● Slice pcap data according to start_time, end_time ● ● Useful for generating synthetic dataset 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  8. Merging ● Insert pcap slice B into pcap slice A ● At particular time offset ● ● Useful for benchmarking anomaly detection algorithms with synthetic dataset 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  9. Comparison ● Visualize the spotted anomalies along timeline ● ● Compute coverage and support, generate HTML table 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

  10. Current status ● Implementation in progress ● ● Your comments are welcomed ● ● youki-k <at> is.naist.jp 10 th CAIDA-WIDE workshop / 1 st CAIDA-WIDE-CASFI workshop August 2008

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend