Antivirus Engine Giorgos Vasiliadis and Sotiris Ioannidis - - PowerPoint PPT Presentation

antivirus engine
SMART_READER_LITE
LIVE PREVIEW

Antivirus Engine Giorgos Vasiliadis and Sotiris Ioannidis - - PowerPoint PPT Presentation

GrAVity: A Massively Parallel Antivirus Engine Giorgos Vasiliadis and Sotiris Ioannidis FORTH-ICS, Greece RAID10, 15 September 2010 Overview Increase the processing throughput of virus scanning applications, using the Graphics


slide-1
SLIDE 1

GrAVity: A Massively Parallel Antivirus Engine

Giorgos Vasiliadis and Sotiris Ioannidis FORTH-ICS, Greece RAID’10, 15 September 2010

slide-2
SLIDE 2

Overview

  • Increase the processing throughput of virus

scanning applications, using the Graphics Processing Unit (GPU)

slide-3
SLIDE 3

Outline

  • Introduction
  • Architecture
  • Performance evaluation
  • Conclusions
slide-4
SLIDE 4

Motivation

  • Antivirus software is running on e-mail servers,

gateway proxies, user desktops

– Require significant computational resources

  • Graphics cards

– Easy to program – Powerful and ubiquitous

  • Why not use GPUs to speed-up virus scanning
  • perations?
slide-5
SLIDE 5

CPU vs GPU

  • The GPU is specialized for compute-intensive,

highly parallel computation

– More transistors are devoted to data processing rather than data caching and flow control

slide-6
SLIDE 6

Anti-Virus Databases

  • Contain thousands of signatures
  • ClamAV contains more than 60K signatures, with

length varying from 4 to 392 bytes

– Significant longer than NIDS

> 80% > 90%

slide-7
SLIDE 7

Virus Scanning in ClamAV

  • ClamAV uses a small part from each signature for

a first-pass filtering

  • Every potential match is processed by the

verification module

Files Filtering Module Verification Module

slide-8
SLIDE 8

Virus Scanning in ClamAV

  • Usually, the majority of data do not contain any

virus

Only a small number of file segments pass to the verification module

Files Filtering Module Verification Module

slide-9
SLIDE 9

Our Approach: GPU Offloading

Files Filtering Module Verification Module

slide-10
SLIDE 10

GRAVITY DESIGN

slide-11
SLIDE 11

Basic Design

  • Three-stage pipeline

GPU

Files Verification Module

slide-12
SLIDE 12

Files Journey (1/5)

  • File scanning example

GPU

Files Verification Module File contents are buffered back-to-back

slide-13
SLIDE 13

Files Journey (2/5)

  • File scanning example

Files Verification Module

  • 1. File contents
slide-14
SLIDE 14

Files Journey (3/5)

  • File scanning example

Files Verification Module

GPU filters

  • ut clean

segments

  • 1. File contents
slide-15
SLIDE 15

Files Journey (4/5)

  • File scanning example

GPU

Files Verification Module

  • 2. Matched offsets
  • 1. File contents
slide-16
SLIDE 16

Files Journey (5/5)

  • File scanning example

GPU

Files Verification Module

  • 2. Matched offsets

Verify matches and report

Full Virus Signatures

  • 1. File contents
slide-17
SLIDE 17

GPU IMPLEMENTATION

slide-18
SLIDE 18

Prefix Filtering

  • Take the first n bytes from each signature

– e.g.

Worm.SQL.Slammer.A:0:*: 4e65742d576f726d2e57696e33322e536c616d6d65725554

  • Compile all n-bytes sub-signatures into a

single Scanning Trie

  • The Scanning Trie can quickly filter clean data

segments in linear time.

slide-19
SLIDE 19

Scanning Trie

  • GrAVity: Variable trie height

4 patterns (avg) per 14-char prefix

slide-20
SLIDE 20

Virus Scanning on the GPU

  • Each thread operate on different data

– May overlap for spanning patterns, but … – … no communication/synchronization costs. – Highly scalable (million threads can run in parallel)

slide-21
SLIDE 21

Memory Management Optimizations

  • Exploit texture cache, to achieve better reading

throughput

  • Cache misses are hidden by running a large number
  • f threads in parallel

Cache miss Cache miss thread switch thread switch

DRAM Cache

4 cycles 700 cycles

slide-22
SLIDE 22

PERFORMANCE EVALUATION

slide-23
SLIDE 23

GrAVity vs ClamAV

  • Up to 20 Gbps end-to-end performance

100x 12x

slide-24
SLIDE 24

Execution Time Breakdown

  • CPU time results in

20% of the total execution time, with a prefix length equal to 14

  • Increasing the prefix

length, results in less matches

slide-25
SLIDE 25

Raw Computational Throughput

  • With 8M threads, the GPU achieves 42Gbits/s

throughput

slide-26
SLIDE 26

Scaling factor

  • Fast evolution
slide-27
SLIDE 27

Conclusions

  • Virus scanning on the GPU is practical and

fast!

  • Over 20 Gbit/s throughput

– Suitable for network-based virus scanning

  • Future work includes

– Adapt memory-efficient algorithms (XFA, D2FA) – Multiple GPUs

slide-28
SLIDE 28

GrAVity: A Massively Parallel Antivirus Engine

thank you!

Giorgos Vasiliadis, gvasil@ics.forth.gr Sotiris Ioannidis, sotiris@ics.forth.gr