Untangling the code An overview of techniques to reverse engineer - - PowerPoint PPT Presentation

untangling the code
SMART_READER_LITE
LIVE PREVIEW

Untangling the code An overview of techniques to reverse engineer - - PowerPoint PPT Presentation

Untangling the code An overview of techniques to reverse engineer malicious software Prashant Gupta Security Architect, McAfee Inc. June 3, 2013 McAfee Confidential Internal Use Only Abstract Reverse engineering and analysis of binary code


slide-1
SLIDE 1

McAfee Confidential—Internal Use Only

Untangling the code

An overview of techniques to reverse engineer malicious software

June 3, 2013

Prashant Gupta Security Architect, McAfee Inc.

slide-2
SLIDE 2

Abstract

June 3, 2013 2

Reverse engineering and analysis of binary code has been seen as a black art that requires immense commitment, large degree of experience and intuition to be fruitful. This may have been true a couple of decades ago but with the recent focus on reverse code engineering by security vendors and prolific malicious actors alike this field has progressed significantly. In this talk I will present some of the techniques employed during reversing of unknown binaries when vetting them for malicious traits. The talk would also cover how program analysis can help identify possibly suspicious traits in code and how reverse engineering and program analysis techniques used for code are also relevant for identifying potentially suspicious data when analysing document formats.

slide-3
SLIDE 3

Reverse Code Engineering: What?

June 3, 2013 3

Better Understanding

Interactions Environment Code

Scanners/Fingerprinting Decoders/Decryptors Unpackers Behaviour Analysers Code reversers … …

slide-4
SLIDE 4

Reverse Code Engineering: Why?

June 3, 2013 4

  • Improve understanding where source code is not available
  • Audit, review or forensic investigation of software systems
  • Identifying intellectual property licensing breach
  • Malware research & defence
slide-5
SLIDE 5

Malware prevalence

June 3, 2013 5

357 8,069 56,342 164,000 54M+ 1 10 100 1000 10000 100000 1000000 10000000 100000000 1990 1995 2000 2005 2010

Historically….

slide-6
SLIDE 6

Malware prevalence: Why?

June 3, 2013 6

Malware Code

Virtualization Encryption Compression Anti- Emulation Junk-Code Packer Chaining Dynamic Functionality Extension Destroy structures Anti- Disassembly New attack vectors

slide-7
SLIDE 7

Reversing Code: Static Analysis

June 3, 2013 7

  • Automated Signature Search

– Text search – Binary signatures – Known blobs, tables and data structure search

  • Decoding/decrypting code and payload
  • Identifying and decoding encoded payloads
  • From XORs to Purpose Built Custom algorithms (e.g. usage pseudo-random)
  • Code Block identification
  • Compiler and Library recognition
  • Fingerprinting known obfuscation techniques
  • Dealing with compiler optimizations
  • Semantic code similarity identification
slide-8
SLIDE 8

Reversing Code: Static Analysis

June 3, 2013 8

  • Decompiling
  • Identifying boiler-plate code
  • Make complex code easier to understand (e.g. algorithms)
  • Many issues, after so many years still in it’s infancy.
  • Packer identification and Un-Packing
  • Identify if binary is a setup, built using a binder/packer/cryptor
  • Unpack using custom, generic or standard algorithms.
  • Artefact extraction and logging
  • Structural anomalies
  • Known code patterns (e.g. peculiar function chaining)
  • Junk-code or no-op code search
slide-9
SLIDE 9

Reversing Code: Dynamic Analysis

June 3, 2013 9

  • Virtualization/Sandboxing and Debugging
  • Control malware exposure and it’s ability to leave irreversible changes.
  • Debugging to guide execution flows (bypass exceptions)
  • Providing stimulus to force malware to exhibit behaviour
  • Active and passive analysis platforms.
  • Behaviour logging
  • Automated behaviour trace logging
  • Identify communication mechanism
  • Identify persistence mechanism
  • Post/Part execution memory dumps
  • Generate memory dumps for detailed static analysis
  • In cases where unpacking is not possible/feasible
  • In cases where in-memory data structures need analysis
  • Runtime unpacking
slide-10
SLIDE 10

Reversing Code: Machine Correlation

June 3, 2013 10

  • Correlation for high level behaviour inference
  • correlation of artefacts extracted from automated analysis
  • Automated Classification
  • Behavioural trait association
  • Static code relationships
  • Malware family
slide-11
SLIDE 11

Reversing Code: Manual Investigation

June 3, 2013 11

  • Separating the wheat from the chaff

– Identifying suspicious code – Evaluating known patterns and guiding analysis process – Making decisions where automation could only provide approximations.

  • Optimizing program code analysis
  • Correlating artefacts from static and behavioural analysis
  • Building new algorithms
  • Fixing problems faced by automation (exceptions, resource constraints, etc.)
  • Manual unpacking/decoding
  • Heuristic development
  • Adding new feature extraction methods
  • New correlation policies
  • New subsystem development for analysis and detection improvements
slide-12
SLIDE 12

Reverse Engineering documents

June 3, 2013 12

Binary Code Binary Documents Analysis Not human readable Yes Yes RE tools and environment Multitude of environments Many execution environments including VMs Documents are generally platform agnostic. Heuristic analysis systems can be shared when analysing artefact correlation. Can exploit vulnerabilities Yes, but not always needed. Yes, generally in document editor/reader but sometimes in OS Dynamic analysis techniques can be used Internal formats can be

  • bfuscated

Yes Yes Detecting encoded payloads. Identifying presence of

  • bfuscation.

Executable Code Yes No Signature searches and payload analysis techniques.

slide-13
SLIDE 13

An example technique…

June 3, 2013 13

10 20 30 40 50 60 70 80 90 100 100 38400 76700 115000 153300 191600 229900 268200 306500 344800 383100 421400 459700 498000 536300 574600 612900 651200 689500 727800 766100 804400 842700 881000 919300 957600 995900

explorer.exe

10 20 30 40 50 60 70 80 90 100 100 14500 28900 43300 57700 72100 86500 100900 115300 129700 144100 158500 172900 187300 201700 216100 230500 244900 259300 273700 288100 302500 316900 331300 345700 360100 374500

upx compressed explorer.exe

slide-14
SLIDE 14

An example technique…

June 3, 2013 14

10 20 30 40 50 60 70 80 90 100 100 157900 315700 473500 631300 789100 946900 1104700 1262500 1420300 1578100 1735900 1893700 2051500 2209300 2367100 2524900 2682700 2840500 2998300 3156100 3313900 3471700 3629500 3787300 3945100 4102900 4260700 4418500 4576300 4734100

document

10 20 30 40 50 60 70 80 90 100 100 157900 315700 473500 631300 789100 946900 1104700 1262500 1420300 1578100 1735900 1893700 2051500 2209300 2367100 2524900 2682700 2840500 2998300 3156100 3313900 3471700 3629500 3787300 3945100 4102900 4260700 4418500 4576300 4734100

document with hidden executable

slide-15
SLIDE 15

Open source toolsets

June 3, 2013 15

SysAnalyzer

  • Automated malcode analysis system (not a sandbox!)

Malcode Analyst Pack

  • suite of tools useful for malcode analysts

VirtualBox

  • x86 and AMD64/Intel64 virtualization product

BeaEngine

  • disassembler library x86 x86-64 (IA32 and Intel64)

Libemu

  • x86 shellcode emulation
slide-16
SLIDE 16

Databases/Tools

June 3, 2013 16

RE-Google IDA plugin

Queries Google Code for information about the functions contained in a disassembled binary

ClamAV

Open source (GPL) antivirus engine

Malware lookup services VirusTotal, The Malware Hash Registry ThreatExpert, McAfee SiteAdvisor Utilities and Assessment Tools McAfee Free Tools, Collaborative RCE Tool Library

slide-17
SLIDE 17

Extensible analysis frameworks

June 3, 2013 17

Cuckoo Sandbox

  • Modular malware analysis system

Zero Wine Malware Analysis Tool

  • Research project to dynamically analyse the behaviour of malware

Malheur

  • Automatic analysis of malware behaviour

Radare

  • Open source tools to disasm, debug, analyse, manipulate binary files
slide-18
SLIDE 18

Prashant Gupta Security Architect, McAfee Inc.

@PrashantGupta

slide-19
SLIDE 19

References

June 3, 2013 19

IMPORTANT: These are 3rd party websites so please review what you download for malware/suspicious content before use.

1. Cuckoo Sandbox: http://www.cuckoosandbox.org/ 2. Zero Wine Malware Analysis Tool: http://zerowine.sourceforge.net/ 3. Malheur: http://www.mlsec.org/malheur/ 4. Radare: http://www.radare.org/ 5. Interactive Disassembler Pro: http://www.hex-rays.com/ 6. REGoogle: http://regoogle.carnivore.it/ 7. ClamAV: http://www.clamav.net/ 8. SysAnalyzer: https://github.com/dzzie/SysAnalyzer 9. Example technique from - Detecting exploits in electronic objects, Alexander Shipp: http://www.google.com/patents/US20080134333

slide-20
SLIDE 20

References

June 3, 2013 20

IMPORTANT: These are 3rd party websites so please review what you download for malware/suspicious content before use.

  • 10. Malcode Analyst Pack: https://github.com/dzzie/MAP
  • 11. VirusTotal: http://www.virustotal.com/
  • 12. The Malware Hash Registry: http://www.team-cymru.org/Services/MHR/
  • 13. ThreatExpert: http://www.threatexpert.com
  • 14. McAfee SiteAdvisor: http://www.siteadvisor.com/
  • 15. McAfee Free Tools: http://www.mcafee.com/us/downloads/free-tools/
  • 16. Collaborative RCE Tool Library:

http://www.woodmann.com/collaborative/tools/index.php/Category:RCE_To

  • ls
  • 17. McAfee Threats Report Q4 2012:

http://www.mcafee.com/uk/resources/reports/rp-quarterly-threat-q4- 2012.pdf