An Analysis of Cybercrime Activity within an Underground Gaming - - PowerPoint PPT Presentation

an analysis of cybercrime activity within an underground
SMART_READER_LITE
LIVE PREVIEW

An Analysis of Cybercrime Activity within an Underground Gaming - - PowerPoint PPT Presentation

An Analysis of Cybercrime Activity within an Underground Gaming Forum Jack Hughes Cambridge Cybercrime Conference joh32@cam.ac.uk 11th July 2019 Background Research into the role of gaming as an entry point into cybercrime is growing


slide-1
SLIDE 1

An Analysis of Cybercrime Activity within an Underground Gaming Forum

Jack Hughes

Cambridge Cybercrime Conference 11th July 2019

joh32@cam.ac.uk

slide-2
SLIDE 2

Background

  • Research into the role of gaming as an entry point

into cybercrime is growing

  • Example: DDoS attacks-as-a-service can be used by

gamers with little technical knowledge to gain an advantage over opponents

  • Exposure to, and use of, these services is believed

to be a pathway into more serious cybercrime

2

slide-3
SLIDE 3

3

Figure from: National Crime Agency. (2015). Identify, Intervene, Inspire: Helping young people to pursue careers in cyber security, not cyber crime, 6.

slide-4
SLIDE 4

Related Work

  • Previous work by Pastrana et al.1:
  • Analysed Hack Forums, for predicting future key actors
  • Produced open-source research tools for analysis
  • Hack Forums is a general-purpose underground hacking

forum

  • MPGH is specifically for multiplayer games
  • Both forums are available on the open web
  • Also available in the CrimeBB dataset, available for research use

from the Cambridge Cybercrime Centre

4

1 Pastrana S., Hutchings A., Caines A., Buttery P. (2018) Characterizing Eve: Analysing Cybercrime Actors in a

Large Underground Forum. In: Bailey M., Holz T., Stamatogiannakis M., Ioannidis S. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2018. Lecture Notes in Computer Science, vol 11050. Springer, Cham

slide-5
SLIDE 5

Ethics

  • This work has received approval from the

Department of Computer Science & Technology’s ethics committee

  • Only carrying out analysis of collective behaviour,

rather than identifying individuals

5

slide-6
SLIDE 6

Studying MPGH

  • Aim is not to carry out “predictive policing”, but

towards identifying possible intervention points

  • This work combines prediction techniques to

identify characteristics of key actors

6

slide-7
SLIDE 7

Key Actors

Individuals who have released tools and tutorials on the forum, or have advertised cybercrime related services such as DDoS-for-hire.

7

slide-8
SLIDE 8

MPGH Dataset

  • Snapshot of forum activity
  • 764k threads, 9.36m posts, 132k members with >5 posts

8

slide-9
SLIDE 9

Feature Collection & Selection

K-means Clustering Social Network Analysis Logistic Regression

Topic Analysis

Key Actor Predictions

Key Actor Selection

Method used by Pastrana et al.

9

Input data Prediction Techniques Validation Output predictions

slide-10
SLIDE 10

Feature Collection & Selection

K-means Clustering Social Network Analysis Logistic Regression Group- based Trajectory Modelling Decision Trees Neural Networks

Topic Analysis

Key Actor Predictions

Additional NLP-derived variables 2 Key Actor Selection

Adapted method for MPGH

2 Caines, A., Pastrana, S., Hutchings, A., & Buttery, P. J. (2018).

Automatically identifying the function and intent of posts in underground forums. Crime Science, 7(1), 19. https://doi.org/10.1186/s40163-018-0094-4 10

slide-11
SLIDE 11

Key Actor Selection

  • Manually selected 87 key actors, including:
  • Those who have released tools and tutorials on cracking,

gaming and hacking forums

  • Those who have advertised DDoS-for-hire (booter/stresser)

services

  • Those who are strongly connected to other key actors, and

are involved in similar activities to key actors

  • No information relating to any arrests or offending are

available for this forum

  • Therefore a manual selection process was used

11

slide-12
SLIDE 12

Feature Collection

  • Initial features include:
  • Social network analysis (eigenvector centrality, …)
  • Activity counts (thread count on marketplace, …)
  • Activity metrics (days spent on forum, …)
  • Interaction metrics (number of citations, …)
  • Impact metrics (h-index, i-10 index, …)
  • Additional features from NLP tools include (averaged
  • ver user’s posts):
  • Sentiment (quantitative measure of emotion)
  • Post types (information request, social, tutorial, …)
  • Post intents (positive, negative, aggressive, …)
  • Addressee types

12

slide-13
SLIDE 13

Feature Selection

  • Only members with more than 5 posts (‘active

members’) are considered for analysis (~17% of all)

  • Features are iteratively removed until correlations

are less than 80%

  • Some techniques and analysis rely on low

multicollinearity of features

  • Features are scaled
  • Some techniques rely on normalised distances of

features

  • Dataset is split into train-test-validation sets

13

slide-14
SLIDE 14

Key Actor Insights

14

slide-15
SLIDE 15

Changing Interests Over Time

15

Start Middle End Lifetime of Key Actor on the Forum

slide-16
SLIDE 16

Logistic Regression

16

slide-17
SLIDE 17

Potential Key Actor Predictions

17

slide-18
SLIDE 18

K-means Clustering (All Members)

18

  • Placing members into (k=)5 groups
  • Proportion of key actors per group:

12 key actors of 47,437 members

0.03%

Members used for prediction 3 key actors of 3966 members

0.08%

9 key actors of 10545 members

0.09%

14 key actors of 21,406 members

0.07%

46 key actors of 588 members

7.82%

slide-19
SLIDE 19

Social Network Analysis

Red: General key actor Blue: Distributing tools and tutorials Yellow: Key actors found after interaction with other key actors Green: Other forum members

19

slide-20
SLIDE 20

20

Social Network Analysis

Red: General key actor Blue: Distributing tools and tutorials Yellow: Key actors found after interaction with other key actors Green: Other forum members Pink: Predicted key actors

slide-21
SLIDE 21

Group-based trajectory modelling

21

This sustainer trajectory contains 28% of all key actors, and is used for prediction

slide-22
SLIDE 22

post_hack <= 1.5 gini = 0.5 samples = 33533 value = [16803, 16730] indegree_centrality <= 0.002 gini = 0.188 samples = 16928 value = [15150, 1778] True h <= 2.5 gini = 0.179 samples = 16605 value = [1653, 14952] False thread_hack <= 0.5 gini = 0.074 samples = 15732 value = [15129, 603] gini = 0.035 samples = 1196 value = [21, 1175] post_games_hackforums_sandbox <= 33.5 gini = 0.027 samples = 15008 value = [14800, 208] gini = 0.496 samples = 724 value = [329, 395] gini = 0.0 samples = 14606 value = [14606, 0] gini = 0.499 samples = 402 value = [194, 208] gini = 0.36 samples = 900 value = [688, 212] post_coding <= 0.5 gini = 0.115 samples = 15705 value = [965, 14740] thread_hack <= 0.5 gini = 0.333 samples = 2552 value = [539, 2013] gini = 0.063 samples = 13153 value = [426, 12727] gini = 0.498 samples = 413 value = [219, 194] thread_market <= 1.5 gini = 0.254 samples = 2139 value = [320, 1819] gini = 0.162 samples = 1540 value = [137, 1403] gini = 0.424 samples = 599 value = [183, 416]

Random Forest

22

slide-23
SLIDE 23

Inspecting Random Forest and Neural Network Models

SHAP diagram explaining the prediction of one member

23

slide-24
SLIDE 24

Topic Analysis

Terms related directly to cybercrime, or to the creation of tools used for cybercrime

24

  • Computationally expensive to compute for all members, but is used to

verify prediction results

slide-25
SLIDE 25

Key Actor Predictions

25

49 members are predicted as key actors

slide-26
SLIDE 26

Summary: Key Actor Behaviour

  • Different techniques begin to explain the behaviour
  • f key actors, showing they:
  • Have a higher h-index
  • Have been active on the forum for longer
  • Mostly well-connected with other key actors, and have

high eigenvector centrality

  • Sustain low-frequency post activity on the marketplace,

and high-frequency post activity in the gaming category

26

slide-27
SLIDE 27

Summary: Techniques

  • Techniques should be combined to produce better

predictions and insights of potential key actors

  • Individual features used for prediction, including

reputation, are not good indicators of key actors

27

slide-28
SLIDE 28

Wider Context

  • Finding common characteristics of key actor activities

are useful in understanding behaviours

  • These can later be used to identify points of

intervention, to deter and prevent individuals from progressing further into cybercrime

  • This could include law enforcement activity having a

presence on the forum

  • Could include disrupting low-level sustaining activity on the

marketplace

28

slide-29
SLIDE 29

29

Jack Hughes joh32@cam.ac.uk

References

1 Pastrana S., Hutchings A., Caines A., Buttery P. (2018) Characterizing Eve: Analysing Cybercrime Actors in a Large Underground

  • Forum. In: Bailey M., Holz T., Stamatogiannakis M., Ioannidis S. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2018.

Lecture Notes in Computer Science, vol 11050. Springer, Cham

2 Caines, A., Pastrana, S., Hutchings, A., & Buttery, P. J. (2018). Automatically identifying the function and intent of posts in

underground forums. Crime Science, 7(1), 19. https://doi.org/10.1186/s40163-018-0094-4

Data used is available from the Cambridge Cybercrime Centre: https://www.cambridgecybercrime.uk/process.html