Gephi for Analysis Part I: Basics NEH Digital Culture 2020 Workshop - - PowerPoint PPT Presentation

gephi for analysis part i basics
SMART_READER_LITE
LIVE PREVIEW

Gephi for Analysis Part I: Basics NEH Digital Culture 2020 Workshop - - PowerPoint PPT Presentation

Gephi for Analysis Part I: Basics NEH Digital Culture 2020 Workshop What is Gephi? An open source visualization tool for graphs and networks that Works with large datasets Enables interactive network exploration Supports the


slide-1
SLIDE 1

Gephi for Analysis Part I: Basics

NEH Digital Culture 2020 Workshop

slide-2
SLIDE 2

What is Gephi?

  • An open source visualization tool for graphs and networks that

○ Works with large datasets ○ Enables interactive network exploration ○ Supports the visualization of evolving networks in real time

(see full list of features here)

slide-3
SLIDE 3

Social Network Analysis with Gephi

Networks have two essential components:

  • Actors who compose the

network (nodes in Gephi)

  • Interactions between actors

(edges in Gephi)

(see more here)

slide-4
SLIDE 4

Gephi and Social Media Research

  • Can create dynamic

networks around specific hashtags, keywords, etc.

  • Shows relationships

between users in the network

(Grandjean, 2013)

slide-5
SLIDE 5

Importing Data into Gephi

  • For the purposes of this session,

we’ll use a YouTube network that you learned how to collect in the “Twitter and YouTube Data Gathering” session on Monday

○ If you haven’t done that part yet, do it first!

  • My seed video is from a project I

am working on about a harassment controversy on YouTube

slide-6
SLIDE 6

Importing Data into Gephi

  • Once you launch Gephi,

click on the “File” menu and select “Open…”

  • Choose the dataset you

want

slide-7
SLIDE 7

Importing Data into Gephi

  • Click “OK” to accept the

default settings

○ This window also gives you basic information about the network, including number of nodes (actors), and edges (relationships)

slide-8
SLIDE 8

Gephi for Analysis

Once you’ve imported your data, it should look something like this:

slide-9
SLIDE 9

Gephi for Analysis

It’s easy to think of Gephi as a visualization tool to present findings, but we’re going to talk about an earlier stage of the process: analysis. This starts with understanding what your data can and can’t tell you.

slide-10
SLIDE 10

Gephi for Analysis

In this case, I used a seed video and asked for all the videos that are recommended from it. We can think about this in several ways:

  • What YouTube’s algorithm thinks is similar

○ This could be used for media industry or technology-focused research

  • What people tend to watch together

○ This provides insight into people’s behavior

  • What a person might have promoted to them after watching one video

○ This can be used to think about filter bubbles, radicalization, misinformation, etc.

slide-11
SLIDE 11

Gephi for Analysis

But the YouTube tool has another option. If I had used Video Info and Comments, I would have (among other things) a network of comments on a certain video. I could use this for:

  • Are there a few main voices in the conversation?
  • What are the norms of social interaction? (and then compare to other interactions on

different platforms, in person, etc.)

  • Are the same people interacting with each other a lot? (and then compare to other

videos from the same source or that are recommended from that source)

slide-12
SLIDE 12

Gephi for Analysis

  • I’m interested in information ecosystems, so I used video network

and am thinking of it as what a person might have promoted to them after watching one video

  • I often say: your question determines your method. You’ll use

different tools based on what you’re trying to find out about your data.

slide-13
SLIDE 13

Node size

  • Once the data is imported, you can start using the tools to

ask questions about it.

  • Let’s start by using node size to see important participants

in the network.

  • In the “Appearance” pane, select the Size tool (the circles in

the upper right), then Nodes, then Ranking

  • Select “Degree” from the dropdown menu and set the

Minimum to 10 and the Maximum to 50 - then, hit “Apply”

  • Now the graph looks like this:
slide-14
SLIDE 14

Node size

  • Degree shows which nodes have more or fewer

connections to other nodes

  • Conceptually, for a question about an information

ecosystem, the higher the degree, the more likely that video is to be shown to people watching the

  • ther videos

○ We could then say that it is likely to have a greater influence

slide-15
SLIDE 15

Node labels

But what are those nodes? Click the T at the bottom left to show the labels

slide-16
SLIDE 16

Node labels

The labels are often messy, but you can 1) Set them proportional to node size 2) Use Label adjust layout

slide-17
SLIDE 17

Node labels

Still kind of messy, but you can start to see the ones that are much bigger

slide-18
SLIDE 18

Node labels

  • If your labels overlap and are hard to read, use Label

adjust layout

slide-19
SLIDE 19

Node color

  • But with everything the same color, it

can be hard to see what’s going on.

  • Let’s change the color on the nodes

by degree too

  • Configure the colors from the

“Appearance” menu (click on the color palette icon)

○ By default, it’s shades of green, but you can double-click the arrows to select other colors

  • Select “Apply”
slide-20
SLIDE 20

Node color

  • After you apply your changes, your visualization looks more like this:
  • By making the more connected nodes both bigger and darker than the others, some of

the “noise” in the graph goes away.

slide-21
SLIDE 21

Layout

  • But there’s still a lot of noisy
  • verlap.
  • In the “Layout” pane, you

can select “Expansion” or “Contraction” to manipulate the spread of your network

○ Click once for a step more expanded or contracted ○ Gone too far by accident? Switch Expand to Contract, or vice versa, and go a step or two back

slide-22
SLIDE 22

Layout

  • Now you can see a lot of the key videos in this space, what the subject matter is, etc.
slide-23
SLIDE 23

Node labels

You can hover over a node to see which other nodes it connects to

slide-24
SLIDE 24

Heatmap

A heatmap is a quick way to see what’s close to a selected node. Select the tool, and then select your node of interest.

slide-25
SLIDE 25

More display options

At the bottom right, you can access more settings for appearance on edges, labels, and

  • verall appearance
slide-26
SLIDE 26

Saving the visualizations for later

Save using the Take screenshot tool at the bottom left.

slide-27
SLIDE 27

What does it mean?

These networks, like any visualization, don’t tell you something all by themselves. You need to know at least something about the reality on the ground for these data to have meaning. In this case, knowing that this was a controversy with a gay Latinx man being harassed by a conservative YouTuber contextualizes why there are significant nodes about President Trump, male privilege, racism, pride march, socialism, etc.

slide-28
SLIDE 28

What does it mean?

This visualization helps give me a big picture of the information ecosystem around the video I started with, to understand how someone might arrive to watch it, and where they might go next. My next steps might be going and watching some of those videos to do textual analysis, looking at the comments on the seed video with textual analysis and/or word frequency, or looking at other sources of information about this incident, like tweets or news articles