Gephi for Analysis Part I: Basics
NEH Digital Culture 2020 Workshop
Gephi for Analysis Part I: Basics NEH Digital Culture 2020 Workshop - - PowerPoint PPT Presentation
Gephi for Analysis Part I: Basics NEH Digital Culture 2020 Workshop What is Gephi? An open source visualization tool for graphs and networks that Works with large datasets Enables interactive network exploration Supports the
NEH Digital Culture 2020 Workshop
○ Works with large datasets ○ Enables interactive network exploration ○ Supports the visualization of evolving networks in real time
(see full list of features here)
(see more here)
networks around specific hashtags, keywords, etc.
between users in the network
(Grandjean, 2013)
we’ll use a YouTube network that you learned how to collect in the “Twitter and YouTube Data Gathering” session on Monday
○ If you haven’t done that part yet, do it first!
am working on about a harassment controversy on YouTube
click on the “File” menu and select “Open…”
want
default settings
○ This window also gives you basic information about the network, including number of nodes (actors), and edges (relationships)
Once you’ve imported your data, it should look something like this:
In this case, I used a seed video and asked for all the videos that are recommended from it. We can think about this in several ways:
○ This could be used for media industry or technology-focused research
○ This provides insight into people’s behavior
○ This can be used to think about filter bubbles, radicalization, misinformation, etc.
But the YouTube tool has another option. If I had used Video Info and Comments, I would have (among other things) a network of comments on a certain video. I could use this for:
different platforms, in person, etc.)
videos from the same source or that are recommended from that source)
and am thinking of it as what a person might have promoted to them after watching one video
different tools based on what you’re trying to find out about your data.
ask questions about it.
in the network.
the upper right), then Nodes, then Ranking
Minimum to 10 and the Maximum to 50 - then, hit “Apply”
connections to other nodes
ecosystem, the higher the degree, the more likely that video is to be shown to people watching the
○ We could then say that it is likely to have a greater influence
But what are those nodes? Click the T at the bottom left to show the labels
The labels are often messy, but you can 1) Set them proportional to node size 2) Use Label adjust layout
Still kind of messy, but you can start to see the ones that are much bigger
adjust layout
can be hard to see what’s going on.
by degree too
“Appearance” menu (click on the color palette icon)
○ By default, it’s shades of green, but you can double-click the arrows to select other colors
the “noise” in the graph goes away.
can select “Expansion” or “Contraction” to manipulate the spread of your network
○ Click once for a step more expanded or contracted ○ Gone too far by accident? Switch Expand to Contract, or vice versa, and go a step or two back
You can hover over a node to see which other nodes it connects to
A heatmap is a quick way to see what’s close to a selected node. Select the tool, and then select your node of interest.
At the bottom right, you can access more settings for appearance on edges, labels, and
Save using the Take screenshot tool at the bottom left.
These networks, like any visualization, don’t tell you something all by themselves. You need to know at least something about the reality on the ground for these data to have meaning. In this case, knowing that this was a controversy with a gay Latinx man being harassed by a conservative YouTuber contextualizes why there are significant nodes about President Trump, male privilege, racism, pride march, socialism, etc.
This visualization helps give me a big picture of the information ecosystem around the video I started with, to understand how someone might arrive to watch it, and where they might go next. My next steps might be going and watching some of those videos to do textual analysis, looking at the comments on the seed video with textual analysis and/or word frequency, or looking at other sources of information about this incident, like tweets or news articles