who networks of social entities
play

Who? Networks of social entities Max Kemman University of - PowerPoint PPT Presentation

Who? Networks of social entities Max Kemman University of Luxembourg December 13, 2016 Doing Digital History: Introduction to Tools and Technology Today Final assignment Preparing the data with Palladio (Cleaning the date


  1. Who? Networks of social entities Max Kemman University of Luxembourg December 13, 2016 Doing Digital History: Introduction to Tools and Technology

  2. Today • Final assignment • Preparing the data with Palladio • (Cleaning the date column with Google Spreadsheets) • Visualising with Palladio • Next time

  3. Final assignment Some additional info about the final assignment The computers in the TIC-Lab are powerful enough to work with all mails in Google Spreadsheets (You may also use Excel if you prefer, but more difficult for me to help when you're stuck) Create a selection and argument why this selection Deadline: 20 January 2017 23:59 You receive grades on Friday 27 January 2017

  4. Final assignment data All data is in Moodle in folder Final Assignment : • allmails-metadata.csv & allmails-metadata.ods • allmails-ner.csv & allmails-metadata.ods (including mentioned people, organisations, locations) • allmails-geocoded.csv (about 108k locations) • Folder with text files per 1k

  5. Preparing the data with Palladio To visualize the coded data, we will use Palladio: http://hdlab.stanford.edu/palladio/ First we need to prepare the data for Palladio

  6. Loading the data Click Start We will use the 1000mails-cleandate.csv file from Moodle in the Who folder) Drag the CSV file onto the text input field Click Load

  7. Preparing the data You will get a list of the columns from the spreadsheet You can already give your project a title and your data table as well Do not close this tab or refresh, you will have to start over! Let's look at several columns

  8. From Sort the values by Frequency Check the data type Click Close

  9. Date To set the data type to date we need the format: YYYY-MM-DD In our original CSV the format included the clock, but here we have the data in the right format so it's automatically recognised See next section for how to clean the date Click Close

  10. People This contains the named entities per email To separate multiple people in an email, enter the delimiter | in the Multiple values box Click Close

  11. People This contains the named entities per email To separate multiple people in an email, enter the delimiter | in the Multiple values box Click Close

  12. Cleaning the date column with Google Spreadsheets Here we used Google Spreadsheets, but also possible in Excel & LibreOffice You can skip this for now, but important for final assignment

  13. Cleaning the Date field Select the Date column, and go to Format > Number > More Formats > More date and time formats

  14. Cleaning the Date field Select the appropriate option YYYY-MM-DD and click Apply

  15. Cleaning the Date field The Date column will now have the appropriate form

  16. Exporting the CSV Click File > Download as > Comma-separated values (.csv, current sheet)

  17. Visualising with Palladio Now let's look at the network by selecting Graph at the top bar As a source, choose the From and close the popup As a target, choose the To and close the popup Wait and watch the result!

  18. Palladio Graph Settings Try the two Highlighting check-boxes Try Size nodes What can we learn from this graph?

  19. Facet To filter for certain attributes, select Facet in the lower-left corner As a Dimension select From and close the popup Now you can select to filter emails only from one person You could alternatively filter emails mentioning a specific person, location, or organisation To refine even further, we can select more facets by selecting the Dimension and selecting more options To remove a facet, delete the red trashcan in the lower right corner

  20. Facet selection from From column

  21. Facet selection from People column

  22. Timeline We can also create a timeline of the emails by clicking Timeline Drag the mouse in the timeline to create a bar that acts as a filter And drag the bar to move it around so you can see how the network develops: you could compare months or years To remove the timeline filter, delete the red trashcan in the lower right corner

  23. Timeline

  24. Filtering one part of the timeline

  25. Filtering another part of the timeline

  26. Why filtering? The network can become quite large when you have more emails, or when you select one of the people, locations, organisations columns in the graph Filtering will help to be able to read the spaghetti/graph See next slide an example of a spaghetti ball (trying to do this might make your computer quite slow)

  27. Sharing To export a graph, click the Download button in the settings (the lower one). This will export an SVG file that you can embed in your HTML report with img src Palladio Graph.svg alt graph

  28. To export the entire workspace, click the upper Download button. This will export a JSON file that you can load next time (see next slide)

  29. If you previously exported your workspace, you can load it in by selecting "Load an existing project" and choosing the JSON file. Also useful to share with project partners

  30. For next time 20 December Wrap-up

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend