Collaborations and Partnerships: addressing the big digital - - PowerPoint PPT Presentation

collaborations and partnerships addressing the big
SMART_READER_LITE
LIVE PREVIEW

Collaborations and Partnerships: addressing the big digital - - PowerPoint PPT Presentation

Create new possibilities for ALL Humanities Create new possibilities for Digital Humanities Researchers with the Gale Digital Scholar Lab Researchers with the Gale Digital Scholar Lab Chris Houghton Head of Digital Scholarship, International


slide-1
SLIDE 1

Create new possibilities for Digital Humanities Researchers with the Gale Digital Scholar Lab

Chris Houghton Head of Digital Scholarship, International

chris.houghton@cengage.com @DHandDSatGale

Create new possibilities for ALL Humanities Researchers with the Gale Digital Scholar Lab

slide-2
SLIDE 2
slide-3
SLIDE 3

Collaborations and Partnerships: addressing the big digital challenges together

slide-4
SLIDE 4

Good Data since 2002

slide-5
SLIDE 5

Think about what existed before

  • Limited number of

users

  • Difficult to use and

easily damaged

  • Slow to find things
slide-6
SLIDE 6

Digital Archives solved the 3 challenges of microfilm

Multiple Simultaneous Users Easy to Use, and a single user cannot destroy it! Find relevant material quickly

slide-7
SLIDE 7

250 Million pages of content later…

slide-8
SLIDE 8

Jump Ahead to 2010…

slide-9
SLIDE 9

Requests to Access and Improve our Data

Text Creation Partnership (TCP)

  • Manually keyed and text-encoded

2,231 of ECCO’s 150k texts

  • Allows them to be used for

purposes beyond the scope of the ECCO platform, including text mining

  • TCP also worked with ProQuest’s

EEBO

slide-10
SLIDE 10

Requests for access beyond archives

Dr Kat Gupta University of Nottingham

slide-11
SLIDE 11

Unexpected Uses of Data

Prof Dallas Liddle Augsburg College

slide-12
SLIDE 12

A shift in workflows

Search text and Retrieve images Gather data and Analyse content sets

slide-13
SLIDE 13

In 2013 and 2014, we responded in 3 ways

slide-14
SLIDE 14

What we learned in 5 years…

slide-15
SLIDE 15

What is Digital Humanities?

Digital Humanities is the critical study of how digital technologies and methods intersect with humanities scholarship and scholarly communication. It investigates the use of digital tools and software for interpretation and analysis of humanities research questions and how digital methodologies can be used to enhance disciplines such as Art History, Classical Studies, History, Literature, Music and many others. Digital Humanities allows scholars to approach old problems with new means, or to ask new questions that could not have been asked with the traditional means of humanistic enquiry. Whatever the approach chosen, Digital Humanities remains grounded in humanities research and interests.

http://www.open.ac.uk/arts/research/digital-humanities/

Digital Humanities is the critical study of how digital technologies and methods intersect with humanities scholarship and scholarly communication. It investigates the use of digital tools and software for interpretation and analysis of humanities research questions and how digital methodologies can be used to enhance disciplines such as Art History, Classical Studies, History, Literature, Music and many others. Digital Humanities allows scholars to approach old problems with new means, or to ask new questions that could not have been asked with the traditional means of humanistic enquiry. Whatever the approach chosen, Digital Humanities remains grounded in humanities research and interests. Digital Humanities is the critical study of how digital technologies and methods intersect with humanities scholarship and scholarly communication. It investigates the use of digital tools and software for interpretation and analysis of humanities research questions and how digital methodologies can be used to enhance disciplines such as Art History, Classical Studies, History, Literature, Music and many others. Digital Humanities allows scholars to approach old problems with new means, or to ask new questions that could not have been asked with the traditional means of humanistic enquiry. Whatever the approach chosen, Digital Humanities remains grounded in humanities research and interests. Digital Humanities is the critical study of how digital technologies and methods intersect with humanities scholarship and scholarly communication. It investigates the use of digital tools and software for interpretation and analysis of humanities research questions and how digital methodologies can be used to enhance disciplines such as Art History, Classical Studies, History, Literature, Music and many others. Digital Humanities allows scholars to approach old problems with new means, or to ask new questions that could not have been asked with the traditional means of humanistic enquiry. Whatever the approach chosen, Digital Humanities remains grounded in humanities research and interests. Digital Humanities is the critical study of how digital technologies and methods intersect with humanities scholarship and scholarly communication. It investigates the use of digital tools and software for interpretation and analysis of humanities research questions and how digital methodologies can be used to enhance disciplines such as Art History, Classical Studies, History, Literature, Music and many others. Digital Humanities allows scholars to approach old problems with new means, or to ask new questions that could not have been asked with the traditional means of humanistic enquiry. Whatever the approach chosen, Digital Humanities remains grounded in humanities research and interests.

slide-16
SLIDE 16

‘1-9-90’ rule

Resource and Technical Support Limits in DH has resulted in a manifestation of the ‘1-9-90’ rule :

The 90-9-1 rule for participation in an online community http://www.nngroup.com/articles/p articipation-inequality/

slide-17
SLIDE 17

Challenge #1: Access to relevant data in an optimised format

Slide courtesy of the COMHIS Collective: https://comhis.github.io/ http://j.mp/comhis-bsecs

slide-18
SLIDE 18

Challenge #2: Hosting all of that data

slide-19
SLIDE 19

Challenge #3: Tools are difficult to use

slide-20
SLIDE 20

20

Our Solution….

slide-21
SLIDE 21

Gale Digital Scholar Lab

  • Access to a broad range of texts from Gale Primary

Source collections

  • Access to powerful text mining tools
  • Construct custom content sets across Gale’s

collections

  • Organise and manage research
  • Integrated help and instructional materials
  • Export OCR, statistical data and visualisations in

standard formats

TDM Research Environment

slide-22
SLIDE 22

Developed with DH Scholars and experts

slide-23
SLIDE 23

Digital Scholar Lab solves the 3 Challenges

Access to relevant data in an optimised format Somewhere to host that data Familiar, Powerful tools

slide-24
SLIDE 24

A Story of Exploration…

slide-25
SLIDE 25

Pat Houghton

slide-26
SLIDE 26

Shortening the “80%” of research time

slide-27
SLIDE 27

OCR Confidence ≠ OCR Accuracy

slide-28
SLIDE 28

Exposing the OCR process, flaws and all

slide-29
SLIDE 29

Clean OCR data at Scale

slide-30
SLIDE 30

Create Bespoke Content Sets

slide-31
SLIDE 31

Analysis Tools available in the Gale Lab

  • Topic Modelling (Mallet)*
  • Clustering (SciKit Learn)*
  • Parts-of-Speech Tagger (spaCy)*
  • Sentiment Analysis (OpenNLP)*
  • Named Entity Recognition

(spaCy)*

  • Ngrams(Lucene)

*Denotes open source

Parts Of Speech Tagger Named Entity Recognition Frequencies & Ngrams

slide-32
SLIDE 32

What does this mean for Uncle Pat?

Daily Telegraph Daily Mail

slide-33
SLIDE 33

Topic Modelling for real Exploration

slide-34
SLIDE 34

Gale’s DS Lab: for the ‘1%’, ‘9%’ and ’90%’

  • Facilitates the creation and use
  • f data sets
  • Can be used in teaching to

analyse data collectively

  • Allows everyone to build up data

analysis and digital humanities skills

slide-35
SLIDE 35

Thank You!

Chris Houghton Head of Digital Scholarship, International

chris.houghton@cengage.com @DHandDSatGale