Pathways for Discovery of Free Sofware Katherine Thornton, Morane - - PowerPoint PPT Presentation

pathways for discovery of free sofware
SMART_READER_LITE
LIVE PREVIEW

Pathways for Discovery of Free Sofware Katherine Thornton, Morane - - PowerPoint PPT Presentation

Pathways for Discovery of Free Sofware Katherine Thornton, Morane Gruenpeter Wikidata for Digital Preservation katherine.thornton@yale.edu, morane@sofwareheritage.org 25 March 2018 Katherine Thornton, Morane Gruenpeter Pathways for Discovery


slide-1
SLIDE 1

Pathways for Discovery of Free Sofware

Katherine Thornton, Morane Gruenpeter

Wikidata for Digital Preservation katherine.thornton@yale.edu, morane@sofwareheritage.org

25 March 2018

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 1 / 39

slide-2
SLIDE 2

Outline

1

Introduce sofware preservation as the key to sofware discovery

2

Describe and document sofware through metadata

3

Explore the landscape of sofware ontologies and vocabularies

4

Discover free sofware in Wikidata

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 2 / 39

slide-3
SLIDE 3

Wikidata for Digital preservation working group

We are... cultural heritage technologists with a mission metadata enthusiasts free sofware advocates

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 3 / 39

slide-4
SLIDE 4

Wikidata for Digital preservation working group

We are... cultural heritage technologists with a mission metadata enthusiasts free sofware advocates Goals document digital artifacts, sofware and sofware source code promote open standards and libre community vision contribute metadata for sofware preservation

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 3 / 39

slide-5
SLIDE 5

Sofware Preservation

Many cultural heritage organizations have sofware in their collections

What do we want/need to preserve ? sofware binaries sofware source code hardware

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 4 / 39

slide-6
SLIDE 6

Sofware Preservation

Many cultural heritage organizations have sofware in their collections

What do we want/need to preserve ? sofware binaries sofware source code hardware is this enough ?

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 4 / 39

slide-7
SLIDE 7

Sofware Preservation

Many cultural heritage organizations have sofware in their collections

What do we want/need to preserve ? sofware binaries sofware source code hardware is this enough ? What are the risks preserving sofware without the context? Sometimes different sofware resources have the same name Sofware description practices are variable-

without the compatible environment metadata lack information necessary for reproducibility

if we preserve information about environments we can emulate or virtualize

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 4 / 39

slide-8
SLIDE 8

Why now

Looking at the past a lot of old sofware misplaced, lost, or behind barriers, but... most legacy founders and the current maintainers are still here, and willing to share urgent to collect their knowledge Only a few years lef.

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 5 / 39

slide-9
SLIDE 9

Why now

Looking at the past a lot of old sofware misplaced, lost, or behind barriers, but... most legacy founders and the current maintainers are still here, and willing to share urgent to collect their knowledge Only a few years lef. Looking at the future sofware development skyrockets essential to preserve the sofware in its context for the future Every year that goes by makes the problem worse.

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 5 / 39

slide-10
SLIDE 10

What are the limitations today with sofware preservation and discovery?

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 6 / 39

slide-11
SLIDE 11

Sofware Heritage in a nutshell

Collect, preserve and share the source code of all the sofware Preserving our heritage, enabling beter sofware for all

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 7 / 39

slide-12
SLIDE 12

Sofware Heritage in a nutshell

Collect, preserve and share the source code of all the sofware Preserving our heritage, enabling beter sofware for all

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 7 / 39

slide-13
SLIDE 13

Outline

1

Introduce sofware preservation as the key to sofware discovery

2

Describe and document sofware through metadata

3

Explore the landscape of sofware ontologies and vocabularies

4

Discover free sofware in Wikidata

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 8 / 39

slide-14
SLIDE 14

Describe and document sofware

Why is it important? without description and documentation these resources can’t be located, reused, extended, etc Use cases unique identification sofware reproducibility browse source code with context information sofware citation - cite and be cited semantic search: find sofware by author, version, keywords

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 9 / 39

slide-15
SLIDE 15

Describe and document sofware

What is sofware ?

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 10 / 39

slide-16
SLIDE 16

Describe and document sofware

What is sofware ?

Sofware as a concept project or entity the community around the project the sofware idea/algorithms/solutions

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 10 / 39

slide-17
SLIDE 17

Describe and document sofware

What is sofware ?

Sofware as a concept project or entity the community around the project the sofware idea/algorithms/solutions Sofware artifact each revision in source code form binaries produced for different environments

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 10 / 39

slide-18
SLIDE 18

Describe and document sofware

Where can we locate sofware metadata?

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 11 / 39

slide-19
SLIDE 19

Describe and document sofware

Where can we locate sofware metadata?

With the source code part of the sofware repository sofware deposits

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 11 / 39

slide-20
SLIDE 20

Describe and document sofware

Where can we locate sofware metadata?

With the source code part of the sofware repository sofware deposits In the Source code package management CodeMeta.json file (for citation)

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 11 / 39

slide-21
SLIDE 21

Describe and document sofware

Where can we locate sofware metadata?

With the source code part of the sofware repository sofware deposits In the Source code package management CodeMeta.json file (for citation) Registries/Catalogs Wikidata FSF-directory libraries.io

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 11 / 39

slide-22
SLIDE 22

With what terms should we describe a sofware artifact?

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 12 / 39

slide-23
SLIDE 23

Outline

1

Introduce sofware preservation as the key to sofware discovery

2

Describe and document sofware through metadata

3

Explore the landscape of sofware ontologies and vocabularies

4

Discover free sofware in Wikidata

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 13 / 39

slide-24
SLIDE 24

Sofware ontologies and vocabularies

“Ontologies are agreements, made in a social context, to accomplish some objectives. It’s important to understand those objectives, and be guided by them."

  • T. Gruber, The Pragmatics of Ontology, 2003

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 14 / 39

slide-25
SLIDE 25

Sofware ontologies and vocabularies

“Ontologies are agreements, made in a social context, to accomplish some objectives. It’s important to understand those objectives, and be guided by them."

  • T. Gruber, The Pragmatics of Ontology, 2003

LOV- Linked open vocabularies “Vocabularies provide the semantic glue enabling data to become meaningful data. ”

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 14 / 39

slide-26
SLIDE 26

The landscape of sofware ontologies

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 15 / 39

slide-27
SLIDE 27

The landscape of sofware ontologies

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 16 / 39

slide-28
SLIDE 28

The landscape of sofware ontologies

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 17 / 39

slide-29
SLIDE 29

The landscape of sofware ontologies

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 18 / 39

slide-30
SLIDE 30

The landscape of sofware ontologies

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 19 / 39

slide-31
SLIDE 31

The landscape of sofware ontologies

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 20 / 39

slide-32
SLIDE 32

Outline

1

Introduce sofware preservation as the key to sofware discovery

2

Describe and document sofware through metadata

3

Explore the landscape of sofware ontologies and vocabularies

4

Discover free sofware in Wikidata

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 21 / 39

slide-33
SLIDE 33

Wikidata

This knowledge base of structured data is:

Machine-readable linked open data Editable by anyone with Internet access Designed to support both human and algorithmic curation Fully-versioned wiki Wikidata is built from free sofware- MediaWiki and WikiBase

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 22 / 39

slide-34
SLIDE 34

SPARQL query for the sofware licenses of the sofware that powers Wikidata

Figure: Try this query!

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 23 / 39

slide-35
SLIDE 35

Status of sofware data in Wikidata

66,000 instances of sofware in Wikidata today OpenHub external ids for 208 sofware items FSF external ids for 1,428 sofware items (15,000+ resources total) Framalibre external ids for 336 sofware items Lots more work for us to do

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 24 / 39

slide-36
SLIDE 36

Free sofware in Wikidata

A bubble chart of licenses by number of sofware titles

Figure: Try this query!

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 25 / 39

slide-37
SLIDE 37

What sofware available under a free sofware license can I use to open .obj files?

Figure: Try this query!

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 26 / 39

slide-38
SLIDE 38

Create an image grid of Gnu/Linux distributions

Figure: Try this query!

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 27 / 39

slide-39
SLIDE 39

Wikidata is a linking hub for external IDs

External IDs have their own data type 58 percent of WD properties are external ids 2570/4439

Figure: All external ids for NumPy

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 28 / 39

slide-40
SLIDE 40

FSF Resource Directory external ID in Wikidata

If a person or sofware agent visits the Wikidata item for a piece of sofware that is also in the FSF Resource Directory, they will find a URL to the page on the FSF wiki.

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 29 / 39

slide-41
SLIDE 41

Scroll down to the botom of the page to see the identifiers

click on the link next to Free Sofware Directory ID

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 30 / 39

slide-42
SLIDE 42

Here is the wiki page for Avogadro in the FSF Resource Directory

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 31 / 39

slide-43
SLIDE 43

Wikidata crosswalks multiple external IDs

We can write queries to return lots of different identifiers for sofware.

Figure: Try this query!

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 32 / 39

slide-44
SLIDE 44

Status of file format data in Wikidata

2,852 instances of file format in Wikidata today PRONOM has 1,553 entries: of these we have 1,023 file formats with PUID external ids 2,629 items connected to Just Solve the File Format Problem ids

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 33 / 39

slide-45
SLIDE 45

Links between descriptive and technical metadata

Bubble chart of sofware titles by number of readable file formats

Figure: Try this query!

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 34 / 39

slide-46
SLIDE 46

Machine-Readable "alternative to" website powered by Wikidata

multiple serialization formats (.tl, .rdf, .json) reverse look-up tool for file formats to sofware unique Wikidata URIs so we can discuss sofware without confusion

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 35 / 39

slide-47
SLIDE 47

Thank You!

Qestions?

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 36 / 39

slide-48
SLIDE 48

Want to contribute?

Wikidata WikiProject Informatics Sofware Heritage forge

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 37 / 39

slide-49
SLIDE 49

Acknowledgements

Wikidata for Digital Preservation working group Kat Morane Carl Wilson, Open Preservation Foundation Thomas Ledoux, National Library of France Bertrand Caron, National Library of France Ross Spencer, Artefactual Systems John Samuel, École Supérieure de Chimie Physique Électronique de Lyon David Russo, British Library

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 38 / 39

slide-50
SLIDE 50

Acknowledgements

Communities Wikidata community Sofware Heritage Wikicite Yale University Library Council on Library and Information Resources Crossminer

Katherine Thornton, Morane Gruenpeter Pathways for Discovery of Free Sofware 25 March 2018 39 / 39