Course in Data Information Literacy a Progress Report YOUR NAME: - - PowerPoint PPT Presentation

course in data information literacy
SMART_READER_LITE
LIVE PREVIEW

Course in Data Information Literacy a Progress Report YOUR NAME: - - PowerPoint PPT Presentation

Course in Data Information Literacy a Progress Report YOUR NAME: GARY SEITZ CONTACT: SEITZ@GEO.UZH.CH Lecture 1 2 INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1 Outline 3 INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1 Data


slide-1
SLIDE 1

Course in Data Information Literacy

a Progress Report

YOUR NAME: GARY SEITZ CONTACT: SEITZ@GEO.UZH.CH

slide-2
SLIDE 2

Lecture 1

2

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-3
SLIDE 3

Outline

3

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-4
SLIDE 4

Data Lifecycle

4

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-5
SLIDE 5

Summary Lecture 1

5

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-6
SLIDE 6

Lecture 2

6

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-7
SLIDE 7

Outline

7

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-8
SLIDE 8

Components of a Data Management Plan

8

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-9
SLIDE 9

Summary Lecture 2

9

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-10
SLIDE 10

Lecture 3

10

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-11
SLIDE 11

Outline

11

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-12
SLIDE 12

Data Repositories

12

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-13
SLIDE 13

Summary Lecture 3

re3data.org is a global registry of research data repositories that covers research data repositories from different academic disciplines Depending on the research discipline, data can often be accessed in one or more data centers (or repositories) that will provide access to the data These repositories may have specific requirements

subject/research domain data re-use and access file format and data structure, and metadata.

13

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-14
SLIDE 14

Lecture 4

14

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-15
SLIDE 15

Outline

15

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-16
SLIDE 16

Informal Workflows

16

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-17
SLIDE 17

Summary Lecture 4

Use of informal or formal workflows for documenting process metadata ensures reproducibility, repeatability, validation Be aware of best practices when designing data file structures Choose a data entry method that allows some validation of data as it is entered Consider investing time in learning how to use a database if datasets are large

  • r complex

17

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-18
SLIDE 18

Lecture 5

18

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-19
SLIDE 19

Outline

19

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-20
SLIDE 20

File naming strategies

20

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-21
SLIDE 21

Summary Lecture 5

21

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

When naming & organizing your files and folders…

be thoughtful be consistent

document your approach

Write down All The Things

slide-22
SLIDE 22

Lecture 6

22

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-23
SLIDE 23

Outline

23

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-24
SLIDE 24

Preferred Formats

24

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-25
SLIDE 25

Summary Lecture 6

Programs and file formats change over time such that old files may become difficult to read. Files in rare formats should be converted into common formats whenever possible. Files should not be password protected, encrypted or compressed File formats should be very common and, if possible, follow standards that are open and not proprietary For storage over more than ten years, we recommend the file formats PDF/A, ASCII text, TIFF, PNG, SVG and JPEG2000 For large data collections you can get an overview of your file formats using the free JAVA application DROID

25

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-26
SLIDE 26

Lecture 7

26

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-27
SLIDE 27

Outline

27

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-28
SLIDE 28

Distribution: data discovery

28

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-29
SLIDE 29

Summary Lecture 7

Metadata is documentation of data A metadata record captures critical information about the content of a dataset Metadata allows data to be discovered, accessed, and re-used A metadata standard provides structure and consistency to data documentation Standards and tools vary – select according to defined criteria such as data type, organizational guidance, and available resources Metadata is of critical importance to data developers, data users, and organizations Metadata can be effectively used for:

 data distribution  data management  project management

Metadata completes a dataset.

Creating robust metadata is in your OWN best interest!

29

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-30
SLIDE 30

Lecture 8

30

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-31
SLIDE 31

Outline

31

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-32
SLIDE 32

Backups vs. Archiving

32

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-33
SLIDE 33

Summary Lecture 8

Backups refer to creating copies of original files while archives involve the preservation of files There are many reasons we need to perform backups but primarily to prevent data loss One needs to consider how often to perform backups, where to backup, and accessibility to backups when you need them and how long to keep the files Check for backups on outdated media and test backups often! Data preservation more than just backing up and archiving your files Evaluate and refresh storage regularly Protect the integrity of your data at the file level Protect the hardware and software systems you use

33

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-34
SLIDE 34

Lecture 9

34

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-35
SLIDE 35

Outline

35

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-36
SLIDE 36

Select archive location

36

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-37
SLIDE 37

Summary Lecture 9

Data preservation has many potential benefits:

Enable longitudinal and synthesis studies Leverage investments in data collection

Additional considerations

Preservation of data in multiple forms - i.e. raw, processed, derived, etc - may be warranted in many circumstances.

 Which version(s) to keep?  How to make relationships among versions clear?

Considerations of cost and reproducibility are key in considering policies for preservation of experimental data.

 How to assess the long-term value of data?  What documentation is necessary to enable data replication?

37

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-38
SLIDE 38

Lecture 10

38

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-39
SLIDE 39

Outline

39

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-40
SLIDE 40

Value of Data Sharing to the Public

40

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-41
SLIDE 41

Summary Lecture 10

Data sharing adds value to the data It is the responsibility of the researcher to share their data Metadata supports data accountability, liability, and usability Sponsors expect, some require, data to be shared Data sharing is essential to the advancement of science Data Citation makes it easy for others to attribute your data directly to you

41

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-42
SLIDE 42

Lecture 11

42

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-43
SLIDE 43

Outline

43

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-44
SLIDE 44

Deidentification of Research Data

44

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-45
SLIDE 45

Summary Lecture 11

Know who can claim ownership over products Assign licenses or waivers appropriately Behave ethically and in accordance with established community norms Respect the licenses or waivers assigned Protect privacy and confidentiality Know what restrictions and liabilities apply to products and processes

45

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1

slide-46
SLIDE 46

Thank you for all your comments!

46

INNOPOOL WORKSHOP REPRODUCIBLE RESEARCH: SESSION 1