validation tool Alecs Geuder SCAPE Information Day British Library, - - PowerPoint PPT Presentation
validation tool Alecs Geuder SCAPE Information Day British Library, - - PowerPoint PPT Presentation
Flint a format and file validation tool Alecs Geuder SCAPE Information Day British Library, UK, 14 th July 2014 Introducing Flint: Presentation Structure Introduction What does Flint do? Flint-the-API Policy-focused
Introducing Flint: Presentation Structure
- Introduction
- What does Flint do?
- Flint-the-API
- Policy-focused Validation
- Flint-the-toolbox
- Format-specific Implementations
- How we are using it
- Mini-demo
Introduction
- Flint facilitates [file/format validation against a policy]
- the code centres on individual file format modules (pdf,
epub, ..)
- Comes with a command line interface, GUIs and a
hadoop mapreduce program
FLint – core features
Schematron Policy
- categoryA – three tests
- categoryB – two tests
Input file
- f specific format
PolicyAware
(Uses schematron-utils)
categoryC – two tests
Format specific Implementation
- canCheck
- validationResult
- ..
<checkresult file=“input file“ result=“passed”>
<categoryA result=“passed”/> <categoryB result=“failed”/> <testB.1 result=“failed”/> <testB.2 result=“failed”/> <categoryC result=“passed”/>
</checkresult>
configuration code Set of internal & third party tools
The FLint ecosystem
config code
CLI GUIs hadoop PDF EPUB Geospatial data … Entry points Format/Feature specific Implementations
CORE
DRM-detection PDF/EPUB
Input file <checkResult>
How we are using it
- To deal with non print legal deposit
What’s next
- Add additional format/feature modules (geospatial, etc..)