Who needs Pandoc when you have Sphinx? An exploration of the - - PowerPoint PPT Presentation

who needs pandoc when you have sphinx
SMART_READER_LITE
LIVE PREVIEW

Who needs Pandoc when you have Sphinx? An exploration of the - - PowerPoint PPT Presentation

Who needs Pandoc when you have Sphinx? An exploration of the parsers and builders of the Sphinx documentation tool FOSDEM 2019 @stephenfin reStructuredText, Docutils & Sphinx 1 A little reStructuredText =========================


slide-1
SLIDE 1

Who needs Pandoc when you have Sphinx?

An exploration of the parsers and builders of the Sphinx documentation tool FOSDEM 2019 @stephenfin

slide-2
SLIDE 2

reStructuredText, Docutils & Sphinx

1

slide-3
SLIDE 3

A little reStructuredText ========================= This document demonstrates some basic features of |rst|. You can use **bold** and *italics*, along with ``literals``. It’s quite similar to `Markdown`_ but much more extensible. CommonMark may one day approach this [1]_, but today is not that day. `Docutils`__ does all this for us. .. |rst| replace:: **reStructuredText** .. _Markdown: https://daringfireball.net/projects/markdown/ .. [1] https://talk.commonmark.org/t/444 __ http://docutils.sourceforge.net/

💿 intro.rst

slide-4
SLIDE 4

A little reStructuredText ========================= This document demonstrates some basic features of |rst|. You can use **bold** and *italics*, along with ``literals``. It’s quite similar to `Markdown`_ but much more extensible. CommonMark may one day approach this [1]_, but today is not that day. `Docutils`__ does all this for us. .. |rst| replace:: **reStructuredText** .. _Markdown: https://daringfireball.net/projects/markdown/ .. [1] https://talk.commonmark.org/t/444 __ http://docutils.sourceforge.net/

💿 intro.rst

slide-5
SLIDE 5

A little reStructuredText

This document demonstrates some basic features of reStructuredText. You can use bold and italics, along with literals. It’s quite similar to Markdown but much more

  • extensible. CommonMark may one day approach this [1], but today is not that day.

Docutils does all this for us. [1] https://talk.commonmark.org/t/444/

💿 intro.html

slide-6
SLIDE 6

A little more reStructuredText ============================== The extensibility really comes into play with directives and

  • roles. We can do things like link to RFCs (:RFC:`2324`, anyone?)
  • r generate some more advanced formatting (I do love me some

H\ :sub:`2`\ O). .. warning:: The power can be intoxicating. Of course, all the stuff we showed previously *still works!* The

  • nly limit is your imagination/interest.

💿 more.rst

slide-7
SLIDE 7

A little more reStructuredText ============================== The extensibility really comes into play with directives and

  • roles. We can do things like link to RFCs (:RFC:`2324`, anyone?)
  • r generate some more advanced formatting (I do love me some

H\ :sub:`2`\ O). .. warning:: The power can be intoxicating. Of course, all the stuff we showed previously *still works!* The

  • nly limit is your imagination/interest.

💿 more.rst

slide-8
SLIDE 8

A little more reStructuredText

The extensibility really comes into play with directives and roles. We can do things like link to RFCs (RFC 2324, anyone?) or generate some more advanced formatting (I do love me some H2O). Warning The power can be intoxicating. Of course, all the stuff we showed previously still works! The only limit is your imagination/interest.

💿 more.html

slide-9
SLIDE 9

reStructuredText provides the syntax Docutils provides the parsing and file generation

slide-10
SLIDE 10

reStructuredText provides the syntax Docutils provides the parsing and file generation Sphinx provides the cross-referencing

slide-11
SLIDE 11

Docutils use readers, parsers, transforms, and writers Docutils works with individual files

slide-12
SLIDE 12

Docutils use readers, parsers, transforms, and writers Docutils works with individual files Sphinx uses readers, parsers, transforms, writers and builders Sphinx works with multiple, cross-referenced files

slide-13
SLIDE 13

How Does Docutils Work?

2

slide-14
SLIDE 14

About me ======== Hello, world. I am **bold** and *maybe* I am brave.

💿 index.rst

slide-15
SLIDE 15

$ rst2html index.rst

slide-16
SLIDE 16

About me

Hello, world. I am bold and maybe I am brave.

💿 index.html

slide-17
SLIDE 17

index.rst index.html

slide-18
SLIDE 18

$ rst2pseudoxml index.rst

slide-19
SLIDE 19

<document ids="about-me" names="about\ me" source="index.rst" title="About me"> <title> About me <paragraph> Hello, world. I am <strong> bold and <emphasis> maybe I am brave.

💿 index.xml

slide-20
SLIDE 20

$ ./docutils/tools/quicktest.py index.rst

slide-21
SLIDE 21

<document source="index.rst"> <section ids="about-me" names="about\ me"> <title> About me <paragraph> Hello, world. I am <strong> bold and <emphasis> maybe I am brave.

💿 index.xml

slide-22
SLIDE 22

Readers (reads from source and passes to the parser) Parsers (creates a doctree model from the read file) Transforms (add to, prune, or otherwise change the doctree model) Writers (converts the doctree model to a file)

slide-23
SLIDE 23

Readers (reads from source and passes to the parser) Parsers (creates a doctree model from the read file) Transforms (add to, prune, or otherwise change the doctree model) Writers (converts the doctree model to a file)

slide-24
SLIDE 24

What About Sphinx?

3

slide-25
SLIDE 25

About me ======== Hello, world. I am **bold** and *maybe* I am brave.

💿 index.rst

slide-26
SLIDE 26

master_doc = 'index'

💿 conf.py

slide-27
SLIDE 27

$ sphinx-build -b html . _build

slide-28
SLIDE 28

About me

Hello, world. I am bold and maybe I am brave.

💿 index.html

slide-29
SLIDE 29

Readers (reads from source and passes to the parser) Parsers (creates a doctree model from the read file) Transforms (add to, prune, or otherwise change the doctree model) Writers (converts the doctree model to a file)

slide-30
SLIDE 30

Builders (call the readers, parsers, transformers, writers) Application (calls the builder(s)) Environment (store information for future builds)

slide-31
SLIDE 31

Builders (call the readers, parsers, transformers, writers) Application (calls the builder(s)) Environment (store information for future builds)

slide-32
SLIDE 32

... updating environment: 1 added, 0 changed, 0 removed reading sources... [100%] index looking for now-outdated files... none found pickling environment... done checking consistency... done preparing documents... done generating indices... done writing additional pages... done copying static files... done copying extra files... done dumping search index in English (code: en) ... done dumping object inventory... done build succeeded.

slide-33
SLIDE 33

Docutils provides almost 100 node types document section title subtitle paragraph block_quote bullet_list note ... (the root element of the document tree) (the main unit of hierarchy for documents) (stores the title of a document, section, ...) (stores the subtitle of a document) (contains the text and inline elements of a single paragraph) (used for quotations set off from the main text) (contains list_item elements marked with bullets) (an admonition, a distinctive and self-contained notice) ...

slide-34
SLIDE 34

Sphinx provides its own custom node types translatable not_smartquotable toctree versionmodified seealso productionlist manpage pending_xref ... (indicates content which supports translation) (indicates content which does not support smart-quotes) (node for inserting a "TOC tree") (version change entry) (custom "see also" admonition) (grammar production lists) (reference to a man page) (cross-reference that cannot be resolved yet) ...

slide-35
SLIDE 35

Docutils provides dozens of transforms DocTitle DocInfo SectNum Contents Footnotes Messages SmartQuotes Admonitions ... (promote title elements to the document level) (transform initial field lists to docinfo elements) (assign numbers to the titles of document sections) (generate a table of contents from a document or sub-node) (resolve links to footnotes, citations and their references) (place system messages into the document) (replace ASCII quotation marks with typographic form) (transform specific admonitions to generic ones) ...

slide-36
SLIDE 36

Sphinx also provides additional transforms MoveModuleTargets AutoNumbering CitationReferences SphinxSmartQuotes DoctreeReadEvent ManpageLink SphinxDomains Locale ... (promote initial module targets to the section title) (register IDs of tables, figures and literal blocks to assign numbers) (replace citation references with pending_xref nodes) (custom SmartQuotes to avoid transform for some extra node types) (emit doctree-read event) (find manpage section numbers and names) (collect objects to Sphinx domains for cross referencing) (replace translatable nodes with their translated doctree) ...

slide-37
SLIDE 37

Using Additional Parsers

4

slide-38
SLIDE 38

There are a number of parsers available reStructuredText (part of docutils) Markdown (part of recommonmark) Jupyter Notebooks (part of nbsphinx)

slide-39
SLIDE 39

# About me Hello, world. I am **bold** and *maybe* I am brave.

💿 index.md

slide-40
SLIDE 40

$ cm2html index.md

slide-41
SLIDE 41

About me

Hello, world. I am bold and maybe I am brave.

💿 index.html

slide-42
SLIDE 42

$ cm2pseudoxml index.md

slide-43
SLIDE 43

<document ids="about-me" names="about\ me" source="index.md" title="About me"> <title> About me <paragraph> Hello, world. I am <strong> bold and <emphasis> maybe I am brave.

💿 index.xml

slide-44
SLIDE 44

# About me Hello, world. I am **bold** and *maybe* I am brave.

💿 index.md

slide-45
SLIDE 45

from recommonmark.parser import CommonMarkParser master_doc = 'index' source_parsers = {'.md': CommonMarkParser} source_suffix = '.md'

💿 conf.py

slide-46
SLIDE 46

from recommonmark.parser import CommonMarkParser master_doc = 'index' source_parsers = {'.md': CommonMarkParser} source_suffix = '.md'

💿 conf.py

slide-47
SLIDE 47

$ sphinx-build -b html . _build

slide-48
SLIDE 48

About me

Hello, world. I am bold and maybe I am brave.

💿 index.html

slide-49
SLIDE 49

Using Additional Writers, Builders

5

slide-50
SLIDE 50

Docutils provides a number of in-tree writers docutils_xml html4css1 latex2e manpage null

  • df_odt

pep_html pseudoxml ... (simple XML document tree Writer) (simple HTML document tree Writer) (LaTeX2e document tree Writer) (simple man page Writer) (a do-nothing Writer) (ODF Writer) (PEP HTML Writer) (simple internal document tree Writer) ...

slide-51
SLIDE 51

$ rst2html5 index.rst

slide-52
SLIDE 52

from docutils.core import publish_file from docutils.writers import html5_polyglot with open('README.rst', 'r') as source: publish_file(source=source, writer=html5_polyglot.Writer())

slide-53
SLIDE 53

$ pip install rst2txt

slide-54
SLIDE 54

$ rst2txt index.rst

slide-55
SLIDE 55

from docutils.core import publish_file from rst2txt with open('README.rst', 'r') as source: publish_file(source=source, writer=rst2txt.Writer())

slide-56
SLIDE 56

html qthelp epub latex text man texinfo xml ... (generates output in HTML format) (like html but also generates Qt help collection support files) (like html but also generates an epub file for eBook readers) (generates output in LaTeX format) (generates text files with most rST markup removed) (generates manual pages in the groff format) (generates textinfo files for use with makeinfo) (generates Docutils-native XML files) ... Sphinx provides its own in-tree builders

slide-57
SLIDE 57

$ sphinx-build -b html . _build

slide-58
SLIDE 58

$ pip install sphinx-asciidoc

slide-59
SLIDE 59

$ sphinx-build -b asciidoc . _build

slide-60
SLIDE 60

Writing Your Own Parsers, Writers

6

slide-61
SLIDE 61

Reading (reads from source and passes to the parser) Parsing (creates a doctree model from the read file) Transforming (applies transforms to the doctree model) Writing (converts the doctree model to a file)

slide-62
SLIDE 62

from docutils import parsers class Parser(parsers.Parser): supported = ('null',) config_section = 'null parser' config_section_dependencies = ('parsers',) def parse(self, inputstring, document): pass

💿 docutils/parsers/null.py

slide-63
SLIDE 63

We’re not covering Compilers 101

slide-64
SLIDE 64

We’re not covering Compilers 101 We’re going to cheat 😅

slide-65
SLIDE 65

<?xml version="1.0" encoding="utf-8"?> <document source="index.rst"> <section ids="about-me" names="about\ me"> <title>About me</title> <paragraph>Hello, world. I am <strong>bold</strong> and <emphasis>maybe</emphasis> I am brave.</paragraph> </section> </document>

💿 index.xml

slide-66
SLIDE 66

from docutils import parsers import xml.etree.ElementTree as ET class Parser(parsers.Parser): supported = ('xml',) config_section = 'XML parser' config_section_dependencies = ('parsers',) def parse(self, inputstring, document): xml = ET.fromstring(inputstring) self._parse(document, xml) ...

💿 xml_parser.py

slide-67
SLIDE 67

... def _parse(self, node, xml): for attrib, value in xml.attrib.items(): # NOTE(stephenfin): this isn't complete! setattr(node, attrib, value) for child in xml: child_node = getattr(nodes, child.tag)(text=child.text) node += self._parse(child_node, child) if xml.tail: return node, nodes.Text(xml.tail) return node

💿 xml_parser.py

slide-68
SLIDE 68

Reading (reads from source and passes to the parser) Parsing (creates a doctree model from the read file) Transforming (applies transforms to the doctree model) Writing (converts the doctree model to a file)

slide-69
SLIDE 69

from docutils import writers class Writer(writers.Writer): supported = ('pprint', 'pformat', 'pseudoxml') config_section = 'pseudoxml writer' config_section_dependencies = ('writers',)

  • utput = None

def translate(self): self.output = self.document.pformat()

💿 docutils/writers/pseudoxml.py

slide-70
SLIDE 70

from docutils import writers class Writer(writers.Writer): supported = ('pprint', 'pformat', 'pseudoxml') config_section = 'pseudoxml writer' config_section_dependencies = ('writers',)

  • utput = None

def translate(self): self.output = self.document.pformat()

💿 docutils/writers/pseudoxml.py

slide-71
SLIDE 71

from docutils import nodes, writers class TextWriter(writers.Writer): supported = ('text',) config_section = 'text writer' config_section_dependencies = ('writers',)

  • utput = None

def translate(self): visitor = TextTranslator(self.document) self.document.walkabout(visitor) self.output = visitor.body

💿 rst2txt/writer.py

slide-72
SLIDE 72

from docutils import nodes, writers class TextWriter(writers.Writer): supported = ('text',) config_section = 'text writer' config_section_dependencies = ('writers',)

  • utput = None

def translate(self): visitor = TextTranslator(self.document) self.document.walkabout(visitor) self.output = visitor.body

💿 rst2txt/writer.py

slide-73
SLIDE 73

... class TextTranslator(nodes.NodeVisitor): ... def visit_document(self, node): pass def depart_document(self, node): pass def visit_section(self, node): pass

💿 rst2txt/writer.py

slide-74
SLIDE 74

from sphinx.builders import Builder class TextBuilder(Builder): name = 'text' def __init__(self): pass def get_outdated_docs(self): pass def get_target_uri(self): pass

💿 sphinx/builders/text.py

slide-75
SLIDE 75

... def prepare_writing(self, docnames): pass def write_doc(self, docnames, doctree): pass def finish(self): pass

💿 sphinx/builders/text.py

slide-76
SLIDE 76

Wrap Up

6

slide-77
SLIDE 77

Sphinx and Docutils share most of the same architecture… Readers Parsers Transforms Writers

slide-78
SLIDE 78

…but Sphinx builds upon and extends Docutils’ core functionality Builders Application Environment

slide-79
SLIDE 79

There are multiple writers/builders provided by both… HTML Manpage LaTeX XML texinfo (Sphinx only) ODF (Docutils only) ...

slide-80
SLIDE 80

...and many more writers/builders available along with readers Markdown (reader and builder) Text (writer) ODF (builder) AsciiDoc (builder) EPUB2 (builder) reStructuredText (builder) ...

slide-81
SLIDE 81

It’s possible to write your own

slide-82
SLIDE 82

It’s possible to write your own

slide-83
SLIDE 83

Fin

🎊

slide-84
SLIDE 84

Who needs Pandoc when you have Sphinx?

An exploration of the parsers and builders of the Sphinx documentation tool FOSDEM 2019 @stephenfin

slide-85
SLIDE 85

Useful Packages and Tools

  • recommonmark (provides a Markdown reader)
  • sphinx-markdown-builder (provides a Markdown builder)
  • sphinx-asciidoc (provides an AsciiDoc builder)
  • rst2txt (provides a plain text writer)
  • asciidoclive.com (online AsciiDoc Editor)
  • rst.ninjs.org (online rST Editor)
slide-86
SLIDE 86

References

  • Quick reStructuredText
  • Docutils Reference Guide

○ reStructuredText Markup Specification ○ reStructuredText Directives ○ reStructuredText Interpreted Text Roles

  • Docutils Hacker’s Guide
  • PEP-258: Docutils Design Specification
slide-87
SLIDE 87

References

  • A brief tutorial on parsing reStructuredText (reST) -- Eli Bendersky
  • A lion, a head, and a dash of YAML -- Stephen Finucane (🌠)
  • OpenStack + Sphinx In A Tree -- Stephen Finucane (🌠)
  • Read the Docs & Sphinx now support Commonmark -- Read the Docs Blog