Relationship between R Markdown, Knitr, Pandoc, and Bookdown

21,192

Pandoc

Pandoc is a document converter. It can convert from a number of different markup formats to many other formats, such as .doc, .pdf etc.

Pandoc is a command line tool with no GUI. It is an independent piece of software, separate from R. However, it comes bundled with R Studio because rmarkdown relies on it for document conversion.

Pandoc not only converts documents, but it also adds functionality on top of the base markdown language to enable it to support more complex outputs.

R Markdown

R Markdown is based on markdown:

Markdown (markup language)

Markdown is a lightweight markup language with plain text formatting syntax designed so that it can be converted to HTML and many other formats. A markdown file is a plain text file that is typically given the extension .md.

Like other markup languages like HTML and Latex, it is completely independent from R.

There is no clearly defined Markdown standard. This has led to fragmentation as different vendors write their own variants of the language to correct flaws or add missing features.

Markdown (R package)

markdown is an R package which converts .Rmd files into HTML. It is the predecessor of rmarkdown, which offers much more functionality. It is no longer recommended for use.

R Markdown (markup language)

R Markdown is an extension of the markdown syntax. R Markdown files are plain text files that typically have the file extension .Rmd. They are written using an extension of markdown syntax that enables R code to be embedded in them in a way which can later be executed.

Because they are expected to be processed by the rmarkdown package, it is possible to use Pandoc markdown syntax as part of a R markdown file. This is an extension to the original markdown syntax that provides additional functionality like raw HTML/Latex and tables.

R Markdown (package)

The R package rmarkdown is a library which proceses and converts .Rmd files into a number of different formats.

The core function is rmarkdown::render which stands on the shoulders of pandoc. This function 'renders the input file to the specified output format using pandoc. If the input requires knitting then knitr::knit is called prior to pandoc.

The RMarkdown package's aim is simply to provide reasonably good defaults and an R-friendly interface to customize Pandoc options..

The YAML metadata seen at the top of RMarkdown files is specificially to pass options to rmarkdown::render, to guide the build process.

Note that RMarkdown only deals with markdown syntax. If you want to convert a .Rhtml or a .Rnw file, you should use the convenience functions built into Knitr, such as knitr::knit2html and knitr:knit2pdf

Knitr

Knitr takes a plain text document with embedded code, executes the code and 'knits' the results back into the document.

For for example, it converts

The core function is knitr::knit and by default this will look at the input document and try and guess what type it is - Rnw, Rmd etc.

This core function performs three roles: - A source parser, which looks at the input document and detects which parts are code that the user wants to be evaluated. - A code evaluator, which evaluates this code - An output renderer, which writes the results of evaluation back to the document in a format which is interpretable by the raw output type. For instance, if the input file is an .Rmd, the output render marks up the output of code evaluation in .md format.

Converting between document formats

Knitr does not convert between document formats - such as converting a .md into a .html. It does, however, provide some convenience functions to help you use other libraries to do this. If you are using the rmarkdown package, you should ignore this functionality because it has been superceded by rmarkdown::render.

An example is knitr:knit2pdf which will: 'Knit the input Rnw or Rrst document, and compile to PDF using texi2pdf or rst2pdf'.

A potential source of confusion is knitr::knit2html, which "is a convenience function to knit the input markdown source and call markdown::markdownToHTML to convert the result to HTML." This is now legacy functionality because the markdown package has been superceded by the rmarkdown package. See this note.

Bookdown

The bookdown package is built on top of R Markdown, and inherits the simplicity of the Markdown syntax , as well as the possibility of multiple types of output formats (PDF/HTML/Word/…).

It offers features like multi-page HTML output, numbering and cross-referencing figures/tables/sections/equations, inserting parts/appendices, and imported the GitBook style (https://www.gitbook.com) to create elegant and appealing HTML book pages.

Share:
21,192

Related videos on Youtube

RobinL
Author by

RobinL

Data scientist/engineer for UK government

Updated on July 08, 2022

Comments

  • RobinL
    RobinL almost 2 years

    What is the relationship between the functionality of R Markdown, Knitr, Pandoc, and Bookdown?

    Specifically what is the 'division of labour' between these packages in converting markup documents with embedded R code (e.g. .Rnw or .Rmd) into final outputs (e.g. .pdf or .html)? And if Knitr is used to process RMarkdown, what does the rmarkdown package do and how is it different to the markdown package?

  • RobinL
    RobinL over 7 years
    I found this very confusing so I have done my best here. Please do edit or add a different answer if I've got something wrong...
  • baptiste
    baptiste over 7 years
    one aspect that I find confusing is the documentation of parameters being passed to each step of the toolchain. There's almost no interactive help (such as autocompletion) and one has to guess what parameters should be called in yaml headers, or via knitr_opts (I always forget what it's called), or via custom pandoc arguments, or via additional yam files, or a custom pandoc template... It feels a bit of a jungle sometimes, especially when you add LaTeX to the chain.
  • CL.
    CL. over 7 years
    @baptiste I completely agree. And this is exactly the reason why I prefer RNW documents with bare LaTeX. No intermediate pandoc step, less magic, less confusion. Just the admittedly steep LaTeX learning curve. In my opinion, Rmarkdown is great when you are satisfied with the simple default stuff. But as soon as you have to tweak it, complexity rises rapidly.
  • RobinL
    RobinL over 7 years
    I also agree! The parameters were the most confusing part for me.
  • RobinL
    RobinL over 7 years
    @baptiste @CL I have created, and attempted to respond to another question about this here - specifically on the topic of what goes in _bookdown.yml.
  • skan
    skan almost 7 years
    So how do markdown and bookdown compare to each other? Which one is easier or offer more options or better results?
  • StatsStudent
    StatsStudent over 5 years
    This is the best explanation I have found of all this. It's very confusing for beginners or even those with years of experience in R and latex separately like myself. Excellent post.
  • Elliot
    Elliot about 5 years
    Is it possible to convert .mmd too .Rmd?
  • Mark Neal
    Mark Neal about 4 years
    @StatsStudent I think a previous version of the rstudio rmarkdown cheat sheet had a diagram that was pretty helpful to understanding the different steps in creating output from rmarkdown. Perhaps an answer here could do with a diagram?
  • Fred Guth
    Fred Guth about 2 years
    Great explanation. I am looking for a simple way to convert md (no specific flavour) to pdf using latex templates. I didn't want to have to use R. Is Bookdown an overkill? Is there an easier/simpler way?