How to add a page break in word document generated by RStudio & markdown

25,036

Solution 1

Added: To insert a page break, please use \newpage for formats including LaTeX, HTML, Word, and ODT.

https://bookdown.org/yihui/rmarkdown-cookbook/pagebreaks.html

Paragraph before page break.

\newpage

First paragraph on a new page.

Previously: There is a way by using a fifth-level header block (#####) and a docx template defined in YAML.

After creating headingfive.docx in Microsoft Word, you select Modify Style of the Heading 5, and then select Page break before in the Line and Page Breaks tab and save the headingfive.docx file.

Page break before

---
title: 'Making page break using fifth-level header block'
output: 
  word_document:
    reference_docx: headingfive.docx
---

In your Rmd document, you define reference_docx in the YAML header, and now you can use the page-breaking #####.

Please see below.

https://www.r-bloggers.com/r-markdown-how-to-insert-page-breaks-in-a-ms-word-document/

Solution 2

With the help of John MacFarlane and others on the pandoc google group, I put together a filter that does this. Please see: https://groups.google.com/forum/#!topic/pandoc-discuss/FzLrhk0vVbU In short, the filter needs to look for something to replace with the openxml for pagebreak. In this case \newpage is being replaced with <w:p><w:r><w:br w:type=\"page\"/></w:r></w:p> This allows for a single latex markup to be interpreted for both pdf and word output. Joel

Solution 3

What you are trying to do is force a "page break" or "new page" in a word document generated with Pandoc. I have found a way to do this in my environment but I'm not sure it will work in every environment.

My environment: * R-studio / Pandoc / MS-WORD starting with an "*.Rmd" file and generating a DOCX file.

In my RMD file the key idea is that i've created what acts like a TEMPLATE document (MyFormattingDocument.docx) and in that word document I tweak the STYLES for things like "Heading 1" and/or "Heading 2" and or "footnote" or whatever other predefined styles I want to tweak.

(SEE THIS: http://rmarkdown.rstudio.com/word_document_format.html#style-reference ) for explanation of style reference and how to set the header information in your RMD file to specify a reference document.

SOOOO in my case... i tweak the "Heading 1" style in WORD to include a forced "Page Break Before" in the Paragraph formatting for "Heading 1". Exactly how you force every "Heading 1" to always "Page Break" is different in different versions of Microsoft WORD but if you follow the WORD documentation and modify the "Heading 1" style THEN every "Heading 1" will always have a pagebreak before it.

THEN... you save this template file in the some directory you're working from with the RMD file... and it is USED AS a template. THE CONTENTS of the file are ignored.... so don't worry... you can put sample text in this file and test that the formatting all works.... THE CONTENTS ARE IGNORED but the STYLES are USED in the new word document which will be built by the RMD file so.... then every "Heading 1" will have a break before it.

NOTE: You could obviously do the same with ANY style that has a one-to-one mapping from PANDOC MARKUP so you could instead just make all "Heading 3" or whatever.... just look at see in your RMD created DOCX what "STYLE" is being applied and then tweak that style even if you need to insert some "fake" lines with essentially blank content just for the purpose of forcing a style to appear in the DOCX

Solution 4

Here is an R script that can be used as a pandoc filter to replace LaTeX breaks (\pagebreak) with word breaks, per @JAllen's answer above. With this you don't need to compile a pandoc script. Since you are working in R Markdown I assume one has R available in the system.

#!/usr/bin/env Rscript

json_in <- file('stdin', 'r')
lat_newp <- '{"t":"RawBlock","c":["latex","\\\\newpage"]}'
doc_newp <- '{"t":"RawBlock","c":["openxml","<w:p><w:r><w:br w:type=\\"page\\"/></w:r></w:p>"]}'
ast <- paste(readLines(json_in, warn=FALSE), collapse="\n")
ast <- gsub(lat_newp, doc_newp, ast, fixed=TRUE)
write(ast, "")

Save this as page-break-filter.R or something like that and make it executable by running chmod +x page-break-filter.R in the terminal.

Then include this filter the R Markdown YAML like so:

---
title: "Title
author: "Author"
output:  
  word_document:
    pandoc_args: [
      "--filter", "/path/to/page-break-filter.R"
    ]
---

Solution 5

You can use the R package worded. This avoids the need for a template word file. See https://github.com/davidgohel/worded.

The output parameter needs to be set to worded::rdocx_document and you need to call library(worded).

---
date: "2018-03-27"
author: "David Gohel"
title: "Document title"
output: 
  worded::rdocx_document
---

```{r setup, include=FALSE}
library(worded)
```

You can then add <!---CHUNK_PAGEBREAK---> to your document whenever you want a page break.

The package allows various word formatting options using a similar mechanism.

Share:
25,036
Giorgio Spedicato
Author by

Giorgio Spedicato

Updated on July 09, 2022

Comments

  • Giorgio Spedicato
    Giorgio Spedicato almost 2 years

    I writing a Word document with R markdown in R Studio. I can get many things, but at the moment I am not figuring out how can I get a page break. I have found solutions but only for rendered latex / pdf document that it is not my case.

  • Keith Hughitt
    Keith Hughitt over 7 years
    it might be helpful to post a snippet from/based on the blog link; this way if the site goes away in the future the answer will still be useful.
  • D. Woods
    D. Woods over 6 years
    Despite the fact that the R markdown site says that this would produce a page break. My testing results in only a horizontal rule in MS Word.
  • Alex Knorre
    Alex Knorre almost 6 years
    The crucial thing to do here that this will work in an Rmd-generated Word document -- tick "New documents based on this template" in Style -- Modify... section
  • r2evans
    r2evans over 5 years
    The only "other" to this technique is that the next page starts with a blank line; it cannot be avoided, I believe, because it is the line of text with the "Heading 5" style attached, not something you can hide or get rid of. The best I did was further the formatting to reduce font size, set to white, reduce line spacing, etc. Still just a single blank line.
  • Sungpil Han
    Sungpil Han over 5 years
    This package is pretty good. It also supports landscape orientation.
  • giordano
    giordano over 5 years
    Is it possible to combine worded with a template word file?
  • anotherfred
    anotherfred over 5 years
    @giordano not sure, but behind the scenes the package uses the same xml injection technique suggested by Noam Ross, so you can always combine the techniques manually.
  • anotherfred
    anotherfred about 5 years
    @Whitebeard13 according to the link, it seems to have been renamed to Officedown. I don't think it was ever on CRAN - you can download it from GitHub with devtools::install_github("davidgohel/officedown")
  • Whitebeard13
    Whitebeard13 about 5 years
    @anotherfred Yes I found it that's why i removed my comment. Thanks a lot.
  • Whitebeard13
    Whitebeard13 about 5 years
    @anotherfred We can use <!---CHUNK_PAGEBREAK---> between the text lines in the .Rmd script correct? While the installation and loading were successful so far, no page break appears. See below part of my .Rmd script: ....se the information contained in this document. Please ensure that you read the last available version of this document.<!---CHUNK_PAGEBREAK---> # 1.Introduction The main purpose of the expert forum is to form a qualitative and quant...
  • sullij
    sullij about 5 years
    getting the following error: devtools::install_github("davidgohel/officedown") Installation failed: handle is dead
  • abu
    abu about 4 years
    That discussion looks promising but I get confused with so many messages and versions of the filter script. Could you explain here to use it? Is it something one can do using just R (.Rmd) code, or is that some kind of pandoc-code? (which I don't know how to open and configure from R). Also, is it platform independent? (I am on Windows 7, but you used RHEL 6). Thanks a lot @JAllen
  • Oliver
    Oliver almost 4 years
    I did this verbatim, but it doesn't work for me. I get this pandoc error: Error running filter page-break-filter.R: Error in $: Failed reading: not a valid json value. Also, incredibly bizarrely, every time I try to render the Rmd, it deletes page-break-filter.R and a bunch of other source files. That doesn't happen when I I don't include the pandoc_args in my YAML.
  • Sungpil Han
    Sungpil Han almost 4 years
    I believe that this is the nicest solution.
  • Whalen
    Whalen almost 4 years
    I used this hack a couple years ago. Updates have enabled using \newpage to work across the core document output types. bookdown.org/yihui/rmarkdown-cookbook/pagebreaks.html