In the getting started vignette we
introduced a variety of predicate functions for interrogating the
contents of a parsed R Markdown document. The family of
rmd_has_*
functions give a convenient way of checking
explicitly for elements of an Rmd, however often we want to check for
multiple elements at the same time and provide consistent output for any
detected discrepancies.
To support this type of workflow, parsermd
implements
the concept of an Rmd template. These are tibble based representations
of Rmd elements which can be easily be compared with a parsed
document.
hw01
Imagine that a homework assignment has been distributed to students
in the form of an Rmd document named hw01.Rmd
. This
document describes all of the necessary tasks for the student to
complete and also includes scaffolding in the form of Rmd chunks and
markdown that indicates where the students are expected to include their
solutions.
(rmd = parse_rmd(system.file("examples/hw01.Rmd", package = "parsermd")))
#> ├── YAML [2 fields]
#> ├── Heading [h3] - Load packages
#> │ └── Chunk [r, 1 option, 2 lines] - load-packages
#> ├── Heading [h3] - Exercise 1
#> │ ├── Markdown [2 lines]
#> │ └── Heading [h4] - Solution
#> │ └── Markdown [2 lines]
#> ├── Heading [h3] - Exercise 2
#> │ ├── Markdown [2 lines]
#> │ └── Heading [h4] - Solution
#> │ ├── Markdown [4 lines]
#> │ ├── Chunk [r, 2 options, 5 lines] - plot-dino
#> │ ├── Markdown [2 lines]
#> │ └── Chunk [r, 0 options, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#> ├── Markdown [2 lines]
#> └── Heading [h4] - Solution
#> ├── Markdown [4 lines]
#> ├── Chunk [r, 0 options, 1 line] - plot-star
#> ├── Markdown [2 lines]
#> └── Chunk [r, 0 options, 1 line] - cor-star
We can see examples of this templating by extracting the contents of the markdown in the Exercise 1 > Solution section.
rmd_select(
rmd,
by_section(c("Exercise 1", "Solution")) &
has_type("rmd_markdown")
) |>
as_document()
#> [1] "(Type your answer to Exercise 1 here. This exercise does not require any R code.)"
#> [2] ""
#> [3] ""
When a student completes this assignment we want to be able to check that they have included solutions in the appropriate sections. At a minimum this means that we need to check that these sections still exist, and secondarily we might also want to check that the provided content in the solution differs from the provided scaffolding.
We will begin by subsetting the original parsed document to select
only the elements that will contain the student’s answers - this assumes
the other sections and elements are extraneous and contain things like
background, instructions, and question text. Below we use
rmd_select
to select all of the elements of the original
document contained withing a section matching “Exercise *” and
“Solution” which should cover the answers for all three exercises.
(rmd_sols = rmd_select(rmd, by_section(c("Exercise *", "Solution"))))
#> ├── Heading [h3] - Exercise 1
#> │ └── Heading [h4] - Solution
#> │ └── Markdown [2 lines]
#> ├── Heading [h3] - Exercise 2
#> │ └── Heading [h4] - Solution
#> │ ├── Markdown [4 lines]
#> │ ├── Chunk [r, 2 options, 5 lines] - plot-dino
#> │ ├── Markdown [2 lines]
#> │ └── Chunk [r, 0 options, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#> └── Heading [h4] - Solution
#> ├── Markdown [4 lines]
#> ├── Chunk [r, 0 options, 1 line] - plot-star
#> ├── Markdown [2 lines]
#> └── Chunk [r, 0 options, 1 line] - cor-star
One we have this more limited set of elements we use the
rmd_template
function to generate our template. Here we
have included keep_content = TRUE
in order to keep the
scaffolded content for each answer which will then be compared to the
student’s answers.
(rmd_tmpl = rmd_template(rmd_sols, keep_content = TRUE))
#> # A tibble: 9 × 5
#> sec_h3 sec_h4 type label content
#> <chr> <chr> <chr> <chr> <chr>
#> 1 Exercise 1 Solution rmd_markdown <NA> "list(\"(Type your answer to Exerc…
#> 2 Exercise 2 Solution rmd_markdown <NA> "list(\"(The answers for this Exer…
#> 3 Exercise 2 Solution rmd_chunk plot-dino "dino_data <- datasaurus_dozen %>%…
#> 4 Exercise 2 Solution rmd_markdown <NA> "list(\"And next calculate the cor…
#> 5 Exercise 2 Solution rmd_chunk cor-dino "dino_data %>%\n summarize(r = co…
#> 6 Exercise 3 Solution rmd_markdown <NA> "list(\"(Add code and narrative as…
#> 7 Exercise 3 Solution rmd_chunk plot-star ""
#> 8 Exercise 3 Solution rmd_markdown <NA> "list(\"I'm some text, you should …
#> 9 Exercise 3 Solution rmd_chunk cor-star ""
One the template is constructed we can then compare it with a new Rmd
document via the rmd_check_template
function. Note that we
can pass in an rmd_ast
or rmd_tibble
object
directly, or the path to an Rmd which will then be parsed and
compared.
file = system.file("examples/hw01-student.Rmd", package = "parsermd")
rmd_check_template(file, rmd_tmpl)
#> ✖ The following required elements were missing in the document:
#> • Section "Exercise 3" > "Solution" is missing required "markdown text".
#> • Section "Exercise 3" > "Solution" is missing required "markdown text".
#> ✖ The following document elements were unmodified from the template:
#> • Section "Exercise 2" > "Solution" has a "code chunk" named "plot-dino"
#> which has not been modified.
#> • Section "Exercise 2" > "Solution" has "markdown text" which has not been
#> modified.
#> • Section "Exercise 2" > "Solution" has a "code chunk" named "cor-dino"
#> which has not been modified.
From the output we can see that there are several issues with the document submitted by the student, they are missing the two expected markdown text entries for Exercise 3 and it appears that they have not entered any thing new for the chunks or markdown in Exercise 2.
Let assume that our original template was a bit too strict, and we would like to revise the feedback it is giving to students.
If we were to decide that for Exercise 3 the markdown text was not
actually necessary, we can remove this requirement by filtering those
elements from rmd_sols
or from rmd_tmpl
.
(Generally, the former is the suggested workflow and will always work,
while the later approach is likely to be somewhat fragile to any changes
made to the template format in future releases.) Here we use
rmd_select
with the !
operator to remove these
specific markdown elements.
rmd_sols |>
rmd_select(
!(by_section(c("Exercise 3", "Solution")) &
has_type("rmd_markdown"))
)
#> ├── Heading [h3] - Exercise 1
#> │ └── Heading [h4] - Solution
#> │ └── Markdown [2 lines]
#> ├── Heading [h3] - Exercise 2
#> │ └── Heading [h4] - Solution
#> │ ├── Markdown [4 lines]
#> │ ├── Chunk [r, 2 options, 5 lines] - plot-dino
#> │ ├── Markdown [2 lines]
#> │ └── Chunk [r, 0 options, 2 lines] - cor-dino
#> └── Heading [h3] - Exercise 3
#> └── Heading [h4] - Solution
#> ├── Chunk [r, 0 options, 1 line] - plot-star
#> └── Chunk [r, 0 options, 1 line] - cor-star
This new AST can then be passed to rmd_template
and
rmd_check_template
to provide the revised feedback,
file = system.file("examples/hw01-student.Rmd", package = "parsermd")
rmd_sols |>
rmd_select(
!(by_section(c("Exercise 3", "Solution")) &
has_type("rmd_markdown"))
) |>
rmd_template(keep_content = TRUE) |>
rmd_check_template(file, template = _)
#> ✖ The following document elements were unmodified from the template:
#> • Section "Exercise 2" > "Solution" has a "code chunk" named "plot-dino"
#> which has not been modified.
#> • Section "Exercise 2" > "Solution" has "markdown text" which has not been
#> modified.
#> • Section "Exercise 2" > "Solution" has a "code chunk" named "cor-dino"
#> which has not been modified.