A new LSP Book

Firstname Lastname

Prepublication draft, do not cite!

April 20, 2023

Series To Be Determined

1 In the beginning

1.1 Markdown philosophy

[1.1] This directory is a skeleton to write a monograph for Language Science Press (LSP) in Markdown. markdown was introduced by John Gruber as a easy-to-use method to write content for webpages. The principles have spread widely and are formally codified as commonmark. The basic idea is to make it easy to write content, while the details of the formatting are added by an automatic conversion.

[1.2] This possibilities of this automatic conversion has been greatly extended by pandoc as introduced and maintained by John MacFarlane. Pandoc can convert between dozens of different output formats, allowing for a great freedom for the visual display of your text. Pandoc also offers many extensions to the rather basic possibilities of the original markdown/commonmark proposals. Pandoc also has a robust system to add functionality through additional modules (called ‘filters’) when needed.

[1.3] Although Pandoc offers a wide range of bidirectional conversions between all kinds of formats, the current system for LSP suggests that you write in Pandoc-enhanced Markdown and convert from that basis to any desired output format. The problem is that every format (e.g. LaTeX, docx, odf) has its own design-quircks, and Pandoc will never be able to convert every detail between all formats. The Pandoc-Markdown structure offers a convenient set of markup options that are garantueed to be converted well and are sufficient for scientific texts.

[1.4] Any writer’s desire that is currently missing in Pandoc-Markdown can be easily added by filters. In essence, Pandoc filters are very similar to packages in LaTeX. Because Pandoc is still relatively new, there are not as many filters available as there are LaTeX-packages, and there is also not yet a central repository for filters like ctan for LaTeX. However, filters (especially Lua-based filters) are really easy to write and adapt, especially when compared to the rather arcane format of LaTeX-packages. The current LSPmarkdown skeleton includes various special filters to prepare a scientific book that is published on the open web. Some more details about the rationale for these filters is explained here.

[1.5] There are some major benifits when using Pandoc-markdown:

[1.6] There are various limitations to the current markdown setup:

[1.7] Note that there are various other approaches that are similary to the current LSPmarkdown skeleton. Most prominently there is Rmarkdown and the related Bookdown, which are both based on knitr by Yihui Xie. Those approaches embed Pandoc inside R-packages and thereby offer many nice options for the visualisation of quantitative data. However, these approaches are strongly geared towards usage within RStudio.

1.2 Installation

[1.8] For writing a book with the LSPmarkdown skeleton you can use any text editor of you choice. However, prefereably stay clear of full-fledged word-processors like Microsoft Word or OpenOffice because they tend to automatically add all kind of markup in the background, often onbenowst to the user.

[1.9] Currently, the free editor visual studio code is in very active development and probably the best choice if you are not yet entangled to any other text editor. However, any text editor is fine, e.g. TextMate, BBedit, Sublime Text, Atom, or even Emacs/vim if you are so inclined.

[1.10] In all modern editors (like Visual Studio Code) it is possible to open a complete directory/folder, so it becomes easy to switch between editing the different files in the LSPmarkdown directory. The current LSPmarkdown directory should be the starting point of your book. Simply rename the directory to your liking and change the current content (which is only included as an example).

[1.11] If you want to convert your text to any of the LSP-based outputs you will need to install some additional software. This can be done through a package manager like Homebrew for macOS. However, it might be easiest for new users to simple install the following pieces of software separately:

1.3 Setup

[1.12] To prepare your book there are three basic files with settings that you will have to adapt to your needs:

[1.13] The current LSPmarkdown directory is prepared for direct upload as a git-repository, e.g. at Github or Gitlab. That is probably the easiest way to share your work with others, also in any pre-publication status. There are a few files included here for a more transparent sharing of your work online:

2 Writing in Pandoc-markdown

2.1 Basic editing

[2.1] The basic principles of writing in Markdown are explained at various places online. However, the most authorative summary is provided by the commonmark project. The most basic rules are:

2.2 Pandoc advanced editing

[2.2] The possibilities of markdown are enhanced by Pandoc with various crucial formatting options as specified in the pandoc user’s guide. These formatting options will all be retained in the various conversions from Markdown into other formats (e.g. HTML or PDF). For example, footnotes are included by using the following format inside your text at the position where the footnote mark should occur. This is an example footnote. Footnote text between square brackets and a circumflex symbol before the brackets.

^[Footnote text between square brackets and a circumflex symbol before the brackets.]

[2.3] In the current LSPmarkdown framework there is an additional filter added that will change strikethrough formatting (by enclosing double tildes) into small caps formatting. This is a convenience option because linguistic texts often use small caps, but only rarely strikethrough. To remove this automatic conversion, simply remove the filter strikeout-to-smallcaps from the conversion files (e.g. from the file tohtml.yaml).

2.3 Cross-referencing

[2.4] Pandoc itself contains some basic cross-referencing options. However, it is strongly recommended to install the extra software pandoc-crossref. This allows for various flexible options to automatically insert internal cross-references in your text. Basically, you add a label to a heading by adding it behind the heading like this:

## Some Heading {#mylabel}

[2.5] In your text you then refer to this heading by typing [@mylabel] which will result in a cross reference like this: Sec­tion 2. pandoc-crossref has many more possibilities and options as explained in detail in the user’s guide.

2.4 Linguistic examples

[2.6] To add linguistic examples, the LSPmarkdown skeleton includes a Pandoc-filter pandoc-ling. A full user’s guide describing all options and limitations is available. Basically, you can add linguistic examples using the following format:

::: ex
a. This is an example sentence.
b. And another one.
:::

[2.7] This will result in a numbered example as shown below. You can refer to this example using abbreviations like [@next] before the examples or [@last] after the example, e.g. see example (2.1).

(2.1) a. This is an example sentence.
b. And another one.

2.5 Figures

[2.8] To insert figures into your text, prepare figures separately and store them in the directory figures. In your markdown then add a line like the following below the table. This will be automatically numbered, and you can use the code [@fig:crossreferencelabel] in your text to refer to the figure, for example Figure 2.1.

![Caption text here](figures/myfigure){#fig:crossreferencelabel}

[2.9] To prepare HTML output it is prefereable to convert any figure into svg format. There are many online converters that can do this. You can include different file formats with the same filename into the directory figures, e.g. myfigure.pdf and myfigure.svg.

This is an example figure.
Figure 2.1: This is an example figure.

2.6 Tables

[2.10] The situation to include tables is not yet very user friendly in Pandoc-Markdown. The editing and conversion of tables is currently a field of active development within Pandoc, so this will probably be improved substantially in the near future. For now, check out the possibilities for formatting tables in the Pandoc user’s guide. If you use Visual Studio Code, then there is a useful extension that might be helpful for formatting tables called table formatter.

Table 2.1: This is an example table
Right Left Default Center
12 12 12 12
123 123 123 123
1 1 1 1

[2.11] To add a caption to the table, add a line like the following below the table. This will be automatically numbered, and you can use the code [@tbl:crossreferencelabel] in your text to refer to the table, for example see Table 2.1.

Table: Caption text here {#tbl:crossreferencelabel}

2.7 References

[2.12] Pandoc includes a citeproc extension to treat references and bibliography using the citation style language (csl) framework. The current framework uses the ‘unified style sheet for linguistics’ by default. There are various ways to include references, but two options seem to be most useful:

[2.13] For the Zotero/BetterBibTeX workflow, you will have to additionally do the following:

[2.14] In both options your references will have a ‘citation key’ which typically uses a format like chomsky1957 (but this format can be changed to your liking). To include a reference in your text use the link as shown below. This will result in a reference in your text (Chomsky 1957: 23). You can add anything you like after the citations key, typically page numbers. To suppress the name inside the brackets, add a dash-minus symbol before the ‘@’ symbol. This will result in a reference to Bloomfield (1925).

[@Chomsky1957: 23]
[-@Bloomfield1925]

3 Making the book

[3.1] A few different settings-files are included in this LSPmarkdown skeleton to prepare the final book from your markdown source. These are yaml-files with various settings. Such files are called defaults-files in pandoc-parlance (see the pandoc documentation).

[3.2] To use these default-files you will have to open a terminal/shell in the current directory. If you use Visual Studio Code and you have opened the whole LSPmarkdown directory with File >> Open Folder… then this is really easy, also when you have never used a terminal/shell. Simply open a terminal/shell through the menu Terminal >> New Terminal and then type the command as specified below.

3.1 LSP-style HTML

[3.3] To convert your markdown to HTML type the following command in your terminal and hit return:

pandoc -d tohtml.yaml

[3.4] As a result there will be a new file called index.html in the directory docs with the final HTML output. This file is completely self-contained and does not need any other files to work properly. You can immediately open this file locally from you computer by double-clicking it. It will open locally in you default web browser.

[3.5] Because this file is completely self-contained you can easily share it with other people (just send them the file). When you sync your whole directory with GitHub you can immediately publish this file using GitHub Pages. Note that the fonts used in the LSP style are Libertinus Serif (for the main text) and Arimo (for the title page). When these fonts are not installed on your computer the browser will attempt to fetch them from the internet.

[3.6] For final LSP-publication there are few additional minor tweaks to be made:

3.2 Draft PDF

[3.7] To convert your markdown to PDF you can use the following command in the terminal:

pandoc -d topdf.yaml

[3.8] This will produce a PDF-file called book.pdf in the directory docs. This PDF does not use the LSP styling. Instead, the PDF will use the default LaTeX-style from the conversion-software Pandoc. The advantage of this option is that this conversion to PDF is much quicker and easier than the complete LSP pathway as described below. This is particularly useful if you want to produce a quick PDF version for printing or reviewing of your text. Note that you will need LaTeX to be installed on your computer, see Sec­tion 1.2

3.3 LSP-style PDF

[3.9] Finally, there is a conversion option to prepare your book for LaTex-based LSP publication. The preparation of a final book for LSP is somewhat more involved because there are various checks and additional fine-tuning needed for a polished real-life publication. Basically the procedure is as follows: first, convert your markdown into raw LaTex, and second, use this LaTex to proceed through the regular LSP pipeline. To convert you markdown to raw LaTeX you can use the following command in your terminal:

pandoc -d totex.yaml

[3.10] This will produce a tex-file called all.tex in the directory latex/chapters. The directory latex is a slightly adapted version of the default LSP skeleton to produce a book. You will have to make a few more changes in this directory to produce a complete LSP book:

[3.11] Then you can produce a draft version of the final LSP styled PDF by typing the following command in your terminal:

make -C latex

[3.12] This will take some time, and might very well spit out many TeX-errors. For now simply ignore these messages. If this process gets stuck, type “x” to break and ask somebody for help. If it finishes, then there will a PDF-file called main.pdf in the directory latex with your LSP-styled book. The final tweaks to this process will be performed together with the people from the Language Science Press.

Bibliography

Bloomfield, Leonard. 1925. On the sound-system of central Algonquian. Language 1(4). 130–156.
Chomsky, Noam. 1957. Syntactic structures. The Hague: Mouton.