Scientific documents with \(\LaTeX\)

Introduction

In your research, you will produce papers, reports and—very importantly—your thesis. These documents can be written using a WYSIWYG (What You See Is What You Get) editor (e.g., Word). However, an alternative especially suited for scientific publications is LaTeX. In LaTeX, the document is written in a text file (.tex) with certain typesetting (tex) syntax. Text formatting is done using markups (like HTML). The file is then “compiled” (like source code of a programming language) into a file – typically in PDF.

Why \(\LaTeX\)?

A number of reasons:

  • The input is a small, portable text file

  • LaTeX compilers are freely available for all OS’

  • Exactly the same result on any computer (not true for Word, for example)

  • LaTeX produces beautiful, professional looking docs

  • Images are easy to embed and annotate

  • Mathematical formulas (esp complex ones) are easy to write

  • LaTeX is very stable – current version basically same since 1994! (9 major versions of MS Word since 1994 – with compatibility issues)

  • LaTeX is free!

  • You can focus on content, and not worry so much about formatting while writing

  • An increasing number of Biology journals provide \(\LaTeX\) templates, making formatting quicker.

  • Referencing (bibliography) is easy (and can also be version controlled) and works with tools like Mendeley and Zotero

  • Plenty of online support available – your question has probably already been answered

  • You can integrate LaTeX into a workflow to auto-generate lengthy and complex documents (like your thesis).


Word vs Latex

Fig. 11 LaTeX documents scale up better then WYSIWYG editors. Latex files for really large and complex documents (such as PhD Theses) are much easier to edit, manage and publish in specific formats (like pdf) than Word or Open Office documents.


Limitations of \(\LaTeX\)

  • It has a steeper learning curve.

  • Can be difficult to manage revisions with multiple authors – especially if they don’t use LaTeX! (Cue: Windows on a virtual machine!)

  • Tracking changes are not available out of the box (but can be enabled using a suitable package)

  • Typesetting tables can be a bit complex.

  • Images and floats are easy to embed, and won’t jump around like Word, but if you don’t use the right package, they can be difficult to place where you want!

Installing LaTeX

Type this in terminal:

sudo apt-get install texlive-full texlive-fonts-recommended texlive-pictures texlive-latex-extra imagemagick

It’s a large installation, and will take some time.

We will use a text editor in this lecture, but you can use one of a number of dedicated editors (e.g., texmaker, Gummi, TeXShop, etc.) There are also WYSIWYG frontends (e.g., Lyx, TeXmacs).

Overleaf is also very good (and works with git), especially for collaborating with non LaTeX-ers (your university may have a blanket license for the pro version).

Key \(\LaTeX\) features

Environments

Environments are used to format blocks of text or graphics in a Latex document. They are delimited by an opening \begin and a closing \end tags (except for certain math environments). Everything inside will be formatted in a specific manner depending on the type of environment. For example, the code

\begin{center}
    Here is some text
\end{center}

Will produce “Here is some text” centered in the middle of the page.

The most commonly used Latex environments are:

Environment

Purpose

\begin{center} ... \end{center}

Center the elements (works for text as well as graphics)

\begin{itemize} ... \end{itemize}

An itemized list (default is bullet points)

\begin{enumerate} ... \end{enumerate}

An enumerated list (default is Arabic numerals)

\begin{figure} ... \end{figure}

For displaying a Figure

\begin{table} ... \end{table}

For displaying a table

\( ... \), $ ... $, or \begin{math}...\end{math}

For displaying an equation inline (as part of a sentence)

\[ ... \], $$ ... $$, or \begin{displaymath}...\end{displaymath}

For producing an equation as a separate, display item (separate from the text)

\begin{equation} ... \end{equation}

For displaying a centered, numbered equation as a separate, display item

In all of these environments, you can use modifier directives to the environment to tailor them. For example, in the itemize environment, the default is to create a list with bullets, but you can pick any symbol.

Below in your first Latex document example, you will see some examples of environments.

Special characters

Some characters are “special” in Latex. These characters have a specific purpose, either inside a particular environment (e.g., table or equation), or both outside and inside an environment.

Character

What it does

#

Used to reference arguments for a latex command; similar to the way $ is an argument reference in shell scripts

$

Used for opening or closing a mathematical equation or symbol; e.g.,$y = mx + c$ gives $\(y = mx + c\)$

%

Comment character; everything from this symbol up to the end of line is ignored and will not appear in the final document

&

Alignment character; used to align columns in tables, and also equations in math environments

_

Subscript in math environments

^

Superscript in math environments

{ and }

Use to group characters in math environments, and to enclose arguments in Latex commands

~

(equivalent to the command \nobreakspace{}) An “unbreakable” space, which can be used to add one or more “hard” spaces inside as well as outside math environments; for example, $x  y$ gives $\(x y\)\(, but `\)x~~y\(` gives \)\(x~~y\)$

\

Indicates a LaTeX command, as in \LaTeX or \maketitle (or can be used to escape as special character - see below)

Rendering special characters

If you want to actually reproduce these special characters in your document, you have to “escape” them by adding a backslash (\) in front of them. For example, writing $\%$ produces the actual percentage symbol, \(\%\).

Latex commands

Every LaTeX command starts with a \ . There are two types of commands:

  • Commands without arguments: These commands are standalone, and do not take any additional arguments.

    • For example, in your first latex document above, the \maketitle command tells latex to render the title in the typeset document.

    • Another example: to render \(\LaTeX\), you need the command \LaTeX

  • Commands with arguments: These commands can (and often must) take arguments with curly brackets, which can be modified by including additional directives in square brackets before the main argument.

    • For example, in your first latex example above, the \documentclass[12pt]{article}

    • Another example: \date{} inserts the current date.

Spaces and new lines

Note that:

  • Several spaces in your text editor are treated as one space in the typeset document

  • Several empty lines are treated as one empty line

  • One empty line defines a new paragraph

Typesetting math

There are two ways to display math

  1. Inline (i.e., within the text).

  2. Stand-alone, numbered equations and formulae.

For inline math, the “dollar” sign flanks the math to be typeset. For example, the code:

$\int_0^1 p^x (1-p)^y dp$

becomes \(\int_0^1 p^x (1-p)^y dp\)

For numbered equations, LaTeX provides the equation environment. For example,

\begin{equation}
    \int_0^1 \left(\ln \left( \frac{1}{x} \right) 
    \right)^y dx = y!
\end{equation}

becomes

(1)\[\begin{equation} \int_0^1 \left(\ln \left( \frac{1}{x} \right) \right)^y dx = y! \end{equation}\]

Document structure

Latex documents have a very specific structure in terms of the sequence in which certain elements must appear.

The start of the document

The first command is always \documentclass[]{} defining the type of document (e.g., article, book, report, letter).

Here, you can set several options. For example, to set size of text to 10 points and the letter paper size:

\documentclass[10pt,letterpaper]{article}

Defining packages

After having declared the type of document, you can specify special packages you want to use. Some particularly useful ones are:

\usepackage{color}

Use colors for text in your document

\usepackage{amsmath,amssymb}

Formats and commands for typesetting mathematical symbols and equations

\usepackage{fancyhdr}

Fine tune the formatting of headers and footers

\usepackage{graphicx}

Include figures in different formats: pdf, ps, eps, gif and jpeg

\usepackage{listings}

Typeset source code for different programming languages

\usepackage{rotating}

Allow rotation of tables and figures

\usepackage{hyperref}

Allow formatting of hyperlinks.

\usepackage{lineno}

Allow line numbers

The main body

  • Once you select the packages, you must start the main body of your document with \begin{document} and end it with \end{document}.

A first LaTeX example

Let’s try writing an example Latex document.

★ In your code editor, type the following and save it as a file called FirstExample.tex in a suitable location(e.g, in a code directory):

\documentclass[12pt]{article}

\title{A Simple Document}

\author{Your Name}

\date{}

\begin{document}
  \maketitle
  
  \begin{abstract}
    This paper analyzes a seminal equation in population biology.
  \end{abstract}
  
  \section{Introduction}
    Blah Blah
  
  \section{Materials \& Methods}
  
  A foundational equation of population biology is:
  
  \begin{equation}
    \frac{dN}{dt} = r N (1 - \frac{N}{K})
  \end{equation}
  
  It was first proposed by Verhulst in 1838 \cite{verhulst1838notice}.
  
  \bibliographystyle{plain}
  
  \bibliography{FirstBiblio}

\end{document}

Note

Look carefully at the way some of the elements such as special characters and environments are used in this first example document.

Referencing and bibliography

Now, let’s get a citation for the paper.

★ In the search box in Google Scholar type “verhulst population 1838”

The paper should be the only one (or the top one) to appear.

Click on the “Cite” icon (looks like two hollow commas) below the paper’s title etc., and a small Cite window will appear. Click on “BibTeX” in the list of format options at the bottom, which should lead to a page with just the following text:

@article{verhulst1838notice,
  title={Notice sur la loi que la population suit dans son accroissement},
  author={Verhulst, Pierre-Fran{\c{c}}ois},
  journal={Corresp. Math. Phys.},
  volume={10},
  pages={113--126},
  year={1838}
}

Copy and paste this into a file called FirstBiblio.bib, saved in the same directory as FirstExample.tex

Compiling the Latex document

Now we can create a .pdf file of the document.

★ In a terminal type (making sure you are the same directory where FirstExample.tex and FirstBiblio.bib are):

 pdflatex FirstExample.tex
 bibtex FirstExample
 pdflatex FirstExample.tex
 pdflatex FirstExample.tex

This should produce the file FirstExample.pdf:

Latex Example

In the above bash script, we repeated the pdflatex command 3 times. Here’s why:

  • The first pdflatex run generates two files:FirstExample.log and FirstExample.aux (and an incomplete .pdf).

    • At this step, all cite{…} arguments info that bibtex needs are written into the .aux file.

  • Then, the second bibtex command (followed by the filename without the .tex extension) results in bibtex reading the .aux file that was generated. It then produces two more files: FirstExample.bbl and FirstExample.blg

    • At this step, bibtex takes the citation info in the aux file and puts the relevant biblogrphic entries into the .bbl file (you can take a peek at all these files), formatted according to the instructions provided by the bibliography style that you have specified using bibliographystyle{plain}.

  • The second pdflatex run updates FirstExample.log and FirstExample.aux (and a still-incomplete .pdf - the citations are not correctly formatted yet)

    • At this step, the reference list in the .bbl generated in the above step is included in the document, and the correct labels for the in-text cite{...} commands are written in .aux file (but the non in the actual pdf).

  • The third and final pdflatex run then updates FirstExample.log and FirstExample.aux one last time, and now produces the complete .pdf file, with citations correctly formatted.

    • At this step, latex knows what the correct in-text citation labels are, and includes them in the pdf document.

Throughout all this, the .log file plays no role except to record info about how the commands are running.

PHEW! Why go through this repetitive sequence of commands? Well, “it is what it is” – \(\LaTeX\), with all its advantages does have its quirks. The reason why it is this way, is probably that back then (Donald Knuth’s PhD Thesis writing days – late 1950’s to early 1960’s), computers had tiny memories (RAMs), and writing files to disk and then reading them back in for the next step of the algorithm/program was the best (and only) way to go. Why has this not been fixed? I am not sure - keep an eye out, and it might well be (and then, raise an issue on TheMulQuaBio’s Github!)

Anyway, as such, you don’t have to run these commands literally step by step, because you can create a bash script that does it for you, as we will now learn.

A bash script to compile \(\LaTeX\)

Let’s write a useful little bash script to compile latex with bibtex.

★ Type the following script and call it CompileLaTeX.sh (you know where to put it!):

#!/bin/bash
pdflatex $1.tex
bibtex $1
pdflatex $1.tex
pdflatex $1.tex
evince $1.pdf &

## Cleanup
rm *.aux
rm *.log
rm *.bbl
rm *.blg

How do you run this script? The same as your previous bash scripts, so:

bash CompileLaTeX.sh FirstExample

Exercise

Note that I have not written the .tex extension of FirstExample when feeding it to the in latex compilation bash script above. Make this bash script more convenient to use by allowing users to compile the script by using

bash CompileLaTeX.sh FirstExample.tex

Some more \(\LaTeX\) features and tips

Here are some more Latex features, and tips that might prove handy:

  • LaTeX can render pretty much every mathematical symbol and operator that you can think of (plenty of lists and cheat-sheets online)

  • Long documents can be split into separate .tex documents and combined using the \input{} command

  • You can use bibliography managers such as Mendeley or Zotero to export and maintain/update .bib files that are then ready to be used in a Latex document

  • You can create new environments and commands, and create new ones in the preamble (which can also be kept as a separate document and inserted using the \input{} command)

Practicals

First \(\LaTeX\) example

Test CompileLaTeX.sh with FirstExample.tex and bring it under verson control under week1 in your repository. Make sure that CompileLaTeX.sh will work if somebody else ran it from their computer using FirstExample.tex as an input.

Practicals wrap-up

Make sure you have your Week 1 directory organized with Data, Sandbox and Code with the necessary files and this week’s (functional!) scripts in there. Every script should run without errors on my computer. This includes the five solutions (single-line commands you came up with) in UnixPrac1.txt.

Commit and push every time you do some significant amount of coding work (after testing it), and then again before the given deadline (this will be announced in class).

Readings & Resources

\(\LaTeX\) Templates

  • There are lots of LaTeX templates online, such for typesetting theses from particular institutions, or papers for a specific journal. There are some examples the TheMulQuaBio repo (under code).

  • The Overleaf templates are extensive

\(\LaTeX\) Tables