{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"library(repr) ; options(repr.plot.width=6, repr.plot.height= 6) # Change plot sizes (in cm)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Biological Computing in R"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction \n",
"\n",
"R is a freely available statistical software with strong programming capabilities widely used by professional scientists around the world. It was based on the commercial statistical software `S` by Robert Gentleman and Ross Ihaka. The first stable version appeared in 2000. It was essentially designed for *programming* statistical analysis and data-mining. It became the standard tool for data analysis and visualization in biology in a matter of just 10 years or so. It is also frequently used for mathematical modelling in biology.\n",
"\n",
"This chapter aims to lay down the foundations for you to use R for scientific computing in biology by exploiting it full potential as a fully featured object-oriented programming language. Specifically, this chapter aims at teaching you:\n",
"\n",
"* Basic R syntax and programming conventions, assuming you have never set your eyes on R\n",
"* Principles of data processing and exploration (including visualization) using R\n",
"* Principles of clean and efficient programming using R\n",
"* To generate publication quality graphics in R \n",
"* To develop reproducible data analysis \"work flows\" so you (or anybody else) can run and re-run your analyses, graphics outputs and all, in R\n",
"* To make R simulations more efficient using vectorization\n",
"* To find and fix errors in R code using debugging \n",
"* To make data wrangling and analyses more efficient and convenient using special packages\n",
"* Some additional, advanced topics (accessing databases, building your own packages, etc.).\n",
"\n",
"### Why R?\n",
"\n",
"There are many commercial statistical (minitab, SPSS, etc) software packages in the world that are mouse-driven, warm, and friendly, and have lots of statistical tests and plotting/graphing capabilities. Why not just use them? Here are some very good reasons:\n",
"\n",
"* R has numerous tried and tested packages for data-handling and processing \n",
"* R provides basically every statistical test you'll ever need and is constantly being improved. You can tailor your analyses rather than trying to use the more limited options each statistical software package can offer.\n",
"* R has excellent graphing and visualization capabilities and produce publication-quality graphics that can be re-produced with scripts – you won't get [RSI](https://en.wikipedia.org/wiki/Repetitive_strain_injury) mouse-clicking your way though graphing and re-graphing your data every time you change your analysis!\n",
"* It also has good capabilities for mathematical calculations, including matrix algebra \n",
"* R is scriptable, so you can build a perfectly repeatable record of your analysis. This in itself has several advantages:\n",
" * You can never replicate *exactly* the same analysis with all the same steps using a point-and-click approach/software. With R you can reproduce your full analysis for yourself (in the future!), your colleagues, your supervisor/employer, and any journal you might want to submit your work to.\n",
" * You may need to rerun your analysis every time you get new data. Once you have it all in a R script, you can just rerun your analysis and go home!\n",
" * You may need to tweak your analysis many times (new data, supervisor changes mind, you change mind, paper reviewers want you do something differently). Having the analysis recorded as script then allows you to do so by revising the relevant parts of your analysis with relatively little pain. \n",
"* R is freely available for all common computer operating systems – if you want a copy on your laptop, help yourself at the [CRAN website](https://cran.r-project.org).\n",
"* Being able to program in R means you can develop and automate your own data handling, statistical analysis, and graphing/plotting, a set of skills you are likely to need in many, if not most careers paths.\n",
"\n",
"#### Would you ever need anything other than R?\n",
"\n",
"Being able to program R means you can develop and automate your statistical analyses and the generation of figures into a reproducible work flow. For many of you, using R as your only programming language will do the job. However, if your work also includes extensive numerical simulations, manipulation of very large matrices, bioinformatics, relational database access and manipulation, or web development, you will be better-off *also* knowing another programming language that is more versatile and computationally efficient (like Python or Julia).\n",
"\n",
"### Installing R\n",
"\n",
"If you are using a college computer, R will likely already be available.\n",
"\n",
"Otherwise you can follow [these instructions](https://imperial-fons-computing.github.io/rstudio.html) to install R on your own computer.\n",
"\n",
"In particular, on Ubuntu Linux, it is as simple as typing the following in terminal:\n",
"\n",
"```bash\n",
"sudo apt install r-base r-base-dev\n",
"```\n",
"\n",
"## Getting started\n",
"\n",
"You should be using an IDE for R. Please re-visit the \"To IDE or not to IDE\" section of the [Introduction](../intro.md) if you are not familiar with IDEs.\n",
"\n",
"Let's briefly look at the bare-bones R interface and command line interface (CLI), and then switch to an IDE like Visual Studio Code or RStudio.\n",
"\n",
"Launch R (From Applications menu on Window or Mac, from terminal in Linux/Ubuntu) — it should look something like this (on Linux/Ubuntu or Mac terminal):\n",
"\n",
"---\n",
"\n",
":::{figure-md} R-Linux-console\n",
"\n",
"\n",
"\n",
"**The R console in Linux/Unix.** \n",
"\n",
":::\n",
"\n",
"Or like this (Windows \"console\", similar in Mac):\n",
" \n",
"---\n",
"\n",
":::{figure-md} R-Windows-console\n",
"\n",
"\n",
"\n",
"**The R console in Windows/Mac OS.** \n",
"\n",
":::\n",
"\n",
"---\n",
"\n",
"\n",
"## R basics\n",
"\n",
"Lets get started with some R basics. You will be working by entering R commands interactively at the R user prompt (`>`). Up and down arrow keys scroll through your command history. \n",
"\n",
"### Useful R commands\n",
"\n",
"|Command| What it does|\n",
"|:-|:-|\n",
"| `ls()`| list all the variables in the work space |\n",
"| `rm('a', 'b')`| remove variable(s) `a` and `b`|\n",
"| `rm(list=ls())`| remove all variable(s)|\n",
"| `getwd()`| get current working directory |\n",
"| `setwd('Path')`| set working directory to `Path`|\n",
"| `q()`| quit R |\n",
"| `?Command`| show the documentation of `Command`|\n",
"| `??Keyword`| search the all packages/functions with 'Keyword', \"fuzzy search\"|\n",
"\n",
"### Baby steps\n",
"\n",
"Like in any programming language, you will need to use \"variables\" to store information in a R session's workspace. Each variable has a reserved location in your [RAM](https://en.wikipedia.org/wiki/Random-access_memory), and takes up \"real estate\" in it — that is when you create a variable you reserve some space in your computer's memory.\n",
"\n",
"★ Now, let's try assigning a few variables in R and doing things to them:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"a <- 4 # store 4 as variable a"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"4"
],
"text/latex": [
"4"
],
"text/markdown": [
"4"
],
"text/plain": [
"[1] 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"a # display it"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"16"
],
"text/latex": [
"16"
],
"text/markdown": [
"16"
],
"text/plain": [
"[1] 16"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"a * a # product"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Store a variable:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"a_squared <- a * a "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{note}\n",
"Unlike Python or most other programming languages, R uses the `<-` [operator to assign variables](https://stat.ethz.ch/R-manual/R-devel/library/base/html/assignOps.html). You can use `=` as well, but it does not work everywhere, so better to stick with `<-`.\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"4"
],
"text/latex": [
"4"
],
"text/markdown": [
"4"
],
"text/plain": [
"[1] 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sqrt(a_squared) # square root"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Build a vector with `c` (stands for \"`c`oncatenate\"): "
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"v <- c(0, 1, 2, 3, 4)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
- 0
- 1
- 2
- 3
- 4
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0\n",
"2. 1\n",
"3. 2\n",
"4. 3\n",
"5. 4\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0 1 2 3 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v # Display the vector-valued variable you created"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that any text after a \"#\" is ignored by R, like in many other languages — handy for commenting. \n",
"\n",
"*In general, please comment your code and scripts, for *everybody's* sake. You will be amazed by how difficult it is to read and understand what a certain R script does (or any other script, for that matter) without judicious comments — even scripts you yourself wrote not so long ago!\n",
"\n",
"```{tip}\n",
"\n",
" **The Concatenate function:** `c()`(concatenate) is one of the most commonly used R functions because it is the default method for combining multiple arguments into a vector. To learn more about it, type `?c` at the R prompt and hit enter. \n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"is.vector(v) # check if v's a vector"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"2"
],
"text/latex": [
"2"
],
"text/markdown": [
"2"
],
"text/plain": [
"[1] 2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mean(v) # mean"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thus, a *vector* is like a single column or row in a *spreadsheet* (just like it is in [Python](05-Python_I.ipynb)). Multiple vectors can be combined to make a matrix (the full spreadsheet). \n",
"\n",
"This is one of many ways R stores and processes data. More on R data types and objects below. \n",
"\n",
"A single value (any kind) is a vector object of length 1 by default. That's why in the R console you see `[1]` before any single-value output (e.g., type `8`, and you will see `[1] 8`)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some examples of operations on vectors: "
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"2.5"
],
"text/latex": [
"2.5"
],
"text/markdown": [
"2.5"
],
"text/plain": [
"[1] 2.5"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"var(v) # variance"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"2"
],
"text/latex": [
"2"
],
"text/markdown": [
"2"
],
"text/plain": [
"[1] 2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"median(v) # median"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"10"
],
"text/latex": [
"10"
],
"text/markdown": [
"10"
],
"text/plain": [
"[1] 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sum(v) # sum all elements"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"120"
],
"text/latex": [
"120"
],
"text/markdown": [
"120"
],
"text/plain": [
"[1] 120"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"prod(v + 1) # multiply"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"5"
],
"text/latex": [
"5"
],
"text/markdown": [
"5"
],
"text/plain": [
"[1] 5"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"length(v) # how many elements in the vector"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Variable names and Tabbing\n",
"\n",
"In R, you can name variables in the following way to keep track of related variables:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"wing.width.cm <- 1.2 #Using dot notation"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"wing.length.cm <- c(4.7, 5.2, 4.8)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This can be handy; type:\n",
"\n",
"```r\n",
"wing.\n",
"```\n",
"\n",
"And then hit the `tab`key to reveal all variables in that category. This is nice — variable names should be as obvious as possible. However, they should not be over-long either! Good style and readability is more important than just convenient variable names. \n",
"\n",
"In fact, R allows dots to be used in variable names of all objects (not just objects that are variables). For example, functions names can have dots in them as well, as you will see below with the `is.*` family (e.g., `is.infinite()`, `is.nan()`, etc.). \n",
"\n",
"\n",
"```{tip}\n",
"Following a consistent style for different elements of your R code (synttax, conditionals, functions, etc) is important. The tidyverse [style guide](https://style.tidyverse.org/index.html) is recommended. While should keep going back to difefrent sections of the syule gude as you learn new topics. Here, for starters, have a look at [naming conventions](https://style.tidyverse.org/syntax.html). \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Operators\n",
"The usual operators are available in R:\n",
"\n",
"|Operator||\n",
"|:-|:-|\n",
"| `+`| Addition |\n",
"| `-`| Subtraction|\n",
"| `*`| Multiplication|\n",
"| `/`| Division|\n",
"| `^`| Power|\n",
"| `%%`| Modulo|\n",
"| `%/%`| Integer division|\n",
"| `==`| Equals|\n",
"| `!=`| Differs|\n",
"| `>`| Greater|\n",
"| `>=`| Greater or equal|\n",
"| `&`| Logical and|\n",
"| `|` | Logical or|\n",
"| `!`| Logical not|"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### When things go wrong\n",
"\n",
"Here are some common syntax errors that you might run into in R, especially when you just beginning to learn this language: \n",
"\n",
"\n",
"* Missing a closing bracket leads to continuation line, which looks something like this, with a `+` at the end:\n",
"\n",
"```r\n",
"x <- (1 + (2 * 3)\n",
"+ \n",
"```\n",
"\n",
"Hit `Ctrl-C`(UNIX terminal or base R command line) or ESC (in in RStudio) or keep typing!\n",
"\n",
"* Too many parentheses; for example, `2 + (2*3))`\n",
"\n",
"* Wrong or mismatched brackets (see next subsection)\n",
"\n",
"* Mixing double and single quotes will also give you an error\n",
"\n",
"\n",
"When things are taking too long and the R console seems frozen, try `Ctrl + C` (UNIX terminal or base R command line) or ESC (in RStudio) to force an exit from whatever is going on."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Types of parentheses\n",
"\n",
"R has specific uses for different types of Parentheses type that you need to get used to:\n",
"\n",
"| Parentheses type| What it does|\n",
"|:-|:-|\n",
"|`f(3,4)`| call the function (or command) f, with the arguments 3 & 4. |\n",
"|`a + (b*c)`| to enforce order over which statements or calculations are executed. Here `(b*c)`is executed before adding to `a`. Here is an alternative order: `(a + b)*c`|\n",
"|`{expr1; expr2;...exprn}` | group a set of expressions or commands into one compound expression. Value returned is value of last expression; used in building function, loops, and conditionals (more on these soon!).|\n",
"|`x[4]`| get the 4th element of the vector `x`.|\n",
"| `li[[3]]`| get the 3rd element of some list `li`, and return it.(compare with `li[3]`, which returns a list with just the 3rd element inside). More on lists in next section."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"li = list(c(1,2,3))"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'list'"
],
"text/latex": [
"'list'"
],
"text/markdown": [
"'list'"
],
"text/plain": [
"[1] \"list\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class(li)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Variable types\n",
"\n",
"There are different kinds of data variable types in R, but you will basically need to know four for most of your work: integer, float (or \"numeric\", including real numbers), string (or \"character\", e.g., \n",
"text), and Boolean (\"logical\"; `True`or `FALSE`). \n",
"\n",
"Try this:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"v <- TRUE"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"'logical'"
],
"text/latex": [
"'logical'"
],
"text/markdown": [
"'logical'"
],
"text/plain": [
"[1] \"logical\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class(v)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `class()` function tells you what type of variable any object in the workspace is."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'numeric'"
],
"text/latex": [
"'numeric'"
],
"text/markdown": [
"'numeric'"
],
"text/plain": [
"[1] \"numeric\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v <- 3.2\n",
"class(v)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'integer'"
],
"text/latex": [
"'integer'"
],
"text/markdown": [
"'integer'"
],
"text/plain": [
"[1] \"integer\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v <- 2L\n",
"class(v)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'character'"
],
"text/latex": [
"'character'"
],
"text/markdown": [
"'character'"
],
"text/plain": [
"[1] \"character\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v <- \"A string\"\n",
"class(v)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'logical'"
],
"text/latex": [
"'logical'"
],
"text/markdown": [
"'logical'"
],
"text/plain": [
"[1] \"logical\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"b <- NA\n",
"class(b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Also, the `is.*` family of functions allow you to check if a variable is a specific in R: "
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"is.na(b)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"NaN"
],
"text/latex": [
"NaN"
],
"text/markdown": [
"NaN"
],
"text/plain": [
"[1] NaN"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"b <- 0/0\n",
"b"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"is.nan(b)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Inf"
],
"text/latex": [
"Inf"
],
"text/markdown": [
"Inf"
],
"text/plain": [
"[1] Inf"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"b <- 5/0\n",
"b"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"FALSE"
],
"text/latex": [
"FALSE"
],
"text/markdown": [
"FALSE"
],
"text/plain": [
"[1] FALSE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"is.nan(b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{tip}\n",
"Beware of the difference between `NA` (`N`ot` A`vailable) and `NaN` (`N`ot` a N`umber). R will use `NA`to represent/identify missing values in data or outputs, while `NaN`represent nonsense values (e.g., 0/0) that cannot be represented as a number or some other data type. \n",
"\n",
"See what R has to say about this: try `?is.nan`, `?is.na`, `?NA`, `?NaN`in the R commandline (one at a time!).\n",
"\n",
"There are also `Inf` (Infinity, e.g., 1/0), and `NULL` (variable not set) value types. Look these up as well using `?`.\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 181,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"is.infinite(b)"
]
},
{
"cell_type": "code",
"execution_count": 182,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"FALSE"
],
"text/latex": [
"FALSE"
],
"text/markdown": [
"FALSE"
],
"text/plain": [
"[1] FALSE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"is.finite(b)"
]
},
{
"cell_type": "code",
"execution_count": 183,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"FALSE"
],
"text/latex": [
"FALSE"
],
"text/markdown": [
"FALSE"
],
"text/plain": [
"[1] FALSE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"is.finite(0/0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Type Conversions and Special Values\n",
"\n",
"The `as.*` commands all convert a variable from one type to another. Try out the following examples:"
]
},
{
"cell_type": "code",
"execution_count": 184,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"3"
],
"text/latex": [
"3"
],
"text/markdown": [
"3"
],
"text/plain": [
"[1] 3"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"as.integer(3.1)"
]
},
{
"cell_type": "code",
"execution_count": 185,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"4"
],
"text/latex": [
"4"
],
"text/markdown": [
"4"
],
"text/plain": [
"[1] 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"as.numeric(4)"
]
},
{
"cell_type": "code",
"execution_count": 186,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1] CLV"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"as.roman(155)"
]
},
{
"cell_type": "code",
"execution_count": 187,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'155'"
],
"text/latex": [
"'155'"
],
"text/markdown": [
"'155'"
],
"text/plain": [
"[1] \"155\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"as.character(155) # same as converting to string"
]
},
{
"cell_type": "code",
"execution_count": 188,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"as.logical(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*What just happened?!* R maps all values other than `0` to logical `True`, and `0` to `False`. This can be useful in some cases, for example, when you want to convert all your data to Presence-Absence only."
]
},
{
"cell_type": "code",
"execution_count": 189,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"FALSE"
],
"text/latex": [
"FALSE"
],
"text/markdown": [
"FALSE"
],
"text/plain": [
"[1] FALSE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"as.logical(0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Also, *keep an eye out for [E notation](https://en.wikipedia.org/wiki/Scientific_notation) in outputs of R functions for statistical analyses, and learn to interpret numbers formatted this way.* R uses E notation in outputs of statistical tests to display very large or small numbers. If you are not used to different representations of long numbers, the E notation might be confusing. \n",
"\n",
"Try this:"
]
},
{
"cell_type": "code",
"execution_count": 190,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"10000"
],
"text/latex": [
"10000"
],
"text/markdown": [
"10000"
],
"text/plain": [
"[1] 10000"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"1E4"
]
},
{
"cell_type": "code",
"execution_count": 191,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"10000"
],
"text/latex": [
"10000"
],
"text/markdown": [
"10000"
],
"text/plain": [
"[1] 10000"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"1e4"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0.05"
],
"text/latex": [
"0.05"
],
"text/markdown": [
"0.05"
],
"text/plain": [
"[1] 0.05"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"5e-2"
]
},
{
"cell_type": "code",
"execution_count": 193,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"1e+08"
],
"text/latex": [
"1e+08"
],
"text/markdown": [
"1e+08"
],
"text/plain": [
"[1] 1e+08"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"1E4^2"
]
},
{
"cell_type": "code",
"execution_count": 194,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"3.33333333333333e-09"
],
"text/latex": [
"3.33333333333333e-09"
],
"text/markdown": [
"3.33333333333333e-09"
],
"text/plain": [
"[1] 3.333333e-09"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"1 / 3 / 1e8"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{tip} \n",
"**Boolean arguments in R**: In `R`, you can use `F` and `T` for boolean `FALSE` and `TRUE` respectively. To see this, type \n",
"\n",
"`a <- T`\n",
"\n",
"in the R commandline, and then see what R returns when you type `a`. Using `F` and `T` for boolean `FALSE` and `TRUE` respectively is not necessarily good practice, but be aware that this option exists. \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## R Data Structures\n",
"\n",
"R comes with different built-in structures (objects) for data storage and manipulation. Mastering these, and knowing which one to use when will help you write better, more efficient programs and also handle diverse datasets (numbers, counts, names, dates, etc). \n",
"\n",
"```{note}\n",
"**Data \"structures\" vs. \"objects\" in R**: You will often see the terms \"object\" and \"data structure\" used in this and other chapters. These two have a very distinct meaning in object-oriented programming (OOP) languages like R and Python. A data structure is just a \"dumb\" container for data (e.g., a vector). An object, on the other hand can be a data structure, but also any other variable or a function in your R environment. R, being an OOP language, converts everything in the current environment to an object so that it knows what to do with each such entity — each object type has its own set of rules for operations and manipulations that R uses when interpreting your commands. \n",
"```\n",
"\n",
"### Vectors\n",
"The Vector, which you have already seen above, is a fundamental data object / structure in R. Scalars (single data values) are treated as vector of length 1. *A vector is like a single column or row in a spreadsheet.*\n",
"\n",
"★ Now get back into R (if you somehow quit R using `q()`or something else), and try this:"
]
},
{
"cell_type": "code",
"execution_count": 195,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"a <- 5\n",
"is.vector(a)"
]
},
{
"cell_type": "code",
"execution_count": 196,
"metadata": {},
"outputs": [],
"source": [
"v1 <- c(0.02, 0.5, 1)\n",
"v2 <- c(\"a\", \"bc\", \"def\", \"ghij\")\n",
"v3 <- c(TRUE, TRUE, FALSE)"
]
},
{
"cell_type": "code",
"execution_count": 197,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0.02
- 0.5
- 1
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0.02\n",
"\\item 0.5\n",
"\\item 1\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0.02\n",
"2. 0.5\n",
"3. 1\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0.02 0.50 1.00"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"- 'a'
- 'bc'
- 'def'
- 'ghij'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'a'\n",
"\\item 'bc'\n",
"\\item 'def'\n",
"\\item 'ghij'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'a'\n",
"2. 'bc'\n",
"3. 'def'\n",
"4. 'ghij'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"a\" \"bc\" \"def\" \"ghij\""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"- TRUE
- TRUE
- FALSE
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item TRUE\n",
"\\item TRUE\n",
"\\item FALSE\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. TRUE\n",
"2. TRUE\n",
"3. FALSE\n",
"\n",
"\n"
],
"text/plain": [
"[1] TRUE TRUE FALSE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v1;v2;v3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"R vectors can only store data of a single type (e.g., all numeric or all character). If you try to combine different types, R will homogenize everything to the same data type. To see this, try the following:"
]
},
{
"cell_type": "code",
"execution_count": 198,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0.02
- 1
- 1
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0.02\n",
"\\item 1\n",
"\\item 1\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0.02\n",
"2. 1\n",
"3. 1\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0.02 1.00 1.00"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v1 <- c(0.02, TRUE, 1)\n",
"v1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"TRUE gets converted to 1.00!"
]
},
{
"cell_type": "code",
"execution_count": 199,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- '0.02'
- 'Mary'
- '1'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item '0.02'\n",
"\\item 'Mary'\n",
"\\item '1'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. '0.02'\n",
"2. 'Mary'\n",
"3. '1'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"0.02\" \"Mary\" \"1\" "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v1 <- c(0.02, \"Mary\", 1)\n",
"v1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Everything gets converted to text!\n",
"\n",
"Basically, the function `c` \"coerces\" arguments that are of mixed types (strings/text, real numbers, logical arguments, etc) to a common type.\n",
"\n",
"(R-matrices)=\n",
"### Matrices and arrays\n",
"\n",
"A R \"matrix\" is a 2 dimensional vector (has both rows and columns) and a \"array\" can store data in more than two dimensions (e.g., a stack of 2-D matrices). \n",
"\n",
"R has many functions to build and manipulate matrices and arrays. \n",
"\n",
"Try this:"
]
},
{
"cell_type": "code",
"execution_count": 200,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"\t1 | 6 | 11 | 16 | 21 |
\n",
"\t2 | 7 | 12 | 17 | 22 |
\n",
"\t3 | 8 | 13 | 18 | 23 |
\n",
"\t4 | 9 | 14 | 19 | 24 |
\n",
"\t5 | 10 | 15 | 20 | 25 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 5 of type int\n",
"\\begin{tabular}{lllll}\n",
"\t 1 & 6 & 11 & 16 & 21\\\\\n",
"\t 2 & 7 & 12 & 17 & 22\\\\\n",
"\t 3 & 8 & 13 & 18 & 23\\\\\n",
"\t 4 & 9 & 14 & 19 & 24\\\\\n",
"\t 5 & 10 & 15 & 20 & 25\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"| 1 | 6 | 11 | 16 | 21 |\n",
"| 2 | 7 | 12 | 17 | 22 |\n",
"| 3 | 8 | 13 | 18 | 23 |\n",
"| 4 | 9 | 14 | 19 | 24 |\n",
"| 5 | 10 | 15 | 20 | 25 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 1 6 11 16 21 \n",
"[2,] 2 7 12 17 22 \n",
"[3,] 3 8 13 18 23 \n",
"[4,] 4 9 14 19 24 \n",
"[5,] 5 10 15 20 25 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1 <- matrix(1:25, 5, 5)\n",
"mat1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the output in your terminal or console will be more like:"
]
},
{
"cell_type": "code",
"execution_count": 201,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 1 6 11 16 21\n",
"[2,] 2 7 12 17 22\n",
"[3,] 3 8 13 18 23\n",
"[4,] 4 9 14 19 24\n",
"[5,] 5 10 15 20 25\n"
]
}
],
"source": [
"print(mat1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(even if you don't use the `print()` command)\n",
"\n",
"The output looks different above because the R code is running in a jupyter notebook. \n",
"\n",
"Now try this:"
]
},
{
"cell_type": "code",
"execution_count": 202,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"\t 1 | 2 | 3 | 4 | 5 |
\n",
"\t 6 | 7 | 8 | 9 | 10 |
\n",
"\t11 | 12 | 13 | 14 | 15 |
\n",
"\t16 | 17 | 18 | 19 | 20 |
\n",
"\t21 | 22 | 23 | 24 | 25 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 5 of type int\n",
"\\begin{tabular}{lllll}\n",
"\t 1 & 2 & 3 & 4 & 5\\\\\n",
"\t 6 & 7 & 8 & 9 & 10\\\\\n",
"\t 11 & 12 & 13 & 14 & 15\\\\\n",
"\t 16 & 17 & 18 & 19 & 20\\\\\n",
"\t 21 & 22 & 23 & 24 & 25\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"| 1 | 2 | 3 | 4 | 5 |\n",
"| 6 | 7 | 8 | 9 | 10 |\n",
"| 11 | 12 | 13 | 14 | 15 |\n",
"| 16 | 17 | 18 | 19 | 20 |\n",
"| 21 | 22 | 23 | 24 | 25 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 1 2 3 4 5 \n",
"[2,] 6 7 8 9 10 \n",
"[3,] 11 12 13 14 15 \n",
"[4,] 16 17 18 19 20 \n",
"[5,] 21 22 23 24 25 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1 <- matrix(1:25, 5, 5, byrow=TRUE)\n",
"mat1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That is, you can order the elements of a matrix by row instead of column (default)."
]
},
{
"cell_type": "code",
"execution_count": 203,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 5
- 5
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 5\n",
"\\item 5\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 5\n",
"2. 5\n",
"\n",
"\n"
],
"text/plain": [
"[1] 5 5"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dim(mat1) #get the size of the matrix"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To make an array consisting of two 5$\\times$5 matrices containing the integers 1-50:"
]
},
{
"cell_type": "code",
"execution_count": 204,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"\t1 | 6 | 11 | 16 | 21 |
\n",
"\t2 | 7 | 12 | 17 | 22 |
\n",
"\t3 | 8 | 13 | 18 | 23 |
\n",
"\t4 | 9 | 14 | 19 | 24 |
\n",
"\t5 | 10 | 15 | 20 | 25 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 5 of type int\n",
"\\begin{tabular}{lllll}\n",
"\t 1 & 6 & 11 & 16 & 21\\\\\n",
"\t 2 & 7 & 12 & 17 & 22\\\\\n",
"\t 3 & 8 & 13 & 18 & 23\\\\\n",
"\t 4 & 9 & 14 & 19 & 24\\\\\n",
"\t 5 & 10 & 15 & 20 & 25\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"| 1 | 6 | 11 | 16 | 21 |\n",
"| 2 | 7 | 12 | 17 | 22 |\n",
"| 3 | 8 | 13 | 18 | 23 |\n",
"| 4 | 9 | 14 | 19 | 24 |\n",
"| 5 | 10 | 15 | 20 | 25 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 1 6 11 16 21 \n",
"[2,] 2 7 12 17 22 \n",
"[3,] 3 8 13 18 23 \n",
"[4,] 4 9 14 19 24 \n",
"[5,] 5 10 15 20 25 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"arr1 <- array(1:50, c(5, 5, 2))\n",
"arr1[,,1]"
]
},
{
"cell_type": "code",
"execution_count": 205,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
", , 1\n",
"\n",
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 1 6 11 16 21\n",
"[2,] 2 7 12 17 22\n",
"[3,] 3 8 13 18 23\n",
"[4,] 4 9 14 19 24\n",
"[5,] 5 10 15 20 25\n",
"\n",
", , 2\n",
"\n",
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 26 31 36 41 46\n",
"[2,] 27 32 37 42 47\n",
"[3,] 28 33 38 43 48\n",
"[4,] 29 34 39 44 49\n",
"[5,] 30 35 40 45 50\n",
"\n"
]
}
],
"source": [
"print(arr1)"
]
},
{
"cell_type": "code",
"execution_count": 206,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"\t26 | 31 | 36 | 41 | 46 |
\n",
"\t27 | 32 | 37 | 42 | 47 |
\n",
"\t28 | 33 | 38 | 43 | 48 |
\n",
"\t29 | 34 | 39 | 44 | 49 |
\n",
"\t30 | 35 | 40 | 45 | 50 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 5 of type int\n",
"\\begin{tabular}{lllll}\n",
"\t 26 & 31 & 36 & 41 & 46\\\\\n",
"\t 27 & 32 & 37 & 42 & 47\\\\\n",
"\t 28 & 33 & 38 & 43 & 48\\\\\n",
"\t 29 & 34 & 39 & 44 & 49\\\\\n",
"\t 30 & 35 & 40 & 45 & 50\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"| 26 | 31 | 36 | 41 | 46 |\n",
"| 27 | 32 | 37 | 42 | 47 |\n",
"| 28 | 33 | 38 | 43 | 48 |\n",
"| 29 | 34 | 39 | 44 | 49 |\n",
"| 30 | 35 | 40 | 45 | 50 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 26 31 36 41 46 \n",
"[2,] 27 32 37 42 47 \n",
"[3,] 28 33 38 43 48 \n",
"[4,] 29 34 39 44 49 \n",
"[5,] 30 35 40 45 50 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"arr1[,,2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just like R vectors, R matrices and arrays have to be of a homogeneous type, and R will do the same sort of type homogenization you saw for R vectors above. \n",
"\n",
"Try inserting a text value in `mat1`and see what happens: "
]
},
{
"cell_type": "code",
"execution_count": 207,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 5 of type chr\n",
"\n",
"\tone | 2 | 3 | 4 | 5 |
\n",
"\t6 | 7 | 8 | 9 | 10 |
\n",
"\t11 | 12 | 13 | 14 | 15 |
\n",
"\t16 | 17 | 18 | 19 | 20 |
\n",
"\t21 | 22 | 23 | 24 | 25 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 5 of type chr\n",
"\\begin{tabular}{lllll}\n",
"\t one & 2 & 3 & 4 & 5 \\\\\n",
"\t 6 & 7 & 8 & 9 & 10\\\\\n",
"\t 11 & 12 & 13 & 14 & 15\\\\\n",
"\t 16 & 17 & 18 & 19 & 20\\\\\n",
"\t 21 & 22 & 23 & 24 & 25\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 5 of type chr\n",
"\n",
"| one | 2 | 3 | 4 | 5 |\n",
"| 6 | 7 | 8 | 9 | 10 |\n",
"| 11 | 12 | 13 | 14 | 15 |\n",
"| 16 | 17 | 18 | 19 | 20 |\n",
"| 21 | 22 | 23 | 24 | 25 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] one 2 3 4 5 \n",
"[2,] 6 7 8 9 10 \n",
"[3,] 11 12 13 14 15 \n",
"[4,] 16 17 18 19 20 \n",
"[5,] 21 22 23 24 25 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1[1,1] <- \"one\"\n",
"mat1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`mat1` went from being `A matrix: 5 × 5 of type int` to `A matrix: 5 × 5 of type chr`. That is, inserting a string in one location converted all the elements of the matrix to the `chr` (string) data type."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thus R's matrix and array are similar to Python's `numpy` array and matrix data structures."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data frames\n",
"\n",
"This is a very important data structure in R. Unlike matrices and vectors, R data frames can store data in which each column contains a different data type (e.g., numbers, strings, boolean) or even a combination of data types, just like a standard spreadsheet. Indeed, the dataframe data type was built to emulate some of the convenient properties of spreadsheets. Many statistical and plotting functions and packages in R naturally use data frames. \n",
"\n",
"Let's build and manipulate a dataframe. First create three vectors:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{note}\n",
"Under the hood, a R data frame is in fact a `list` of equal-length `vector`s. \n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 208,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\item 6\n",
"\\item 7\n",
"\\item 8\n",
"\\item 9\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"3. 3\n",
"4. 4\n",
"5. 5\n",
"6. 6\n",
"7. 7\n",
"8. 8\n",
"9. 9\n",
"10. 10\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1 2 3 4 5 6 7 8 9 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Col1 <- 1:10\n",
"Col1"
]
},
{
"cell_type": "code",
"execution_count": 209,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'A'
- 'B'
- 'C'
- 'D'
- 'E'
- 'F'
- 'G'
- 'H'
- 'I'
- 'J'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'A'\n",
"\\item 'B'\n",
"\\item 'C'\n",
"\\item 'D'\n",
"\\item 'E'\n",
"\\item 'F'\n",
"\\item 'G'\n",
"\\item 'H'\n",
"\\item 'I'\n",
"\\item 'J'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'A'\n",
"2. 'B'\n",
"3. 'C'\n",
"4. 'D'\n",
"5. 'E'\n",
"6. 'F'\n",
"7. 'G'\n",
"8. 'H'\n",
"9. 'I'\n",
"10. 'J'\n",
"\n",
"\n"
],
"text/plain": [
" [1] \"A\" \"B\" \"C\" \"D\" \"E\" \"F\" \"G\" \"H\" \"I\" \"J\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Col2 <- LETTERS[1:10]\n",
"Col2"
]
},
{
"cell_type": "code",
"execution_count": 210,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0.938413555268198
- 0.426268232986331
- 0.352635570103303
- 0.420346662169322
- 0.309041227446869
- 0.0709800566546619
- 0.294151729205623
- 0.802815589588135
- 0.706063920864835
- 0.974233828950673
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0.938413555268198\n",
"\\item 0.426268232986331\n",
"\\item 0.352635570103303\n",
"\\item 0.420346662169322\n",
"\\item 0.309041227446869\n",
"\\item 0.0709800566546619\n",
"\\item 0.294151729205623\n",
"\\item 0.802815589588135\n",
"\\item 0.706063920864835\n",
"\\item 0.974233828950673\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0.938413555268198\n",
"2. 0.426268232986331\n",
"3. 0.352635570103303\n",
"4. 0.420346662169322\n",
"5. 0.309041227446869\n",
"6. 0.0709800566546619\n",
"7. 0.294151729205623\n",
"8. 0.802815589588135\n",
"9. 0.706063920864835\n",
"10. 0.974233828950673\n",
"\n",
"\n"
],
"text/plain": [
" [1] 0.93841356 0.42626823 0.35263557 0.42034666 0.30904123 0.07098006\n",
" [7] 0.29415173 0.80281559 0.70606392 0.97423383"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Col3 <- runif(10) # 10 random numbers from a uniform distribution\n",
"Col3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now combine them into a dataframe:"
]
},
{
"cell_type": "code",
"execution_count": 211,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 10 × 3\n",
"\n",
"\tCol1 | Col2 | Col3 |
\n",
"\t<int> | <chr> | <dbl> |
\n",
"\n",
"\n",
"\t 1 | A | 0.93841356 |
\n",
"\t 2 | B | 0.42626823 |
\n",
"\t 3 | C | 0.35263557 |
\n",
"\t 4 | D | 0.42034666 |
\n",
"\t 5 | E | 0.30904123 |
\n",
"\t 6 | F | 0.07098006 |
\n",
"\t 7 | G | 0.29415173 |
\n",
"\t 8 | H | 0.80281559 |
\n",
"\t 9 | I | 0.70606392 |
\n",
"\t10 | J | 0.97423383 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 10 × 3\n",
"\\begin{tabular}{lll}\n",
" Col1 & Col2 & Col3\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t 1 & A & 0.93841356\\\\\n",
"\t 2 & B & 0.42626823\\\\\n",
"\t 3 & C & 0.35263557\\\\\n",
"\t 4 & D & 0.42034666\\\\\n",
"\t 5 & E & 0.30904123\\\\\n",
"\t 6 & F & 0.07098006\\\\\n",
"\t 7 & G & 0.29415173\\\\\n",
"\t 8 & H & 0.80281559\\\\\n",
"\t 9 & I & 0.70606392\\\\\n",
"\t 10 & J & 0.97423383\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 10 × 3\n",
"\n",
"| Col1 <int> | Col2 <chr> | Col3 <dbl> |\n",
"|---|---|---|\n",
"| 1 | A | 0.93841356 |\n",
"| 2 | B | 0.42626823 |\n",
"| 3 | C | 0.35263557 |\n",
"| 4 | D | 0.42034666 |\n",
"| 5 | E | 0.30904123 |\n",
"| 6 | F | 0.07098006 |\n",
"| 7 | G | 0.29415173 |\n",
"| 8 | H | 0.80281559 |\n",
"| 9 | I | 0.70606392 |\n",
"| 10 | J | 0.97423383 |\n",
"\n"
],
"text/plain": [
" Col1 Col2 Col3 \n",
"1 1 A 0.93841356\n",
"2 2 B 0.42626823\n",
"3 3 C 0.35263557\n",
"4 4 D 0.42034666\n",
"5 5 E 0.30904123\n",
"6 6 F 0.07098006\n",
"7 7 G 0.29415173\n",
"8 8 H 0.80281559\n",
"9 9 I 0.70606392\n",
"10 10 J 0.97423383"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyDF <- data.frame(Col1, Col2, Col3)\n",
"MyDF"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, this output looks different from your terminal/console output because these commands are running in a Jupyter notebook.\n",
"\n",
"Your output will look like this:"
]
},
{
"cell_type": "code",
"execution_count": 212,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Col1 Col2 Col3\n",
"1 1 A 0.93841356\n",
"2 2 B 0.42626823\n",
"3 3 C 0.35263557\n",
"4 4 D 0.42034666\n",
"5 5 E 0.30904123\n",
"6 6 F 0.07098006\n",
"7 7 G 0.29415173\n",
"8 8 H 0.80281559\n",
"9 9 I 0.70606392\n",
"10 10 J 0.97423383\n"
]
}
],
"source": [
"print(MyDF)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can easily assign names to the columns of dataframes:"
]
},
{
"cell_type": "code",
"execution_count": 213,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 10 × 3\n",
"\n",
"\tMyFirstColumn | My Second Column | My.Third.Column |
\n",
"\t<int> | <chr> | <dbl> |
\n",
"\n",
"\n",
"\t 1 | A | 0.93841356 |
\n",
"\t 2 | B | 0.42626823 |
\n",
"\t 3 | C | 0.35263557 |
\n",
"\t 4 | D | 0.42034666 |
\n",
"\t 5 | E | 0.30904123 |
\n",
"\t 6 | F | 0.07098006 |
\n",
"\t 7 | G | 0.29415173 |
\n",
"\t 8 | H | 0.80281559 |
\n",
"\t 9 | I | 0.70606392 |
\n",
"\t10 | J | 0.97423383 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 10 × 3\n",
"\\begin{tabular}{lll}\n",
" MyFirstColumn & My Second Column & My.Third.Column\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t 1 & A & 0.93841356\\\\\n",
"\t 2 & B & 0.42626823\\\\\n",
"\t 3 & C & 0.35263557\\\\\n",
"\t 4 & D & 0.42034666\\\\\n",
"\t 5 & E & 0.30904123\\\\\n",
"\t 6 & F & 0.07098006\\\\\n",
"\t 7 & G & 0.29415173\\\\\n",
"\t 8 & H & 0.80281559\\\\\n",
"\t 9 & I & 0.70606392\\\\\n",
"\t 10 & J & 0.97423383\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 10 × 3\n",
"\n",
"| MyFirstColumn <int> | My Second Column <chr> | My.Third.Column <dbl> |\n",
"|---|---|---|\n",
"| 1 | A | 0.93841356 |\n",
"| 2 | B | 0.42626823 |\n",
"| 3 | C | 0.35263557 |\n",
"| 4 | D | 0.42034666 |\n",
"| 5 | E | 0.30904123 |\n",
"| 6 | F | 0.07098006 |\n",
"| 7 | G | 0.29415173 |\n",
"| 8 | H | 0.80281559 |\n",
"| 9 | I | 0.70606392 |\n",
"| 10 | J | 0.97423383 |\n",
"\n"
],
"text/plain": [
" MyFirstColumn My Second Column My.Third.Column\n",
"1 1 A 0.93841356 \n",
"2 2 B 0.42626823 \n",
"3 3 C 0.35263557 \n",
"4 4 D 0.42034666 \n",
"5 5 E 0.30904123 \n",
"6 6 F 0.07098006 \n",
"7 7 G 0.29415173 \n",
"8 8 H 0.80281559 \n",
"9 9 I 0.70606392 \n",
"10 10 J 0.97423383 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"names(MyDF) <- c(\"MyFirstColumn\", \"My Second Column\", \"My.Third.Column\")\n",
"MyDF"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And unlike matrices, you can access the contents of data frames by naming the columns directly using a $ sign:"
]
},
{
"cell_type": "code",
"execution_count": 214,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\item 6\n",
"\\item 7\n",
"\\item 8\n",
"\\item 9\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"3. 3\n",
"4. 4\n",
"5. 5\n",
"6. 6\n",
"7. 7\n",
"8. 8\n",
"9. 9\n",
"10. 10\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1 2 3 4 5 6 7 8 9 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyDF$MyFirstColumn"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But don't use spaces in column names! Try this: "
]
},
{
"cell_type": "code",
"execution_count": 215,
"metadata": {
"scrolled": true
},
"outputs": [
{
"ename": "ERROR",
"evalue": "Error in parse(text = x, srcfile = src): :1:9: unexpected symbol\n1: MyDF$My Second\n ^\n",
"output_type": "error",
"traceback": [
"Error in parse(text = x, srcfile = src): :1:9: unexpected symbol\n1: MyDF$My Second\n ^\nTraceback:\n"
]
}
],
"source": [
"MyDF$My Second Column"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That gives an error, because R cannot handle spaces in column names easily. \n",
"\n",
"So replace that column name using the `colnames` function:"
]
},
{
"cell_type": "code",
"execution_count": 216,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'MyFirstColumn'
- 'My Second Column'
- 'My.Third.Column'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'MyFirstColumn'\n",
"\\item 'My Second Column'\n",
"\\item 'My.Third.Column'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'MyFirstColumn'\n",
"2. 'My Second Column'\n",
"3. 'My.Third.Column'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"MyFirstColumn\" \"My Second Column\" \"My.Third.Column\" "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"colnames(MyDF)"
]
},
{
"cell_type": "code",
"execution_count": 217,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 10 × 3\n",
"\n",
"\tMyFirstColumn | MySecondColumn | My.Third.Column |
\n",
"\t<int> | <chr> | <dbl> |
\n",
"\n",
"\n",
"\t 1 | A | 0.93841356 |
\n",
"\t 2 | B | 0.42626823 |
\n",
"\t 3 | C | 0.35263557 |
\n",
"\t 4 | D | 0.42034666 |
\n",
"\t 5 | E | 0.30904123 |
\n",
"\t 6 | F | 0.07098006 |
\n",
"\t 7 | G | 0.29415173 |
\n",
"\t 8 | H | 0.80281559 |
\n",
"\t 9 | I | 0.70606392 |
\n",
"\t10 | J | 0.97423383 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 10 × 3\n",
"\\begin{tabular}{lll}\n",
" MyFirstColumn & MySecondColumn & My.Third.Column\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t 1 & A & 0.93841356\\\\\n",
"\t 2 & B & 0.42626823\\\\\n",
"\t 3 & C & 0.35263557\\\\\n",
"\t 4 & D & 0.42034666\\\\\n",
"\t 5 & E & 0.30904123\\\\\n",
"\t 6 & F & 0.07098006\\\\\n",
"\t 7 & G & 0.29415173\\\\\n",
"\t 8 & H & 0.80281559\\\\\n",
"\t 9 & I & 0.70606392\\\\\n",
"\t 10 & J & 0.97423383\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 10 × 3\n",
"\n",
"| MyFirstColumn <int> | MySecondColumn <chr> | My.Third.Column <dbl> |\n",
"|---|---|---|\n",
"| 1 | A | 0.93841356 |\n",
"| 2 | B | 0.42626823 |\n",
"| 3 | C | 0.35263557 |\n",
"| 4 | D | 0.42034666 |\n",
"| 5 | E | 0.30904123 |\n",
"| 6 | F | 0.07098006 |\n",
"| 7 | G | 0.29415173 |\n",
"| 8 | H | 0.80281559 |\n",
"| 9 | I | 0.70606392 |\n",
"| 10 | J | 0.97423383 |\n",
"\n"
],
"text/plain": [
" MyFirstColumn MySecondColumn My.Third.Column\n",
"1 1 A 0.93841356 \n",
"2 2 B 0.42626823 \n",
"3 3 C 0.35263557 \n",
"4 4 D 0.42034666 \n",
"5 5 E 0.30904123 \n",
"6 6 F 0.07098006 \n",
"7 7 G 0.29415173 \n",
"8 8 H 0.80281559 \n",
"9 9 I 0.70606392 \n",
"10 10 J 0.97423383 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"colnames(MyDF)[2] <- \"MySecondColumn\"\n",
"MyDF"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But using dots in column names is OK (just as it is for variable names): "
]
},
{
"cell_type": "code",
"execution_count": 218,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0.938413555268198
- 0.426268232986331
- 0.352635570103303
- 0.420346662169322
- 0.309041227446869
- 0.0709800566546619
- 0.294151729205623
- 0.802815589588135
- 0.706063920864835
- 0.974233828950673
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0.938413555268198\n",
"\\item 0.426268232986331\n",
"\\item 0.352635570103303\n",
"\\item 0.420346662169322\n",
"\\item 0.309041227446869\n",
"\\item 0.0709800566546619\n",
"\\item 0.294151729205623\n",
"\\item 0.802815589588135\n",
"\\item 0.706063920864835\n",
"\\item 0.974233828950673\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0.938413555268198\n",
"2. 0.426268232986331\n",
"3. 0.352635570103303\n",
"4. 0.420346662169322\n",
"5. 0.309041227446869\n",
"6. 0.0709800566546619\n",
"7. 0.294151729205623\n",
"8. 0.802815589588135\n",
"9. 0.706063920864835\n",
"10. 0.974233828950673\n",
"\n",
"\n"
],
"text/plain": [
" [1] 0.93841356 0.42626823 0.35263557 0.42034666 0.30904123 0.07098006\n",
" [7] 0.29415173 0.80281559 0.70606392 0.97423383"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyDF$My.Third.Column"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also access elements by using numerical indexing: "
]
},
{
"cell_type": "code",
"execution_count": 219,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\item 6\n",
"\\item 7\n",
"\\item 8\n",
"\\item 9\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"3. 3\n",
"4. 4\n",
"5. 5\n",
"6. 6\n",
"7. 7\n",
"8. 8\n",
"9. 9\n",
"10. 10\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1 2 3 4 5 6 7 8 9 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyDF[,1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That is, you asked R to return values of `MyDF` in all Rows (therefore, nothing before the comma), and the first column (`1` after the comma)."
]
},
{
"cell_type": "code",
"execution_count": 220,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"1"
],
"text/latex": [
"1"
],
"text/markdown": [
"1"
],
"text/plain": [
"[1] 1"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyDF[1,1]"
]
},
{
"cell_type": "code",
"execution_count": 221,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 10 × 2\n",
"\n",
"\tMyFirstColumn | My.Third.Column |
\n",
"\t<int> | <dbl> |
\n",
"\n",
"\n",
"\t 1 | 0.93841356 |
\n",
"\t 2 | 0.42626823 |
\n",
"\t 3 | 0.35263557 |
\n",
"\t 4 | 0.42034666 |
\n",
"\t 5 | 0.30904123 |
\n",
"\t 6 | 0.07098006 |
\n",
"\t 7 | 0.29415173 |
\n",
"\t 8 | 0.80281559 |
\n",
"\t 9 | 0.70606392 |
\n",
"\t10 | 0.97423383 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 10 × 2\n",
"\\begin{tabular}{ll}\n",
" MyFirstColumn & My.Third.Column\\\\\n",
" & \\\\\n",
"\\hline\n",
"\t 1 & 0.93841356\\\\\n",
"\t 2 & 0.42626823\\\\\n",
"\t 3 & 0.35263557\\\\\n",
"\t 4 & 0.42034666\\\\\n",
"\t 5 & 0.30904123\\\\\n",
"\t 6 & 0.07098006\\\\\n",
"\t 7 & 0.29415173\\\\\n",
"\t 8 & 0.80281559\\\\\n",
"\t 9 & 0.70606392\\\\\n",
"\t 10 & 0.97423383\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 10 × 2\n",
"\n",
"| MyFirstColumn <int> | My.Third.Column <dbl> |\n",
"|---|---|\n",
"| 1 | 0.93841356 |\n",
"| 2 | 0.42626823 |\n",
"| 3 | 0.35263557 |\n",
"| 4 | 0.42034666 |\n",
"| 5 | 0.30904123 |\n",
"| 6 | 0.07098006 |\n",
"| 7 | 0.29415173 |\n",
"| 8 | 0.80281559 |\n",
"| 9 | 0.70606392 |\n",
"| 10 | 0.97423383 |\n",
"\n"
],
"text/plain": [
" MyFirstColumn My.Third.Column\n",
"1 1 0.93841356 \n",
"2 2 0.42626823 \n",
"3 3 0.35263557 \n",
"4 4 0.42034666 \n",
"5 5 0.30904123 \n",
"6 6 0.07098006 \n",
"7 7 0.29415173 \n",
"8 8 0.80281559 \n",
"9 9 0.70606392 \n",
"10 10 0.97423383 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyDF[c(\"MyFirstColumn\",\"My.Third.Column\")] # show two specific columns only"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can check whether a particular object is a dataframe data structure with:"
]
},
{
"cell_type": "code",
"execution_count": 222,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"'data.frame'"
],
"text/latex": [
"'data.frame'"
],
"text/markdown": [
"'data.frame'"
],
"text/plain": [
"[1] \"data.frame\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class(MyDF)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can check the structure of a dataframe with `str()`:"
]
},
{
"cell_type": "code",
"execution_count": 223,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"'data.frame':\t10 obs. of 3 variables:\n",
" $ MyFirstColumn : int 1 2 3 4 5 6 7 8 9 10\n",
" $ MySecondColumn : chr \"A\" \"B\" \"C\" \"D\" ...\n",
" $ My.Third.Column: num 0.938 0.426 0.353 0.42 0.309 ...\n"
]
}
],
"source": [
"str(MyDF)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can print the column names and top few rows with `head()`:"
]
},
{
"cell_type": "code",
"execution_count": 224,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"\t | MyFirstColumn | MySecondColumn | My.Third.Column |
\n",
"\t | <int> | <chr> | <dbl> |
\n",
"\n",
"\n",
"\t1 | 1 | A | 0.93841356 |
\n",
"\t2 | 2 | B | 0.42626823 |
\n",
"\t3 | 3 | C | 0.35263557 |
\n",
"\t4 | 4 | D | 0.42034666 |
\n",
"\t5 | 5 | E | 0.30904123 |
\n",
"\t6 | 6 | F | 0.07098006 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 3\n",
"\\begin{tabular}{r|lll}\n",
" & MyFirstColumn & MySecondColumn & My.Third.Column\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t1 & 1 & A & 0.93841356\\\\\n",
"\t2 & 2 & B & 0.42626823\\\\\n",
"\t3 & 3 & C & 0.35263557\\\\\n",
"\t4 & 4 & D & 0.42034666\\\\\n",
"\t5 & 5 & E & 0.30904123\\\\\n",
"\t6 & 6 & F & 0.07098006\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"| | MyFirstColumn <int> | MySecondColumn <chr> | My.Third.Column <dbl> |\n",
"|---|---|---|---|\n",
"| 1 | 1 | A | 0.93841356 |\n",
"| 2 | 2 | B | 0.42626823 |\n",
"| 3 | 3 | C | 0.35263557 |\n",
"| 4 | 4 | D | 0.42034666 |\n",
"| 5 | 5 | E | 0.30904123 |\n",
"| 6 | 6 | F | 0.07098006 |\n",
"\n"
],
"text/plain": [
" MyFirstColumn MySecondColumn My.Third.Column\n",
"1 1 A 0.93841356 \n",
"2 2 B 0.42626823 \n",
"3 3 C 0.35263557 \n",
"4 4 D 0.42034666 \n",
"5 5 E 0.30904123 \n",
"6 6 F 0.07098006 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"head(MyDF)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And the bottom few rows with `tail()`:"
]
},
{
"cell_type": "code",
"execution_count": 225,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"\t | MyFirstColumn | MySecondColumn | My.Third.Column |
\n",
"\t | <int> | <chr> | <dbl> |
\n",
"\n",
"\n",
"\t5 | 5 | E | 0.30904123 |
\n",
"\t6 | 6 | F | 0.07098006 |
\n",
"\t7 | 7 | G | 0.29415173 |
\n",
"\t8 | 8 | H | 0.80281559 |
\n",
"\t9 | 9 | I | 0.70606392 |
\n",
"\t10 | 10 | J | 0.97423383 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 3\n",
"\\begin{tabular}{r|lll}\n",
" & MyFirstColumn & MySecondColumn & My.Third.Column\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t5 & 5 & E & 0.30904123\\\\\n",
"\t6 & 6 & F & 0.07098006\\\\\n",
"\t7 & 7 & G & 0.29415173\\\\\n",
"\t8 & 8 & H & 0.80281559\\\\\n",
"\t9 & 9 & I & 0.70606392\\\\\n",
"\t10 & 10 & J & 0.97423383\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"| | MyFirstColumn <int> | MySecondColumn <chr> | My.Third.Column <dbl> |\n",
"|---|---|---|---|\n",
"| 5 | 5 | E | 0.30904123 |\n",
"| 6 | 6 | F | 0.07098006 |\n",
"| 7 | 7 | G | 0.29415173 |\n",
"| 8 | 8 | H | 0.80281559 |\n",
"| 9 | 9 | I | 0.70606392 |\n",
"| 10 | 10 | J | 0.97423383 |\n",
"\n"
],
"text/plain": [
" MyFirstColumn MySecondColumn My.Third.Column\n",
"5 5 E 0.30904123 \n",
"6 6 F 0.07098006 \n",
"7 7 G 0.29415173 \n",
"8 8 H 0.80281559 \n",
"9 9 I 0.70606392 \n",
"10 10 J 0.97423383 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"tail(MyDF)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{note}\n",
"**The Factor data type**: R has a special data type called \"factor\". Different values in this data type are called \"levels\". This data type is used to identify \"grouping variables\" such as columns in a dataframe that contain experimental treatments. This is convenient for statistical analyses using one of the many plotting and statistical commands or routines available in R, which have been written to interpret the `factor` data type as such and use it to automatically compare subgroups in the data. More on this later, when we delve into analyses (including visualization) in R. \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Lists\n",
"\n",
"A list is used to collect a group of data objects of different sizes and types (e.g., one whole data frame and one vector can both be in a single list). It is simply an ordered collection of objects (that can be variables). \n",
"\n",
"```{note}\n",
"The outputs of many statistical functions in R are lists (e.g. linear model fitting using `lm()`), to return all relevant information in one output object. So you need to know how to unpack and manipulate lists. \n",
"```\n",
"\n",
"As a budding multilingual quantitative biologist, you should not be perturbed by the fact that a `list` is a very different data structure in python vs R. It will take some practice — sometimes the same word means different things in different human languages too!\n",
"\n",
"Try this:"
]
},
{
"cell_type": "code",
"execution_count": 226,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- $species
\n",
"\t\t- \n",
"
- 'Quercus robur'
- 'Fraxinus excelsior'
\n",
" \n",
"\t- $age
\n",
"\t\t- \n",
"
- 123
- 84
\n",
" \n",
"
\n"
],
"text/latex": [
"\\begin{description}\n",
"\\item[\\$species] \\begin{enumerate*}\n",
"\\item 'Quercus robur'\n",
"\\item 'Fraxinus excelsior'\n",
"\\end{enumerate*}\n",
"\n",
"\\item[\\$age] \\begin{enumerate*}\n",
"\\item 123\n",
"\\item 84\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n"
],
"text/markdown": [
"$species\n",
": 1. 'Quercus robur'\n",
"2. 'Fraxinus excelsior'\n",
"\n",
"\n",
"\n",
"$age\n",
": 1. 123\n",
"2. 84\n",
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"$species\n",
"[1] \"Quercus robur\" \"Fraxinus excelsior\"\n",
"\n",
"$age\n",
"[1] 123 84\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyList <- list(species=c(\"Quercus robur\",\"Fraxinus excelsior\"), age=c(123, 84))\n",
"MyList"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can access contents of a list item using number of the item instead of name using nested square brackets:"
]
},
{
"cell_type": "code",
"execution_count": 227,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'Quercus robur'
- 'Fraxinus excelsior'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'Quercus robur'\n",
"\\item 'Fraxinus excelsior'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'Quercus robur'\n",
"2. 'Fraxinus excelsior'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"Quercus robur\" \"Fraxinus excelsior\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyList[[1]]"
]
},
{
"cell_type": "code",
"execution_count": 228,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'Quercus robur'"
],
"text/latex": [
"'Quercus robur'"
],
"text/markdown": [
"'Quercus robur'"
],
"text/plain": [
"[1] \"Quercus robur\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyList[[1]][1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"or using the name in either this way:"
]
},
{
"cell_type": "code",
"execution_count": 229,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'Quercus robur'
- 'Fraxinus excelsior'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'Quercus robur'\n",
"\\item 'Fraxinus excelsior'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'Quercus robur'\n",
"2. 'Fraxinus excelsior'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"Quercus robur\" \"Fraxinus excelsior\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyList$species"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"or this way: "
]
},
{
"cell_type": "code",
"execution_count": 230,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'Quercus robur'
- 'Fraxinus excelsior'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'Quercus robur'\n",
"\\item 'Fraxinus excelsior'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'Quercus robur'\n",
"2. 'Fraxinus excelsior'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"Quercus robur\" \"Fraxinus excelsior\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyList[[\"species\"]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And to access a specific element inside a list's item "
]
},
{
"cell_type": "code",
"execution_count": 231,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'Quercus robur'"
],
"text/latex": [
"'Quercus robur'"
],
"text/markdown": [
"'Quercus robur'"
],
"text/plain": [
"[1] \"Quercus robur\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyList$species[1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A more complex list:"
]
},
{
"cell_type": "code",
"execution_count": 232,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- $species
\n",
"\t\t- 'Cancer magister'
\n",
"\t- $latitude
\n",
"\t\t- 48.3
\n",
"\t- $longitude
\n",
"\t\t- -123.1
\n",
"\t- $startyr
\n",
"\t\t- 1980
\n",
"\t- $endyr
\n",
"\t\t- 1985
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 303
- 402
- 101
- 607
- 802
- 35
\n",
" \n",
"
\n"
],
"text/latex": [
"\\begin{description}\n",
"\\item[\\$species] 'Cancer magister'\n",
"\\item[\\$latitude] 48.3\n",
"\\item[\\$longitude] -123.1\n",
"\\item[\\$startyr] 1980\n",
"\\item[\\$endyr] 1985\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 303\n",
"\\item 402\n",
"\\item 101\n",
"\\item 607\n",
"\\item 802\n",
"\\item 35\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n"
],
"text/markdown": [
"$species\n",
": 'Cancer magister'\n",
"$latitude\n",
": 48.3\n",
"$longitude\n",
": -123.1\n",
"$startyr\n",
": 1980\n",
"$endyr\n",
": 1985\n",
"$pop\n",
": 1. 303\n",
"2. 402\n",
"3. 101\n",
"4. 607\n",
"5. 802\n",
"6. 35\n",
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"$species\n",
"[1] \"Cancer magister\"\n",
"\n",
"$latitude\n",
"[1] 48.3\n",
"\n",
"$longitude\n",
"[1] -123.1\n",
"\n",
"$startyr\n",
"[1] 1980\n",
"\n",
"$endyr\n",
"[1] 1985\n",
"\n",
"$pop\n",
"[1] 303 402 101 607 802 35\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pop1<-list(species='Cancer magister',\n",
" latitude=48.3,longitude=-123.1,\n",
" startyr=1980,endyr=1985,\n",
" pop=c(303,402,101,607,802,35))\n",
"pop1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can build lists of lists too:"
]
},
{
"cell_type": "code",
"execution_count": 233,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- $sp1
\n",
"\t\t\n",
"\t- $lat
\n",
"\t\t- 19
\n",
"\t- $long
\n",
"\t\t- 57
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 100
- 101
- 99
\n",
" \n",
"
\n",
" \n",
"\t- $sp2
\n",
"\t\t\n",
"\t- $lat
\n",
"\t\t- 56
\n",
"\t- $long
\n",
"\t\t- -120
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 1
- 4
- 7
- 7
- 2
- 1
- 2
\n",
" \n",
"
\n",
" \n",
"\t- $sp3
\n",
"\t\t\n",
"\t- $lat
\n",
"\t\t- 32
\n",
"\t- $long
\n",
"\t\t- -10
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 12
- 11
- 2
- 1
- 14
\n",
" \n",
"
\n",
" \n",
"
\n"
],
"text/latex": [
"\\begin{description}\n",
"\\item[\\$sp1] \\begin{description}\n",
"\\item[\\$lat] 19\n",
"\\item[\\$long] 57\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 100\n",
"\\item 101\n",
"\\item 99\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n",
"\n",
"\\item[\\$sp2] \\begin{description}\n",
"\\item[\\$lat] 56\n",
"\\item[\\$long] -120\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 1\n",
"\\item 4\n",
"\\item 7\n",
"\\item 7\n",
"\\item 2\n",
"\\item 1\n",
"\\item 2\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n",
"\n",
"\\item[\\$sp3] \\begin{description}\n",
"\\item[\\$lat] 32\n",
"\\item[\\$long] -10\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 12\n",
"\\item 11\n",
"\\item 2\n",
"\\item 1\n",
"\\item 14\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n",
"\n",
"\\end{description}\n"
],
"text/markdown": [
"$sp1\n",
": $lat\n",
": 19\n",
"$long\n",
": 57\n",
"$pop\n",
": 1. 100\n",
"2. 101\n",
"3. 99\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"$sp2\n",
": $lat\n",
": 56\n",
"$long\n",
": -120\n",
"$pop\n",
": 1. 1\n",
"2. 4\n",
"3. 7\n",
"4. 7\n",
"5. 2\n",
"6. 1\n",
"7. 2\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"$sp3\n",
": $lat\n",
": 32\n",
"$long\n",
": -10\n",
"$pop\n",
": 1. 12\n",
"2. 11\n",
"3. 2\n",
"4. 1\n",
"5. 14\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"$sp1\n",
"$sp1$lat\n",
"[1] 19\n",
"\n",
"$sp1$long\n",
"[1] 57\n",
"\n",
"$sp1$pop\n",
"[1] 100 101 99\n",
"\n",
"\n",
"$sp2\n",
"$sp2$lat\n",
"[1] 56\n",
"\n",
"$sp2$long\n",
"[1] -120\n",
"\n",
"$sp2$pop\n",
"[1] 1 4 7 7 2 1 2\n",
"\n",
"\n",
"$sp3\n",
"$sp3$lat\n",
"[1] 32\n",
"\n",
"$sp3$long\n",
"[1] -10\n",
"\n",
"$sp3$pop\n",
"[1] 12 11 2 1 14\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pop1<-list(lat=19,long=57,\n",
" pop=c(100,101,99))\n",
"pop2<-list(lat=56,long=-120,\n",
" pop=c(1,4,7,7,2,1,2))\n",
"pop3<-list(lat=32,long=-10,\n",
" pop=c(12,11,2,1,14))\n",
"pops<-list(sp1=pop1,sp2=pop2,sp3=pop3)\n",
"pops"
]
},
{
"cell_type": "code",
"execution_count": 234,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- $lat
\n",
"\t\t- 19
\n",
"\t- $long
\n",
"\t\t- 57
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 100
- 101
- 99
\n",
" \n",
"
\n"
],
"text/latex": [
"\\begin{description}\n",
"\\item[\\$lat] 19\n",
"\\item[\\$long] 57\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 100\n",
"\\item 101\n",
"\\item 99\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n"
],
"text/markdown": [
"$lat\n",
": 19\n",
"$long\n",
": 57\n",
"$pop\n",
": 1. 100\n",
"2. 101\n",
"3. 99\n",
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"$lat\n",
"[1] 19\n",
"\n",
"$long\n",
"[1] 57\n",
"\n",
"$pop\n",
"[1] 100 101 99\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pops$sp1 # check out species 1"
]
},
{
"cell_type": "code",
"execution_count": 235,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"$pop = \n",
"- 100
- 101
- 99
\n"
],
"text/latex": [
"\\textbf{\\$pop} = \\begin{enumerate*}\n",
"\\item 100\n",
"\\item 101\n",
"\\item 99\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"**$pop** = 1. 100\n",
"2. 101\n",
"3. 99\n",
"\n",
"\n"
],
"text/plain": [
"$pop\n",
"[1] 100 101 99\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pops$sp1[\"pop\"] # sp1's population sizes"
]
},
{
"cell_type": "code",
"execution_count": 236,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"56"
],
"text/latex": [
"56"
],
"text/markdown": [
"56"
],
"text/plain": [
"[1] 56"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pops[[2]]$lat #latitude of second species"
]
},
{
"cell_type": "code",
"execution_count": 237,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- $sp1
\n",
"\t\t\n",
"\t- $lat
\n",
"\t\t- 19
\n",
"\t- $long
\n",
"\t\t- 57
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 100
- 101
- 99
\n",
" \n",
"
\n",
" \n",
"\t- $sp2
\n",
"\t\t\n",
"\t- $lat
\n",
"\t\t- 56
\n",
"\t- $long
\n",
"\t\t- -120
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 1
- 4
- 7
- 7
- 2
- 1
- 2
\n",
" \n",
"
\n",
" \n",
"\t- $sp3
\n",
"\t\t\n",
"\t- $lat
\n",
"\t\t- 32
\n",
"\t- $long
\n",
"\t\t- -10
\n",
"\t- $pop
\n",
"\t\t- \n",
"
- 12
- 11
- 102
- 1
- 14
\n",
" \n",
"
\n",
" \n",
"
\n"
],
"text/latex": [
"\\begin{description}\n",
"\\item[\\$sp1] \\begin{description}\n",
"\\item[\\$lat] 19\n",
"\\item[\\$long] 57\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 100\n",
"\\item 101\n",
"\\item 99\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n",
"\n",
"\\item[\\$sp2] \\begin{description}\n",
"\\item[\\$lat] 56\n",
"\\item[\\$long] -120\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 1\n",
"\\item 4\n",
"\\item 7\n",
"\\item 7\n",
"\\item 2\n",
"\\item 1\n",
"\\item 2\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n",
"\n",
"\\item[\\$sp3] \\begin{description}\n",
"\\item[\\$lat] 32\n",
"\\item[\\$long] -10\n",
"\\item[\\$pop] \\begin{enumerate*}\n",
"\\item 12\n",
"\\item 11\n",
"\\item 102\n",
"\\item 1\n",
"\\item 14\n",
"\\end{enumerate*}\n",
"\n",
"\\end{description}\n",
"\n",
"\\end{description}\n"
],
"text/markdown": [
"$sp1\n",
": $lat\n",
": 19\n",
"$long\n",
": 57\n",
"$pop\n",
": 1. 100\n",
"2. 101\n",
"3. 99\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"$sp2\n",
": $lat\n",
": 56\n",
"$long\n",
": -120\n",
"$pop\n",
": 1. 1\n",
"2. 4\n",
"3. 7\n",
"4. 7\n",
"5. 2\n",
"6. 1\n",
"7. 2\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"$sp3\n",
": $lat\n",
": 32\n",
"$long\n",
": -10\n",
"$pop\n",
": 1. 12\n",
"2. 11\n",
"3. 102\n",
"4. 1\n",
"5. 14\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"$sp1\n",
"$sp1$lat\n",
"[1] 19\n",
"\n",
"$sp1$long\n",
"[1] 57\n",
"\n",
"$sp1$pop\n",
"[1] 100 101 99\n",
"\n",
"\n",
"$sp2\n",
"$sp2$lat\n",
"[1] 56\n",
"\n",
"$sp2$long\n",
"[1] -120\n",
"\n",
"$sp2$pop\n",
"[1] 1 4 7 7 2 1 2\n",
"\n",
"\n",
"$sp3\n",
"$sp3$lat\n",
"[1] 32\n",
"\n",
"$sp3$long\n",
"[1] -10\n",
"\n",
"$sp3$pop\n",
"[1] 12 11 102 1 14\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pops[[3]]$pop[3]<-102 #change population of third species at third time step\n",
"pops"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Maybe you have guessed by now that R dataframes are actually a kind of list.\n",
"\n",
"### Matrix vs Dataframe\n",
"\n",
"If dataframes are so nice, why use R matrices at all? The problem is that dataframes can be too slow when large numbers of mathematical calculations or operations (e.g., matrix - vector multiplications or other linear algebra operations) need to be performed. In such cases, you will need to convert a dataframe to a matrix. But for statistical analyses, plotting, and writing output of standard R analyses to a file, data frames are more convenient. Dataframes also allow you to refer to columns by name (using `$`), which is often convenient.\n",
"\n",
"To see the difference in memory usage of matrices vs dataframes, try this:"
]
},
{
"cell_type": "code",
"execution_count": 238,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 4 × 4 of type int\n",
"\n",
"\t1 | 5 | 1 | 5 |
\n",
"\t2 | 6 | 2 | 6 |
\n",
"\t3 | 7 | 3 | 7 |
\n",
"\t4 | 8 | 4 | 8 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 4 × 4 of type int\n",
"\\begin{tabular}{llll}\n",
"\t 1 & 5 & 1 & 5\\\\\n",
"\t 2 & 6 & 2 & 6\\\\\n",
"\t 3 & 7 & 3 & 7\\\\\n",
"\t 4 & 8 & 4 & 8\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 4 × 4 of type int\n",
"\n",
"| 1 | 5 | 1 | 5 |\n",
"| 2 | 6 | 2 | 6 |\n",
"| 3 | 7 | 3 | 7 |\n",
"| 4 | 8 | 4 | 8 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4]\n",
"[1,] 1 5 1 5 \n",
"[2,] 2 6 2 6 \n",
"[3,] 3 7 3 7 \n",
"[4,] 4 8 4 8 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyMat = matrix(1:8, 4, 4)\n",
"MyMat"
]
},
{
"cell_type": "code",
"execution_count": 239,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 4 × 4\n",
"\n",
"\tV1 | V2 | V3 | V4 |
\n",
"\t<int> | <int> | <int> | <int> |
\n",
"\n",
"\n",
"\t1 | 5 | 1 | 5 |
\n",
"\t2 | 6 | 2 | 6 |
\n",
"\t3 | 7 | 3 | 7 |
\n",
"\t4 | 8 | 4 | 8 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 4 × 4\n",
"\\begin{tabular}{llll}\n",
" V1 & V2 & V3 & V4\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t 1 & 5 & 1 & 5\\\\\n",
"\t 2 & 6 & 2 & 6\\\\\n",
"\t 3 & 7 & 3 & 7\\\\\n",
"\t 4 & 8 & 4 & 8\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 4 × 4\n",
"\n",
"| V1 <int> | V2 <int> | V3 <int> | V4 <int> |\n",
"|---|---|---|---|\n",
"| 1 | 5 | 1 | 5 |\n",
"| 2 | 6 | 2 | 6 |\n",
"| 3 | 7 | 3 | 7 |\n",
"| 4 | 8 | 4 | 8 |\n",
"\n"
],
"text/plain": [
" V1 V2 V3 V4\n",
"1 1 5 1 5 \n",
"2 2 6 2 6 \n",
"3 3 7 3 7 \n",
"4 4 8 4 8 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyDF = as.data.frame(MyMat)\n",
"MyDF"
]
},
{
"cell_type": "code",
"execution_count": 240,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"280 bytes"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"object.size(MyMat) # returns size of an R object (variable) in bytes"
]
},
{
"cell_type": "code",
"execution_count": 241,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"1152 bytes"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"object.size(MyDF)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Quite a big difference!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Creating and manipulating data\n",
"\n",
"(R-creating-sequences)=\n",
"### Creating sequences\n",
"\n",
"The `:` operator creates vectors of sequential integers:"
]
},
{
"cell_type": "code",
"execution_count": 242,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1990
- 1991
- 1992
- 1993
- 1994
- 1995
- 1996
- 1997
- 1998
- 1999
- 2000
- 2001
- 2002
- 2003
- 2004
- 2005
- 2006
- 2007
- 2008
- 2009
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1990\n",
"\\item 1991\n",
"\\item 1992\n",
"\\item 1993\n",
"\\item 1994\n",
"\\item 1995\n",
"\\item 1996\n",
"\\item 1997\n",
"\\item 1998\n",
"\\item 1999\n",
"\\item 2000\n",
"\\item 2001\n",
"\\item 2002\n",
"\\item 2003\n",
"\\item 2004\n",
"\\item 2005\n",
"\\item 2006\n",
"\\item 2007\n",
"\\item 2008\n",
"\\item 2009\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1990\n",
"2. 1991\n",
"3. 1992\n",
"4. 1993\n",
"5. 1994\n",
"6. 1995\n",
"7. 1996\n",
"8. 1997\n",
"9. 1998\n",
"10. 1999\n",
"11. 2000\n",
"12. 2001\n",
"13. 2002\n",
"14. 2003\n",
"15. 2004\n",
"16. 2005\n",
"17. 2006\n",
"18. 2007\n",
"19. 2008\n",
"20. 2009\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004\n",
"[16] 2005 2006 2007 2008 2009"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"years <- 1990:2009\n",
"years"
]
},
{
"cell_type": "code",
"execution_count": 243,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 2009
- 2008
- 2007
- 2006
- 2005
- 2004
- 2003
- 2002
- 2001
- 2000
- 1999
- 1998
- 1997
- 1996
- 1995
- 1994
- 1993
- 1992
- 1991
- 1990
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 2009\n",
"\\item 2008\n",
"\\item 2007\n",
"\\item 2006\n",
"\\item 2005\n",
"\\item 2004\n",
"\\item 2003\n",
"\\item 2002\n",
"\\item 2001\n",
"\\item 2000\n",
"\\item 1999\n",
"\\item 1998\n",
"\\item 1997\n",
"\\item 1996\n",
"\\item 1995\n",
"\\item 1994\n",
"\\item 1993\n",
"\\item 1992\n",
"\\item 1991\n",
"\\item 1990\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 2009\n",
"2. 2008\n",
"3. 2007\n",
"4. 2006\n",
"5. 2005\n",
"6. 2004\n",
"7. 2003\n",
"8. 2002\n",
"9. 2001\n",
"10. 2000\n",
"11. 1999\n",
"12. 1998\n",
"13. 1997\n",
"14. 1996\n",
"15. 1995\n",
"16. 1994\n",
"17. 1993\n",
"18. 1992\n",
"19. 1991\n",
"20. 1990\n",
"\n",
"\n"
],
"text/plain": [
" [1] 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995\n",
"[16] 1994 1993 1992 1991 1990"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"years <- 2009:1990 # or in reverse order \n",
"years"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For sequences of float numbers, you have to use `seq()`:"
]
},
{
"cell_type": "code",
"execution_count": 244,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 1.5
- 2
- 2.5
- 3
- 3.5
- 4
- 4.5
- 5
- 5.5
- 6
- 6.5
- 7
- 7.5
- 8
- 8.5
- 9
- 9.5
- 10
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 1.5\n",
"\\item 2\n",
"\\item 2.5\n",
"\\item 3\n",
"\\item 3.5\n",
"\\item 4\n",
"\\item 4.5\n",
"\\item 5\n",
"\\item 5.5\n",
"\\item 6\n",
"\\item 6.5\n",
"\\item 7\n",
"\\item 7.5\n",
"\\item 8\n",
"\\item 8.5\n",
"\\item 9\n",
"\\item 9.5\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 1.5\n",
"3. 2\n",
"4. 2.5\n",
"5. 3\n",
"6. 3.5\n",
"7. 4\n",
"8. 4.5\n",
"9. 5\n",
"10. 5.5\n",
"11. 6\n",
"12. 6.5\n",
"13. 7\n",
"14. 7.5\n",
"15. 8\n",
"16. 8.5\n",
"17. 9\n",
"18. 9.5\n",
"19. 10\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0\n",
"[16] 8.5 9.0 9.5 10.0"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"seq(1, 10, 0.5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{tip}\n",
"Don't forget, you can get help on a particular R command by prefixing it with `?`. For example, try:\n",
"\n",
"`?seq`\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also use `seq(from=1,to=10, by=0.5) `OR` seq(from=1, by=0.5, to=10)` with the same effect (try it). This explicit, \"argument matching\" approach is partly what makes R so popular and accessible to a wider range of users."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(R-indices)=\n",
"\n",
"### Accessing parts of data stuctures: Indices and Indexing\n",
"\n",
"Every element (entry) of a vector in R has an order (an \"index\" value): the first value, second, third, etc. To illustrate this, let's create a simple vector:"
]
},
{
"cell_type": "code",
"execution_count": 245,
"metadata": {},
"outputs": [],
"source": [
"MyVar <- c( 'a' , 'b' , 'c' , 'd' , 'e' )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, square brackets extract values based on their numerical order in the vector:"
]
},
{
"cell_type": "code",
"execution_count": 246,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"'a'"
],
"text/latex": [
"'a'"
],
"text/markdown": [
"'a'"
],
"text/plain": [
"[1] \"a\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyVar[1] # Show element in first position "
]
},
{
"cell_type": "code",
"execution_count": 247,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'d'"
],
"text/latex": [
"'d'"
],
"text/markdown": [
"'d'"
],
"text/plain": [
"[1] \"d\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyVar[4] # Show element in fourth position "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The values in square brackets are called \"indices\" — they give the index (position) of the required value. We can also select sets of values in different orders, or repeat values:"
]
},
{
"cell_type": "code",
"execution_count": 248,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'c'
- 'b'
- 'a'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'c'\n",
"\\item 'b'\n",
"\\item 'a'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'c'\n",
"2. 'b'\n",
"3. 'a'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"c\" \"b\" \"a\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyVar[c(3,2,1)] # reverse order"
]
},
{
"cell_type": "code",
"execution_count": 249,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'a'
- 'a'
- 'e'
- 'e'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'a'\n",
"\\item 'a'\n",
"\\item 'e'\n",
"\\item 'e'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'a'\n",
"2. 'a'\n",
"3. 'e'\n",
"4. 'e'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"a\" \"a\" \"e\" \"e\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"MyVar[c(1,1,5,5)] # repeat indices"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also manipulate data structures/objects by indexing:"
]
},
{
"cell_type": "code",
"execution_count": 250,
"metadata": {},
"outputs": [],
"source": [
"v <- c(0, 1, 2, 3, 4) # Create a vector named v"
]
},
{
"cell_type": "code",
"execution_count": 251,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"2"
],
"text/latex": [
"2"
],
"text/markdown": [
"2"
],
"text/plain": [
"[1] 2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v[3] # access one element"
]
},
{
"cell_type": "code",
"execution_count": 252,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0
- 1
- 2
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0\n",
"\\item 1\n",
"\\item 2\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0\n",
"2. 1\n",
"3. 2\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0 1 2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v[1:3] # access sequential elements"
]
},
{
"cell_type": "code",
"execution_count": 253,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0
- 1
- 3
- 4
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0\n",
"\\item 1\n",
"\\item 3\n",
"\\item 4\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0\n",
"2. 1\n",
"3. 3\n",
"4. 4\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0 1 3 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v[-3] # remove elements"
]
},
{
"cell_type": "code",
"execution_count": 254,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0
- 3
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0\n",
"\\item 3\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0\n",
"2. 3\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0 3"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v[c(1, 4)] # access non-sequential indices"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For matrices, you need to use both row and column indices:"
]
},
{
"cell_type": "code",
"execution_count": 255,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"\t 1 | 2 | 3 | 4 | 5 |
\n",
"\t 6 | 7 | 8 | 9 | 10 |
\n",
"\t11 | 12 | 13 | 14 | 15 |
\n",
"\t16 | 17 | 18 | 19 | 20 |
\n",
"\t21 | 22 | 23 | 24 | 25 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 5 of type int\n",
"\\begin{tabular}{lllll}\n",
"\t 1 & 2 & 3 & 4 & 5\\\\\n",
"\t 6 & 7 & 8 & 9 & 10\\\\\n",
"\t 11 & 12 & 13 & 14 & 15\\\\\n",
"\t 16 & 17 & 18 & 19 & 20\\\\\n",
"\t 21 & 22 & 23 & 24 & 25\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 5 of type int\n",
"\n",
"| 1 | 2 | 3 | 4 | 5 |\n",
"| 6 | 7 | 8 | 9 | 10 |\n",
"| 11 | 12 | 13 | 14 | 15 |\n",
"| 16 | 17 | 18 | 19 | 20 |\n",
"| 21 | 22 | 23 | 24 | 25 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 1 2 3 4 5 \n",
"[2,] 6 7 8 9 10 \n",
"[3,] 11 12 13 14 15 \n",
"[4,] 16 17 18 19 20 \n",
"[5,] 21 22 23 24 25 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1 <- matrix(1:25, 5, 5, byrow=TRUE) #create a matrix\n",
"mat1"
]
},
{
"cell_type": "code",
"execution_count": 256,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"2"
],
"text/latex": [
"2"
],
"text/markdown": [
"2"
],
"text/plain": [
"[1] 2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1[1,2]"
]
},
{
"cell_type": "code",
"execution_count": 257,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 2
- 3
- 4
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 2\n",
"2. 3\n",
"3. 4\n",
"\n",
"\n"
],
"text/plain": [
"[1] 2 3 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1[1,2:4]"
]
},
{
"cell_type": "code",
"execution_count": 258,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 2 × 3 of type int\n",
"\n",
"\t2 | 3 | 4 |
\n",
"\t7 | 8 | 9 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 2 × 3 of type int\n",
"\\begin{tabular}{lll}\n",
"\t 2 & 3 & 4\\\\\n",
"\t 7 & 8 & 9\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 2 × 3 of type int\n",
"\n",
"| 2 | 3 | 4 |\n",
"| 7 | 8 | 9 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3]\n",
"[1,] 2 3 4 \n",
"[2,] 7 8 9 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1[1:2,2:4]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And to get all elements in a particular row or column, you need to leave the value blank:"
]
},
{
"cell_type": "code",
"execution_count": 259,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
- 3
- 4
- 5
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"3. 3\n",
"4. 4\n",
"5. 5\n",
"\n",
"\n"
],
"text/plain": [
"[1] 1 2 3 4 5"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1[1,] # First row, all columns"
]
},
{
"cell_type": "code",
"execution_count": 260,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 6
- 11
- 16
- 21
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 6\n",
"\\item 11\n",
"\\item 16\n",
"\\item 21\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 6\n",
"3. 11\n",
"4. 16\n",
"5. 21\n",
"\n",
"\n"
],
"text/plain": [
"[1] 1 6 11 16 21"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mat1[,1] # First column, all rows"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(R-recycling)=\n",
"### Recycling\n",
"\n",
"When vectors are of different lengths, R will recycle the shorter one to make a vector of the same length:"
]
},
{
"cell_type": "code",
"execution_count": 261,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 3
- 7
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 3\n",
"\\item 7\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 3\n",
"2. 7\n",
"\n",
"\n"
],
"text/plain": [
"[1] 3 7"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"a <- c(1,5) + 2\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 262,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"\n",
"\n"
],
"text/plain": [
"[1] 1 2"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"- 5
- 3
- 9
- 2
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 5\n",
"\\item 3\n",
"\\item 9\n",
"\\item 2\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 5\n",
"2. 3\n",
"3. 9\n",
"4. 2\n",
"\n",
"\n"
],
"text/plain": [
"[1] 5 3 9 2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x <- c(1,2); y <- c(5,3,9,2)\n",
"x;y"
]
},
{
"cell_type": "code",
"execution_count": 263,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 6
- 5
- 10
- 4
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 6\n",
"\\item 5\n",
"\\item 10\n",
"\\item 4\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 6\n",
"2. 5\n",
"3. 10\n",
"4. 4\n",
"\n",
"\n"
],
"text/plain": [
"[1] 6 5 10 4"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x + y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Strange! R just recycled `x` (repeated `1,2` twice) so that the two vectors could be summed! here's another example:"
]
},
{
"cell_type": "code",
"execution_count": 264,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Warning message in x + c(y, 1):\n",
"“longer object length is not a multiple of shorter object length”\n"
]
},
{
"data": {
"text/html": [
"\n",
"- 6
- 5
- 10
- 4
- 2
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 6\n",
"\\item 5\n",
"\\item 10\n",
"\\item 4\n",
"\\item 2\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 6\n",
"2. 5\n",
"3. 10\n",
"4. 4\n",
"5. 2\n",
"\n",
"\n"
],
"text/plain": [
"[1] 6 5 10 4 2"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x + c(y,1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Think about what happened here*. R is clearly not comfortable doing this, so it warns you! Recycling could be convenient at times, but is dangerous!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Basic vector-matrix operations\n",
"\n",
"You can perform the usual vector matrix operations on R `vectors` "
]
},
{
"cell_type": "code",
"execution_count": 265,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0
- 2
- 4
- 6
- 8
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0\n",
"\\item 2\n",
"\\item 4\n",
"\\item 6\n",
"\\item 8\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0\n",
"2. 2\n",
"3. 4\n",
"4. 6\n",
"5. 8\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0 2 4 6 8"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v <- c(0, 1, 2, 3, 4)\n",
"v2 <- v*2 # multiply whole vector by 2\n",
"v2"
]
},
{
"cell_type": "code",
"execution_count": 266,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0
- 2
- 8
- 18
- 32
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0\n",
"\\item 2\n",
"\\item 8\n",
"\\item 18\n",
"\\item 32\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0\n",
"2. 2\n",
"3. 8\n",
"4. 18\n",
"5. 32\n",
"\n",
"\n"
],
"text/plain": [
"[1] 0 2 8 18 32"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v * v2 # element-wise product"
]
},
{
"cell_type": "code",
"execution_count": 267,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 1 × 5 of type dbl\n",
"\n",
"\t0 | 1 | 2 | 3 | 4 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 1 × 5 of type dbl\n",
"\\begin{tabular}{lllll}\n",
"\t 0 & 1 & 2 & 3 & 4\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 1 × 5 of type dbl\n",
"\n",
"| 0 | 1 | 2 | 3 | 4 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 0 1 2 3 4 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"t(v) # transpose the vector"
]
},
{
"cell_type": "code",
"execution_count": 268,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 5 of type dbl\n",
"\n",
"\t0 | 0 | 0 | 0 | 0 |
\n",
"\t0 | 1 | 2 | 3 | 4 |
\n",
"\t0 | 2 | 4 | 6 | 8 |
\n",
"\t0 | 3 | 6 | 9 | 12 |
\n",
"\t0 | 4 | 8 | 12 | 16 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 5 of type dbl\n",
"\\begin{tabular}{lllll}\n",
"\t 0 & 0 & 0 & 0 & 0\\\\\n",
"\t 0 & 1 & 2 & 3 & 4\\\\\n",
"\t 0 & 2 & 4 & 6 & 8\\\\\n",
"\t 0 & 3 & 6 & 9 & 12\\\\\n",
"\t 0 & 4 & 8 & 12 & 16\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 5 of type dbl\n",
"\n",
"| 0 | 0 | 0 | 0 | 0 |\n",
"| 0 | 1 | 2 | 3 | 4 |\n",
"| 0 | 2 | 4 | 6 | 8 |\n",
"| 0 | 3 | 6 | 9 | 12 |\n",
"| 0 | 4 | 8 | 12 | 16 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5]\n",
"[1,] 0 0 0 0 0 \n",
"[2,] 0 1 2 3 4 \n",
"[3,] 0 2 4 6 8 \n",
"[4,] 0 3 6 9 12 \n",
"[5,] 0 4 8 12 16 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v %*% t(v) # matrix/vector product"
]
},
{
"cell_type": "code",
"execution_count": 269,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
- 3
- 4
- 5
- 6
- 7
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\item 6\n",
"\\item 7\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"3. 3\n",
"4. 4\n",
"5. 5\n",
"6. 6\n",
"7. 7\n",
"\n",
"\n"
],
"text/plain": [
"[1] 1 2 3 4 5 6 7"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v3 <- 1:7 # assign using sequence\n",
"v3"
]
},
{
"cell_type": "code",
"execution_count": 270,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0
- 2
- 4
- 6
- 8
- 1
- 2
- 3
- 4
- 5
- 6
- 7
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0\n",
"\\item 2\n",
"\\item 4\n",
"\\item 6\n",
"\\item 8\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\item 6\n",
"\\item 7\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0\n",
"2. 2\n",
"3. 4\n",
"4. 6\n",
"5. 8\n",
"6. 1\n",
"7. 2\n",
"8. 3\n",
"9. 4\n",
"10. 5\n",
"11. 6\n",
"12. 7\n",
"\n",
"\n"
],
"text/plain": [
" [1] 0 2 4 6 8 1 2 3 4 5 6 7"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"v4 <- c(v2, v3) # concatenate vectors\n",
"v4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Strings and Pasting\n",
"\n",
"It is important to know how to handle strings in R for two main reasons:\n",
"\n",
"* To deal with text data, such as names of experimental treatments \n",
"* To generate appropriate text labels and titles for figures\n",
"\n",
"Let's try creating and manipulating strings:"
]
},
{
"cell_type": "code",
"execution_count": 271,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"'Quercus robur'"
],
"text/latex": [
"'Quercus robur'"
],
"text/markdown": [
"'Quercus robur'"
],
"text/plain": [
"[1] \"Quercus robur\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"species.name <- \"Quercus robur\" #You can alo use single quotes\n",
"species.name"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To combine to strings: "
]
},
{
"cell_type": "code",
"execution_count": 272,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"'Quercus robur'"
],
"text/latex": [
"'Quercus robur'"
],
"text/markdown": [
"'Quercus robur'"
],
"text/plain": [
"[1] \"Quercus robur\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"paste(\"Quercus\", \"robur\")"
]
},
{
"cell_type": "code",
"execution_count": 273,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'Quercusrobur'"
],
"text/latex": [
"'Quercusrobur'"
],
"text/markdown": [
"'Quercusrobur'"
],
"text/plain": [
"[1] \"Quercusrobur\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"paste(\"Quercus\", \"robur\",sep = \"\") #Get rid of space"
]
},
{
"cell_type": "code",
"execution_count": 274,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'Quercus, robur'"
],
"text/latex": [
"'Quercus, robur'"
],
"text/markdown": [
"'Quercus, robur'"
],
"text/plain": [
"[1] \"Quercus, robur\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"paste(\"Quercus\", \"robur\",sep = \", \") #insert comma to separate"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see above, both double and single quotes work, but using double quotes is better because it will allow you to define strings that contain a single quotes, which is often necessary.\n",
"\n",
"And as is the case with so many R functions, pasting works on vectors:"
]
},
{
"cell_type": "code",
"execution_count": 275,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'Year is: 1990'
- 'Year is: 1991'
- 'Year is: 1992'
- 'Year is: 1993'
- 'Year is: 1994'
- 'Year is: 1995'
- 'Year is: 1996'
- 'Year is: 1997'
- 'Year is: 1998'
- 'Year is: 1999'
- 'Year is: 2000'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'Year is: 1990'\n",
"\\item 'Year is: 1991'\n",
"\\item 'Year is: 1992'\n",
"\\item 'Year is: 1993'\n",
"\\item 'Year is: 1994'\n",
"\\item 'Year is: 1995'\n",
"\\item 'Year is: 1996'\n",
"\\item 'Year is: 1997'\n",
"\\item 'Year is: 1998'\n",
"\\item 'Year is: 1999'\n",
"\\item 'Year is: 2000'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'Year is: 1990'\n",
"2. 'Year is: 1991'\n",
"3. 'Year is: 1992'\n",
"4. 'Year is: 1993'\n",
"5. 'Year is: 1994'\n",
"6. 'Year is: 1995'\n",
"7. 'Year is: 1996'\n",
"8. 'Year is: 1997'\n",
"9. 'Year is: 1998'\n",
"10. 'Year is: 1999'\n",
"11. 'Year is: 2000'\n",
"\n",
"\n"
],
"text/plain": [
" [1] \"Year is: 1990\" \"Year is: 1991\" \"Year is: 1992\" \"Year is: 1993\"\n",
" [5] \"Year is: 1994\" \"Year is: 1995\" \"Year is: 1996\" \"Year is: 1997\"\n",
" [9] \"Year is: 1998\" \"Year is: 1999\" \"Year is: 2000\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"paste('Year is:', 1990:2000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that this last example creates a vector of 11 strings as it is 1990:2000 *inclusive*. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Useful R functions\n",
"\n",
"There are a number of very useful functions available by default (in the \"base packages\"). Some particularly useful ones are listed below.\n",
"\n",
"### For manipulating strings\n",
"\n",
"|Function||\n",
"|:-|:-|\n",
"|`strsplit(x,';')`| Split the string `x` at ';' |\n",
"|`nchar(x)`| Number of characters in string `x`|\n",
"|`toupper(x)`| Set string `x` to upper case|\n",
"|`tolower(x)`| Set string `x` to lower case|\n",
"|`paste(x1,x2,sep=';')`| Join the two strings using ';'|\n",
"\n",
"### Mathematical\n",
"\n",
"|Function||\n",
"|:-|:-|\n",
"|`log(x)`| Natural logarithm of the number (or every number in the vector or matrix) `x`|\n",
"|`log10(x)`| Logarithm in base 10 of the number (or every number in the vector or matrix) `x`|\n",
"|`exp(x)`| Exponential of the number (or every number in the vector or matrix) `x` ($e^x$)|\n",
"|`abs(x)`| Absolute value of the number (or every number in the vector or matrix) `x`|\n",
"|`floor(x)`| Largest integer smaller than the number (or every number in the vector or matrix) `x`|\n",
"|`ceiling(x)`| Smallest integer greater than the number (or every number in the vector or matrix) `x`|\n",
"|`sqrt(x)`| Square root of the number (or every number in the vector or matrix) `x` ($\\sqrt{x}$)|\n",
"|`sin(x)`| Sine function of the number (or every number in a vector or matrix) `x`|\n",
"|`pi`| Value of the constant $\\pi$|\n",
"\n",
"### Statistical\n",
"\n",
"|Function||\n",
"|:-|:-|\n",
"|`mean(x)`| Compute mean of (a vector or matrix) `x`| \n",
"|`sd(x)`| Standard deviation of (a vector or matrix) `x`|\n",
"|`var(x)`| Variance of (a vector or matrix) `x`|\n",
"|`median(x)`| Median of (a vector or matrix) `x`|\n",
"|`quantile(x,0.05)`| Compute the 0.05 quantile of (a vector or matrix) `x`|\n",
"|`range(x)`| Range of the data in (a vector or matrix) `x`|\n",
"|`min(x)`| Minimum of (a vector or matrix) `x`|\n",
"|`max(x)`| Maximum of (a vector or matrix) `x`|\n",
"|`sum(x)`| Sum all elements of (a vector or matrix) `x`|\n",
"|`summary(x)`| Summary statistics for (a vector or matrix) `x`|\n",
"\n",
"### Sets\n",
"\n",
"|Function||\n",
"|:-|:-|\n",
"|`union(x, y)` | Union of all elements of two vectors x & y|\n",
"|`intersect(x, y)`|Elements common to two vectors x & y|\n",
"|`setdiff(x, y)`|Elements unique to two vectors x & y|\n",
"|`setequal(x, y)`|Check if two vectors x & y are the same set (have same unique elements)|\n",
"|`is.element(x, y)`|Check if an element x is in vector y (same as `x %in% y`)|\n",
"\n",
"★ *Try out the above commands in your R console by generating the appropriate data*. For example, "
]
},
{
"cell_type": "code",
"execution_count": 276,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\t- \n",
"
- 'String'
- ' to'
- ' Split'
\n",
" \n",
"
\n"
],
"text/latex": [
"\\begin{enumerate}\n",
"\\item \\begin{enumerate*}\n",
"\\item 'String'\n",
"\\item ' to'\n",
"\\item ' Split'\n",
"\\end{enumerate*}\n",
"\n",
"\\end{enumerate}\n"
],
"text/markdown": [
"1. 1. 'String'\n",
"2. ' to'\n",
"3. ' Split'\n",
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
"[[1]]\n",
"[1] \"String\" \" to\" \" Split\"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"strsplit(\"String; to; Split\",';')# Split the string at ';'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(R-random-numbers)=\n",
"### Generating Random Numbers\n",
"\n",
"You will probably need to generate random numbers at some point as a quantitative biologist. \n",
"\n",
"R has many routines for generating random samples from various probability distributions. There are a number of random number distributions that you can sample or generate random numbers from: \n",
"\n",
"|Function||\n",
"|:-|:-|\n",
"|`rnorm(10, m=0, sd=1)`| Draw 10 normal random numbers with mean=0 and standard deviation = 1|\n",
"|`dnorm(x, m=0, sd=1)` | Density function|\n",
"|`qnorm(x, m=0, sd=1)` | Cumulative density function|\n",
"|`runif(20, min=0, max=2)` | Twenty random numbers from uniform `[0,2]`|\n",
"|` rpois(20, lambda=10)` | Twenty random numbers from Poisson (with mean $\\lambda$)|\n",
"\n",
"#### \"Seeding\" random number generators \n",
"\n",
"Computers *can't* generate *true* mathematically random numbers. This may seem surprising, but basically a computer cannot be programmed to do things purely by chance; it can only follow given instructions blindly and is therefore completely predictable. Instead, computers have algorithms called \"pseudo-random number generators\" that generate *practically random* sequences of numbers. These are typically based on a iterative formula that generates a sequence of random numbers, starting with a first number called the \"**seed**\". This sequence is completely \"deterministic\", that is, starting with a particular seed yields exactly the same sequence of pseudo-random numbers every time you re-run the generator. \n",
"\n",
"Try this:"
]
},
{
"cell_type": "code",
"execution_count": 277,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"0.156703769128359"
],
"text/latex": [
"0.156703769128359"
],
"text/markdown": [
"0.156703769128359"
],
"text/plain": [
"[1] 0.1567038"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"set.seed(1234567)\n",
"rnorm(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Everybody in the class will get the same answer! \n",
"\n",
"Now try and compare the results with your neighbor:"
]
},
{
"cell_type": "code",
"execution_count": 278,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1.37381119149164
- 0.73067024376365
- -1.35080092669852
- -0.00851496085595985
- 0.320981862836429
- -1.77814840855737
- 0.909503835073888
- -0.919404336160487
- -0.157714830888067
- 1.10199738945752
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1.37381119149164\n",
"\\item 0.73067024376365\n",
"\\item -1.35080092669852\n",
"\\item -0.00851496085595985\n",
"\\item 0.320981862836429\n",
"\\item -1.77814840855737\n",
"\\item 0.909503835073888\n",
"\\item -0.919404336160487\n",
"\\item -0.157714830888067\n",
"\\item 1.10199738945752\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1.37381119149164\n",
"2. 0.73067024376365\n",
"3. -1.35080092669852\n",
"4. -0.00851496085595985\n",
"5. 0.320981862836429\n",
"6. -1.77814840855737\n",
"7. 0.909503835073888\n",
"8. -0.919404336160487\n",
"9. -0.157714830888067\n",
"10. 1.10199738945752\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1.373811191 0.730670244 -1.350800927 -0.008514961 0.320981863\n",
" [6] -1.778148409 0.909503835 -0.919404336 -0.157714831 1.101997389"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"rnorm(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And then the whole sequence of 11 numbers you generated:"
]
},
{
"cell_type": "code",
"execution_count": 279,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 0.156703769128359
- 1.37381119149164
- 0.73067024376365
- -1.35080092669852
- -0.00851496085595985
- 0.320981862836429
- -1.77814840855737
- 0.909503835073888
- -0.919404336160487
- -0.157714830888067
- 1.10199738945752
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 0.156703769128359\n",
"\\item 1.37381119149164\n",
"\\item 0.73067024376365\n",
"\\item -1.35080092669852\n",
"\\item -0.00851496085595985\n",
"\\item 0.320981862836429\n",
"\\item -1.77814840855737\n",
"\\item 0.909503835073888\n",
"\\item -0.919404336160487\n",
"\\item -0.157714830888067\n",
"\\item 1.10199738945752\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 0.156703769128359\n",
"2. 1.37381119149164\n",
"3. 0.73067024376365\n",
"4. -1.35080092669852\n",
"5. -0.00851496085595985\n",
"6. 0.320981862836429\n",
"7. -1.77814840855737\n",
"8. 0.909503835073888\n",
"9. -0.919404336160487\n",
"10. -0.157714830888067\n",
"11. 1.10199738945752\n",
"\n",
"\n"
],
"text/plain": [
" [1] 0.156703769 1.373811191 0.730670244 -1.350800927 -0.008514961\n",
" [6] 0.320981863 -1.778148409 0.909503835 -0.919404336 -0.157714831\n",
"[11] 1.101997389"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"set.seed(1234567)\n",
"rnorm(11)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thus, setting the seed allows you to reliably generate the identical sequence of \"random\" numbers. These numbers are not truly random, but have the properties of random numbers. Note also that pseudo-random number generators are periodic, which means that the sequence will eventually repeat itself. However, this period is so long that it can be ignored for most practical purposes. So effectively, `rnorm` has an enormous list that it cycles through. The random seed starts the process, i.e., indicates where in the list to start. This is usually taken from the clock when you start R.\n",
"\n",
"But why bother with random number seeds? Setting a particular seed can be useful when debugging programs (coming up below). Bugs in code can be hard to find — harder still if you are generating random numbers, so repeat runs of your code may or may not all trigger the same behavior. You can set the seed once at the beginning of the code — ensuring repeatability, retaining (pseudo) randomness. Once debugged, if you want, you can remove the set seed line."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Your analysis workflow\n",
"\n",
"In using R for an analysis, you will likely use and create several files. As in the case of bash and Python based projects, in R projects as well, you should keep your workflow well organized. For example, it is sensible to create a folder (directory) to keep all code files together. You can then set R to work from this directory, so that files are easy to find and run — this will be your \"working directory\" (more on this below). Also, you don't want to mix code files with data and results files. So you should create separate directories for these as well. \n",
"\n",
"Thus, your typical R analysis workflow will be:\n",
"\n",
"---\n",
"\n",
":::{figure-md} R-project-org\n",
"\n",
"\n",
"\n",
"**Your R project.** Keeping it neat and organized if the key to becoming a good R programmer.\n",
"\n",
":::\n",
"\n",
"---\n",
"\n",
"\n",
"Some details on each kind of file:\n",
"\n",
"\n",
"* *R script files*: These are plain text files containing all the R code needed for an analysis. These should be created with a text editor, typically part of some smart code editor like vscode, or RStudio, and saved with the extension `*.R`. You should *never* use Word to save or edit these files as R can only read code from plain text files.\n",
" \n",
"* *Text data files* These are files of data in plain text format containing one or more columns of data (numbers, strings, or both). Although there are several format options, we will typically be using `csv` files, where the data entries are separated by commas. These are easy to create and export from Excel (if that's what you use...).\n",
"\n",
"* *Results output files* These are a plain text files containing your results, such the the summary of output of a regression or ANOVA analysis. Typically, you will output your results in a table format where the columns are separated by commas (csv) or tabs (tab-delimited). \n",
"\n",
"* *Graphics files* R can export graphics in a wide range of formats. This can be done automatically from R code and we will look at this later but you can also select a graphics window (e.g., in RStudio) and click `File` $\\triangleright$ `Save as...`.\n",
"\n",
"* *Rdata files* You can save any data loaded or created in R, including outputs of statistical analyses and other things, into a single`Rdata` file. These are not plain text and can only be read by R, but can hold all the data from an analysis in a single handy location. We will not use these much in this course. \n",
"\n",
"So let's build your R analysis project structure. \n",
"\n",
"Do the following:\n",
"\n",
"★ Create a sensibly-named directory (e.g., `MyRCoursework`, `week3`, etc, depnding on which course you are on) in an appropriate location on your computer. If you are using a college Windows computer, you may need to create it in your `H:` drive. Avoid including spaces in your file or directory names, as this will often create problems when you share your file or directory with somebody else. Many software programs do not handle spaces in file/directory names well. Use underscores instead of spaces. For example, instead of `My R Coursework`, use `My_R_Coursework` or `MyRCoursework`. \n",
"\n",
"★ Create subdirectories *within this directory* called `code`, `data`, and `results`. Remember, commands in all programming languages are case-sensitive when it comes to reading directory path names, so `code` is not the same as `Code`!\n",
"\n",
"You can create directories using `dir.create()`within R (or if on Mac/Linux, with the usual `mkdir` from the bash terminal):\n",
"\n",
"```R\n",
"dir.create(\"MyRCoursework\")\n",
"dir.create(\"MyRCoursework/code\")\n",
"dir.create(\"MyRCoursework/data\") \n",
"dir.create(\"MyRCoursework/results\") \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The R Workspace and Working Directory\n",
"\n",
"R has a \"workspace\" – a current working environment that includes any user-defined data structures objects (vectors, matrices, data frames, lists) as well as other objects (e.g., functions). At the end of an R session, the user can save an image of the current workspace that is automatically reloaded the next time R is started. Your workspace is saved in your \"Working Directory\", which has to be set manually.\n",
"\n",
"So before we go any further, let's get sort out where your R \"Working Directory\" should be and how you should set it. R has a default location where it assumes your working directory is. \n",
"\n",
"* In UNIX/Linux, it is whichever directory you are in when you launch R.\n",
"\n",
"* In Mac, it is `/User/YourUserName` or similar.\n",
"\n",
"* In Windows, it is `C:/Windows/system32`or similar.\n",
"\n",
"To see where your current working directory is, at the R command prompt, type:\n",
"\n",
"`getwd()` \n",
"\n",
"This tells you what the current `w`orking `d`irectory is. \n",
"\n",
"Now, set the working directory to be `MrRCourseworkcode`. For example, if you created `MrRCoursework`directly in your `H:\\`, the you would use:\n",
"\n",
"`setwd(\"H:/MrRCourseworkcode\")` \n",
"\n",
"\n",
"`dir()` #check what's in the current working directory\n",
"\n",
"On your own computer, you can also change R's default to a particular working directory where you would like to start (easily done in RStudio): \n",
"\n",
"* In Linux, you can do this by editing the `Rprofile.site`site with `sudo gedit /etc/R/Rprofile.site`. In that file, you would add your start-up parameters between the lines \n",
"\n",
" `.First <- function() cat(\"\\n Welcome to R!\\n\\n\")`\n",
"\n",
"and \n",
"\n",
" `.Last <- function() cat(\"\\n Goodbye! \\n\\n\")`.\n",
"\n",
"Between these two lines, insert: \n",
" `setwd(\"/home/YourName/YourDirectoryPath\")`\n",
"\n",
"* In Windows and Macs, you can find the `Rprofile.site`file by searching for it. On Windows, it should be at `C:\\Program Files\\R\\R-x.x.x\\etc` directory, where `x.x.x` is your R version.\n",
"\n",
"* If you are using RStudio, you can change the default working directory through the RStudio \"Options\" dialog."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Importing and Exporting Data\n",
"\n",
"We are now ready to see how to import and export data in R, typically the first step of your analysis. The best option is to have your data in a `c`omma `s`eparated `v`alue (` csv`) text file or in a tab separated text file. Then, you can use the function `read.csv`(or `read.table`) to import your data. Now, lets get some data into your `Data`directory.\n",
"\n",
"★ Go to the [TheMulQuaBio git repository](https://github.com/mhasoba/TheMulQuaBio) and navigate to the [`data` directory](https://github.com/mhasoba/TheMulQuaBio/tree/master/content/data).\n",
"\n",
"★ Download and copy the file [`trees.csv`](https://raw.githubusercontent.com/mhasoba/TheMulQuaBio/master/content/data/trees.csv) into your own `data` directory. \n",
"\n",
"Alternatively, of you may download the whole repository to your computer and then grab the file from the place where you downloaded it. \n",
"\n",
"Now, import the data:"
]
},
{
"cell_type": "code",
"execution_count": 280,
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"setwd(\"../code/\")"
]
},
{
"cell_type": "code",
"execution_count": 281,
"metadata": {},
"outputs": [],
"source": [
"MyData <- read.csv(\"../data/trees.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 282,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'MyData'
- 'MyDF'
- 'MyList'
- 'MyMat'
- 'MyVar'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'MyData'\n",
"\\item 'MyDF'\n",
"\\item 'MyList'\n",
"\\item 'MyMat'\n",
"\\item 'MyVar'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'MyData'\n",
"2. 'MyDF'\n",
"3. 'MyList'\n",
"4. 'MyMat'\n",
"5. 'MyVar'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"MyData\" \"MyDF\" \"MyList\" \"MyMat\" \"MyVar\" "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ls(pattern = \"My*\") # Check that MyData has appeared"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Your output may be somewhat different, depending on what what variables and other objects have been created in \n",
"your R Workspace during the current R session. But the main thing is, you should be able to see a `MyData` in the list of objects printed. \n",
"\n",
"```{tip}\n",
"You can list only objects with a particular name pattern by using the `pattern` option of `ls()`. It works by using regular expressions, which you were introduced through the `grep` command in the [UNIX chapter](Using-grep). We will delve deeper into regular expressions in the [Python II Chapter](Python_II:python-regex). \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the resulting `MyData` object in your workspace is a R dataframe:"
]
},
{
"cell_type": "code",
"execution_count": 283,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'data.frame'"
],
"text/latex": [
"'data.frame'"
],
"text/markdown": [
"'data.frame'"
],
"text/plain": [
"[1] \"data.frame\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class(MyData)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Relative paths\n",
"\n",
"Note the UNIX-like path to the file we used in the `read.csv()` command above (using forward slashes; Windows uses back slashes). \n",
"\n",
"The `../` in `read.csv(\"../data/trees.csv\")` above signifies a \"relative\" path. That is, you are asking R to load data that lies in a different directory (folder) *relative* your current location (in this case, you are in your `Code`directory). In other, more technical words, `../data/trees.txt`points to a file named `trees.txt`located in the \"parent\" of the current directory.\n",
"\n",
"*What is an absolute path?* — one that specifies the whole path on your computer, say from `C:\\`\"upwards\" on Windows, `/Users/` upwards on Mac, and `/home/` upwards on Linux. Absolute paths are specific to each computer, so should be avoided. So to import data and export results, your script should *not* use absolute paths. Also, *AVOID putting a `setwd()`command at the start of your R script*, because setting the working directory requires an absolute directory path, which will differ across computers, platforms, and users. Let the end users set the working directory on their machine themselves. \n",
"\n",
"Using relative paths in in your R scripts and code will make your code computer-independent and easier for others to use your code. The relative path way should always be the way you load data in your analyses scripts — it will guarantee that your analysis works on every computer, not just your own or college computer. \n",
"\n",
"\n",
"```{tip}\n",
"If you are using a computer from elsewhere in the EU, Excel may use a comma (e.g., $\\pi=3,1416$) instead of a decimal point ($\\pi=3.1416$). In this case, `csv`files may use a semi-colon to separate columns and you can use the alternative function `read.csv2()` to read them into the R workspace.\n",
"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 284,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"\t | Species | Distance.m | Angle.degrees |
\n",
"\t | <chr> | <dbl> | <dbl> |
\n",
"\n",
"\n",
"\t1 | Populus tremula | 31.66583 | 41.28264 |
\n",
"\t2 | Quercus robur | 45.98499 | 44.53592 |
\n",
"\t3 | Ginkgo biloba | 31.24177 | 25.14626 |
\n",
"\t4 | Fraxinus excelsior | 34.61667 | 23.33613 |
\n",
"\t5 | Betula pendula | 45.46617 | 38.34913 |
\n",
"\t6 | Betula pendula | 48.79550 | 33.59231 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 3\n",
"\\begin{tabular}{r|lll}\n",
" & Species & Distance.m & Angle.degrees\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t1 & Populus tremula & 31.66583 & 41.28264\\\\\n",
"\t2 & Quercus robur & 45.98499 & 44.53592\\\\\n",
"\t3 & Ginkgo biloba & 31.24177 & 25.14626\\\\\n",
"\t4 & Fraxinus excelsior & 34.61667 & 23.33613\\\\\n",
"\t5 & Betula pendula & 45.46617 & 38.34913\\\\\n",
"\t6 & Betula pendula & 48.79550 & 33.59231\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"| | Species <chr> | Distance.m <dbl> | Angle.degrees <dbl> |\n",
"|---|---|---|---|\n",
"| 1 | Populus tremula | 31.66583 | 41.28264 |\n",
"| 2 | Quercus robur | 45.98499 | 44.53592 |\n",
"| 3 | Ginkgo biloba | 31.24177 | 25.14626 |\n",
"| 4 | Fraxinus excelsior | 34.61667 | 23.33613 |\n",
"| 5 | Betula pendula | 45.46617 | 38.34913 |\n",
"| 6 | Betula pendula | 48.79550 | 33.59231 |\n",
"\n"
],
"text/plain": [
" Species Distance.m Angle.degrees\n",
"1 Populus tremula 31.66583 41.28264 \n",
"2 Quercus robur 45.98499 44.53592 \n",
"3 Ginkgo biloba 31.24177 25.14626 \n",
"4 Fraxinus excelsior 34.61667 23.33613 \n",
"5 Betula pendula 45.46617 38.34913 \n",
"6 Betula pendula 48.79550 33.59231 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"head(MyData) # Have a quick look at the data frame"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also have a more detailed look at the data you imported:"
]
},
{
"cell_type": "code",
"execution_count": 285,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"'data.frame':\t120 obs. of 3 variables:\n",
" $ Species : chr \"Populus tremula\" \"Quercus robur\" \"Ginkgo biloba\" \"Fraxinus excelsior\" ...\n",
" $ Distance.m : num 31.7 46 31.2 34.6 45.5 ...\n",
" $ Angle.degrees: num 41.3 44.5 25.1 23.3 38.3 ...\n"
]
}
],
"source": [
"str(MyData) # Note the data types of the three columns"
]
},
{
"cell_type": "code",
"execution_count": 286,
"metadata": {},
"outputs": [],
"source": [
"MyData <- read.csv(\"../data/trees.csv\", header = F) # Import ignoring headers"
]
},
{
"cell_type": "code",
"execution_count": 287,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"\t | V1 | V2 | V3 |
\n",
"\t | <chr> | <chr> | <chr> |
\n",
"\n",
"\n",
"\t1 | Species | Distance.m | Angle.degrees |
\n",
"\t2 | Populus tremula | 31.6658337740228 | 41.2826361937914 |
\n",
"\t3 | Quercus robur | 45.984992608428 | 44.5359166583512 |
\n",
"\t4 | Ginkgo biloba | 31.2417666241527 | 25.1462585572153 |
\n",
"\t5 | Fraxinus excelsior | 34.6166691975668 | 23.336126555223 |
\n",
"\t6 | Betula pendula | 45.4661654261872 | 38.3491299510933 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 3\n",
"\\begin{tabular}{r|lll}\n",
" & V1 & V2 & V3\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t1 & Species & Distance.m & Angle.degrees \\\\\n",
"\t2 & Populus tremula & 31.6658337740228 & 41.2826361937914\\\\\n",
"\t3 & Quercus robur & 45.984992608428 & 44.5359166583512\\\\\n",
"\t4 & Ginkgo biloba & 31.2417666241527 & 25.1462585572153\\\\\n",
"\t5 & Fraxinus excelsior & 34.6166691975668 & 23.336126555223 \\\\\n",
"\t6 & Betula pendula & 45.4661654261872 & 38.3491299510933\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"| | V1 <chr> | V2 <chr> | V3 <chr> |\n",
"|---|---|---|---|\n",
"| 1 | Species | Distance.m | Angle.degrees |\n",
"| 2 | Populus tremula | 31.6658337740228 | 41.2826361937914 |\n",
"| 3 | Quercus robur | 45.984992608428 | 44.5359166583512 |\n",
"| 4 | Ginkgo biloba | 31.2417666241527 | 25.1462585572153 |\n",
"| 5 | Fraxinus excelsior | 34.6166691975668 | 23.336126555223 |\n",
"| 6 | Betula pendula | 45.4661654261872 | 38.3491299510933 |\n",
"\n"
],
"text/plain": [
" V1 V2 V3 \n",
"1 Species Distance.m Angle.degrees \n",
"2 Populus tremula 31.6658337740228 41.2826361937914\n",
"3 Quercus robur 45.984992608428 44.5359166583512\n",
"4 Ginkgo biloba 31.2417666241527 25.1462585572153\n",
"5 Fraxinus excelsior 34.6166691975668 23.336126555223 \n",
"6 Betula pendula 45.4661654261872 38.3491299510933"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"head(MyData)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or you can load data using the more general `read.table` function:"
]
},
{
"cell_type": "code",
"execution_count": 288,
"metadata": {},
"outputs": [],
"source": [
"MyData <- read.table(\"../data/trees.csv\", sep = ',', header = TRUE) #another way"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With `read.table` you need to specify whether there is a header row that needs to be imported as such. "
]
},
{
"cell_type": "code",
"execution_count": 289,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"\t | Species | Distance.m | Angle.degrees |
\n",
"\t | <chr> | <dbl> | <dbl> |
\n",
"\n",
"\n",
"\t1 | Populus tremula | 31.66583 | 41.28264 |
\n",
"\t2 | Quercus robur | 45.98499 | 44.53592 |
\n",
"\t3 | Ginkgo biloba | 31.24177 | 25.14626 |
\n",
"\t4 | Fraxinus excelsior | 34.61667 | 23.33613 |
\n",
"\t5 | Betula pendula | 45.46617 | 38.34913 |
\n",
"\t6 | Betula pendula | 48.79550 | 33.59231 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 3\n",
"\\begin{tabular}{r|lll}\n",
" & Species & Distance.m & Angle.degrees\\\\\n",
" & & & \\\\\n",
"\\hline\n",
"\t1 & Populus tremula & 31.66583 & 41.28264\\\\\n",
"\t2 & Quercus robur & 45.98499 & 44.53592\\\\\n",
"\t3 & Ginkgo biloba & 31.24177 & 25.14626\\\\\n",
"\t4 & Fraxinus excelsior & 34.61667 & 23.33613\\\\\n",
"\t5 & Betula pendula & 45.46617 & 38.34913\\\\\n",
"\t6 & Betula pendula & 48.79550 & 33.59231\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 3\n",
"\n",
"| | Species <chr> | Distance.m <dbl> | Angle.degrees <dbl> |\n",
"|---|---|---|---|\n",
"| 1 | Populus tremula | 31.66583 | 41.28264 |\n",
"| 2 | Quercus robur | 45.98499 | 44.53592 |\n",
"| 3 | Ginkgo biloba | 31.24177 | 25.14626 |\n",
"| 4 | Fraxinus excelsior | 34.61667 | 23.33613 |\n",
"| 5 | Betula pendula | 45.46617 | 38.34913 |\n",
"| 6 | Betula pendula | 48.79550 | 33.59231 |\n",
"\n"
],
"text/plain": [
" Species Distance.m Angle.degrees\n",
"1 Populus tremula 31.66583 41.28264 \n",
"2 Quercus robur 45.98499 44.53592 \n",
"3 Ginkgo biloba 31.24177 25.14626 \n",
"4 Fraxinus excelsior 34.61667 23.33613 \n",
"5 Betula pendula 45.46617 38.34913 \n",
"6 Betula pendula 48.79550 33.59231 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"head(MyData)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"MyData <- read.csv(\"../data/trees.csv\", skip = 5) # skip first 5 lines"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Writing to and saving files\n",
"\n",
"You can also save your data frames using `write.table` or `write.csv`:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"write.csv(MyData, \"../results/MyData.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 'LV_model.pdf'
- 'MyData.csv'
- 'MyFirst-ggplot2-Figure.pdf'
- 'MyResults.Rout'
- 'Pred_Prey_Overlay.pdf'
- 'QMEENet.svg'
- 'TreeHeight.csv'
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 'LV\\_model.pdf'\n",
"\\item 'MyData.csv'\n",
"\\item 'MyFirst-ggplot2-Figure.pdf'\n",
"\\item 'MyResults.Rout'\n",
"\\item 'Pred\\_Prey\\_Overlay.pdf'\n",
"\\item 'QMEENet.svg'\n",
"\\item 'TreeHeight.csv'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 'LV_model.pdf'\n",
"2. 'MyData.csv'\n",
"3. 'MyFirst-ggplot2-Figure.pdf'\n",
"4. 'MyResults.Rout'\n",
"5. 'Pred_Prey_Overlay.pdf'\n",
"6. 'QMEENet.svg'\n",
"7. 'TreeHeight.csv'\n",
"\n",
"\n"
],
"text/plain": [
"[1] \"LV_model.pdf\" \"MyData.csv\" \n",
"[3] \"MyFirst-ggplot2-Figure.pdf\" \"MyResults.Rout\" \n",
"[5] \"Pred_Prey_Overlay.pdf\" \"QMEENet.svg\" \n",
"[7] \"TreeHeight.csv\" "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"dir(\"../results/\") # Check if it worked"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Warning message in write.table(MyData[1, ], file = \"../results/MyData.csv\", append = TRUE):\n",
"“appending column names to file”\n"
]
}
],
"source": [
"write.table(MyData[1,], file = \"../results/MyData.csv\",append=TRUE) # append"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You get a warning with here because R thinks it is strange that you are appending headers to a file that already has headers!"
]
},
{
"cell_type": "code",
"execution_count": 294,
"metadata": {},
"outputs": [],
"source": [
"write.csv(MyData, \"../results/MyData.csv\", row.names=TRUE) # write row names"
]
},
{
"cell_type": "code",
"execution_count": 295,
"metadata": {},
"outputs": [],
"source": [
"write.table(MyData, \"../results/MyData.csv\", col.names=FALSE) # ignore col names"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Writing R code\n",
" \n",
"Typing in commands interactively in the R console is good for starters, but you will want to switch to putting your sequence of commands into a script file, and then ask R to run those commands. \n",
"\n",
"\n",
"* Open a new text file, call it `basic_io.R`, and save it to your `code`directory. \n",
"* Write the above input-output commands in it: \n",
"\n",
"```R\n",
"# A simple script to illustrate R input-output. \n",
"# Run line by line and check inputs outputs to understand what is happening \n",
"\n",
"MyData <- read.csv(\"../data/trees.csv\", header = TRUE) # import with headers\n",
"\n",
"write.csv(MyData, \"../results/MyData.csv\") #write it out as a new file\n",
"\n",
"write.table(MyData[1,], file = \"../results/MyData.csv\",append=TRUE) # Append to it\n",
"\n",
"write.csv(MyData, \"../results/MyData.csv\", row.names=TRUE) # write row names\n",
"\n",
"write.table(MyData, \"../results/MyData.csv\", col.names=FALSE) # ignore column names\n",
"\n",
"```\n",
"\n",
"* Now place the cursor on the first line of code in the script file and run it by pressing the appropriate keyboard shortcut (e.g., PC: ctrl+R, Mac: command+enter, Linux: ctrl+enter are the usual shortcuts for doing this).\n",
"* Check after every line that you are getting the expected result."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{note}\n",
"**Why no shebang in your first R script?** Because we are learning R here mainly for writing scripts for data analysis & visualization, and relatively simple numerical calculations & modelling/simulation tasks. Read [this](https://www.r-bloggers.com/2019/11/r-scripts-as-command-line-tools/) and [this](https://blog.sellorm.com/2017/12/18/learn-to-write-command-line-utilities-in-r/) for some more technical advice about R scripts.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Running R code\n",
"\n",
"But even writing to a script file and running the code line-by-line or block-by-block is not your ultimate goal. What you would really like to do is to just run your full analysis and output all the results. There are two main approaches for running R script/code.\n",
"\n",
"### Using `source`\n",
"\n",
"You can run all the contents of a `*.R`script file from the R command line by using `source()`.\n",
"\n",
"★ Try sourcing `basic_io.R` (you will need to make sure you have `setwd` to your code directory):"
]
},
{
"cell_type": "code",
"execution_count": 296,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Warning message in write.table(MyData[1, ], file = \"../results/MyData.csv\", append = TRUE):\n",
"“appending column names to file”\n"
]
}
],
"source": [
"source(\"basic_io.R\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That has run OK, warning and all. \n",
"\n",
"* If you get errors, read and then try to fix them. The most common problem is likely to be that you have not `setwd()` to the `code` directory. \n",
"\n",
"Alternatively, you can run the script from wherever (e.g., `data` directory) by adding the directory path to the script file name, if the script file is not in your working directory and you don't want to change your working directory. For example, you will need `source(\"../code/control.R` if your working directory is `data` and not `code` (using a relative path).\n",
"\n",
"*Do not* put a `source()`command inside the script file you are sourcing, as it is then trying to run itself again and again and that's just cruel!*\n",
"\n",
"```{tip}\n",
"The command `source()`has a `chdir`argument whose default value is FALSE. When set to TRUE, it will change the working directory to the directory of the file being sourced. \n",
"\n",
"```\n",
"\n",
"Also, if you have `source`ed a script successfully, you will see no output on in R console/terminal unless there was an error or warning, or if you explicitly asked for something to be printed. So it can be useful to add a line at the end of the script saying something like `print(\"Script complete!\")`.\n",
"\n",
"\n",
"### Using `Rscript`\n",
"\n",
"You can also run R script from the UNIX/Linux terminal by calling `Rscript`. That is, while you have to be inside an R session to use the `source` command to run a script, you can run a R script directly from the UNIX/Linux terminal by calling `Rscript`.\n",
"\n",
"This allows you to easily automate execution of your R scripts (e.g., by writing a bash script) and integrate R into a bigger computing pipeline/workflow by calling it through other tools or languages (e.g., see the [Python Chapter II](06-Python_II.ipynb)). \n",
"\n",
"If you are on Linux, try using `Rscript` to run `basic_io.R`: \n",
"\n",
"* Exit from the R console using `ctrl+D`, or open a new bash terminal\n",
" \n",
"* `cd` to the location of `basic_io.R` (e.g., `week3/code`)\n",
" \n",
"* Then run the script using `Rscript basic_io.R`\n",
"\n",
"Also, please have a look at `man Rscript` in a bash terminal.\n",
"\n",
"### Running R in batch mode\n",
"\n",
"In addition to `Rscript`, there is another way to run you R script without opening the R console. In Mac or Linux, you can do so by typing:\n",
"\n",
"`R CMD BATCH MyCode.R MyResults.Rout`\n",
"\n",
"This will create an `MyResults.Rout`file containing all the output. On Microsoft Windows, it's more complicated — change the path to `R.exe`and output file as needed: \n",
"\n",
"`\"C:\\Program Files\\R\\R-4.x.x\\bin\\R.exe\" CMD BATCH -vanilla -slave \"C:\\PathToMyResults\\Results\\MyCode.R\"`\n",
"\n",
"Here, replace 4.x.x with the R version you have."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Control flow tools\n",
"\n",
"In R, you can write \"if-then\", and \"else\" statements, and \"for\" and \"while\" loops like any programming language to give you finer control over your program's \"control flow\". Such statements are useful to include in functions and scripts because you may only want to do certain calculations or other tasks, under certain conditions (e.g., `if` the dataset is from a particular year, do something different). \n",
"\n",
"Let's look at some examples of these control flow tools in R. \n",
"\n",
"★ Type each of the following blocks of code in a script file called `control_flow.R` and save it in your `code` directory. then run each block *separately* by sending or pasting into the R console. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `if` statements"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"a is TRUE\"\n"
]
}
],
"source": [
"a <- TRUE\n",
"if (a == TRUE) {\n",
" print (\"a is TRUE\")\n",
"} else {\n",
" print (\"a is FALSE\")\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that using `if (a)` instead of `if (a == TRUE)` will also achieve the same result, because `a` is a Boolean variable. Compare with [Python conditionals](python-conditionals). \n",
"\n",
"In general, it is arguably a good idea to make explicit the `if` condition though. \n",
"\n",
"\n",
"You can also write an `if` statement on a single line:"
]
},
{
"cell_type": "code",
"execution_count": 298,
"metadata": {},
"outputs": [],
"source": [
"z <- runif(1) ## Generate a uniformly distributed random number\n",
"if (z <= 0.5) {print (\"Less than a half\")}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But code readability is important, so avoid squeezing control flow blocks like this into a single line.\n",
"\n",
"```{tip}\n",
"Please indent your code for readability, even if its not strictly necessary (unlike [Python](./05-Python_I.ipynb)). Indentation helps you see the flow of the logic, rather than flattened version, which is hard for you and anybody else to read. For example, the following code block is so much much readable than the one above:\n",
"\n",
"```r\n",
"z <- runif(1)\n",
"if (z <= 0.5) {\n",
" print (\"Less than a half\")\n",
" }\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `for` loops"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Loops](https://en.wikipedia.org/wiki/Control_flow#Loops) are really useful to repeat a task over some range of input values. \n",
"\n",
"The following code \"loops\" over a range of numbers, squaring each, and then printing the result:"
]
},
{
"cell_type": "code",
"execution_count": 299,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"1 squared is 1\"\n",
"[1] \"2 squared is 4\"\n",
"[1] \"3 squared is 9\"\n",
"[1] \"4 squared is 16\"\n",
"[1] \"5 squared is 25\"\n",
"[1] \"6 squared is 36\"\n",
"[1] \"7 squared is 49\"\n",
"[1] \"8 squared is 64\"\n",
"[1] \"9 squared is 81\"\n",
"[1] \"10 squared is 100\"\n"
]
}
],
"source": [
"for (i in 1:10) {\n",
" j <- i * i\n",
" print(paste(i, \" squared is\", j ))\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What exactly is going on in the piece of code above? What are `i` and `j`? Let's break it down:\n",
"\n",
"Firstly The `1:10` part simply generates a sequence, as you learned [previously](R-creating-sequences): "
]
},
{
"cell_type": "code",
"execution_count": 300,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\item 6\n",
"\\item 7\n",
"\\item 8\n",
"\\item 9\n",
"\\item 10\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"3. 3\n",
"4. 4\n",
"5. 5\n",
"6. 6\n",
"7. 7\n",
"8. 8\n",
"9. 9\n",
"10. 10\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1 2 3 4 5 6 7 8 9 10"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"1:10"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is the same as using `seq(10)` (try substituting this instead of `1:10` in the above block of code). \n",
"\n",
"Then, `j` is a temporary variable that stores the value of the number (the other temporary variable, `i`) in that iteration of the loop. \n",
"\n",
"```{note}\n",
"Using the `:` operator or the `seq()` function pre-generates the sequence and stores it in memory, so is less efficient that Python's [`range` function](Python-loops), which generates numbers in a sequence on a \"need-to\" basis. \n",
"```\n",
"\n",
"You can also loop over a vector of strings:"
]
},
{
"cell_type": "code",
"execution_count": 301,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"The species is Heliodoxa rubinoides\"\n",
"[1] \"The species is Boissonneaua jardini\"\n",
"[1] \"The species is Sula nebouxii\"\n"
]
}
],
"source": [
"for(species in c('Heliodoxa rubinoides', \n",
" 'Boissonneaua jardini', \n",
" 'Sula nebouxii')) {\n",
" print(paste('The species is', species))\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These is a random assortment of birds!\n",
"\n",
"And here's a for loop using apre-existing vector:"
]
},
{
"cell_type": "code",
"execution_count": 302,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"a\"\n",
"[1] \"bc\"\n",
"[1] \"def\"\n"
]
}
],
"source": [
"v1 <- c(\"a\",\"bc\",\"def\")\n",
"for (i in v1) {\n",
" print(i)\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### `while` loops\n",
"\n",
"If you want to perform an operation till some condition is met, use a `while` loop:"
]
},
{
"cell_type": "code",
"execution_count": 303,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 1\n",
"[1] 4\n",
"[1] 9\n",
"[1] 16\n",
"[1] 25\n",
"[1] 36\n",
"[1] 49\n",
"[1] 64\n",
"[1] 81\n",
"[1] 100\n"
]
}
],
"source": [
"i <- 0\n",
"while (i < 10) {\n",
" i <- i+1\n",
" print(i^2)\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"★ Now test `control_flow.R`. That is, run the script file using `source` (and `Rscript` if on Linux).\n",
"\n",
"If you get errors, read them carefully and fix them (this is going to be your mantra henceforth!)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Some more control flow tools\n",
"\n",
"Let's look at some more control tools that are less commonly used, but can be useful in certain scenarios. \n",
"\n",
"#### `break`ing out of loops\n",
"Often it is useful (or necessary) to `break` out of a loop when some condition is met. Use `break` (like in pretty much any other programming language, like Python) in situations when you cannot set a target number of iterations and would like to stop the loop execultion once some condition is met (as you would with a `while` loop). \n",
"\n",
"Try this (type into `break.R` and save in `code`):"
]
},
{
"cell_type": "code",
"execution_count": 304,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"i equals 0 \n",
"i equals 1 \n",
"i equals 2 \n",
"i equals 3 \n",
"i equals 4 \n",
"i equals 5 \n",
"i equals 6 \n",
"i equals 7 \n",
"i equals 8 \n",
"i equals 9 \n"
]
}
],
"source": [
"i <- 0 #Initialize i\n",
" while (i < Inf) {\n",
" if (i == 10) {\n",
" break \n",
" } else { # Break out of the while loop! \n",
" cat(\"i equals \" , i , \" \\n\")\n",
" i <- i + 1 # Update i\n",
" }\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Using `next`\n",
"\n",
"You can also skip to next iteration of a loop. Both `next` and `break` can be used within other loops (` while`, `for`). Try this (type into `next.R` and save in `code`):"
]
},
{
"cell_type": "code",
"execution_count": 305,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 1\n",
"[1] 3\n",
"[1] 5\n",
"[1] 7\n",
"[1] 9\n"
]
}
],
"source": [
"for (i in 1:10) {\n",
" if ((i %% 2) == 0) # check if the number is odd\n",
" next # pass to next iteration of loop \n",
" print(i)\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This code checks if a number is odd using the \"modulo\" operation and prints it if it is."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Writing R Functions\n",
"\n",
"A function is a block of re-useable code that takes an input, does something with it (or to it!), and returns the result. Like any other programming language, R lets you write your own functions. All the \"commands\" that you have been using, such as `ls()`, `mean()`, `c()`, etc are basically functions. You will want to write your own function for every scenario where a particular task or set of analysis steps need to be performed again and again. \n",
"The syntax for R functions is quite simple, with each function accepting \"arguments\" and \"returning\" a value. \n",
"\n",
"### Your first R function\n",
"\n",
"★ Type the following into a script file called `boilerplate.R`and save it in your `code` directory:\n",
" \n",
"```R\n",
"\n",
"# A boilerplate R script\n",
"\n",
"MyFunction <- function(Arg1, Arg2) {\n",
" \n",
" # Statements involving Arg1, Arg2:\n",
" print(paste(\"Argument\", as.character(Arg1), \"is a\", class(Arg1))) # print Arg1's type\n",
" print(paste(\"Argument\", as.character(Arg2), \"is a\", class(Arg2))) # print Arg2's type\n",
" \n",
" return (c(Arg1, Arg2)) #this is optional, but very useful\n",
"}\n",
"\n",
"MyFunction(1,2) #test the function\n",
"MyFunction(\"Riki\",\"Tiki\") #A different test\n",
"```\n",
"Note the curly brackets – these are necessary for R to know where the specification of the function starts and ends. Also, note the indentation. Not necessary (unlike Python), but recommended to make the code more readable.\n",
"\n",
"Now enter the R console and source the script:"
]
},
{
"cell_type": "code",
"execution_count": 306,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"Argument 1 is a numeric\"\n",
"[1] \"Argument 2 is a numeric\"\n",
"[1] \"Argument Riki is a character\"\n",
"[1] \"Argument Tiki is a character\"\n"
]
}
],
"source": [
"source(\"boilerplate.R\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This will run the script, and also save your function `MyFunction` as an object into your workspace (try `ls()`, and you will see `MyFunction` appear in the list of objects):"
]
},
{
"cell_type": "code",
"execution_count": 311,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'MyFunction'"
],
"text/latex": [
"'MyFunction'"
],
"text/markdown": [
"'MyFunction'"
],
"text/plain": [
"[1] \"MyFunction\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ls(pattern = \"MyFun*\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, you output may be a bit different as you have been different things in your R workspace/session than I have. What matters is that you see `MyFunction` in the list of objects above. \n",
"\n",
"Now try: "
]
},
{
"cell_type": "code",
"execution_count": 319,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'function'"
],
"text/latex": [
"'function'"
],
"text/markdown": [
"'function'"
],
"text/plain": [
"[1] \"function\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class(MyFunction)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So, yes, `MyFunction`is a `function` objec, just as it would be in Python."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Functions with conditionals\n",
"\n",
"\n",
"Here are some examples of functions with conditionals:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"'6 is even!'"
],
"text/latex": [
"'6 is even!'"
],
"text/markdown": [
"'6 is even!'"
],
"text/plain": [
"[1] \"6 is even!\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Checks if an integer is even\n",
"is.even <- function(n = 2) {\n",
" if (n %% 2 == 0) {\n",
" return(paste(n,'is even!'))\n",
" } else {\n",
" return(paste(n,'is odd!'))\n",
" }\n",
"}\n",
"\n",
"is.even(6)"
]
},
{
"cell_type": "code",
"execution_count": 321,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'4 is a power of 2!'"
],
"text/latex": [
"'4 is a power of 2!'"
],
"text/markdown": [
"'4 is a power of 2!'"
],
"text/plain": [
"[1] \"4 is a power of 2!\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Checks if a number is a power of 2\n",
"is.power2 <- function(n = 2) {\n",
" if (log2(n) %% 1==0) {\n",
" return(paste(n, 'is a power of 2!'))\n",
" } else {\n",
" return(paste(n,'is not a power of 2!'))\n",
" }\n",
"}\n",
"\n",
"is.power2(4)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'3 is a prime!'"
],
"text/latex": [
"'3 is a prime!'"
],
"text/markdown": [
"'3 is a prime!'"
],
"text/plain": [
"[1] \"3 is a prime!\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Checks if a number is prime\n",
"is.prime <- function(n) {\n",
" if (n==0) {\n",
" return(paste(n,'is a zero!'))\n",
" } else if (n==1) {\n",
" return(paste(n,'is just a unit!'))\n",
" }\n",
" \n",
" ints <- 2:(n-1)\n",
" \n",
" if (all(n%%ints!=0)) {\n",
" return(paste(n,'is a prime!'))\n",
" } else {\n",
" return(paste(n,'is a composite!'))\n",
" }\n",
"}\n",
"\n",
"is.prime(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"★ Run the three blocks of code in R one at a time and make sure you understand what each function is doing and how. Save all three blocks to a single script file called `R_conditionals.R` in your `code` directory, and make sure it runs using `source` and `Rscript` (in the Linux/Ubuntu terminal) . "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### An example utility function\n",
"\n",
"\n",
"Now let's write a script containing a more useful function:\n",
"\n",
"★ In your text editor type the following in a file called `TreeHeight.R`, and save it in your `code`directory:\n",
"\n",
"```R\n",
"# This function calculates heights of trees given distance of each tree \n",
"# from its base and angle to its top, using the trigonometric formula \n",
"#\n",
"# height = distance * tan(radians)\n",
"#\n",
"# ARGUMENTS\n",
"# degrees: The angle of elevation of tree\n",
"# distance: The distance from base of tree (e.g., meters)\n",
"#\n",
"# OUTPUT\n",
"# The heights of the tree, same units as \"distance\"\n",
"\n",
"TreeHeight <- function(degrees, distance) {\n",
" radians <- degrees * pi / 180\n",
" height <- distance * tan(radians)\n",
" print(paste(\"Tree height is:\", height))\n",
" \n",
" return (height)\n",
"}\n",
"\n",
"TreeHeight(37, 40)\n",
"```\n",
"\n",
"* Run `TreeHeight.R`'s two blocks (the `TreeHeight` function, and the call to the function, `TreeHeight(37, 40)`) by pasting them sequentially into the R console. Try and understand what each line is doing.\n",
"* Now test the whole `TreeHeight.R` script at one go using `source` and/or `Rscript` (in Linux/Ubuntu)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Practicals\n",
" \n",
"### Tree heights\n",
"\n",
"Modify the script `TreeHeight.R` so that it does the following:\n",
" * Loads `trees.csv` and calculates tree heights for all trees in the data. Note that the distances have been measured in meters. (Hint: use relative paths)).\n",
" * Creates a csv output file called `TreeHts.csv` in `results`that contains the calculated tree heights along with the original data in the following format (only first two rows and headers shown): \n",
" ```bash\n",
" \"Species\",\"Distance.m\",\"Angle.degrees\",\"Tree.Height.m\"\n",
" \"Populus tremula\",31.6658337740228,41.2826361937914,27.8021161438536\n",
" \"Quercus robur\",45.984992608428,44.5359166583512,45.2460250644405\n",
" ```\n",
"This script should work using either `source` or `Rscript` in Linux / UNIX.\n",
"\n",
"\n",
"### Groupwork Practical on Tree Heights\n",
"\n",
"The goal of this practical is to make the `TreeHeight.R` script more general, so that it could be used for other datasets, not just `trees.csv`. \n",
"\n",
"Guidelines:\n",
"\n",
" * Write another R script called `get_TreeHeight.R` that takes a csv file name from the command line (e.g., `get_TreeHeight.R Trees.csv`) and outputs the result to a file just like `TreeHeight.R`above, but this time includes the input file name in the output file name as `InputFileName_treeheights.csv`. Note that you will have to strip the `.csv`or whatever the extension is from the filename, and also `../` etc., if you are using relative paths. (Hint: Command-line parameters are accessible within the R running environment via `commandArgs()` — so `help(commandArgs)` might be your starting point.)\n",
" * Write a Unix shell script called `run_get_TreeHeight.sh` that tests `get_TreeHeight.R`. Include `trees.csv`as your example file. Note that `source`will not work in this case as it does not allow scripts with arguments to be run; you will have to use `Rscript`instead. \n",
"\n",
"### Groupwork Practical on Tree Heights 2\n",
"\n",
"Assuming you have already worked through [Python Chapter I](./05-Python_I.ipynb), write a Python version of `get_TreeHeight.R` (call it `get_TreeHeight.py`). Include a test of this script into `run_get_TreeHeight.sh`.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(R-Vectorization)=\n",
"\n",
"## Vectorization\n",
"\n",
"R is relatively slow at cycling through a data structure such as a dataframe or matrix (e.g., by using a `for` loop) because it is a *high-level, interpreted computer language*. \n",
"\n",
"That is, when you execute a command in R, it needs to \"read\" and interpret the necessary code from scratch every single time the command is called. On the other hand, compiled languages like C know exactly what the flow of the program is because the code is pre-interpreted and ready to go before execution (i.e., the code is \"compiled\"). \n",
"\n",
"For example, when you assign a new variable in R:"
]
},
{
"cell_type": "code",
"execution_count": 323,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"'numeric'"
],
"text/latex": [
"'numeric'"
],
"text/markdown": [
"'numeric'"
],
"text/plain": [
"[1] \"numeric\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"a <- 1.0\n",
"class(a)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"R automatically figures out that `1.0` is an floating point number, and finds a place in the system memory for it with `a` registered as a \"pointer\" to it.\n",
"\n",
"In contrast, in C, for example, you would have to do so manually by giving `a` an address and space in memory:\n",
"\n",
"```C\n",
"float a\n",
"a = 1\n",
"```\n",
"\n",
"*Vectorization is an approach where you directly apply compiled, optimized code to run an operation on a vector, matrix, or an higher-dimensional data structure (like an R array), instead of performing the operation element-wise (each row or column element one at a time) on the data structure*. \n",
"\n",
"Apart from computational efficiency vectorization makes writing code more concise, easy to read, and less error prone.\n",
"\n",
"Let's try an example that illustrates this point.\n",
"\n",
"★ Type (save in `Code`) as `Vectorize1.R` the following script, and run it (it sums all elements of a matrix):"
]
},
{
"cell_type": "code",
"execution_count": 324,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"Using loops, the time taken is:\"\n",
" user system elapsed \n",
" 0.08 0.00 0.08 \n",
"[1] \"Using the in-built vectorized function, the time taken is:\"\n",
" user system elapsed \n",
" 0.001 0.000 0.001 \n"
]
}
],
"source": [
"M <- matrix(runif(1000000),1000,1000)\n",
"\n",
"SumAllElements <- function(M) {\n",
" Dimensions <- dim(M)\n",
" Tot <- 0\n",
" for (i in 1:Dimensions[1]) {\n",
" for (j in 1:Dimensions[2]) {\n",
" Tot <- Tot + M[i,j]\n",
" }\n",
" }\n",
" return (Tot)\n",
"}\n",
" \n",
"print(\"Using loops, the time taken is:\")\n",
"print(system.time(SumAllElements(M)))\n",
"\n",
"print(\"Using the in-built vectorized function, the time taken is:\")\n",
"print(system.time(sum(M)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note the `system.time` R function: it calculates how much time your code takes. This time will vary with every run, and every computer that this code is run on (so your times will be a bit different than what I got).\n",
"\n",
"Both `SumAllElements()` and `sum()` approaches are correct, and will give you the right answer. However, the inbuilt function `sum()` is about 100 times faster than the other, because it uses vectorization, avoiding the amount of looping that `SumAllElements()` uses.\n",
"\n",
"In effect, of course, the computer still has to run loops. However, this running of the loops is encapsulated in a pre-complied program that R calls. These programs are written in more primitive (and therefore, faster) languages like Fortran and C. For example `sum` is actually written in C if you look under the hood.\n",
"\n",
"In R, even if you should try to avoid loops, in practice, it is often much easier to throw in a `for` loop, and *then* \"optimize\" the code to avoid the loop if the running time is not satisfactory. Therefore, it is still essential that you to become familiar with loops and looping as you learned in the sections above. \n",
"\n",
"### Pre-allocation\n",
"\n",
"And if you are using loops, one operation that is slow in R (and somewhat slow in all languages) is memory allocation for a particular variable that will change during looping (e.g., a variable that is a dataframe). So writing a for loop that resizes a vector repeatedly makes R re-allocate memory repeatedly, which makes it slow. Try this:"
]
},
{
"cell_type": "code",
"execution_count": 441,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] 1\n",
"56 bytes\n",
"[1] 1 2\n",
"56 bytes\n",
"[1] 1 2 3\n",
"64 bytes\n",
"[1] 1 2 3 4\n",
"64 bytes\n",
"[1] 1 2 3 4 5\n",
"80 bytes\n",
"[1] 1 2 3 4 5 6\n",
"80 bytes\n",
"[1] 1 2 3 4 5 6 7\n",
"80 bytes\n",
"[1] 1 2 3 4 5 6 7 8\n",
"80 bytes\n",
"[1] 1 2 3 4 5 6 7 8 9\n",
"96 bytes\n",
" [1] 1 2 3 4 5 6 7 8 9 10\n",
"96 bytes\n"
]
},
{
"data": {
"text/plain": [
" user system elapsed \n",
" 0.011 0.000 0.011 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"NoPreallocFun <- function(x) {\n",
" a <- vector() # empty vector\n",
" for (i in 1:x) {\n",
" a <- c(a, i) # concatenate\n",
" print(a)\n",
" print(object.size(a))\n",
" }\n",
"}\n",
"\n",
"system.time(NoPreallocFun(10))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, in each repetition of the for loop, you can see that R has to re-size the vector and re-allocate (more) memory. It has to find the vector in memory, create a new vector that will fit more data, copy the old data over, insert the new data, and erase the old vector. This can get very slow as vectors get big.\n",
"\n",
"On the other hand, if you \"pre-allocate\" a vector that fits all the values, R doesn't have to re-allocate memory each iteration. Here's how you'd do that for the above case:"
]
},
{
"cell_type": "code",
"execution_count": 442,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" [1] 1 NA NA NA NA NA NA NA NA NA\n",
"96 bytes\n",
" [1] 1 2 NA NA NA NA NA NA NA NA\n",
"96 bytes\n",
" [1] 1 2 3 NA NA NA NA NA NA NA\n",
"96 bytes\n",
" [1] 1 2 3 4 NA NA NA NA NA NA\n",
"96 bytes\n",
" [1] 1 2 3 4 5 NA NA NA NA NA\n",
"96 bytes\n",
" [1] 1 2 3 4 5 6 NA NA NA NA\n",
"96 bytes\n",
" [1] 1 2 3 4 5 6 7 NA NA NA\n",
"96 bytes\n",
" [1] 1 2 3 4 5 6 7 8 NA NA\n",
"96 bytes\n",
" [1] 1 2 3 4 5 6 7 8 9 NA\n",
"96 bytes\n",
" [1] 1 2 3 4 5 6 7 8 9 10\n",
"96 bytes\n"
]
},
{
"data": {
"text/plain": [
" user system elapsed \n",
" 0.006 0.000 0.005 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"PreallocFun <- function(x) {\n",
" a <- rep(NA, x) # pre-allocated vector\n",
" for (i in 1:x) {\n",
" a[i] <- i # assign\n",
" print(a)\n",
" print(object.size(a))\n",
" }\n",
"}\n",
"\n",
"system.time(PreallocFun(10))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"★ Write the above two blocks of code into a script called `preallocate.R`. You can't really see the difference in timing here using `system.time()` because the vector is really small (just 10 elements), and the `print()` commands take up most of the time. To really see the difference in efficiency between the two functions, increase the iterations in your script from 10 to 1000, and suppress the print commands. Make this modification to your script. The modified script should print just the outputs of the two `system.time()` calls. \n",
"\n",
"Fortunately, R has several functions that can operate on entire vectors and matrices without requiring looping (Vectorization). That is, vectorizing a computer program means you write it such that as many operations as possible are applied to whole data structure (vectors, matrices, dataframes, lists, etc) at one go, instead of its individual elements. \n",
"\n",
"You will learn about some important R functions that allow vectorization in the following sections. \n",
"\n",
"### The `*apply` family of functions\n",
"\n",
"There are a family of functions called `*apply` in R that vectorize your code for you. These functions are described in the help files (e.g. `?apply`). \n",
"\n",
"For example, `apply` can be used when you want to apply a function to the rows or columns of a matrix (and higher-dimensional analogues – remember arrays!). This is not generally advisable for data frames as it will first need to coerce the data frame to a matrix first.\n",
"\n",
"Let us try using applying the same function to rows/colums of a matrix using `apply`. \n",
"\n",
"\n",
"★ Type the following in a script file called `apply1.R`, save it to your `Code` directory, and run it:"
]
},
{
"cell_type": "code",
"execution_count": 327,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" [1] 0.36031151 0.04436538 0.22147423 0.20493801 0.16210725 0.04860134\n",
" [7] 0.41767343 -0.30198204 0.17624371 -0.12782576\n"
]
}
],
"source": [
"## Build a random matrix\n",
"M <- matrix(rnorm(100), 10, 10)\n",
"\n",
"## Take the mean of each row\n",
"RowMeans <- apply(M, 1, mean)\n",
"print (RowMeans)"
]
},
{
"cell_type": "code",
"execution_count": 328,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" [1] 0.8054302 0.5757168 0.3312622 0.4285847 0.9819718 0.9878042 0.6637336\n",
" [8] 0.8063847 0.4404945 1.3378551\n"
]
}
],
"source": [
"## Now the variance\n",
"RowVars <- apply(M, 1, var)\n",
"print (RowVars)"
]
},
{
"cell_type": "code",
"execution_count": 329,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" [1] 0.225188540 -0.005593872 0.070236505 0.540658047 0.207704657\n",
" [6] -0.039456952 -0.308016915 -0.052139549 -0.017257915 0.584584511\n"
]
}
],
"source": [
"## By column\n",
"ColMeans <- apply(M, 2, mean)\n",
"print (ColMeans)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That was using `apply` on some of R's inbuilt functions. You can use apply to define your own functions. Let's try it.\n",
"\n",
"★ Type the following in a script file called `apply2.R`, save it to your `Code` directory, and run it:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" [,1] [,2] [,3] [,4] [,5] [,6]\n",
" [1,] -234.39455 31.662431 0.7233882 -0.94436410 -1.4572646 -20.07141\n",
" [2,] 180.15974 83.582155 0.3644622 1.19740008 -0.3927685 -98.37526\n",
" [3,] 16.28549 143.665254 0.3018886 -1.06216206 -2.3448302 224.24758\n",
" [4,] -28.24259 -6.671888 0.4672368 0.63864658 0.4280011 -86.77100\n",
" [5,] 237.42028 148.139235 1.2226815 0.04541425 -0.3694419 -17.01924\n",
" [6,] 67.62704 14.332068 -0.3192625 -1.51307536 -1.8384410 -15.53040\n",
" [7,] 107.61127 53.433940 -1.8648914 -0.10926190 -1.3273339 87.13306\n",
" [8,] 39.42281 -101.533491 0.1049022 -0.58231320 -0.1164919 80.88842\n",
" [9,] 85.01566 177.863389 -2.0382452 -0.06725732 0.6842362 -39.28277\n",
"[10,] 79.61237 67.051809 0.3626656 0.40823357 -1.1903160 27.77324\n",
" [,7] [,8] [,9] [,10]\n",
" [1,] -0.90120256 18.385375 0.04279667 3.696104\n",
" [2,] -0.48180877 74.636647 -0.97886275 147.825300\n",
" [3,] -0.65661220 9.031228 0.69408658 57.520424\n",
" [4,] -0.49917243 -94.532163 -0.41680049 128.188540\n",
" [5,] 0.07449752 163.795799 -0.57998084 -22.925171\n",
" [6,] -0.31465171 -82.500457 0.09081916 -106.962944\n",
" [7,] 0.76765184 9.721098 -1.81725821 42.298713\n",
" [8,] 1.99041007 73.372002 -0.29073832 133.662999\n",
" [9,] 0.93981016 -28.824632 1.31045316 129.069991\n",
"[10,] -1.45123295 126.623928 -1.11812957 69.367728\n"
]
}
],
"source": [
"SomeOperation <- function(v) { # (What does this function do?)\n",
" if (sum(v) > 0) { #note that sum(v) is a single (scalar) value\n",
" return (v * 100)\n",
" } else { \n",
" return (v)\n",
" }\n",
"}\n",
"\n",
"M <- matrix(rnorm(100), 10, 10)\n",
"print (apply(M, 1, SomeOperation))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thus, the function `SomeOperation` takes as input `v`. Then if the sum of v is greater than zero, it multiplies that value by 100. So if `v` has positive and negative numbers, and the sum comes out to be positive, only then does it multiply all the values in `v` by 100 and return them.\n",
"\n",
"There are many other methods: `lapply`, `sapply`, `eapply`, etc. Each is best for a given data type. For example, `lapply` abd `sapply` are designed for R lists. Have a look at [this Stackoveflow thread](https://stackoverflow.com/questions/3505701/grouping-functions-tapply-by-aggregate-and-the-apply-family) \n",
"for some guidelines.\n",
"\n",
"#### A vectorization example\n",
"\n",
"Let's try an example of vectorization involving `lapply` and `sapply`. Both of these *apply* a function to *each element of a list*, but the former returns a list, while the latter returns a vector. \n",
"\n",
"We will also see how [sampling random numbers](R-random-numbers) works in the process of trying out this example.\n",
"\n",
"★ Type the following blocks of code into a single script called `sample.R` and save in `code`.\n",
"\n",
"First some functions:"
]
},
{
"cell_type": "code",
"execution_count": 331,
"metadata": {},
"outputs": [],
"source": [
"######### Functions ##########\n",
"\n",
"## A function to take a sample of size n from a population \"popn\" and return its mean\n",
"myexperiment <- function(popn,n) {\n",
" pop_sample <- sample(popn, n, replace = FALSE)\n",
" return(mean(pop_sample))\n",
"}\n",
"\n",
"## Calculate means using a FOR loop on a vector without preallocation:\n",
"loopy_sample1 <- function(popn, n, num) {\n",
" result1 <- vector() #Initialize empty vector of size 1 \n",
" for(i in 1:num) {\n",
" result1 <- c(result1, myexperiment(popn, n))\n",
" }\n",
" return(result1)\n",
"}\n",
"\n",
"## To run \"num\" iterations of the experiment using a FOR loop on a vector with preallocation:\n",
"loopy_sample2 <- function(popn, n, num) {\n",
" result2 <- vector(,num) #Preallocate expected size\n",
" for(i in 1:num) {\n",
" result2[i] <- myexperiment(popn, n)\n",
" }\n",
" return(result2)\n",
"}\n",
"\n",
"## To run \"num\" iterations of the experiment using a FOR loop on a list with preallocation:\n",
"loopy_sample3 <- function(popn, n, num) {\n",
" result3 <- vector(\"list\", num) #Preallocate expected size\n",
" for(i in 1:num) {\n",
" result3[[i]] <- myexperiment(popn, n)\n",
" }\n",
" return(result3)\n",
"}\n",
"\n",
"\n",
"## To run \"num\" iterations of the experiment using vectorization with lapply:\n",
"lapply_sample <- function(popn, n, num) {\n",
" result4 <- lapply(1:num, function(i) myexperiment(popn, n))\n",
" return(result4)\n",
"}\n",
"\n",
"## To run \"num\" iterations of the experiment using vectorization with sapply:\n",
"sapply_sample <- function(popn, n, num) {\n",
" result5 <- sapply(1:num, function(i) myexperiment(popn, n))\n",
" return(result5)\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Think carefully about what each of these functions does.* \n",
"\n",
"Now let's generate a population. To get the same result every time, let's set seed (you might want to review [this section](R-random-numbers))."
]
},
{
"cell_type": "code",
"execution_count": 332,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtAAAALQCAMAAACOibeuAAAC91BMVEUAAAABAQECAgIDAwME\nBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUW\nFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJyco\nKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6\nOjo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERGRkZHR0dISEhJSUlKSkpLS0tMTExN\nTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5f\nX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBx\ncXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKD\ng4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Pj4+QkJCRkZGSkpKTk5OUlJSVlZWW\nlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqanp6eo\nqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGzs7O0tLS1tbW2tra3t7e4uLi5ubm6urq7\nu7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vMzMzN\nzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e3t7f\n39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v7+/w8PDx\n8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////CQYAgAAAACXBI\nWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3dbZwU1Z3o8X+DPC3PMirDg/IUYBE0ShTCDAgu\nEpBA9JLIGAWWEEVBY5LNjXLRmOi9SVTWPJDkZoPLJpvdJLrJbmLMRtclCWuy2SgK4ypqEpfl\ngokkCAiCA1MvbnX1zFR310xVTXWdU6eOv++L6ZruU13n0/xsq/vM9IgDWESyngCQJoKGVQga\nViFoWIWgYRWChlUIGlYhaFiFoGEVgoZVCBpWIWhYhaBhFYKGVQgaViFoWIWgYRWChlUIGlYh\naFiFoGEVgoZVCBpWIWhYhaBhFYKGVQgaViFoWIWgYRWChlUIGlYhaFiFoGEVgoZVCBpWIWhY\nhaBhFYKGVQgaViHoTm0VkV+UNieINDrOV0QGZDWZlk+O69X/G1kdPWcIulOxgj61ePHin2qY\nzOfcycjXNBzIBgTdqVhBn3QHfUvDZOaLnL5uu4YD2YCgOxUIuvXkyZPVg3QF/aciN2s4jB0I\nulOBoDvxwnZ30F2/OFLDYU7FGjVZZGMNB3lrIehOdXnK0frdBeP6jpv/9RbHWSae4rA3Ny0Z\nO+idH9zZtvfRj08bMP8/3V1mut/cKTLB+e7bJ7qjHph7Tp/Rsz5X/E/gFvden5lV6D31Aef4\n3ZcMGv/n+8oPX35/bYe5s/22NSJzX79lZN9z7zoeHFx9a/Hgp+6b0vecK55T9VAZhqA71VXQ\nrUtLecmFR8qC3nleabPXJ1qLe+ydVPzmzHV+0H9bkDHO8Zlt+0474gU9eoj33X3zvIuRr/lH\nr7i/ToJubPSuOm93YHD1re7Bxzd53/Z5RtNDlzGC7tRWKdfYEXTxDYeJV84siKzyz6GPjXM3\nzr6oj/v1geLOS4o59y/u2Bb0WcPEDfpW94rJl5zlfr3DC1qk/5+U7r+u+OVTHQevvL/ntp8t\nsmL7nvZb3WQLUhhzmnvr/ODBq269s3jPhfritwu0PXiZIuhOdRX0QpFrHe+701s7gt4o0mOL\n4+x7h5vmHx3nQffq+1vfvMkPWnrf8Fd/V3xtd3vpLt5VCvr2U0fXF2t83tk5QOSKjoNX3V/V\nOfSa4rPvy87By9zLnwcGV91aPPgVrzoHLhIZpO/RyxJBd6qroKe75wZf3uMc27ZtW0tH0G5w\nVxd32uU+cX/Xca4SaXC/OzXZD9q91mn91re+9arjHJ4j8g4v6DPcl4S/cW/8nHvj1eWvPKvu\nr5Ogn3QvX+nnviQNDK661T147+Ip+zfcqw8of9RMQNCdKjuHnlwe9O1e4BNveOio03HKccL9\nH/pD3tCJIv/bcaaI/J/idxs7gh5YuqeW7Z967/nFc4NS0NPdqw643z3sXq4tC7r6/oJBj/Y2\n/qz4f4vqwZW3Fg8+rvjdo+5xXkn7QTISQXeqq6BPbDi99KQ96K86gv5N6f/urktFPuC09hLZ\nWvzuqx1Bj/du3eGWXphw1cL2oN2vXtA/ciqDrrq/ToKe4W2sELksMLjy1tJbLK7HCPqtraug\n3afZn9x6vvdKa2f5M/Q/eLdN8t6MGC5yX/G7u8retnMdd1+9LXejui0q6Or7CwZ9jrfhnibf\nGBhceStBo6SLoI80Nze7V+1zW5XNHefQbkrXFEc+20PkQce5xHtudJyZlUH/uzv6BfdycVTQ\n1fcXDLrwtHv5qvtC8vOBwVW3EjQ8XQS92736x+51v3PPhH/oBb3F8Z5ze/yN4+y/SGTI7x3n\n01I85zj1CakM+lHvauf7hcigq+6vkxeFF+x1DrmnLv32BgZX3UrQ8HR1yjFBpOecqy8fJHLm\nYccZ4v7/ff1vnKNj3NHjZ/Vzv7pn1s5R95xDRg2WqqBfcZ9C5fxpbs8yNTzoqvvr5Blaeryt\nt3vrx4ODq24laHi6Cvq5YW3v5PX9ifvd+6Q0bMe5pSt7bfRWCv/Ve+HY+8rKoJ0Pe2PGNYmc\ntis06Or7CwQ9a4Z383uPBgdX3UrQ8HT5ovC1++eM61d34S3/r/jNqyvr+01+1t04cc/iMQNn\nfKB9dfnXfz5p2OXbv1wV9KnPn9f/wo8c+if3rm8JD7rq/gJBzz28vr7PlL9s7WRw1a0EjdTc\n7j5LKrjbYrJJb30LIOjUrZswYcYxx2mZUloWSRtBhyLo1H3e/d/7lf/84/kig3+r4O4JOhRB\np+7Ue9teOPb/noq7J+hQBK3AY0v+9E+Gv/Ojv1Ny519etuwTSW99CyBoWIWgYRWChlUIGlYh\naFiFoGEVgoZVCBpWIWhYhaBhFYKGVQgaViFoWIWgYRWChlUIGlYhaFiFoGGV2oI+tHdfvL97\nA+hRQ9C7VhQ/86rnyCb+hB6MkTzo9QWpn7Fo0cxRImtSnBBQi8RBb5YFT5W2mq+STWlNB6hN\n4qBnTWpp32yd3ZDOZIBaJQ560Ep/e8PgNKYC1C75M/Rk/29fz+MZGoao4Rx6YdtfAt59tdyT\n1nSA2iR/l2OtyOjGJUvnjBVZ1ZrijIAa1PA+9I6m4p/07VnftC296QC1qW2l8OCe/awUwiQs\nfcMqLH3DKix9wyosfcMqLH3DKix9wyosfcMqLH3DKix9wyosfcMqapa+X+opvh4nOxsCqKBo\n6fvpX3V4QE7UdAygG9Qvff8bQUMf9UvfBA2N1C99EzQ0Ur/0TdDQSP3SN0FDI/VL3wQNjdQv\nfRM0NFK/9E3Q0Ej90jdBQyP1v/VN0NColqB/93zbO3ev7g0ZRdDQKHnQO84TGb7V23xX2L0Q\nNDRKHPRLfXvMX9RXNhe3CTpF186P9kjWkzRX4qCXF37oOL+f0Pd5h6BTVVi0OsqI27OepLkS\nBz12QfHr7n7vdgg6VYUtu6LMuD3rSZorcdADSz9id7v8lKBTRdA1SRx04xTv4vXR554g6DQR\ndE0SB32brD9evHxYlr9B0DEdvuczkQi6JomDfmO2DFxc3LhdRp5B0PH8q8yMJARdi+TvQx+8\ndXLprGPrJCHoeB4vRMa6i6Brksbf+m797WMhtxK0j6CVU//H6wnaR9DKEbROBK0cQetE0MoR\ntE4ErRxB60TQyhG0TgStHEHrRNDKEbROBK0cQetE0MoRtE4ErRxB60TQyhG0TgStHEHrRNDK\nEbROBK0cQetE0MoRtE4ErRxB60TQyhG0TgStHEHrRNDKEbROBK0cQetE0MoRtE4ErRxB60TQ\nyhG0TgStHEHrRNDKEbROBK0cQetE0MoRtE4ErRxB60TQyhG0TgStHEHrRNDKEbROBK0cQetE\n0MoRtE4ErRxB60TQyhG0TgStHEHrRNDKEbROBK0cQetE0MoRtE4ErRxB60TQyhG0TgStHEHr\nRNDKEbROBK0cQetE0MoRtE4ErRxB60TQyhG0TgStHEHrRNDKEbROBK0cQetE0MoRtE4ErRxB\n60TQyhG0TgStHEHrRNDKEbROBK0cQetE0MoRtE4ErRxB60TQyhG0TgStHEHrRNDKEbROBK0c\nQetE0MoRtE4ErRxB60TQyhG0TgStHEHrRNDKEbROBK0cQetE0MoRtE4ErRxB60TQyhG0TgSt\nHEHrRNDKEbROBK0cQetE0MrVFvShvftORY0haB9BK1dD0LtWDBeRniObtocOI2gfQSuXPOj1\nBamfsWjRzFEia8LGEbSPoJVLHPRmWfBUaav5KtkUMpCgfQStXOKgZ01qad9snd0QMpCgfQSt\nXOKgB630tzcMDhlI0D6CVi75M/Tkkx3b83iGjoeglavhHHrhztLW7qvlnpCBBO0jaOWSv8ux\nVmR045Klc8aKrGoNGUfQPoJWrob3oXc01RXfh65v2hY6jKB9BK1cbSuFB/fsZ6WwGwhaOZa+\ndSJo5Vj61omglWPpWyeCVo6lb50IWjmWvnUiaOVY+taJoJVj6VsnglaOpe/UfH9+pOkErRpL\n36nZePaHo1xC0Kqx9J2ajTMjS/wLglZNzdL3yyOGdhhI0AStT60fY3DqhWdbgte2fO87He4i\naILWJ3HQG7e4X1o+O0Ckz3WvhQ3klIOgNUoctMx1v9wkQ5ddP1OmHA8ZSNAErVFNQTcXLj7g\nbm6RO0IGEjRBa1RT0F+VJ7zthotCBhI0QWtUU9B3yBFve+3AkIEETdAa1RT0N6TZ237PtJCB\nBJ160GePjV6UvCvrhyMjyYMecdeDvzxjeXHzl71Whwwk6NSDHjwjck2yYVbWD0dGEgc9uiBF\njzvOrf2G7QkZSNDpB3195JCbCbq7ju186NOrG3/qOJNHh659EzRBa5TCB54/G/57sgRN0Brx\nCf6pIWgTEHRqCNoEBJ0agjYBQaeGoE2QNOgvDKkQMpKgCVqjpEG/eHMfGTi1Q8hIgiZojZKf\ncvxIFscaR9AErVEN59ATCboCQZughqDff0WsYQRN0BrxLkdqCNoEBJ0agjYBQaeGoE1A0Kkh\naBMQdGoI2gQEnRqCNgFBp4agTUDQqSFoExB0agjaBASdGoI2AUGnhqBNQNCpIWgTEHRqCNoE\nBJ0agjYBQaeGoE1A0KkhaBMQdGoI2gQEnRqCNgFBp4agTUDQqSFoExB0agjaBASdGoI2AUGn\nhqBNQNCpIWgTEHRqCNoEBJ0agjYBQaeGoE1A0KkhaBMQdGoI2gQEnRqCNgFBp4agTUDQqSFo\nExB0agjaBASdGoI2AUGnhqBNQNCpIWgTEHRqCNoEBJ0agjYBQaeGoE1A0KkhaBMQdGoI2gQE\nnRqCNgFBp4agTUDQqSFoExB0agjaBASdGoI2AUGnhqBNQNCpIWgTEHRqCNoEBJ0agjYBQaeG\noE1A0KkhaBMQdGoI2gQEnRqCNgFBp4agTUDQqSFoExB0agjaBASdGoI2AUGnhqBNQNCpIWgT\nlAe99ZCKIxA0QWtUHrT0vfI7x1I/AkETtEblQW++pIcMuOYHb6Z7BIImaI0qz6H3f9Ft+vQP\nPn4qxSMQNEFrFHhRuP+Lc3pI/Yd+kdoRCJqgNQq+y/H0nWPFNfGhlI5A0AStUWXQLY9/6ByR\n+rU/fvIjAwr/kc4RCJqgNSoP+qFrh4qM/4snWovfPCW3pnMEgiZojSretpPz73ym/ZtDdfem\ncwSCJmiNyoO+79cqjkDQBK1R5Tn0C4+6X77yfKpHIGiC1qgi6A8VGt2vpxU+0priEQiaoDUq\nD/oBmfWwe/HIPNmS4hEImqA1Kg963ttKq94tU96R4hEImqA1Kg96yPVtGzcOTPEIBE3QGpUH\nPXlh28blE1M8AkETtEblQV/X8x+9y0d6rkrxCARN0BqVB/2HMTL/7q995t2FM/fH3PvQ3n2R\nP5hH0AStUcXbdv91bY/izyVd/lysXXetGO4O7jmyaXvoMIImaI2qftru99u/+dh/x9tzfUHq\nZyxaNHOUyJqwcQRN0Bol/iXZzbLgqdJW81WyKWQgQRO0RhVBP7h8fpvoHWdNamnfbJ3dEDKQ\noAlao/KgvyYyoK4kesdBK/3tDYNDBhI0QWtUHvS5g8Jf3lWYNflkx/Y8nqEdgjZDWdCtvW/q\nxo6bZeHO0tbuq+WekIEETdAalQV9vPDh7uy5VmR045Klc8aKrAr76TyCJmiNyk85LhnzWnd2\n3dFUV3wfur5pW+gwgiZojcqD/q9p07790gFPzL0P7tnPSmE7gjZBxU/b9Zd2Mfdm6bsMQZug\nPN01vji7svRdiaBNkPzjdFn6rkLQJqgK+ujOn8fckaXvagRtgoqgX76yl3v6fMf798bYkaXv\nagRtgvKg942WWfPEuVdG7ovekaXvagRtgvKg18nXnb91r9ja88boHVn6rkbQJigP+px5jhe0\ns+Rt0Tuy9F2NoE1QHnT/69uCvqF/jD1Z+q5C0CYoD3rGxW1BXzg9zq4sfVciaBOUB3233HWq\nGPTdclvMvbta+t7XOL3DJDle+zTzgKBNUB70yTky4Z1y43SZ9kbMvbta+j666TMdbuAZmqD1\nqXgf+sT9Z7snEcM2Ho61K0vflQjaBNVL30ee/UPMPVn6rkLQJuC3vlND0CYoD/oaX/SOLH1X\nI2gTVP6NlTYDJ0TvyNJ3NYI2QXnQxz0HHmvo93D0jix9VyNoE3R2Dn100rDov/fN0nc1gjZB\npy8KPyZ7ovdk6bsKQZug06A/1CfOX69n6bsSQZugk6BbfzL4vJh781vfZQjaBOVBDyjpI7I1\nxSMQNEFrVB704jYr/jHNIxA0QWuU/Le+4yJogtaIoFND0CYoD3pUhcbQ/b4wpELISIImaI3K\ng147Ugojpo8qyJhG1xWh+714cx8ZOLVDyEiCJmiNyoP+WY/L/tO9eH7ByJdj7PkjWRzrCARN\n0BqVB/3usce8y2PjlsXZdSJBVyBoE5QHfVb7D9CtHhVn1/eHn5S0I2iC1qj6czk88+tTPAJB\nZxH0dZO+E+lfsn7EVCgPennhe97lP/VYkuIRCDqLoBt6DorSv2DjP0x50C8P6/G+LY888L4e\n/Z5J8QgEnUXQ73x75JBvStxf7s+TioWVpy/1fmFl6mNpHoGgfQStXNVKYfODm77+8zg/Oxof\nQfsIWrnEH3gemx1BN0e/xvofBG2AxB94HpsdQc/uE/kiqydBGyDxB57HZkfQs26ODORCgjZA\n4g88j42gfQStXOIPPI+NoH0ErVzyDzyPi6B9BK1cDR94HhNB+whaudo+8DwOgvYRtHK1feB5\nHATtI2jlavjA85gI2kfQypUF/fpXnujOB57HRdA+glau4l2O96s4AkH7CFq58qBvPOOAgiMQ\ntI+glSsPuuX6ad9+8fDrRSkegaB9BK1cedDDh/ds/wz/FI9A0D6CVq483VW+FI9A0D6CVq49\n6PV/o+oIBO0jaOXagxbvD189EPoXB5MhaB9BK1cZ9CoFn91I0D6CVo6g4yHonCDoeAg6Jwg6\nHoLOCYKOh6BzgqDjIeic6Aj6nOWusbK8JMUjELSPoJXrCLpSikcgaB9BK9ee7q8qpXgEgvYR\ntHL8Fax4CDonCDoegs4Jgo6HoHOCoOMh6Jwg6HgIOicIOh6CzgmCjoegc4Kg4yHonCDoeAg6\nJwg6HoLOCYKOh6BzgqDjIeicIOh4CDonCDoegs4Jgo6HoHOCoOMh6Jwg6HgIOicIOh6CzgmC\njoegc4Kg4yHonCDoeAg6Jwg6HoLOCYKOh6BzgqDjIeicIOh4CDonCDoegs4Jgo6HoHOCoOMh\n6Jwg6HgIOicIOh6CzgmCjoegc4Kg4yHonCDoeAg6Jwg6HoLOCYKOh6BzgqDjIeicIOh4CDon\nCDoegs4Jgo6HoHOCoOMh6Jwg6HgIOicIOh6CzgmCjoegc4Kg4yHonCDoeAg6Jwg6HoLOCYKO\nh6BzgqDjIeicIOh4CDonCDoegs6J2oI+tHffqagxBO0jaOVqCHrXiuEi0nNk0/bQYQTtI2jl\nkge9viD1MxYtmjlKZE3YOIL2EbRyiYPeLAueKm01XyWbQgYStI+glUsc9KxJLe2brbMbQgYS\ntI+glUsc9KCV/vaGwSEDCdpH0Molf4aefLJjex7P0LsI2gw1nEMv3Fna2n213BMykKB9BK1c\n8nc51oqMblyydM5YkVWtIeMI2kfQytXwPvSOprri+9D1TdtChxG0j6CVq22l8OCe/awUtiNo\nE7D0HQ9B5wRL3/EQdE6w9B0PQecES9/xEHROsPQdD0HnhJql7wPXvrfDpQTdgaCVU7P0fXDd\ndR2WEnQHglaOpe94CDonWPqOh6BzgqXveAg6J1j6joegc6LWjzE49cKzLeEjCNpH0MolDnrj\nFvdLy2cHiPS57rWwgQTtI2jlEgctc90vN8nQZdfPlCnHQwYStI+glasp6ObCxQfczS1yR8hA\ngvYRtHI1Bf1VecLbbrgoZCBB+whauZqCvkOOeNtrB4YMJGifUUHfL0OGRjnz6awf+O6qKehv\nSLO3/Z5pIQMJ2mdU0J+Uz26K0vvhrB/47koe9Ii7HvzlGcuLm7/stTpkIEH7DAv6V5Fj+r51\ngh5dkKLHHefWfsP2hAwkaB9BK5d8YeXYzoc+vbrxp44zeXTo2jdB+whauRQ+8PzZ8NVv84M+\nOSnyxdHQ0wg6H/gEf8d5Q26OfHXUh6DzgaCLQX8z8l+2P0HnA0ETdAiCDiJoH0ErR9AEHYKg\ngwjaR9DKETRBhyDoIIL2EbRyBE3QIQg6iKB9BK0cQRN0CIIOImgfQStH0AQdgqCDCNpH0MoR\nNEGHIOgggvYRtHIETdAhCDqIoH0ErRxBE3QIgg4iaB9BK0fQBB2CoIMI2kfQyhE0QYcg6CCC\n9hG0cgRN0CEIOoigfQStHEETdAiCDiJoH0ErR9AEHYKggwjaR9DKETRBhyDoIIL2EbRyBE3Q\nIQg6iKB9BK0cQRN0CIIOImgfQStH0AQdgqCDCNpH0MoRNEGHIOgggvYRtHIETdAhCDqIoH0E\nrRxBE3QIgg4iaB9BK0fQBB2CoIMI2kfQyhE0QYcg6CCC9hG0cgRN0CEIOoigfQStHEETdAiC\nDiJoH0ErR9AEHYKggwjaR9DKETRBhyDoIIL2EbRyBE3QIQg6iKB9BK0cQRN0CIIOImgfQStn\nfdA7zhgaZQhBd6XP534V6UiW/7wB1gf9g96botxN0F0pSLTbsvznDbA/6H6R/2pPEHSXk7n3\n36I0fjTLf94AgibosMlsjhwyl6C1IuiuEHQyBO0jaOUImqDDJkPQAQTtI2jlCJqgwyZD0AEE\n7SNo5QiaoMMmQ9ABBO0jaOUImqDDJkPQAQTtI2jlCJqgwyZD0AEE7SNo5QiaoMMmQ9ABBO0j\naOUImqDDJkPQAQTtI2jlCJqgwyZD0AEE7SNo5QiaoMMmQ9ABBO0jaOUImqDDJkPQAQTtI2jl\nCJqgwyZD0AEE7SNo5QiaoMMmQ9ABBO0jaOXyHfTR6M/GvJ+gu0DQyagMekOMD8ck6C6kFPSs\nZY9GellZAQH5DvqjjZEfjrmaoLuQUtBnxHhSuVhZAQE5D3pu5ON9A0F3IaWgh62MHPKxC5QV\nEEDQBB02GYIOIGgfQStH0AQdNhmCDiBoH0ErV1vQh/buOxU1JnHQx/8YaR1Bd4Ggu2/XiuEi\n0nNk0/bQYYmDviTG+0EE3QWC7rb1BamfsWjRzFEia8LGJQ76guseifI2gu6CUUHfOPr/Rvq7\nZJEEJA56syx4qrTVfJVsChmYPOiPRT5S5xJ0F4wK+tK+U6KMl4PJKqmWOOhZk1raN1tnN1Td\n+PonPt7hms6CPjY0xvnEjNVR6s6OHHLBaZFDrpF3R47pNT1yyFkjIodcLJFDVsuiyCF93h45\nZOSZkUMaZWX0ZC6LHNJvauSQc4ZFDnmP/DFpiZUSBz1opb+9YXDVja8smt9hzthOXje2LJsf\nafxFkUPOnxw5pGFU9JFGzIkcMmF65JALJkYOmT0yejIjZ0cOmXhB5JDpEyKHzBkRPZlRDZFD\nJp8fOeQd46OPtKwlWEkSyZ+hJ5/s2J5X/QwNZKSGc+iFO0tbu6+We9KaDlCb5O9yrBUZ3bhk\n6ZyxIqtaU5wRUIMa3ofe0VRXfB+6vmlbetMBalPbSuHBPfsjVwoBjdT/LAegEUHDKgQNqxA0\nrELQsApBwyoEDasQNKxC0LAKQcMqBA2rEDSsklHQrxVi/AoWsrc1mz6SyyjoP8jfR3+0sy5P\nyF9nPYUyha9kPYMy/b6fTR/JZRb0M9kcuDNvyM+znkKZwuNZz6BM/x9kPYPuImiCDkHQMRF0\nlwi6JgRN0CEIOiaC7hJB14SgCToEQcdE0F0i6JoQNEGHIOiYCLpLBF0TgiboEAQd0+HCc9kc\nuDNv9nwy6ymU6RP+BxH0GvrjrGfQXVn9tN2vMzpup4yazG9M+qDA3+bug7H48VFYhaBhFYKG\nVQgaViFoWIWgYRWChlUIGlYhaFiFoGEVgoZVCBpWIWhYhaBhFYKGVTIN+sjW/87y8LBQpkGv\nEkN+w+f4htmDxjW9lPU0XF9qGNzwpawn0cacR6Ubsgz6QTEk6Ndmy5Q1lxX67ch6Is5ambRi\noqzPehoecx6V7sgw6L2nDzAk6Ntknfv14R7nZz2RHfKuFqflssKurCdSZMyj0i3ZBd166dgN\nhgQ9eeDx4sV8+V3GE2nyfhn+SVmR8Tw8xjwq3ZJd0Pf2+NlnDAl6ymLvYpE8n/FE6kZ5F/XD\ns51GiTGPSrdkFvSO3rc5pgRd8vu+Z7VkO4OD0uBdzpDD2U6kTPaPSvdkFfSxKW8/YVbQuyfI\nX2c8hT2yxLtcJHsznkkHAx6V7tEd9NH7XW7G6/o2O5kH3T4Z1+t39Ov7xUwn49ovS73LRbIv\n45m0MeJR6R7dQb9S/NNKy5zH5C+d7INum4zrh2fL4uxPFU/1nONdzuxpxue7mPGodE9Gpxz3\ndfzdsK9lM4FKd8i5P8l6DkX147yL0SMznkeJKY9Kt2QU9KNri2bIwrUmfJLbVll+Ius5eJpk\nt/u1WZqynkiRMY9Kt2S69J31KUeb1kkj38h6DiXb5Bp3OlfJz7KeiGPSo9ItBO04v5Uz3lXy\natZTWSWXbpgjH8h6GkUGPSrdQdCO8y8dJ/SZv1vW+tlZg2bdm/UsPAY9Kt3Bz0PDKgQNqxA0\nrELQsApBwyoEDasQNKxC0LAKQcMqBA2rEDSsQtCwCkHDKgQNqxA0rELQsApBwyoEDasQNKxC\n0LAKQcMqBA2rEDSsQtCwCkHDKgQNqxA0rELQsApBwyoEDasQNKxC0LAKQcMqBA2rEDSsQtCw\nCkFr1fpm1LU5/NOARiFoferWPHCmnP2+Xxe3D9103oDp//NY5bVrhuyZW+g9dUu208w3gtan\nblxh7Mo5haFPOs4r46XhgxfK1CMV167pP23MLesHyUNZzzTHCFqfOln4huP8vfyZ49wo97tX\nfFw+VXHtGpl20HG2y/KsZ5pjBK1PXY8XiheXy+43e09tdbeODx9Rfq0b9LeLmwPmZzjJvCNo\nferGeBdfkO+/KOu9zSvl9bJr3aBf8sYRdHIErU/dTO/iu7J5m9ztba5zn5b9a92gD3jjCDo5\ngtanbqx38V23pLIAAAD2SURBVCX5hxfkJm9zmRwuu5agU0DQ+tT18M4o3iPNb/Y6r7h1YuTw\n8msJOgUErU+dXH7ccR4qNDjO9fJF94r/JZ+suJaga0fQ+tSNOn3CB+YXBv+74+wbI3NvnCHn\nvV5xLUHXjqD1qZv74tLho5a9WNx+bd20/hfe+kbltQRdO4LWp25u/GuREEHrQ9AaELQ+BK0B\nQesz5cr41yIhgoZVCBpWIWhYhaBhFYKGVQgaViFoWIWgYRWChlUIGlYhaFiFoGEVgoZVCBpW\nIWhYhaBhFYKGVQgaViFoWIWgYRWChlUIGlYhaFiFoGEVgoZVCBpWIWhY5f8DL8ZdmLWnTOAA\nAAAASUVORK5CYII=",
"text/plain": [
"Plot with title “Histogram of popn”"
]
},
"metadata": {
"image/png": {
"height": 360,
"width": 360
}
},
"output_type": "display_data"
}
],
"source": [
"set.seed(12345)\n",
"popn <- rnorm(10000) # Generate the population\n",
"hist(popn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And run and time the different functions:"
]
},
{
"cell_type": "code",
"execution_count": 333,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"Using loops without preallocation on a vector took:\"\n",
" user system elapsed \n",
" 0.596 0.000 0.651 \n",
"[1] \"Using loops with preallocation on a vector took:\"\n",
" user system elapsed \n",
" 0.378 0.000 0.384 \n",
"[1] \"Using loops with preallocation on a list took:\"\n",
" user system elapsed \n",
" 0.347 0.000 0.347 \n",
"[1] \"Using the vectorized sapply function (on a list) took:\"\n",
" user system elapsed \n",
" 0.299 0.000 0.300 \n",
"[1] \"Using the vectorized lapply function (on a list) took:\"\n",
" user system elapsed \n",
" 0.340 0.000 0.342 \n"
]
}
],
"source": [
"n <- 100 # sample size for each experiment\n",
"num <- 10000 # Number of times to rerun the experiment\n",
"\n",
"print(\"Using loops without preallocation on a vector took:\" )\n",
"print(system.time(loopy_sample1(popn, n, num)))\n",
"\n",
"print(\"Using loops with preallocation on a vector took:\" )\n",
"print(system.time(loopy_sample2(popn, n, num)))\n",
"\n",
"print(\"Using loops with preallocation on a list took:\" )\n",
"print(system.time(loopy_sample3(popn, n, num)))\n",
"\n",
"print(\"Using the vectorized sapply function (on a list) took:\" )\n",
"print(system.time(sapply_sample(popn, n, num)))\n",
"\n",
"print(\"Using the vectorized lapply function (on a list) took:\" )\n",
"print(system.time(lapply_sample(popn, n, num)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Note that these times will be different on your computer because they depend on the computer's hardware and software (especially, the operating system), as well as what other processes are running.*\n",
"\n",
"★ Run `sample.R` using `source` and/or `Rscript` and make sure it works.\n",
"\n",
"Rerun the script a few times, first by fixing the `n` and `num` parameters, and then also by varying them. Compare the times you get from these repeated runs of the script and think about which approach is the most efficient and why. Clearly, the loopy, witout pre-allocation apporach is usually (but not always) going to be bad, while the others are pretty comparable."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The `tapply` function\n",
"\n",
"Let's look at `tapply`, which is particularly useful because it allows you to apply a function to subsets of a vector in a dataframe, with the subsets defined by some other vector in the same dataframe, usually a factor (this could be useful for pound hill data analysis that's coming up, for example). \n",
"\n",
"This makes it a bit of a different member of the `*apply` family. Try this:"
]
},
{
"cell_type": "code",
"execution_count": 334,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
\n"
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item 1\n",
"\\item 2\n",
"\\item 3\n",
"\\item 4\n",
"\\item 5\n",
"\\item 6\n",
"\\item 7\n",
"\\item 8\n",
"\\item 9\n",
"\\item 10\n",
"\\item 11\n",
"\\item 12\n",
"\\item 13\n",
"\\item 14\n",
"\\item 15\n",
"\\item 16\n",
"\\item 17\n",
"\\item 18\n",
"\\item 19\n",
"\\item 20\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. 1\n",
"2. 2\n",
"3. 3\n",
"4. 4\n",
"5. 5\n",
"6. 6\n",
"7. 7\n",
"8. 8\n",
"9. 9\n",
"10. 10\n",
"11. 11\n",
"12. 12\n",
"13. 13\n",
"14. 14\n",
"15. 15\n",
"16. 16\n",
"17. 17\n",
"18. 18\n",
"19. 19\n",
"20. 20\n",
"\n",
"\n"
],
"text/plain": [
" [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x <- 1:20 # a vector\n",
"x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now create a `factor` type variable (of the same length) defining groups:"
]
},
{
"cell_type": "code",
"execution_count": 335,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"- a
- a
- a
- a
- b
- b
- b
- b
- c
- c
- c
- c
- d
- d
- d
- d
- e
- e
- e
- e
\n",
"\n",
"\n",
"\t\n",
"\t\tLevels:\n",
"\t
\n",
"\t\n",
"\t- 'a'
- 'b'
- 'c'
- 'd'
- 'e'
\n",
" "
],
"text/latex": [
"\\begin{enumerate*}\n",
"\\item a\n",
"\\item a\n",
"\\item a\n",
"\\item a\n",
"\\item b\n",
"\\item b\n",
"\\item b\n",
"\\item b\n",
"\\item c\n",
"\\item c\n",
"\\item c\n",
"\\item c\n",
"\\item d\n",
"\\item d\n",
"\\item d\n",
"\\item d\n",
"\\item e\n",
"\\item e\n",
"\\item e\n",
"\\item e\n",
"\\end{enumerate*}\n",
"\n",
"\\emph{Levels}: \\begin{enumerate*}\n",
"\\item 'a'\n",
"\\item 'b'\n",
"\\item 'c'\n",
"\\item 'd'\n",
"\\item 'e'\n",
"\\end{enumerate*}\n"
],
"text/markdown": [
"1. a\n",
"2. a\n",
"3. a\n",
"4. a\n",
"5. b\n",
"6. b\n",
"7. b\n",
"8. b\n",
"9. c\n",
"10. c\n",
"11. c\n",
"12. c\n",
"13. d\n",
"14. d\n",
"15. d\n",
"16. d\n",
"17. e\n",
"18. e\n",
"19. e\n",
"20. e\n",
"\n",
"\n",
"\n",
"**Levels**: 1. 'a'\n",
"2. 'b'\n",
"3. 'c'\n",
"4. 'd'\n",
"5. 'e'\n",
"\n",
"\n"
],
"text/plain": [
" [1] a a a a b b b b c c c c d d d d e e e e\n",
"Levels: a b c d e"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"y <- factor(rep(letters[1:5], each = 4)) \n",
"y"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"Now add up the values in x within each subgroup defined by y:"
]
},
{
"cell_type": "code",
"execution_count": 336,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"- a
- 10
- b
- 26
- c
- 42
- d
- 58
- e
- 74
\n"
],
"text/latex": [
"\\begin{description*}\n",
"\\item[a] 10\n",
"\\item[b] 26\n",
"\\item[c] 42\n",
"\\item[d] 58\n",
"\\item[e] 74\n",
"\\end{description*}\n"
],
"text/markdown": [
"a\n",
": 10b\n",
": 26c\n",
": 42d\n",
": 58e\n",
": 74\n",
"\n"
],
"text/plain": [
" a b c d e \n",
"10 26 42 58 74 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"tapply(x, y, sum)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using `by`\n",
"\n",
"You can also do something similar to `tapply` with the `by` function, i.e., apply a function to a dataframe using some factor to define the subsets. Try this:\n",
"\n",
"First import some data:"
]
},
{
"cell_type": "code",
"execution_count": 337,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"The following objects are masked from iris (pos = 3):\n",
"\n",
" Petal.Length, Petal.Width, Sepal.Length, Sepal.Width, Species\n",
"\n",
"\n"
]
},
{
"data": {
"text/html": [
"\n",
"A data.frame: 150 × 5\n",
"\n",
"\tSepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
\n",
"\t<dbl> | <dbl> | <dbl> | <dbl> | <fct> |
\n",
"\n",
"\n",
"\t5.1 | 3.5 | 1.4 | 0.2 | setosa |
\n",
"\t4.9 | 3.0 | 1.4 | 0.2 | setosa |
\n",
"\t4.7 | 3.2 | 1.3 | 0.2 | setosa |
\n",
"\t4.6 | 3.1 | 1.5 | 0.2 | setosa |
\n",
"\t5.0 | 3.6 | 1.4 | 0.2 | setosa |
\n",
"\t5.4 | 3.9 | 1.7 | 0.4 | setosa |
\n",
"\t4.6 | 3.4 | 1.4 | 0.3 | setosa |
\n",
"\t5.0 | 3.4 | 1.5 | 0.2 | setosa |
\n",
"\t4.4 | 2.9 | 1.4 | 0.2 | setosa |
\n",
"\t4.9 | 3.1 | 1.5 | 0.1 | setosa |
\n",
"\t5.4 | 3.7 | 1.5 | 0.2 | setosa |
\n",
"\t4.8 | 3.4 | 1.6 | 0.2 | setosa |
\n",
"\t4.8 | 3.0 | 1.4 | 0.1 | setosa |
\n",
"\t4.3 | 3.0 | 1.1 | 0.1 | setosa |
\n",
"\t5.8 | 4.0 | 1.2 | 0.2 | setosa |
\n",
"\t5.7 | 4.4 | 1.5 | 0.4 | setosa |
\n",
"\t5.4 | 3.9 | 1.3 | 0.4 | setosa |
\n",
"\t5.1 | 3.5 | 1.4 | 0.3 | setosa |
\n",
"\t5.7 | 3.8 | 1.7 | 0.3 | setosa |
\n",
"\t5.1 | 3.8 | 1.5 | 0.3 | setosa |
\n",
"\t5.4 | 3.4 | 1.7 | 0.2 | setosa |
\n",
"\t5.1 | 3.7 | 1.5 | 0.4 | setosa |
\n",
"\t4.6 | 3.6 | 1.0 | 0.2 | setosa |
\n",
"\t5.1 | 3.3 | 1.7 | 0.5 | setosa |
\n",
"\t4.8 | 3.4 | 1.9 | 0.2 | setosa |
\n",
"\t5.0 | 3.0 | 1.6 | 0.2 | setosa |
\n",
"\t5.0 | 3.4 | 1.6 | 0.4 | setosa |
\n",
"\t5.2 | 3.5 | 1.5 | 0.2 | setosa |
\n",
"\t5.2 | 3.4 | 1.4 | 0.2 | setosa |
\n",
"\t4.7 | 3.2 | 1.6 | 0.2 | setosa |
\n",
"\t⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
\n",
"\t6.9 | 3.2 | 5.7 | 2.3 | virginica |
\n",
"\t5.6 | 2.8 | 4.9 | 2.0 | virginica |
\n",
"\t7.7 | 2.8 | 6.7 | 2.0 | virginica |
\n",
"\t6.3 | 2.7 | 4.9 | 1.8 | virginica |
\n",
"\t6.7 | 3.3 | 5.7 | 2.1 | virginica |
\n",
"\t7.2 | 3.2 | 6.0 | 1.8 | virginica |
\n",
"\t6.2 | 2.8 | 4.8 | 1.8 | virginica |
\n",
"\t6.1 | 3.0 | 4.9 | 1.8 | virginica |
\n",
"\t6.4 | 2.8 | 5.6 | 2.1 | virginica |
\n",
"\t7.2 | 3.0 | 5.8 | 1.6 | virginica |
\n",
"\t7.4 | 2.8 | 6.1 | 1.9 | virginica |
\n",
"\t7.9 | 3.8 | 6.4 | 2.0 | virginica |
\n",
"\t6.4 | 2.8 | 5.6 | 2.2 | virginica |
\n",
"\t6.3 | 2.8 | 5.1 | 1.5 | virginica |
\n",
"\t6.1 | 2.6 | 5.6 | 1.4 | virginica |
\n",
"\t7.7 | 3.0 | 6.1 | 2.3 | virginica |
\n",
"\t6.3 | 3.4 | 5.6 | 2.4 | virginica |
\n",
"\t6.4 | 3.1 | 5.5 | 1.8 | virginica |
\n",
"\t6.0 | 3.0 | 4.8 | 1.8 | virginica |
\n",
"\t6.9 | 3.1 | 5.4 | 2.1 | virginica |
\n",
"\t6.7 | 3.1 | 5.6 | 2.4 | virginica |
\n",
"\t6.9 | 3.1 | 5.1 | 2.3 | virginica |
\n",
"\t5.8 | 2.7 | 5.1 | 1.9 | virginica |
\n",
"\t6.8 | 3.2 | 5.9 | 2.3 | virginica |
\n",
"\t6.7 | 3.3 | 5.7 | 2.5 | virginica |
\n",
"\t6.7 | 3.0 | 5.2 | 2.3 | virginica |
\n",
"\t6.3 | 2.5 | 5.0 | 1.9 | virginica |
\n",
"\t6.5 | 3.0 | 5.2 | 2.0 | virginica |
\n",
"\t6.2 | 3.4 | 5.4 | 2.3 | virginica |
\n",
"\t5.9 | 3.0 | 5.1 | 1.8 | virginica |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 150 × 5\n",
"\\begin{tabular}{lllll}\n",
" Sepal.Length & Sepal.Width & Petal.Length & Petal.Width & Species\\\\\n",
" & & & & \\\\\n",
"\\hline\n",
"\t 5.1 & 3.5 & 1.4 & 0.2 & setosa\\\\\n",
"\t 4.9 & 3.0 & 1.4 & 0.2 & setosa\\\\\n",
"\t 4.7 & 3.2 & 1.3 & 0.2 & setosa\\\\\n",
"\t 4.6 & 3.1 & 1.5 & 0.2 & setosa\\\\\n",
"\t 5.0 & 3.6 & 1.4 & 0.2 & setosa\\\\\n",
"\t 5.4 & 3.9 & 1.7 & 0.4 & setosa\\\\\n",
"\t 4.6 & 3.4 & 1.4 & 0.3 & setosa\\\\\n",
"\t 5.0 & 3.4 & 1.5 & 0.2 & setosa\\\\\n",
"\t 4.4 & 2.9 & 1.4 & 0.2 & setosa\\\\\n",
"\t 4.9 & 3.1 & 1.5 & 0.1 & setosa\\\\\n",
"\t 5.4 & 3.7 & 1.5 & 0.2 & setosa\\\\\n",
"\t 4.8 & 3.4 & 1.6 & 0.2 & setosa\\\\\n",
"\t 4.8 & 3.0 & 1.4 & 0.1 & setosa\\\\\n",
"\t 4.3 & 3.0 & 1.1 & 0.1 & setosa\\\\\n",
"\t 5.8 & 4.0 & 1.2 & 0.2 & setosa\\\\\n",
"\t 5.7 & 4.4 & 1.5 & 0.4 & setosa\\\\\n",
"\t 5.4 & 3.9 & 1.3 & 0.4 & setosa\\\\\n",
"\t 5.1 & 3.5 & 1.4 & 0.3 & setosa\\\\\n",
"\t 5.7 & 3.8 & 1.7 & 0.3 & setosa\\\\\n",
"\t 5.1 & 3.8 & 1.5 & 0.3 & setosa\\\\\n",
"\t 5.4 & 3.4 & 1.7 & 0.2 & setosa\\\\\n",
"\t 5.1 & 3.7 & 1.5 & 0.4 & setosa\\\\\n",
"\t 4.6 & 3.6 & 1.0 & 0.2 & setosa\\\\\n",
"\t 5.1 & 3.3 & 1.7 & 0.5 & setosa\\\\\n",
"\t 4.8 & 3.4 & 1.9 & 0.2 & setosa\\\\\n",
"\t 5.0 & 3.0 & 1.6 & 0.2 & setosa\\\\\n",
"\t 5.0 & 3.4 & 1.6 & 0.4 & setosa\\\\\n",
"\t 5.2 & 3.5 & 1.5 & 0.2 & setosa\\\\\n",
"\t 5.2 & 3.4 & 1.4 & 0.2 & setosa\\\\\n",
"\t 4.7 & 3.2 & 1.6 & 0.2 & setosa\\\\\n",
"\t ⋮ & ⋮ & ⋮ & ⋮ & ⋮\\\\\n",
"\t 6.9 & 3.2 & 5.7 & 2.3 & virginica\\\\\n",
"\t 5.6 & 2.8 & 4.9 & 2.0 & virginica\\\\\n",
"\t 7.7 & 2.8 & 6.7 & 2.0 & virginica\\\\\n",
"\t 6.3 & 2.7 & 4.9 & 1.8 & virginica\\\\\n",
"\t 6.7 & 3.3 & 5.7 & 2.1 & virginica\\\\\n",
"\t 7.2 & 3.2 & 6.0 & 1.8 & virginica\\\\\n",
"\t 6.2 & 2.8 & 4.8 & 1.8 & virginica\\\\\n",
"\t 6.1 & 3.0 & 4.9 & 1.8 & virginica\\\\\n",
"\t 6.4 & 2.8 & 5.6 & 2.1 & virginica\\\\\n",
"\t 7.2 & 3.0 & 5.8 & 1.6 & virginica\\\\\n",
"\t 7.4 & 2.8 & 6.1 & 1.9 & virginica\\\\\n",
"\t 7.9 & 3.8 & 6.4 & 2.0 & virginica\\\\\n",
"\t 6.4 & 2.8 & 5.6 & 2.2 & virginica\\\\\n",
"\t 6.3 & 2.8 & 5.1 & 1.5 & virginica\\\\\n",
"\t 6.1 & 2.6 & 5.6 & 1.4 & virginica\\\\\n",
"\t 7.7 & 3.0 & 6.1 & 2.3 & virginica\\\\\n",
"\t 6.3 & 3.4 & 5.6 & 2.4 & virginica\\\\\n",
"\t 6.4 & 3.1 & 5.5 & 1.8 & virginica\\\\\n",
"\t 6.0 & 3.0 & 4.8 & 1.8 & virginica\\\\\n",
"\t 6.9 & 3.1 & 5.4 & 2.1 & virginica\\\\\n",
"\t 6.7 & 3.1 & 5.6 & 2.4 & virginica\\\\\n",
"\t 6.9 & 3.1 & 5.1 & 2.3 & virginica\\\\\n",
"\t 5.8 & 2.7 & 5.1 & 1.9 & virginica\\\\\n",
"\t 6.8 & 3.2 & 5.9 & 2.3 & virginica\\\\\n",
"\t 6.7 & 3.3 & 5.7 & 2.5 & virginica\\\\\n",
"\t 6.7 & 3.0 & 5.2 & 2.3 & virginica\\\\\n",
"\t 6.3 & 2.5 & 5.0 & 1.9 & virginica\\\\\n",
"\t 6.5 & 3.0 & 5.2 & 2.0 & virginica\\\\\n",
"\t 6.2 & 3.4 & 5.4 & 2.3 & virginica\\\\\n",
"\t 5.9 & 3.0 & 5.1 & 1.8 & virginica\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 150 × 5\n",
"\n",
"| Sepal.Length <dbl> | Sepal.Width <dbl> | Petal.Length <dbl> | Petal.Width <dbl> | Species <fct> |\n",
"|---|---|---|---|---|\n",
"| 5.1 | 3.5 | 1.4 | 0.2 | setosa |\n",
"| 4.9 | 3.0 | 1.4 | 0.2 | setosa |\n",
"| 4.7 | 3.2 | 1.3 | 0.2 | setosa |\n",
"| 4.6 | 3.1 | 1.5 | 0.2 | setosa |\n",
"| 5.0 | 3.6 | 1.4 | 0.2 | setosa |\n",
"| 5.4 | 3.9 | 1.7 | 0.4 | setosa |\n",
"| 4.6 | 3.4 | 1.4 | 0.3 | setosa |\n",
"| 5.0 | 3.4 | 1.5 | 0.2 | setosa |\n",
"| 4.4 | 2.9 | 1.4 | 0.2 | setosa |\n",
"| 4.9 | 3.1 | 1.5 | 0.1 | setosa |\n",
"| 5.4 | 3.7 | 1.5 | 0.2 | setosa |\n",
"| 4.8 | 3.4 | 1.6 | 0.2 | setosa |\n",
"| 4.8 | 3.0 | 1.4 | 0.1 | setosa |\n",
"| 4.3 | 3.0 | 1.1 | 0.1 | setosa |\n",
"| 5.8 | 4.0 | 1.2 | 0.2 | setosa |\n",
"| 5.7 | 4.4 | 1.5 | 0.4 | setosa |\n",
"| 5.4 | 3.9 | 1.3 | 0.4 | setosa |\n",
"| 5.1 | 3.5 | 1.4 | 0.3 | setosa |\n",
"| 5.7 | 3.8 | 1.7 | 0.3 | setosa |\n",
"| 5.1 | 3.8 | 1.5 | 0.3 | setosa |\n",
"| 5.4 | 3.4 | 1.7 | 0.2 | setosa |\n",
"| 5.1 | 3.7 | 1.5 | 0.4 | setosa |\n",
"| 4.6 | 3.6 | 1.0 | 0.2 | setosa |\n",
"| 5.1 | 3.3 | 1.7 | 0.5 | setosa |\n",
"| 4.8 | 3.4 | 1.9 | 0.2 | setosa |\n",
"| 5.0 | 3.0 | 1.6 | 0.2 | setosa |\n",
"| 5.0 | 3.4 | 1.6 | 0.4 | setosa |\n",
"| 5.2 | 3.5 | 1.5 | 0.2 | setosa |\n",
"| 5.2 | 3.4 | 1.4 | 0.2 | setosa |\n",
"| 4.7 | 3.2 | 1.6 | 0.2 | setosa |\n",
"| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |\n",
"| 6.9 | 3.2 | 5.7 | 2.3 | virginica |\n",
"| 5.6 | 2.8 | 4.9 | 2.0 | virginica |\n",
"| 7.7 | 2.8 | 6.7 | 2.0 | virginica |\n",
"| 6.3 | 2.7 | 4.9 | 1.8 | virginica |\n",
"| 6.7 | 3.3 | 5.7 | 2.1 | virginica |\n",
"| 7.2 | 3.2 | 6.0 | 1.8 | virginica |\n",
"| 6.2 | 2.8 | 4.8 | 1.8 | virginica |\n",
"| 6.1 | 3.0 | 4.9 | 1.8 | virginica |\n",
"| 6.4 | 2.8 | 5.6 | 2.1 | virginica |\n",
"| 7.2 | 3.0 | 5.8 | 1.6 | virginica |\n",
"| 7.4 | 2.8 | 6.1 | 1.9 | virginica |\n",
"| 7.9 | 3.8 | 6.4 | 2.0 | virginica |\n",
"| 6.4 | 2.8 | 5.6 | 2.2 | virginica |\n",
"| 6.3 | 2.8 | 5.1 | 1.5 | virginica |\n",
"| 6.1 | 2.6 | 5.6 | 1.4 | virginica |\n",
"| 7.7 | 3.0 | 6.1 | 2.3 | virginica |\n",
"| 6.3 | 3.4 | 5.6 | 2.4 | virginica |\n",
"| 6.4 | 3.1 | 5.5 | 1.8 | virginica |\n",
"| 6.0 | 3.0 | 4.8 | 1.8 | virginica |\n",
"| 6.9 | 3.1 | 5.4 | 2.1 | virginica |\n",
"| 6.7 | 3.1 | 5.6 | 2.4 | virginica |\n",
"| 6.9 | 3.1 | 5.1 | 2.3 | virginica |\n",
"| 5.8 | 2.7 | 5.1 | 1.9 | virginica |\n",
"| 6.8 | 3.2 | 5.9 | 2.3 | virginica |\n",
"| 6.7 | 3.3 | 5.7 | 2.5 | virginica |\n",
"| 6.7 | 3.0 | 5.2 | 2.3 | virginica |\n",
"| 6.3 | 2.5 | 5.0 | 1.9 | virginica |\n",
"| 6.5 | 3.0 | 5.2 | 2.0 | virginica |\n",
"| 6.2 | 3.4 | 5.4 | 2.3 | virginica |\n",
"| 5.9 | 3.0 | 5.1 | 1.8 | virginica |\n",
"\n"
],
"text/plain": [
" Sepal.Length Sepal.Width Petal.Length Petal.Width Species \n",
"1 5.1 3.5 1.4 0.2 setosa \n",
"2 4.9 3.0 1.4 0.2 setosa \n",
"3 4.7 3.2 1.3 0.2 setosa \n",
"4 4.6 3.1 1.5 0.2 setosa \n",
"5 5.0 3.6 1.4 0.2 setosa \n",
"6 5.4 3.9 1.7 0.4 setosa \n",
"7 4.6 3.4 1.4 0.3 setosa \n",
"8 5.0 3.4 1.5 0.2 setosa \n",
"9 4.4 2.9 1.4 0.2 setosa \n",
"10 4.9 3.1 1.5 0.1 setosa \n",
"11 5.4 3.7 1.5 0.2 setosa \n",
"12 4.8 3.4 1.6 0.2 setosa \n",
"13 4.8 3.0 1.4 0.1 setosa \n",
"14 4.3 3.0 1.1 0.1 setosa \n",
"15 5.8 4.0 1.2 0.2 setosa \n",
"16 5.7 4.4 1.5 0.4 setosa \n",
"17 5.4 3.9 1.3 0.4 setosa \n",
"18 5.1 3.5 1.4 0.3 setosa \n",
"19 5.7 3.8 1.7 0.3 setosa \n",
"20 5.1 3.8 1.5 0.3 setosa \n",
"21 5.4 3.4 1.7 0.2 setosa \n",
"22 5.1 3.7 1.5 0.4 setosa \n",
"23 4.6 3.6 1.0 0.2 setosa \n",
"24 5.1 3.3 1.7 0.5 setosa \n",
"25 4.8 3.4 1.9 0.2 setosa \n",
"26 5.0 3.0 1.6 0.2 setosa \n",
"27 5.0 3.4 1.6 0.4 setosa \n",
"28 5.2 3.5 1.5 0.2 setosa \n",
"29 5.2 3.4 1.4 0.2 setosa \n",
"30 4.7 3.2 1.6 0.2 setosa \n",
"⋮ ⋮ ⋮ ⋮ ⋮ ⋮ \n",
"121 6.9 3.2 5.7 2.3 virginica\n",
"122 5.6 2.8 4.9 2.0 virginica\n",
"123 7.7 2.8 6.7 2.0 virginica\n",
"124 6.3 2.7 4.9 1.8 virginica\n",
"125 6.7 3.3 5.7 2.1 virginica\n",
"126 7.2 3.2 6.0 1.8 virginica\n",
"127 6.2 2.8 4.8 1.8 virginica\n",
"128 6.1 3.0 4.9 1.8 virginica\n",
"129 6.4 2.8 5.6 2.1 virginica\n",
"130 7.2 3.0 5.8 1.6 virginica\n",
"131 7.4 2.8 6.1 1.9 virginica\n",
"132 7.9 3.8 6.4 2.0 virginica\n",
"133 6.4 2.8 5.6 2.2 virginica\n",
"134 6.3 2.8 5.1 1.5 virginica\n",
"135 6.1 2.6 5.6 1.4 virginica\n",
"136 7.7 3.0 6.1 2.3 virginica\n",
"137 6.3 3.4 5.6 2.4 virginica\n",
"138 6.4 3.1 5.5 1.8 virginica\n",
"139 6.0 3.0 4.8 1.8 virginica\n",
"140 6.9 3.1 5.4 2.1 virginica\n",
"141 6.7 3.1 5.6 2.4 virginica\n",
"142 6.9 3.1 5.1 2.3 virginica\n",
"143 5.8 2.7 5.1 1.9 virginica\n",
"144 6.8 3.2 5.9 2.3 virginica\n",
"145 6.7 3.3 5.7 2.5 virginica\n",
"146 6.7 3.0 5.2 2.3 virginica\n",
"147 6.3 2.5 5.0 1.9 virginica\n",
"148 6.5 3.0 5.2 2.0 virginica\n",
"149 6.2 3.4 5.4 2.3 virginica\n",
"150 5.9 3.0 5.1 1.8 virginica"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"attach(iris)\n",
"iris"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now run the `colMeans` function (it is better for dataframes than just mean) on multiple columns:"
]
},
{
"cell_type": "code",
"execution_count": 338,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"iris$Species: setosa\n",
"Sepal.Length Sepal.Width \n",
" 5.006 3.428 \n",
"------------------------------------------------------------ \n",
"iris$Species: versicolor\n",
"Sepal.Length Sepal.Width \n",
" 5.936 2.770 \n",
"------------------------------------------------------------ \n",
"iris$Species: virginica\n",
"Sepal.Length Sepal.Width \n",
" 6.588 2.974 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"by(iris[,1:2], iris$Species, colMeans)"
]
},
{
"cell_type": "code",
"execution_count": 339,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"iris$Petal.Width: 0.1\n",
"Sepal.Length Sepal.Width \n",
" 4.82 3.36 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 0.2\n",
"Sepal.Length Sepal.Width \n",
" 4.972414 3.379310 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 0.3\n",
"Sepal.Length Sepal.Width \n",
" 4.971429 3.328571 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 0.4\n",
"Sepal.Length Sepal.Width \n",
" 5.300000 3.785714 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 0.5\n",
"Sepal.Length Sepal.Width \n",
" 5.1 3.3 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 0.6\n",
"Sepal.Length Sepal.Width \n",
" 5.0 3.5 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1\n",
"Sepal.Length Sepal.Width \n",
" 5.414286 2.371429 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.1\n",
"Sepal.Length Sepal.Width \n",
" 5.400000 2.466667 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.2\n",
"Sepal.Length Sepal.Width \n",
" 5.78 2.74 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.3\n",
"Sepal.Length Sepal.Width \n",
" 5.884615 2.746154 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.4\n",
"Sepal.Length Sepal.Width \n",
" 6.3250 2.9125 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.5\n",
"Sepal.Length Sepal.Width \n",
" 6.183333 2.816667 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.6\n",
"Sepal.Length Sepal.Width \n",
" 6.375 3.100 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.7\n",
"Sepal.Length Sepal.Width \n",
" 5.80 2.75 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.8\n",
"Sepal.Length Sepal.Width \n",
" 6.400000 2.941667 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 1.9\n",
"Sepal.Length Sepal.Width \n",
" 6.34 2.68 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 2\n",
"Sepal.Length Sepal.Width \n",
" 6.650000 3.016667 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 2.1\n",
"Sepal.Length Sepal.Width \n",
" 6.916667 3.033333 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 2.2\n",
"Sepal.Length Sepal.Width \n",
" 6.866667 3.200000 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 2.3\n",
"Sepal.Length Sepal.Width \n",
" 6.9125 3.0875 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 2.4\n",
"Sepal.Length Sepal.Width \n",
" 6.266667 3.100000 \n",
"------------------------------------------------------------ \n",
"iris$Petal.Width: 2.5\n",
"Sepal.Length Sepal.Width \n",
" 6.733333 3.400000 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"by(iris[,1:2], iris$Petal.Width, colMeans)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using `replicate`\n",
"\n",
"The `replicate` function is useful to avoid a loop for function that typically involves random number generation (more on this below). For example:"
]
},
{
"cell_type": "code",
"execution_count": 340,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 5 × 10 of type dbl\n",
"\n",
"\t0.05779543 | 0.79875145 | 0.48499984 | 0.86907132 | 0.1283642 | 0.0009810759 | 0.6536405 | 0.8683811 | 0.8454915 | 0.8783501 |
\n",
"\t0.45615105 | 0.57098740 | 0.23039036 | 0.30154024 | 0.2747319 | 0.2428520240 | 0.9098087 | 0.6079378 | 0.6468676 | 0.7984627 |
\n",
"\t0.56282049 | 0.04613426 | 0.62504519 | 0.04747557 | 0.7813629 | 0.8418969549 | 0.4024599 | 0.6946761 | 0.6771371 | 0.2471511 |
\n",
"\t0.47753585 | 0.69846828 | 0.46303515 | 0.25758230 | 0.2651486 | 0.1114152884 | 0.1513297 | 0.3396491 | 0.8450003 | 0.3601890 |
\n",
"\t0.63433648 | 0.40640785 | 0.07475354 | 0.04436429 | 0.0915491 | 0.2450728263 | 0.4717760 | 0.4990844 | 0.7130802 | 0.9336190 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 5 × 10 of type dbl\n",
"\\begin{tabular}{llllllllll}\n",
"\t 0.05779543 & 0.79875145 & 0.48499984 & 0.86907132 & 0.1283642 & 0.0009810759 & 0.6536405 & 0.8683811 & 0.8454915 & 0.8783501\\\\\n",
"\t 0.45615105 & 0.57098740 & 0.23039036 & 0.30154024 & 0.2747319 & 0.2428520240 & 0.9098087 & 0.6079378 & 0.6468676 & 0.7984627\\\\\n",
"\t 0.56282049 & 0.04613426 & 0.62504519 & 0.04747557 & 0.7813629 & 0.8418969549 & 0.4024599 & 0.6946761 & 0.6771371 & 0.2471511\\\\\n",
"\t 0.47753585 & 0.69846828 & 0.46303515 & 0.25758230 & 0.2651486 & 0.1114152884 & 0.1513297 & 0.3396491 & 0.8450003 & 0.3601890\\\\\n",
"\t 0.63433648 & 0.40640785 & 0.07475354 & 0.04436429 & 0.0915491 & 0.2450728263 & 0.4717760 & 0.4990844 & 0.7130802 & 0.9336190\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 5 × 10 of type dbl\n",
"\n",
"| 0.05779543 | 0.79875145 | 0.48499984 | 0.86907132 | 0.1283642 | 0.0009810759 | 0.6536405 | 0.8683811 | 0.8454915 | 0.8783501 |\n",
"| 0.45615105 | 0.57098740 | 0.23039036 | 0.30154024 | 0.2747319 | 0.2428520240 | 0.9098087 | 0.6079378 | 0.6468676 | 0.7984627 |\n",
"| 0.56282049 | 0.04613426 | 0.62504519 | 0.04747557 | 0.7813629 | 0.8418969549 | 0.4024599 | 0.6946761 | 0.6771371 | 0.2471511 |\n",
"| 0.47753585 | 0.69846828 | 0.46303515 | 0.25758230 | 0.2651486 | 0.1114152884 | 0.1513297 | 0.3396491 | 0.8450003 | 0.3601890 |\n",
"| 0.63433648 | 0.40640785 | 0.07475354 | 0.04436429 | 0.0915491 | 0.2450728263 | 0.4717760 | 0.4990844 | 0.7130802 | 0.9336190 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] [,3] [,4] [,5] [,6] \n",
"[1,] 0.05779543 0.79875145 0.48499984 0.86907132 0.1283642 0.0009810759\n",
"[2,] 0.45615105 0.57098740 0.23039036 0.30154024 0.2747319 0.2428520240\n",
"[3,] 0.56282049 0.04613426 0.62504519 0.04747557 0.7813629 0.8418969549\n",
"[4,] 0.47753585 0.69846828 0.46303515 0.25758230 0.2651486 0.1114152884\n",
"[5,] 0.63433648 0.40640785 0.07475354 0.04436429 0.0915491 0.2450728263\n",
" [,7] [,8] [,9] [,10] \n",
"[1,] 0.6536405 0.8683811 0.8454915 0.8783501\n",
"[2,] 0.9098087 0.6079378 0.6468676 0.7984627\n",
"[3,] 0.4024599 0.6946761 0.6771371 0.2471511\n",
"[4,] 0.1513297 0.3396491 0.8450003 0.3601890\n",
"[5,] 0.4717760 0.4990844 0.7130802 0.9336190"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"replicate(10, runif(5))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That is, you just generated 10 sets (columns) of 5 uniformly-distributed random numbers (a 10 $\\times$ 5 matrix). \n",
"\n",
"```{note}\n",
"The actual numbers you get will be different from the ones you see here unless you set the random number seed [as we learned above](R-random-numbers).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using `plyr` and `ddply`\n",
"\n",
"The `plyr` package combines the functionality of the `*apply` family, into a few handy functions. Look up the [web page](http://plyr.had.co.nz/).\n",
"\n",
"In particular, `ddply` is very useful, because for each subset of a data frame, it applies a function and then combines results into another data frame. In other words, \"ddply\" means: take a data frame, split it up, do something to it, and return a data frame. Look up [this](http://seananderson.ca/2013/12/01/plyr.html) and \n",
"[this](https://www.r-bloggers.com/transforming-subsets-of-data-in-r-with-by-ddply-and-data-table/) \n",
"for examples. There you will also see a comparison of speed of `ddply` vs `by` at the latter web page; `ddply` is actually slower than other vectorized methods, as it trades-off compactness of use for some of the speed of vectorization! Indeed, overall functions in `plyr` can be slow if you are working with very large datasets that involve a lot of subsetting (analyses by many groups or grouping variables). \n",
"\n",
"The base `*apply` functions remain useful and worth knowing even if you do get into `plyr` or better still, `dplyr`, which we will see in the [Data chapter](08-Data_R.ipynb).\n",
"\n",
"\n",
"## Practicals\n",
" \n",
"### A vectorization challenge\n",
"\n",
"The Ricker model is a classic discrete population model which was introduced in 1954 by Ricker to model recruitment of stock in fisheries. It gives the expected number (or density) $N_{t+1}$ of individuals in generation $t + 1$ as a function of the number of individuals in the previous generation $t$:\n",
"\n",
"$$ N_{t+1}= N_t e^{r\\left(1-\\frac{N_t}{k}\\right)} $$\n",
"\n",
"Here $r$ is intrinsic growth rate and $k$ as the carrying capacity of the environment. Try this script (call it `Ricker.R` and save it to `code`) that runs it:"
]
},
{
"cell_type": "code",
"execution_count": 341,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtAAAALQCAMAAACOibeuAAADAFBMVEUAAAABAQECAgIDAwME\nBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUW\nFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJyco\nKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6\nOjo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tM\nTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1e\nXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29w\ncHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGC\ngoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OU\nlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWm\npqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4\nuLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnK\nysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc\n3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u\n7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////i\nsF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3dd3xUVfrH8TPpIYUOgQBSBQMB\nFAQEQg29SpGmyCIg1VVWRVBgsYKAoi62RX6AyIruuthAARWk6ApSBASkSO81BELq/c0kYQIh\nmUzmnHuee879vv9IhjBzz+Pu5xXuzNy5lxkAGmHUAwCIhKBBKwgatIKgQSsIGrSCoEErCBq0\ngqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsI\nGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBB\nKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSC\noEErCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwga\ntIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEEr\nCBq0gqBBKwgatIKgQSsIGrSCoEErCBq0gqBBKwgatIKgQSsIGrSCoEErEoLethnAJ9sKX5v5\nQW9iAD7aVOjczA96A0s2fQ3QUjLbUOjHIGiwLAQNWkHQoBUEDVpB0KAVBA1akR/05WMn0gu6\nD4IGH0kOesfgKMaYf/SA9R7vhqDBR3KDHutg5Rp37tykAmPDPN0PQYOPpAY9l3XYknVrZz82\n28MdETT4SGrQTWum3riZEdfMwx0RNPhIatCRD+fcnlTUwx0RNPhI7m/oWmnu263xGxpMIHkf\nutNvWbf2DmSvergjggYfyX2VYyRjFZt379GiCmNDMjzcD0GDjyS/Dr11QCnX69DlBqzxeDcE\nDT6S/07hxSMn83yn8FTneLf6LIlrDXtKP7Z+yfTRfXNpF6+tPqm3/49gnWM5EqdOcOvArpiy\nhp6S9qycN3lwyyqBjJW9t/fjE2xjRh67rdYJ+mbvImgvXNi56r0JfZtV9WOB5Rr0nfDeF5sT\nqEeih6DVc2HzJ9Mf69ugKGMhVeNHTH1v1YG0gh9kFwhaGdcPrFo4fUR81QDGijfoOmL6J5uP\nU49kQTKDfqvYLTzcE0Hf5MaehYMFVW2WuWeRSD2ShckMet9jwSyijpuHeyJow0g57tyzGNG1\nQUTWnsX0hdiz8IbcXY5vWFev7mf3oK+PbBbtx/yimw2YMPfLHXiqVwiS96HvRNDeGFb++YVr\nDqZQj6EiyUEPut+ru9k86HcC11GPoCy8ymE9PwW/Tz2CuhC05ZyMfpR6BIUhaKtJiWtynXoG\nhSFoqxlZ9hj1CCpD0BazMHAt9QhKQ9DW8mvoXOoR1IagLeVUhYeoR1AcgraS1JZ3X6OeQXEI\n2krGlTlCPYLqELSFfBj4A/UIykPQ1rG1yBvUI6gPQVvGuSqDqEfQAIK2irT29a5Sz6ABBG0V\n40scoB5BBwjaIpb4f0M9ghYQtDVsD5tFPYIeELQlnK/Wy9O5/sBrCNoK0jvVxSe5xUDQVvB0\n8f3UI+gCQVvAZ/7LqUfQBoKmtzvyFeoR9IGgyV2o3hNPCIVB0NTSu9S6TD2DRhA0tWcjf6ce\nQScImtgy/6+oR9AKgqa1p+jz1CPoBUGTSojpnucFZ8BXCJpSRu+al6hn0AyCpjQ1Yhf1CLpB\n0IS+8P839QjaQdB09habQj2CfhA0mSu12+MaE8IhaCoZfWtcpJ5BQwiayovhO6hH0BGCJrIy\nYCn1CFpC0DT+LDmJegQ9IWgSibHt8ITQFAiaQka/ymepZ9AUgqYwI/RX6hF0haAJrA74F/UI\n2kLQ8h0q9RT1CPpC0NJda9A2lXoGfSFo6YbccYZ6BI0haNlmh2ymHkFnCFqydYHzqUfQGoKW\n60jpx6lH0BuClirp3mbJ1DPoDUFLNbTiaeoRNIegZXoz5BfqEXSHoCXaEDSPegTtIWh5TpQf\nQz2C/hC0NCnN7sMTQtMhaGmGRx2nHsEGELQs7wSuox7BDhC0JD8Fv089gi0gaDlORj9KPYI9\nIGgpUuKaXKeewR4QtBQjyx6lHsEmELQMiwLXUo9gFwhagi2hc6lHsA0Ebb5TFR6iHsE+ELTp\nUlvefY16BvtA0KYbV/JP6hFsBEGb7cPAH6hHsBMEbbKtReZQj2ArCNpc56sMoh7BXhC0qdI6\n1LtKPYO9IGhTjS9xgHoEm0HQZlri/w31CHaDoE20PWwW9Qi2g6DNc75arwzqGWwHQZunT91E\n6hHsB0Gb5lTA99Qj2BCCNs3r0bgukHwI2jQNJ1BPYEcI2ix/sN+oR7AjBG2WybWpJ7AlBG2W\nGjOoJ7AlBG2SDX5HqEewJQRtktGtqSewJwRtjtQyOHMuCQRtji9CLlKPYE8I2hz9elNPYFMI\n2hQJRT6jHsGmELQpFhTHqexoIGhTtBtBPYFdIWgznPDHueyIIGgzzK6UTj2CXZEEfWnTWc93\nUD3oeyZST2BbcoO+NKXnywnGrCKMNdrt6X6KB72b7aAewbakBn2+GmMsfhGrMbyzXzFP1whW\nPOhn61NPYF9Sg36CzTm+MCCwS7JhrHCM9HBHtYPOqDKTegT7khp0LdcBO/3Ydtfttp4OF1Y7\n6HV+uP4EGalBF3FdCWoKyzxb8qiwXH95IIDdROWgR7alnsDG5P+G7p/10aT2t/2G3rrZbZLK\nQSeXnE89go1J3od++8ySgIAeKYax0uHpun1K73IsC7lEPYKNyX2Vo6pzZ6L1PHbXqB7+kSc9\n3FHpoPs+QD2Bncl9HfrCxC7TLhkvhzBWz+MrtSoHfTn0c+oR7IzkncLzG497voPKQX9QIpl6\nBDvDsRyitRlFPYGtIWjBjvuvpx7B1hC0YK/egVPoUkLQgtV7jnoCe0PQYu1iO6lHsDcELdYz\nDagnsDkELVRG5deoR7A5BC3UGr9j1CPYHIIWanh76gnsDkGLlFxiIfUIdoegRfpP6GXqEewO\nQYvUawD1BLaHoAW6FPoV9Qi2h6AFer90CvUItoegBWo5lnoCEBF08plUMcO4qRn0Eb+fqEcA\n3qB/eb5lCcYcZTrN2iNuKEWDfqUaDrQjxxV02qIGLKDBwHGTRj9Q24+1Wy5sKjWDjp1KPQFw\nBb2lQeTQ1dey/3B5We/grqKuZKZk0NuYyH+kwDc8QZeZcfWWn58d30TARC5KBv1UI+oJgC/o\n208/IeqEFCoGnV7xDeoRQMSrHJePnRB+dm8Vg/4uwNOZRkASzqB3DI5ijPlHDxD7yVAVgx7a\niXoCMHiDHutg5Rp37tykAmPDRE6lYNBJxRZTjwAGZ9BzWYctWbd29mOzhc2kZNCfhCk3spa4\ngm5a0/0WYUZcM0ETuSgYdI8HqScAF66gIx/O+eGkoiLGyaZe0BeCV1CPAC58v6Frpbl/2Nre\nv6HfLSP6eBbwCec+dKffsm7tHcheFTaTikHHPUY9AWTie5VjJGMVm3fv0aIKY0NEHpijXNCH\nHP+jHgEycb4OvXVAKdfr0OUGrBE4k4JBv1QdB9pZA/87hRePnMQ7hXWmUU8AWfCJFRF+ZXup\nR4AsCFoEYYcZAi8ELUB69FvUI0A2BC3AqoBT1CNANp6g3yp2C4FTKRb0kC7UE8ANPEHveyyY\nRdRxEziVWkEnFV1CPQLcwLfL8Q3rKnIYN7WC/jgskXoEuIFzH/pOBG0Y3QZTTwBunEEPul/g\nLDmUCvp80LfUI4AbXuXgNhcH2lkIgubW9AnqCSAHguZ1yLGZegTIgaB5PV+LegK4CYLmFfMC\n9QRwEwTNaZPjAPUIcBMEzelxkZ+lBG4Imk9a1NvUI8DNEDSfbwLPUo8ANxMRdJ+Vxuo+YsbJ\npk7QD3WnngBuISJo9q4xT+xvbWWCvhq+lHoEuAWC5vJR5LWC7wQSIWgunf9CPQHcCkHzOBO4\nmnoEuBWC5vFm+bSC7wQyIWgejZ+kngByQdAc9ju2Uo8AuSBoDlPvop4AckPQHO58mXoCyA1v\nffvuZ8ef1CNAbgjad+NaUE8At0HQPkst+x71CHAbBO2zr4POUY8At0HQPhtozjlJgAuC9lVi\n+L+pR4DbIWhfLcKBdlYkJuiMw8kihnFTIegOQq9uDoJwB71m6J/GsToseLLNLut2OuAH6hEg\nD7xBr/BjO40BrF1D9i9xQ6kQ9JxoHGhnRbxBxxX5LONqaCsjuWycuKFUCPrep6kngLzwBl2i\ns2GsZh8YRr8y4oZSIOg/2HbqESAvvEEXHWgYU9gfhjEsTNxQCgQ9uTb1BJAn3qAblU9OqVXN\nMFKqx4gbSoGga8ygngDyxBv0QlazMnvB+KEFE3lxYMsHvcHvCPUIkCful+1eKhXQ65rxHOsh\nMkHLBz26NfUEkDcBb6y4LsiwT+wpOK0edGqZedQjQN7w1rcvvgy5SD0C5I076E/7x2cTNpP1\ng+7fm3oCyAdv0PMYCy+VRdxQVg86ochn1CNAPniDrh25XtwwbhYPekHx69QjQD44g84IGidw\nGDeLB91uBPUEkB/OoK87TLlIn7WDPuG/lnoEyA/vLkfLypfEDeNm7aBnV0qnHgHywxv04djY\npfvPZRI3lMWDvmci9QSQL96gi4WxG8QNZe2gd7Md1CNAvniDHpZD3FDWDvrZ+tQTQP7wTmFh\nZVSZST0C5E9E0Il7E8QM42bloNf5HaUeAfLHHXTCtHLO/eeoaYnCRjKsHfTIttQTgAe8QV+r\nw6J6jekTzeqJfPPMwkEnl5xPPQJ4wBv00+wZV8nJzzKRr2VZOOhlIWa88A6i8AZ9T4PsG40a\n5HXXvJ3+tYAdFAsH3fcB6gnAE96gw24c1jAq3ItHHnr4HcPYVJ8xv24en1lZN+jLoZ9TjwCe\n8AYd2zz7Rsu6BT9wX0n2uvFHqKP9yJas3AUPd7Ru0B+UEHvSMxCMN+gxbE7mOcD+wbw47K6P\n458ZRm+/Vc6bH7OxHu5o3aDbjKKeADziDfpyVVZn7Ivj6rIqlwt+YNnGzi8VOmXejs99Yov0\nLz9xG27VoI/7m3H4N4jD/Tr0yVGBjLHAR0948cCIgc4vZR7JvD08MtdfHixT3K0IE/1WjSCv\n3iHynJQgnoB3ClP2rfkjxasHti7v/DXeta6rifQ6nq64Y9ldjnrPUU8Ankk9lmNtUJOfjK3h\nE9OMpDHsNQ93tGrQu9hO6hHAM56gGTtmMFaYw0f/FcAqxlVlpRpGsiGe7mfVoJ8pxIvtQIIn\n6J49zxl9cnjz0MPjy7vaD+n4rce7WTTojMqe/lkBK+Dd5Th34xCOq55eV77ZlaMHTxX0ESaL\nBr3G7xj1CFAA3qDZguwbz5cUMk8WiwY9vD31BFAQrqD/u3gxG7E40/yG+gedXGIh9QhQEK6g\nK9/0lJANFjiVNYP+T6gXbx4BLa6gVy5bxh5blmVlksCprBl0rwHUE0CBePeh4z2/XOEjSwZ9\nKfQr6hGgQMLeWPlqOPcsOSwZ9PulvXs/FChxB3180RyX1+oWFTaTRYNu6en4QLAI3qC3Fb/x\npHCMuKEsGfQRv5+oR4CC8QbdK2Du8hrdfl7ZQuT5zi0Z9CvVcKCdAniDju5mGNNrGsb5kovE\nDWXJoGOnUk8AXuANOmScYSwLTDOMES2FzWTJoLexPdQjgBd4g67VxzC2u85eOEnzJ4VPNaKe\nALzBG/SDwV+nXw+ZZBhNKokbyoJBp1d8g3oE8AZv0IfC2WJjmKN3Wyby06PWC/q7gJPUI4A3\nuF+H3jVurXG1QwDr6O3ho96wXtBDO1FPAF4R9E7hpfMCZslhuaCTii2mHgG8whl04rsbBQ7j\nZrmgPwmz2ECQD+5TgQ0SN0sOywXdw5T/TBCPN+jRpUVeLOgGqwV9IXg59QjgHd6gUx+NXbov\nIdFF3FCWC/rdMqnUI4B3eIOOivK3wVWw4h6jngC8xBv0kBzihrJa0Icc/6MeAbyEq2B54aXq\nONBOFQKCvvqb8AOFLRZ0nWnUE4C3uIM+1CvQufs8ZZDQU7BYK+gtbC/1COAt3qBPVGRNWzNj\nJov25ny63rJW0H9rQj0BeI3/DP6LjMXOHyzwHy1uKGsFnR79FvUI4DXeoO9obWQGbXSvIWwm\niwW9KuAU9QjgNe63vh/NDnpUmLCZLBb0kC7UE4D3eINu3Cg7aPcFC0WwUtBJRZdQjwDe4w36\nRfZCuivoF7W9kuzHYUIvYw7m4g06rQWrfh8b3YDF6npuu24iz0IJZuN+HTp5TiXGWMnnhF62\nykJBnw8y5ex9YBIRb31f2SX28yqWCnpuVBr1CFAIOJajAPWfop4ACoM76E/7x2cTNpOVgl7v\n+IN6BCgM3qDnMRZeKou4oSwU9KDO1BNAofAGXTvSjItfWybosyFfUo8AhcIZdEbQOIHDuFkm\n6Fcq4SmhWjiDvu54QuAwblYJOr3KK9QjQOHw7nK0rHxJ3DBuVgn6yyAcl6QY3qAPx8Yu3X8u\nk7ihLBN0Z5yOQzW8QRcLK8zF671lkaAP+ZvxjBfMxBv0sBzihrJK0BPqUk8AhYV3CvN3vcy7\n1CNAYeFT3/lbFIFLISsHn/rO3324MKF68KnvfG1z7KAeAQoNn/rO14hW1BNA4eFT3/lJiFhK\nPQIUHj71nZ83opKpR4DCw6e+81N7MvUE4AN86jsf3/sfoh4BfIBPfeej7/3UE4Av8KnvvJ0I\nXEk8AfgEn/rO29+rpRNPAD7BsRx5Sq0wm3YA8JGwo+3GTHj/rLCpyIP+T6jof3RADt6gS4W6\nj4dmoU+Lmoo86PihtOuDr3iDPlulypu/HNo8t2qnbct7MlEXxKYOep9jE+n64DPuy7pFncz8\nfqrcZCOjTRtBU1EH/fjdpMuD73iDrvRw9o2hMYYxvbSIkQzyoK+VmE+5PHDgDrpr9o2eZQ3j\nqZJCZiIPel6xq5TLAwfeoB8OWJb5/avAgcaRGqIOuCQOuuHfKFcHHtxPCquyZk/OeKoFq3B6\nR5BD1LtrtEH/7MB1CZXF/cbKibHBjDG/oaeMTU2FnQeONuiH2xMuDnwEvFN4ffe32wRfhoQ0\n6AtF/ku3OHDCW9+3mVkxlW5x4MQT9JO5z/514HnuebJQBp1x5wtkawM3nqBHFH16Z85PM9YN\nDZ8lZijSoFcEHidbG7hx7XL82JDFjv3wf/tO7ln//tDKjk57RE1FGXT3fmRLAz/OfegNg0pl\nH5lUcczv4qYiDPqw/xqqpUEA7ieFGVsWzpg4+yOxl9YhDHrSXRlUS4MAeJXjVsll/0G0MgjB\nGXTiuxsFDuNGF/SScJygUWncJ5ox5Rz3dEHHjSRaGMTgDXp0aZGXoriBLOhdji00C4MgvEGn\nPhq7dF9Coou4oeiCHtWcZl0QhTfoqCh/ja6xciXyI5J1QRjuj2DlEDcUWdBzS18nWReEwct2\nN4sVeYI+oIBrrNxkrd9BimVBIFxj5Sb9u1GsCiLhGis5zgQvJ1gVhMI1VnK8WBUnaFQerrHi\nllb5VfmLgmC4xorbsuDT8hcFwXCNFbcOg+WvCaLhGis37PcT/uojyIdrrNzwZH3pS4J4uMZK\ntmsl/yl7STABrrGS7f+KCj5ZDpDgPrfdjRu74wuxhXnrPf89QdCN/ip7RTADb9C1s853nvBU\nYGEOWWIFfC5EftBb2c6C7wTWxxt0SI0jzq8flWOVPi3wcUe/uoF1cn7xcE/5QT/SVvKCYA7e\noL8Lu2P/9jgW/JwXpwhfwG7h4Z7Sg74Y9m+5C4JJuJ8UbixWwp913e/N4xKGsPBJ011YY+eX\nXH97adwItzjZQb9WLkXugmAS/lc5tpZmb3v7yE9LVFmXuYU89qHPDerr1oAJfRWwQBk1p0pd\nD0wj4GW736MrefUL2uVoG7+JKdZ7UrgqQOjh3ECHJ+j62aJZcdc3rx6bMTOo/k7LBd2rj9Tl\nwDw8Qde6lZeP3npXyOsWC/p44HcylwMTEXxI9toYZrGgp9TCCRp1wR/0H6ucX94t1KmhV89a\n5fkOcoNOjZ4jcTUwFXfQf3W4TjYU4Bgv8pec3KA/KXJB4mpgKt6g57OmXzu/rWjNPhA2k+yg\nWw+XuBiYizfo1jWy3pFIjWkoaCIXqUH/7tgsbzEwGW/QxR7NvjE6Qsg8WaQGPa6JvLXAbLxB\n1+qUfaPLnULmySIz6KvFF0pbC0zHG/QI/6yL16/wV/Vkje+VFPnhMSDGG/T5yiz+xXnTuznK\nnBQ3lNSg73la2lJgPu6X7Q4/5Oc6FrTLbmEjGVKD3uAQewEvoCXgncIz6z9afVTQONkkBv1g\np4LvA+qw+/mhz4Z8IWklkIInaMaOGd59AqWw5AU9vVKapJVACp6ge/Y8Z/TJIXAqaUGnV3lZ\nzkIgic13Ob4KOiVnIZBEWNApuT8jyENa0F0GylkHZOEK+pe2JUMarzIS5gy5v20FFfehD/mv\nk7IOSMMT9FZ/xkJZwJp7M58TxgicSlbQE2pLWQbk4XpSyMZfMvY2Cmdjd5w6cU3kVJKCTi7z\njoxlQCKeoO/I/OTSz6yu0IlcJAX9YcRlGcuARDxB+/V2fb3K+okcKJOkoJuOlbEKyMT1xsqD\nN38TSU7Q29gOCauAVHYOekRLCYuAXDYOOiHiY/MXAclsHPSbUcnmLwKScQVdrqNL9reOAqeS\nEnSdyeavAbLxHW3n7fmeC0tG0N/7HzJ9DZCOJ+j9txI4lYyg+/Y0fQmQz7ZH250M/NbsJYCA\nbYOeVi3d7CWAgF2DTq0w2+QVgIRdg/5PqOiLhYIl2DXo+L+YvADQsGnQ+xy/mLsAELFp0E/c\nbe72gYo9g75WYr6p2wcy9gx6XjEvLnwLKrJn0A3Hm7p5oGPLoP/n2Gvm5oGQLYMe0t7MrQMl\nOwZ9och/Tdw6kLJj0DMrpJq4dSBlw6Az7nzevI0DMRsG/U3gcfM2DsRsGHQP8ecRAcuwX9BH\n/NeYtm0gZ7+gn71L5FXJwWJsF3RK+bfM2jRYgO2CXhJ+yaxNgwXYLui4kWZtGazAbkHvcmwx\nactgCXYLenRzkzYM1mCzoK9EfmTOhsEibBb03FJJ5mwYLMJmQcdONGe7YBX2CvpHv4OmbBcs\nw15BD+hmymbBOmwV9Jng5WZsFizEVkG/WBUnaNSdnYJOrzzDhK2Cpdgp6M+DT5uwVbAUOwXd\ncbAJGwVrsVHQ+/1+Er9RsBgbBf1kPfHbBKuxT9DXS78vfJtgOfYJekHRROHbBMuxT9CN/yp8\nk2A9tgl6K9spepNgQbYJ+pE2orcIVmSXoC+GfSp4i2BJdgn69XIpgrcIlmSToDNqThW7QbAo\nmwS9KuCY2A2CRdkk6F69xW4PrMoeQR8NWCV0e2BZtgg6re29OEGjTdgi6KeLHxC5ObAwOwS9\nzB8fJbQNGwS9t+gL4jYGFqd/0FdiuuKjsfahfdAZD1S/KGpbYH3aB/1qKM6faye6B/19wAJB\nWwIlaB70iXI4rN9e9A46pdl9yUI2BKrQO+iRZXHRWJvROugPA9eK2AwoROegtxV5U8BWQCka\nB32h6gD+jYBi9A06vVMsTsRhP/oG/VzEbgGTgGK0DfpLv89ETAKK0TXofcUmC5kEFCM76NN7\nUrNunPX0qVXuoK/dHZ/GtwVQk9ygt9ZlLCrr4IqOnrbCHfTDd5zl2wAoSmrQ+0P84juHsLmu\n26YGPSdkM9fjQVlSg+7vWG4YZ6qH7DHMDXpj0Ac8DweFSQ26SgfX172hrstfmhj0qehRHI8G\npUkNOmJY5rfJ7Mc8gj5cs6pbaZ6gU1s0vu77o0FtUoNuHpP5LbFi7eTbg05Z+J7bk4zjqM/H\nyhz1/cGgOKlBT2RjM393fs36J3nc5djAEfQS/5U+PxaUJzXopDgW0dV1YzKLLm1S0L+FzfT1\noaABua9DX3ymVtZex4KazJygE2r1xFm/7Izqre+MP1d7+Fufg87oVfOyb48EPVjzWA6fg34+\nfJfYSUAxegW9KuATwZOAYrQK+lCpCaInAcXoFHRSg9apwkcBtegU9NCKZ4RPAorRKOh3AteL\nnwQUo0/QPwe/Z8IkoBhtgj5d4SEzJgHF6BJ0Wvv610wZBdSiS9BPljhoyiSgGE2CXua/wpxJ\nQDF6BL0n8iWTJgHFaBH0lZhuOMQOMukQdEbfGpdMGwXUokPQ08N2mDYJKEaDoL8PWGjeJKAY\n9YM+UvoJEycBxSgfdEqzprguELgpH/SIKFwXCHKoHvSiwB9NnQQUo3jQ20LfMncSUIzaQZ+v\nOtDkSUAxSged3qnuVbNHAbUoHfSzxfabPQkoRuWgv8B1gSA3hYP+o9hU0wcB1agbdGKddrgu\nEOSmbtCDcV0guJ2yQb+O6wJBHlQNekPQfNOnAAUpGvTJ8qNNHwJUpGbQKXFNcF0gyIuaQY/F\ndYEgb0oGvSTgB9NHADWpGPRvYbNNnwAUpWDQF6vfj5MWQD7UCzrjflwXCPKlXtDTInBdIMiX\nckGvDPjU9OVBXaoFfajURNNXB4UpFnRSgzY4xA48UCzov+C6QOCRWkHPDf7F9LVBaUoF/XPw\nP01fGtSmUtC4LhAUSKGgM1o1TDJ9ZVCcQkGnPnjY9IVBdQoFDVAwBA1aQdCgFQQNWkHQoBUE\nDVpB0KAVBA1aQdCgFQQNWkHQoBUEDVpB0KAVBA1aQdCgFQQNWkHQoBVrBr2JAfhoU6FzMz9o\nY9tmoWaHfkigRl+CRZ9liwhWrdeFYNGX2fd5/H+9rfC1SQhasM8jKVZt+hLBomtZOsGqnZ8m\nWHQ7Oy9mQwjaOwjaXAhaMgRtLgQtGYI2F4KWDEGbC0FLhqDNhaAlQ9DmQtCSIWhzIWjJELS5\nELRkCNpcNg56RSmKVVvNJFj0p0CKq5L2fI5g0d2OBDEbUi/o9D8pVj1xjWDRjIMEixpnBKVV\nOAcEbUe9oAE8QNCgFQQNWkHQoBUEDVpB0KAVBA1aQdCgFQQNWkHQoBUEDVpB0KAVBA1aQdCg\nFQQNWkHQXrqy4Cj1CJra94fhLz0AAAWKSURBVJbIrakW9PVJcZFVB+yXv/AQ9pXsJX9sG1nu\nAcn/qefHxxSJGX9B5pKPFcu+8Xazos3e5t2aYkFfimMxw9o7QrfKXvhTJj3oj4PKD+zhX/Kw\nzDUvVGWtRrRk1S/JW3JlcHbQI1nNwXeysZybUyzoiWyM8+vXfvUkr3usRLjsoA8HNHZm9U/2\nsMxFJ7G5zq9z2FRZCw6qyVhW0FtZx1Qjtb1jB98GFQu6VsR117d4dlrqshltqkySHfR49pNr\n5dffkbloF3bG+fU46ylrwfu7do3ICnoA2+78+isbzLdBxYKO6Zr5rTPbI3XZmX7rpssOunxF\nuetlmsaWOL8uYi9LXLNOVtClKmR+KxfFtzXFgs5yJqRsqsz1tgZNNGQHfYXFbetWpmKffVJX\nvdQqcMDUAQHxMj/4nRX0RdYs80+NGd/aKga9tzr7P5nrXYupnyw96KOsWnjs0I5+RQp/mREe\nHwQwxgI/lLlkVtBHWPfMP3Vmx7i2pl7QiVNCQ/4hdcUxITsN6UEfZOyZDMNY5bhb5qqvsO7b\nr27rwmZLXDMr6JOsR+afOrMTXFtTLujllVhXuTvQq9nrhvygT7GSaa7v7WU+/z0fcleK81ty\njSKX5S2aFXS6f4vMPzXx5zv9mWpBT2G110pecpb7ImPzJK6aHtIw8/tI9qu8RTeyUZnfh/lw\nPTWfZT8pLFc181vFaL6tKRb0AtZf+mU8V410acw6jVwvc9mOkUmuby39EuWteTz73/2sV+8k\nqXPjZbu9zq872QC+rakVdEbN6CSipaW/bPctG+P8x3cp6ypz0Xr+K51fV/jdK3HN7KDXsAed\n/wf3Y+v4tqZW0H+y0h2znJW9tPSgjSEsdkQ7Vk7qMVG/RTg6jIp3FN0tcc3soJ3/vW0mtWCP\ncG5NraC/c+/O8r224wP5QRuzmkfEjJV6nJBhnBgeUyTm0VMyl7wRdMaMppFNuc9arFbQAAVA\n0KAVBA1aQdCgFQQNWkHQoBUEDVpB0KAVBA1aQdCgFQQNWkHQoBUEDVpB0KAVBA1aQdCgFQQN\nWkHQoBUEDVpB0KAVBA1aQdCgFQQNWkHQoBUEDVpB0KAVBA1aQdCgFQQNWkHQoBUEDVpB0KAV\nBA1aQdDSsMp5/rh5Bclz6A1BS4OgZUDQ0iBoGRC0NO6gb73SIoIWCkFLkxn0sGJHWjmC6nzg\n+sHe3hWi+/6ZGXTaS03CK489YRg7g1o5/5hSp8RJ0lnVhaClyQo6LLby42Mj2b8NY2Oko9VD\nFaPucAad3II1HNGKVTpsGH9n8w3jJbaYelpVIWhpsoJmsRcNYz3rbxiN/P5rGAnNmTPoOewF\n598tZL2dbdcpcWZfSDfqYZWFoKXJDnqp63Z4vLGZ9XXd+sUVdKXq6a7b9wVdNYyf/Qa2KX6C\ncE61IWhpsoPe77pdKt74iGXuSBulKxiJ7L7FLq3Zb84fPMHYIsIxFYegpckO+pzrtjPoWWxF\n5o/vrmDscl/xeaPzB/tY2GXCMRWHoKXJFfRS17M/p8oVjHNszE336x7ERhOMpwkELU2uoLey\nfq5bB/2c+9AlG2beY+YU55fF7LU+jg1kU6oOQUuTK2ijid/nhpHUxfWk8Fn2kuF6lWOgYZwq\n2SDteGRMsudtQX4QtDS5g94Y6dfukWrhrjdWEuqwBqN7+EcfM4xe/r8axj/YNOJhlYWgpckd\ntLG3T6WoXltGut4pTJpwT5Hqo04Yxsfsb84/pjcK/p1yVIUhaNAKggatIGjQCoIGrSBo0AqC\nBq0gaNAKggatIGjQCoIGrSBo0AqCBq0gaNAKggatIGjQCoIGrSBo0AqCBq0gaNAKggatIGjQ\nCoIGrSBo0AqCBq0gaNAKggatIGjQCoIGrSBo0AqCBq0gaNDK/wOVtHJnmSTPKAAAAABJRU5E\nrkJggg==",
"text/plain": [
"plot without title"
]
},
"metadata": {
"image/png": {
"height": 360,
"width": 360
}
},
"output_type": "display_data"
}
],
"source": [
"Ricker <- function(N0=1, r=1, K=10, generations=50)\n",
"{\n",
" # Runs a simulation of the Ricker model\n",
" # Returns a vector of length generations\n",
" \n",
" N <- rep(NA, generations) # Creates a vector of NA\n",
" \n",
" N[1] <- N0\n",
" for (t in 2:generations)\n",
" {\n",
" N[t] <- N[t-1] * exp(r*(1.0-(N[t-1]/K)))\n",
" }\n",
" return (N)\n",
"}\n",
"\n",
"plot(Ricker(generations=10), type=\"l\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now open and run the script `Vectorize2.R` (available on the TheMulQuaBio repository). This is the stochastic Ricker model (compare with the above script to see where the stochasticity (random error) enters). Now modify the script to complete the exercise given. \n",
"*Your goal is to come up with a solution better than mine!* "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Errors and Debugging\n",
"\n",
"As we learned in [Python Chapter I](Python:errors), dealing with errors is a fundamental programming task that never really goes away!\n",
"\n",
"### Unit testing \n",
"\n",
"We will not cover unit testing here, which we [did in Python Chapter I](Python:errors)). Unit testing in R is equally recommended if you are going to develop complex code and packages.\n",
"\n",
"```{tip}\n",
"A very convenient tool for R unit testing is [`testthat`](https://testthat.r-lib.org).\n",
"```\n",
"\n",
"In addition, you can use other \"[defensive programming](https://en.wikipedia.org/wiki/Defensive_programming)\" methods in R to keep an eye on errors that might arise in special circumstances in an complex program. A good option for this is to convert warnings to errors using the [`stopifnot()`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/stopifnot.html). This is a bit like `try` (which we will cover below, and we also covered in [Python Chapter I](Python:errors)). "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Debugging\n",
"\n",
"Once you have found an error, you need to debug it. Assuming you have an idea about where (or the general area) in your code the error is, a simple way to debug is to use the `browser()` function. \n",
"\n",
"The `browser()` function allows you to insert a breakpoint in your script (where the running of the script is stopped) and then step through your code. Place it within your script at the point you want to examine local variables (e.g. inside a `for` loop). \n",
"\n",
"Let's look at an example usage of `browser()`. \n",
"\n",
"★ Type the following in a script file called `browse.R` and save in `code`:\n",
"\n",
"```R\n",
"Exponential <- function(N0 = 1, r = 1, generations = 10) {\n",
" # Runs a simulation of exponential growth\n",
" # Returns a vector of length generations\n",
" \n",
" N <- rep(NA, generations) # Creates a vector of NA\n",
" \n",
" N[1] <- N0\n",
" for (t in 2:generations) {\n",
" N[t] <- N[t-1] * exp(r)\n",
" browser()\n",
" }\n",
" return (N)\n",
"}\n",
"\n",
"plot(Exponential(), type=\"l\", main=\"Exponential growth\")\n",
"```\n",
"The script will be run till the first iteration of the `for` loop and the console will enter the browser mode, which looks like this:\n",
"\n",
"```R\n",
"Browse[1]>\n",
"```\n",
"\n",
"Now, you can examine the variables present at that point. Also, at the browser console, you can enter expressions as normal, or use a few particularly useful debug commands (similar to the Python debugger):\n",
"\n",
"* `n`: single-step \n",
"* `c`: exit browser and continue\n",
"* `Q`: exit browser and abort, return to top-level.\n",
"\n",
"\n",
"```{tip}\n",
"We will not cover advanced debugging here as we did [in Python Chapter I](Python:errors), but look up `traceback()` (to find where the errors(s) are when a program crashes), and `debug()` (to debug a whole function).\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### \"Catching\" errors\n",
"\n",
"Often, you don't know if a simulation or a R function will work on a particular data or variable, or a value of a variable (can happen in many statistical functions). \n",
"\n",
"Indeed, as most of you must have already experienced by now, there can be frustrating, puzzling bugs in programs that lead to mysterious errors. Often, the error and warning messages you get are un-understandable, especially in R!\n",
"\n",
"Rather than having R throw you out of the code, you would rather catch the error and keep going. This can be done by using the `try` keyword. \n",
"\n",
"Lets try `try`! \n",
"\n",
"First, let's write a function:"
]
},
{
"cell_type": "code",
"execution_count": 342,
"metadata": {},
"outputs": [],
"source": [
"doit <- function(x) {\n",
" temp_x <- sample(x, replace = TRUE)\n",
" if(length(unique(temp_x)) > 30) {#only take mean if sample was sufficient\n",
" print(paste(\"Mean of this sample was:\", as.character(mean(temp_x))))\n",
" } \n",
" else {\n",
" stop(\"Couldn't calculate mean: too few unique values!\")\n",
" }\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This function runs a simulation that involves sampling from a synthetic population with replacement and takes its mean, but *only* if at least 30 unique samples are obtained. Note that we have used another keyword here: `stop()`. Pay close attention to `sample()`, and `stop()` in the above script (check out what they are by using R help or searching online). \n",
"\n",
"\n",
"Now, generate your population:"
]
},
{
"cell_type": "code",
"execution_count": 358,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtAAAALQCAMAAACOibeuAAAC+lBMVEUAAAABAQECAgIDAwME\nBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUW\nFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJyco\nKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6\nOjo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tM\nTExNTU1OTk5PT09QUFBRUVFSUlJUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1eXl5f\nX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29wcHBx\ncXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGCgoKD\ng4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OUlJSV\nlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWmpqan\np6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGzs7O0tLS1tbW2tra3t7e4uLi5ubm6\nurq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnKysrLy8vM\nzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd3d3e\n3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v7+/w\n8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////rAHqZAAAA\nCXBIWXMAABJ0AAASdAHeZh94AAAdDElEQVR4nO3dfbxUdZ3A8e9c5CnggnJRHhWQgJCHVlLw\nggh200RCgRTEkBCVAB9qd31aREvb1tJd86EHC6JsN13RssxWzbBIy82IpxRIM5YFU1x5Frjc\ne16vPedcuMzMvZz59Zvz+47z4/P5Y+bMnDPznXvP2/HcmctcCYg8Skr9AIjSDNDkVYAmrwI0\neRWgyasATV4FaPIqQJNXAZq8CtDkVYAmrwI0eRWgyasATV4FaPIqQJNXAZq8CtDkVYAmrwI0\neRWgyasATV4FaPIqQJNXAZq8CtDkVYAmrwI0eRWgyasATV4FaPIqQJNXAZq8CtDkVYAmrwI0\neRWgyasATV4FaPIqQDfbEhH5TcNiP5HRQfANkfalejC1n+/bst1DpZpeZgG62YxA102YMOGX\nCg/mq+GDkW8rDPIhQDebEegD4UYPKzyYGpHj5i1XGORDgG62JqDrDxw4kL+RFugPiVyjMMaP\nAN1sTUA30/rl4Ua3/2ZnEWPqjLYaKLKgiCFHV4ButiMectQ/fm7fNn1rvlcbBFMkLtps/90T\n+1SeccWqg7fefcOQ9jV/DG8yMrxwm0i/4PEP9w+3Wjz2pNa9qr8a/SdwXXivK6szrQYvDvbe\ncVblyZ/enD0++/4Ojrnt0LrZImN3XdejzSm37226cf7aaHjdXYPanDTpFWffq/dXgG62I4Gu\nv6CBl5y6Mwv0qqENiy1vrY9usWlAdOH4eYdBfz8jvYO9Iw/edsjOGHSvTvGlu8bFZz22HZ6e\nc3/NgB49Or5q6LomG+evDYefPC2+2Hql1veutAG62ZZIdodBRy849J88MiMy8/Ax9J6+4cKJ\np7UOTxdHN54YcW4X3fAg6BM6Swj6xvCKgWedEJ4ujEGLtPtAw/1XRSdfaByee3+vLD9RZMby\njYfWhmQzkul9TLi2punwvLW3Rfec6RZdPFftm1fSAN1sRwJ9nsingvjScfWNoBeIVCwKgs0f\nCWn+XxA8Gl59T/3+qw+Dllaf+dZ/RD/b3dJwFx9vAH1L3e75kcZXg1XtRSY1Ds+7v7xj6NnR\ns+8bwbvnhOcvNtk4b200fNLbwdbTRCpVv4ElC9DNdiTQw8Njg69vDPYsW7asthF0CO6S6Ear\nwyfux4PgYpFR4aW6gYdBh9cG9Q8//PDbQbBjjMhHYtBdwh8JXw9XfjVceUn2T55599cM6JfD\n8zfbhj+SNtk4b204vFV0yP5QePVWhe9b6QN0s2UdQw/MBn1LDLz/Z5buDhoPOfaF/0NfGm/a\nX+SLQTBI5J+jSwsaQXdouKfa5V/45LDo2KAB9PDwqq3hpSfD8zlZoPPvrynoXvHCR6P/W+Rv\nnLs2Gt43uvRMOOfN1L9L78cA3WxHAr3v5uManrQrv9UI+vWG/7uHnS1yeVDfUmRJdOnBRtAn\nx2tXhNIz/S4+7xDo8DQG/bMgF3Te/TUDekS8MEPknCYb565teIkl7FlAH90dCXT4NPv8jcPi\nn7RWZT9DPxavGxC/GNFV5K7o0u1ZL9uF7Q1/epsaorqpEOj8+2sK+qR4ITxMnttk49y1gKaG\njgB655o1a8KrNodW5YHGY+iQ0qXRlmsrRB4NgrPi58YgGJkL+rfh1uvD8wmFQOffX1PQmT+E\n52+HP0je22TjvLWAprgjgF4XXv10eN1fwyPhn8agFwXxc27Fd4Ngy2kind4Kgi9JdMxRd6vk\ngn4mvjr4caYg6Lz7a+aHwr/bFGwPD13abmqycd5aQFPckQ45+om0GHPJ+ZUix+8Igk7h/9/n\nvx7s7h1ufXJ12/A0PLIOdofHHNKzo+SBfjN8CpVhQ0LPMjgZdN79NfMMLRUfbBWuvaHpxnlr\nAU1xRwL9SueDr+S1eT68dJE0bLbilIYrWy6I3yn8RfyDY6vJuaCDz8bb9J0mcszqRND599cE\ndPWIePUndzfdOG8toCnuiD8UbrtnTN+2Vade97/Rhbcv69Z24NpwYd+XJ/TuMOLyQ+8uv/bp\nAZ3PX/71PNB19w5td+rntj8R3vV1yaDz7q8J6LE75ndrPejf6pvZOG8toCm1bgmfJR3cbUTW\ndu1REKBTb16/fiP2BEHtoIa3RdIO0IkBOvXuDf/3Pvm/nq4R6fhnB3cP6MQAnXp1nzz4g2O7\nH7q4e0AnBmgHPTvxQx/oesbf/9XJnX99ypRbbdceBQGavArQ5FWAJq8CNHkVoMmrAE1eBWjy\nKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjy\nKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjy\nKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAO286TXW\nTS/1Yy+/AO26Ohk/y7LxUlfqR192Adp1dbJktWVLAP03B2jXAVo1QLsO0KoB2nWAVg3QrgO0\naoB2HaBVA7TrAK0aoF0HaNUA7TpAqwZo1wFaNUC7DtCqAdp1gFYN0K4DtGqAdh2gVQO06wCt\nGqBdB2jVAO06QKtWLOi69WtrU3kg3gZo1axBL1gUntTe2V6k9ZXbUnxA3gVo1axBy9jw5Go5\ndspVI2XQ3hQfkW8BWrWiQK/JnL41XFwkC9N7QN4FaNWKAv2gvBAvjzottcfjX4BWrSjQC2Vn\nvDynQ2qPx78ArVpRoB+SNfHyhUNSezz+BWjV7EF3v/3Rl7pMjRZfajkrvQfkXYBWzRp0r4xE\nPRcEN7btvDHNh+RZgFbN/o2VPauWfmnW6F8GwcBey1J8QN4FaNVSeOt7bdPv+sZBfRs7seOB\n4meUcYBWzc3vcuz7zjcbu172OZlRLgFaNfe/nPRrQANaL0C7DtCqAdp1gFbNFvR9nXJK2BLQ\ngFbMFvSGa1pLh8GNJWwJaEArZn/I8TOZYLQdoAGtWBHH0P0BbRKgVSsC9PRJRpsBGtCK8SqH\n6wCtGqBdB2jVAO06QKsGaNcBWjVAuw7QqgHadYBWDdCuA7RqgHYdoFUDtOsArRqgXQdo1QDt\nOkCrBmjXAVo1QLsO0KoB2nWAVg3QrgO0aoB2HaBVA7TrAK0aoF0HaNUA7TpAqwZo1wFaNUC7\nDtCqAdp1gFYN0K4DtGqAdh2gVQO06wCtGqBdB2jVAO06QKsGaNcBWjVAuw7QqgHadYBWDdCu\nA7RqgHYdoFUDtOsArRqgXQdo1QDtOkCrBmjXAVo1QLsO0KoB2nWAVg3QrgO0aoB2HaBVA7Tr\nAK0aoF0HaNUA7TpAqwZo1wFaNUC7DtCqAdp1gFYN0K4DtGqAdh2gVQO06wCtGqBdB2jVAO06\nQKsGaNcBWjVAuw7QqgHadYBWDdCuA7RqgHYdoFUDtOsArRqgXQdo1QDtOkCrBmjXAVo1QLsO\n0KoB2nWAVg3QrgO0aoB2HaBVA7TrAK0aoF0HaNUA7TpAqwZo1wFaNUC7DtCqAdp1gFYN0K4D\ntGqAdh2gVQO06wCtGqBdB2jVAO06QKsGaNcBWjVAuw7QqgHadYBWDdCuA7RqgHYdoFUDtOsA\nrRqgXQdo1QDtOkCrBmjXAVo1QLsO0KoB2nWAVg3QrgO0aoB2HaBVA7TrAK0aoF0HaNUA7TpA\nqwZo1wFaNUC7DtCqAdp1gFYN0K4DtGqAdh2gVQO06wCtGqBdB2jVAO06QKsGaNcBWjVAuw7Q\nqgHadYBWDdCuA7RqgHYdoFUDtOsArRqgXQdo1QDtOkCrBmjXAVo1QLsO0KoB2nWAVg3QrgO0\naoB2HaBVA7TrAK0aoF0HaNUA7TpAqwZo1wFaNUC7DtCqAdp1gFYN0K4DtGqAdh2gVQO06wCt\nGqBdB2jVAO06QKsGaNcBWrXiQG/ftLngtxzQgFasCNCrZ3QVkRY9pi1P3AzQgFbMHvT8jHQb\nMX78yJ4is5O2AzSgFbMG/YCc+/uGpTUXy90JGwIa0IpZg64eUHtosf7MUQkbAhrQilmDrrzs\n8PLNHRM2BDSgFbN/hh54oHF5HM/QRw7QqhVxDH3eqoaldZfIlxM2BDSgFbN/lWOOSK/REy8Y\n00dkZn3CdoAGtGJFvA69YlpV9Dp0t2nLEjcDNKAVK+6dwnc3bmn2W77vO99s7HpAA1ovN7/L\nsXFQ38a6y14nM8olQKvGLye5DtCqAdp1gFYN0K4DtGq2oO/rlFPCloAGtGK2oDdc01o6DG4s\nYUtAA1ox+0OOn8kEo+0ADWjFijiG7g9okwCtWhGgp08y2gzQgFaMVzlcB2jVAO06QKsGaNcB\nWrVs0Eu2u5gAaEArlg1a2kz+zz2pTwA0oBXLBv3AWRXS/tKf7E93AqABrVjuMfSW+0PTx13x\nXJrfR0ADWrEmPxRuuX9MhXS79jepTQA0oBVr+irHH27rI2H9l6Y0AdCAViwXdO1z154k0m3O\n0y9/rn3mv9OZAGhAK5YNeumnjhU5+R9eiP8N9+/lxnQmABrQiuW8bCfDblt56ML2qq+kMwHQ\ngFYsG/Rdr7mYAGhAK5Z7DL3+mfDkG6+mOgHQgFYsB/S1mdHh6TGZzyV9EtLfGqABrVg26MVS\n/WR49tQ4WZTiBEADWrFs0OM+2PCud+2gj6Q4AdCAViwbdKerDi7M7ZDiBEADWrFs0APPO7hw\nfv8UJwAa0Iplg76yxY/i86dazExxAqABrVg26Hd6S80d3/6XT2SO35LiBEADWrGcl+3+8qmK\n6PeSzn8lzQmABrRieb9t99byf3/2f9KdAGhAK8Y/knUdoFXLAf3o1JqDpTgB0IBWLBv0t0Xa\nVzWU4gRAA1qxbNCnVCb/GXq7AA1oxbJA17e62sUEQANasSzQezOfdTEB0IBWLPuQ46ze2xxM\nADSgFcsG/ZchQx7509a4FCcAGtCK5fy2XTs5VIoTAA1oxbLpzj5cihMADWjFeKfQdYBWLQ/0\n7lUvpj0B0IBWLAf0G5NbhofPC6dvSnMCoAGtWDbozb2kepwEX5Eem1OcAGhAK5YNep58L/h+\neMWSFnNTnABoQCuWDfqkcUEMOpj4wRQnABrQimWDbnfVQdCfaZfiBEADWrFs0CNOPwj61OEp\nTgA0oBXLBn2H3F4Xgb5DbkpxAqABrVg26ANjpN8ZMne4DHkvxQmABrRiOa9D77vnRBHpvGBH\nmhMADWjF8t/63rn2nZQnABrQivG7HK4DtGrZoC89XIoTAA1oxXL/xsrBOvRLcQKgAa1YNui9\ncVufHdX2yRQnANoa9P1yxZW2LS71112imjuG3j2gc4p/7xvQ1qBvkim2Da4u9dddopr9ofAf\nZWN6EwBdBOiVtje9BtBZXds6xWM3QANasWZA1z/fcWiKEwANaMWyQbdvqLXIkhQnABrQimWD\nnnCwGT9KcwKgAa0Y7xS6DtCqAdp1gFYtG3TPnEanNAHQgFYsG/ScHpLpPrxnRnqPDpuU0gRA\nA1qxbNC/qjjnj+HZq+f2eCPFCYAGtGLZoD/RZ098vqfvlBQnABrQimWDPuGygwuzeqY4AdCA\nViz/czniarqlOAHQgFYsG/TUzA/j8ycqJqY4AdCAViwb9BudKy5a9NTiiyrarkxxAqABrVjO\nGyt/ODv+ByuDn01zAqABrVjeO4VrHr37ey+m++9+AA1oxfjAc9cBWjU+8Nx1gFaNDzx3HaBV\n4wPPXQdo1fjAc9cBWjU+8Nx1gFaNDzx3HaBV4wPPXQdo1fjAc9cBWjU+8Nx1gFYtC/Sub7zA\nB56nH6BVy3mVY7qLCYAGtGLZoOd22epgAqABrVg26NqrhjyyYceuqBQnABrQimWD7tq1xaHP\n8E9xAqABrVg23ZmHS3ECoAGt2CHQ87/ragKgAa3YIdAS/+GrxbPTnwBoQCuWC3qmg89uBDSg\nFQO06wCtGqBdB2jVAO06QKsGaNcBWjVAuw7QqjWCPmlqWB+Z2lCKEwANaMUaQeeW4gRAA1qx\nQ3R/l1uKEwANaMX4K1iuA7RqgHYdoFUDtOsArRqgXQdo1QDtOkCrBmjXAVo1QLsO0KoB2nWA\nVg3QrgO0aoB2HaBVA7TrAK0aoF0HaNUAbdZ1fa0DtGaANqv67FstuwXQmgHarOprbGmtALRm\ngDYL0GUSoM0CdJkEaLMAXSYB2ixAl0mANgvQZRKgzQJ0mQRoswBdJgHaLECXSYA2C9BlEqDN\nAnSZBGizAF0mAdosQJdJgDYL0GUSoM0CdJkEaLMAXSYB2ixAl0mANgvQZRKgzQJ0mQRoswBd\nJgHaLECXSYA2C9BlUrGg69avrU3eAtCAVswa9IJF4Untne1FWl+5LWlDQANaMWvQMjY8uVqO\nnXLVSBm0N2FDQANasaJAr8mcvjVcXCQLEzYENKAVKwr0g/JCvDzqtIQNAQ1oxYoCvVB2xstz\nOuStfL3LsY11kKQDEtXuO9a6Y44e0HPtv0udX0pxb9lVFOiHZE28fOGQvJV1v3imsXveP8/Q\n1w/5lm1tjh7Q48+1/y49keLesssedPfbH32py9Ro8aWWsxI2fB8dclx/pq2P1e2OItCzbIeu\nbl/GoHtlJOq5ILixbeeNCRsCGtCK2b+xsmfV0i/NGv3LIBjYa1nSdoAGtGIpvPW9ti5xNaAB\nrdhR9bscgDYJ0MkBGtCKAdosQJsEaN0AbRKgkwM0oBUDtFmANgnQugHaJEAnB2hAKwZoswBt\nEqB1A7RJgE4O0IBWDNBmAdokQOsGaJMAnRygAa0YoM0CtEmA1g3QJgE6OUADWjFAmwVokwCt\nG6BNAnRygAa0YoA2C9AmAVo3QJsE6OQAXRLQp/+fbR8DdGKALgXosWIfoBMDdClAn9H/Edsq\nAZ0YoEsC+sO2t1x9HKATAzSgFQO0WYA2CdC6AdokQCcHaEArBmizAG0SoHUDtEmATg7QgFYM\n0GYB2iRA6wZokwCdHKABrRigzQK0SYDWDdAmATo5QANaMUCbBWiTAK0boE0CdHKABrRigDYL\n0CYBWjdAmwTo5AANaMUAbRagTQK0boA2CdDJARrQigHaLECbBGjdAG0SoJMDNKAVA7RZgDap\n3b/+zrYNKe1jQBvuKkAblLH/hMiKnensY0CbBWiTMnf+2rLvyjvp7GNAmwVokzL32t7yMUBb\nBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ0c\noAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDb\nBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ0coAFdOEDbBGiTAJ1c2qD3//wZ2y4GtEGA\nTi5t0E9mKm1rAWiDAJ1c2qCfaG/9/R4CaIMAnRygAV04QNsEaJMAnRygAV04QNsEaJMAnRyg\nAV04QNsEaJMAnRygAV04QNsEaJMAnRygAV04QNsEaJMAnRygAV04QNsEaJMAnRygAV04QNsE\naJMAnRygAV04QNsEaJMAnRygAV04QNsEaJMAnRygAV04QNsEaJMAnRygAV04QNsEaJMAnRyg\nAV04QNsEaJMAnRygAV04QNsEaJMAnVzzoB/+pm1zAW0QoJ3VLOh35YSellUC2iBAO6tZ0O/I\nY7Zf+hWANgjQzgI0oAsHaJsAbRKgkwM0oAsHaJsAbRKgkwM0oAsHaJsAbRKgkwM0oAsHaJsA\nbRKgkwM0oAsHaJsAbRKgkwM0oAsHaJsAbRKgkwM0oAsHaJsAbRKgkwM0oAsHaJsAbRKgkwM0\noAsHaJsAbRKgkwM0oAsHaJsAbRKgkwM0oAsHaJsAbdLRDHr7ps11hbYBNKAL934AvXpGVxFp\n0WPa8sTNAA3owr0PQM/PSLcR48eP7CkyO2k7QAO6cKUH/YCc+/uGpTUXy90JGwIa0IUrPejq\nAbWHFuvPHJW3ctetNzR2afOgJ82y7MOtbG85q0tP65u2HG57y0/LeNubjhTbW87qcbz1TdsM\ntb6pfMz2lpNKDrryssPLN3fMW/nm+JrGxvRp5ufG2ik1tlX3sr7psAHWN+033Pqm3c+0veWZ\n3a2HDu9nfdMBw6xv2qva+qZTapsqscn+GXrggcblcfnP0EQlqohj6PNWNSytu0S+nNbDISou\n+1c55oj0Gj3xgjF9RGbWp/iIiIqoiNehV0yril6H7jZtWXoPh6i4inun8N2NWwq+U0ikmPvf\n5SBSDNDkVYAmrwI0eRWgyasATV4FaPIqQJNXAZq8CtDkVYAmrwI0eVX5gV4k5GGZbenwKD/Q\nP2r3uxI0dG4Jhj4oL5Vg6qgZJRj6g5L/m8KS9URlKaZWf7EEQ5+XUvxy7vjrSzB0JaB1A7Tb\nAK0coN0GaOUA7TZAKwdotwFaOUC7DdDKAdptgFYO0G4DtHKAdhuglQO0245i0E9VlWLq2K+U\nYOiLLUvxGWsXLijB0FcyO9K5o/IDXffnUkzdvKcEQ+tfL8HQ4K2UaP1tvZbS/ZQfaKKEAE1e\nBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ5FWAJq8CtHkb7iv1I1Cp\nvL/MMgS99+YzK/tO+5P+4Gs66c772qiOo76mOzJK+8sMUt2l5Qd625kyaPY5mbYrtAc/3Vp3\nT8+RATP6y3zVmYH+lxmku0vLD/RNMi88fbJimO7Y6QNEVPf0Cvl4bVB7Tma15lD9LzMqzV1a\nfqAHdtgbndXIX1XHTpowoYPqnp4mK8PTl2WG5lD9LzMqzV1afqAHTYjPxsur2pMHq+7pqp7x\nWbeumkOjdL/MqDR3afmBbuitNifUas9U3dPvyqj4fIRo/yNsfdANpbNLyxT0un7yHfWhqnt6\no0yMz8fLJsWpUSUCndIuLR/Qu+8J+0m8uGth2zb3q0/V3dNb5IL4fLxsVpwaVRLQqe3S8gH9\nZvS3kqZESz89USZoHUAfnqq8p+tajInPR7bQ/jiwUoBOb5eWD+jGFsopz5dksO6e7tY3PuvV\nQ3NoVAlAp7hLyw/0Epm6rzSTdff0NFkXnq6RaZpDo/RBp7lLyw50/YAe75VotO6eXiaXhl/t\nxfIrzaFR6qBT3aVlB/rP0uXjDb2tPVp5T8+Us28eI5erzoxSB53qLi070D9v/GO62i9nae/p\n+jurK6tL8Cm+6qBT3aVlB5ooKUCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ\n5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ\n5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ5FWAVq1+f6FrS/Q3GL0J0HpVzV58vJx40WvR\n8varh7Yffv2e3Gtnd9o4NtNq8KLSPszyDtB6VfXN9LlsTObYl4PgzZNl1BWnyuCdOdfObjek\n93XzK2VpqR9pGQdovarkvPeC4Afy0SCYK/eEV9wgX8i5drYMeTcIlsvUUj/SMg7QelVVrI/O\nzpd1+1sNrg+X9nbtnn1tCPqRaLF9TQkfZLkHaL2qesdn98mPN8j8eHGy7Mq6NgT9p3g7QNsH\naL2qRsZnj8sDy+SOeHFe+LR8+NoQ9NZ4O0DbB2i9qvrEZ1+Tx9bL1fHiFNmRdS2gUwjQelVV\nxEcUF8qa/S2HRkv7enTNvhbQKQRovark/L1BsDQzKgiukvvDK/5JPp9zLaCLD9B6VfU8rt/l\nNZmOvw2Czb1l7NwRMnRXzrWALj5A61U1dsMFXXtO2RAtb5s3pN2pN76Xey2giw/QelWNNb+W\nLAO0XoBWCNB6AVohQOs1aLL5tWQZoMmrAE1eBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1e\nBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1eBWjyKkCTVwGavArQ5FWAJq8CNHkVoMmrAE1e\nBWjyqv8HLW6FNZxvNmYAAAAASUVORK5CYII=",
"text/plain": [
"Plot with title “Histogram of popn”"
]
},
"metadata": {
"image/png": {
"height": 360,
"width": 360
}
},
"output_type": "display_data"
}
],
"source": [
"set.seed(1345) # again, to get the same result for illustration\n",
"\n",
"popn <- rnorm(50)\n",
"\n",
"hist(popn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now try running it using `lapply` as you did above, repeating the sampling exercise 15 times:"
]
},
{
"cell_type": "code",
"execution_count": 357,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"Mean of this sample was: -0.11620822588674\"\n",
"[1] \"Mean of this sample was: -0.0468516755995931\"\n",
"[1] \"Mean of this sample was: -0.0890228211466614\"\n",
"[1] \"Mean of this sample was: -0.124229742255296\"\n"
]
},
{
"ename": "ERROR",
"evalue": "Error in doit(popn): Couldn't calculate mean: too few unique values!\n",
"output_type": "error",
"traceback": [
"Error in doit(popn): Couldn't calculate mean: too few unique values!\nTraceback:\n",
"1. lapply(1:15, function(i) doit(popn))",
"2. FUN(X[[i]], ...)",
"3. doit(popn) # at line 1 of file ",
"4. stop(\"Couldn't calculate mean: too few unique values!\") # at line 7 of file "
]
}
],
"source": [
"lapply(1:15, function(i) doit(popn))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Your result will be different than this (you would get between 0 - 15 samples; we got 4 here). But in most cases, the script will fail because of the `stop` command (on line 7 of the above function) at some iteration, returning less than the requested 15 mean values, followed by an error stating that the mean could not be calculated because too few unique values were sampled."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now try doing the same using `try`:"
]
},
{
"cell_type": "code",
"execution_count": 359,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"Mean of this sample was: -0.11620822588674\"\n",
"[1] \"Mean of this sample was: -0.0468516755995931\"\n",
"[1] \"Mean of this sample was: -0.0890228211466614\"\n",
"[1] \"Mean of this sample was: -0.124229742255296\"\n",
"Error in doit(popn) : Couldn't calculate mean: too few unique values!\n",
"[1] \"Mean of this sample was: 0.0314144452816157\"\n",
"[1] \"Mean of this sample was: -0.233476945796405\"\n",
"[1] \"Mean of this sample was: -0.196681538928001\"\n",
"[1] \"Mean of this sample was: 0.0146969612111605\"\n",
"[1] \"Mean of this sample was: -0.234913159471725\"\n",
"[1] \"Mean of this sample was: -0.0497464588165691\"\n",
"Error in doit(popn) : Couldn't calculate mean: too few unique values!\n",
"[1] \"Mean of this sample was: 0.14834078393269\"\n",
"[1] \"Mean of this sample was: -0.271626543849156\"\n",
"[1] \"Mean of this sample was: -0.0661949014450445\"\n"
]
}
],
"source": [
"result <- lapply(1:15, function(i) try(doit(popn), FALSE))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this run, you again asked for the means of 15 samples, and again you (most likely) got less than that (see the above example output), but without any error! The `FALSE` modifier for the `try` command suppresses any error messages, but `result` will still contain them so that you can inspect them later (see below). *Please don't forget to check the help on inbuilt R commands like `try`.*\n",
"\n",
"The errors are stored in the object `result`:"
]
},
{
"cell_type": "code",
"execution_count": 346,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'list'"
],
"text/latex": [
"'list'"
],
"text/markdown": [
"'list'"
],
"text/plain": [
"[1] \"list\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class(result)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a list that stores the result of each of the 15 runs, including the ones that ran into an error. Have a look at it: "
]
},
{
"cell_type": "code",
"execution_count": 347,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[[1]]\n",
"[1] \"Error in doit(popn) : Couldn't calculate mean: too few unique values!\\n\"\n",
"attr(,\"class\")\n",
"[1] \"try-error\"\n",
"attr(,\"condition\")\n",
"\n",
"\n",
"[[2]]\n",
"[1] \"Mean of this sample was: -0.0639710889231533\"\n",
"\n",
"[[3]]\n",
"[1] \"Mean of this sample was: 0.0954898884134274\"\n",
"\n",
"[[4]]\n",
"[1] \"Error in doit(popn) : Couldn't calculate mean: too few unique values!\\n\"\n",
"attr(,\"class\")\n",
"[1] \"try-error\"\n",
"attr(,\"condition\")\n",
"\n",
"\n",
"[[5]]\n",
"[1] \"Mean of this sample was: 0.179497884968553\"\n",
"\n",
"[[6]]\n",
"[1] \"Mean of this sample was: -0.0269560049619619\"\n",
"\n",
"[[7]]\n",
"[1] \"Mean of this sample was: -0.0596177312603713\"\n",
"\n",
"[[8]]\n",
"[1] \"Error in doit(popn) : Couldn't calculate mean: too few unique values!\\n\"\n",
"attr(,\"class\")\n",
"[1] \"try-error\"\n",
"attr(,\"condition\")\n",
"\n",
"\n",
"[[9]]\n",
"[1] \"Mean of this sample was: -0.169305162448064\"\n",
"\n",
"[[10]]\n",
"[1] \"Mean of this sample was: -0.0570172067888256\"\n",
"\n",
"[[11]]\n",
"[1] \"Mean of this sample was: 0.130501940679632\"\n",
"\n",
"[[12]]\n",
"[1] \"Mean of this sample was: -0.0476960671278198\"\n",
"\n",
"[[13]]\n",
"[1] \"Error in doit(popn) : Couldn't calculate mean: too few unique values!\\n\"\n",
"attr(,\"class\")\n",
"[1] \"try-error\"\n",
"attr(,\"condition\")\n",
"\n",
"\n",
"[[14]]\n",
"[1] \"Mean of this sample was: -0.0103821907230653\"\n",
"\n",
"[[15]]\n",
"[1] \"Error in doit(popn) : Couldn't calculate mean: too few unique values!\\n\"\n",
"attr(,\"class\")\n",
"[1] \"try-error\"\n",
"attr(,\"condition\")\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"result"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That's a lot output! But basically it tells you which runs ran into and error and why (no surprises here, of course!). You can also store the results \"manually\" by using a loop to do the same: "
]
},
{
"cell_type": "code",
"execution_count": 348,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1] \"Mean of this sample was: 0.0690929258795289\"\n",
"[1] \"Mean of this sample was: -0.117793922564727\"\n",
"Error in doit(popn) : Couldn't calculate mean: too few unique values!\n",
"[1] \"Mean of this sample was: 0.179353333942007\"\n",
"[1] \"Mean of this sample was: 0.00392019401996458\"\n",
"[1] \"Mean of this sample was: -0.0432275201290453\"\n",
"Error in doit(popn) : Couldn't calculate mean: too few unique values!\n",
"[1] \"Mean of this sample was: -0.0869881356850989\"\n",
"Error in doit(popn) : Couldn't calculate mean: too few unique values!\n",
"[1] \"Mean of this sample was: 0.0984080957232758\"\n",
"Error in doit(popn) : Couldn't calculate mean: too few unique values!\n",
"[1] \"Mean of this sample was: 0.0856480925151331\"\n",
"[1] \"Mean of this sample was: 0.0462039547120371\"\n",
"Error in doit(popn) : Couldn't calculate mean: too few unique values!\n",
"[1] \"Mean of this sample was: -0.0147578826249971\"\n"
]
}
],
"source": [
"result <- vector(\"list\", 15) #Preallocate/Initialize\n",
"for(i in 1:15) {\n",
" result[[i]] <- try(doit(popn), FALSE)\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now have a look at the new `result`; it will have similar content as the one you got from using `lapply`.\n",
"\n",
"```{tip}\n",
"Also check out `tryCatch()` as an alternative to `try()`.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"★ Type the above blocks of code illustrating `try` into a script file called `try.R` and save in `code`. The script, when `source`d, should run without errors (that is, don't run the sampling function without `try`!). "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Packages\n",
"\n",
"The big strength of R is that users can easily build and share packages through [cran.r-project.org](https://cran.r-project.org). \n",
"\n",
"There are R packages for practically all statistical and mathematical analysis you might conceive, so check them out before reinventing the wheel. Visit [cran.r-project.org](https://cran.r-project.org) and go to packages to see a list and a brief description. \n",
"\n",
"In Windows and Macs and Linux, you can install a package within R by using the `install.packages()`command. \n",
"\n",
"For example:\n",
"\n",
"```r\n",
"install.packages(c(\"tidyverse\"))\n",
"```\n",
"You can also install multiple packages by concatenating their names: `install.packages(c(\"pack1\",\"pack2\",\"pack3\"))`\n",
"(three packages in this hypothetical example)\n",
"\n",
"In Ubuntu, you will have to launch a `sudo R`session to get `install.packages()`to work properly. Otherwise, you be forced to install the package in a non-standard location on your drive that does not require `sudo` privileges (nor recommended).\n",
"\n",
"You can also use the RStudio GUI to install packages using your mouse and menu. \n",
"\n",
"In Ubuntu, you can also use the bash terminal: \n",
"\n",
"```bash \n",
"sudo apt install r-cran-tidyverse\n",
"```\n",
"\n",
"★ Go ahead and install `tidyverse` if you don't have it yet. We will be using it [soon](./08-Data_R.ipynb)!\n",
"\n",
"\n",
"### Building your own \n",
"\n",
"You can combine your code, data sets and documentation to make a *bona fide* R package. You may wish to do this for particularly large projects that you think will be useful for others. Read the [*Writing R Extensions*](https://cran.r-project.org/doc/manuals/r-release/R-exts.html) manual and see *package.skeleton* to get started. \n",
"\n",
"The R tool set [EcoDataTools](https://github.com/DomBennett/EcoDataTools) and the package [cheddar](https://github.com/quicklizard99/cheddar) were written by Silwood Grad Students!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Practicals \n",
"\n",
"### Is Florida getting warmer?\n",
"\n",
"*This Practical assumes you have at least a basic understanding of [correlation coefficients](regress:correlations) and [p-values](13-t_F_tests.ipynb).*\n",
"\n",
"Your goal is to write an R script that will help answer the question: *Is Florida getting warmer*? Call this script `Florida.R`.\n",
"\n",
"To answer the question, you need to calculate the [correlation coefficients](regress:correlations) between temperature and time. However, you can't use the standard p-value calculated for a correlation coefficient, because measurements of climatic variables in successive time-points in a time series (successive seconds, minutes, hours, months, years, etc.) are *not independent*. Therefore you will use a permutation analysis instead, by generating a distribution of random correlation coefficients and compare your observed coefficient with this random distribution. \n",
"\n",
"Some guidelines:\n",
"\n",
" * Load and examine the annual temperature dataset from Key West in Florida, USA for the 20th century:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'ats'"
],
"text/latex": [
"'ats'"
],
"text/markdown": [
"'ats'"
],
"text/plain": [
"[1] \"ats\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"rm(list=ls())\n",
"\n",
"load(\"../data/KeyWestAnnualMeanTemperature.RData\")\n",
"\n",
"ls()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"'data.frame'"
],
"text/latex": [
"'data.frame'"
],
"text/markdown": [
"'data.frame'"
],
"text/plain": [
"[1] \"data.frame\""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"class(ats)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A data.frame: 6 × 2\n",
"\n",
"\t | Year | Temp |
\n",
"\t | <int> | <dbl> |
\n",
"\n",
"\n",
"\t1 | 1901 | 23.75000 |
\n",
"\t2 | 1902 | 24.66667 |
\n",
"\t3 | 1903 | 24.71667 |
\n",
"\t4 | 1904 | 24.51667 |
\n",
"\t5 | 1905 | 24.88333 |
\n",
"\t6 | 1906 | 24.63333 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A data.frame: 6 × 2\n",
"\\begin{tabular}{r|ll}\n",
" & Year & Temp\\\\\n",
" & & \\\\\n",
"\\hline\n",
"\t1 & 1901 & 23.75000\\\\\n",
"\t2 & 1902 & 24.66667\\\\\n",
"\t3 & 1903 & 24.71667\\\\\n",
"\t4 & 1904 & 24.51667\\\\\n",
"\t5 & 1905 & 24.88333\\\\\n",
"\t6 & 1906 & 24.63333\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A data.frame: 6 × 2\n",
"\n",
"| | Year <int> | Temp <dbl> |\n",
"|---|---|---|\n",
"| 1 | 1901 | 23.75000 |\n",
"| 2 | 1902 | 24.66667 |\n",
"| 3 | 1903 | 24.71667 |\n",
"| 4 | 1904 | 24.51667 |\n",
"| 5 | 1905 | 24.88333 |\n",
"| 6 | 1906 | 24.63333 |\n",
"\n"
],
"text/plain": [
" Year Temp \n",
"1 1901 23.75000\n",
"2 1902 24.66667\n",
"3 1903 24.71667\n",
"4 1904 24.51667\n",
"5 1905 24.88333\n",
"6 1906 24.63333"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"head(ats)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAADAFBMVEUAAAABAQECAgIDAwME\nBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUW\nFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJyco\nKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6\nOjo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tM\nTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1e\nXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29w\ncHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGC\ngoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OU\nlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWm\npqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O0tLS1tbW2tra3t7e4\nuLi5ubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnK\nysrLy8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc\n3Nzd3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u\n7u7v7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7////i\nsF19AAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3dd2AUxd8G8LmS5BJSSEioCRAI\nEELvLYTeQkCkB+kgBEFRinQUEZCOKOArXVBBmkhREOWHoPTepUnvHRLSbt+73Q012Ss7u3u7\n93z+yIy5vdmvJE/ubssMYQBANKJ0AQBagCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAg\nAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQ\ngCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAg\nAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQ\ngCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAg\nAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQ\ngCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAg\nAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVAgQ5AO7QNQlUOO/5ZLH6S9BEBl9jr8ay59kP4m\nyZLvA4CiZPK3w89BkABegyABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCAB\nUIAgAVCAIAFQgCABUIAgAVCAIAFQgCCBu9nyQb1Wn12iPCiCBO4lrbMxbkS/0r6r6A6LIIF7\nGRVywPLVPM7zONVhESRwK4m+S7hOg65Ux0WQwK38rXvKdeZEUB0XQQK38quJ7yzPSXVcBAnc\nylFymeuMK091XAQJ3Iq5yEC2fRz+GdVxESRwLxuNox4xzIkaRR5RHRZBAjezNo+haC7S4DLd\nUREkcDfPdvzfspO0B0WQAChAkAAoQJDAbZzpVzms/vjHkoyNIGlbUprSFbiOdT41v/hueP4i\ntC/8ZiFIGvZgYGG9qeJCs9J1uIarvqOtzePa0VL8gyBI2nWjaLFZ/2weka27apJ0Zc896Qb/\ntEQ6257TOb62nm0Ikna1rcB+HNjnvUzpSuxinpOPEFLuD6nGbzqA70TOlmB0BEmz7hj/5Dr9\n6ilbiJ0GZZv879MDfYyU77h7rt5IvlN+qgSjI0iatV2XwnUoX+cskb36LWw7JkSaw2rMuy24\n9pnfGglGR5A0a5s+leusCJZnh1cWD5uyNd3ZZ3/Iv24+819Jq6BX/WHcz7bjc0iRVARJs27q\n+R/twBhZ9ve5Z76GFTwrnHPy6U0H8Z2qE2hV9JouORbeMZ/72CjJR0YESbua1WT/FU/6LZRj\nb9N8lpsZ5lqDcCf/3r/9Ad8pJ8VHGKvUz/yJFym8TpLBESTtuhha4cdTe6cGtXT67ZYDnvrN\n5doCE50bYGxJ7ij9DY+tlEp6U/KRTeclOheAIGnYjW6BhBSeKsvFDZtMSVxnaG3nBrjsw76l\nS2lRWpUXYyBI2nad7u1rWfsujO/MLu7kCCu8Yr9ZN7VU7hO0SpIVggRUbPDhD7aPjnZ2iCOd\nivmWH3CTVkXyQpCAiocm7mBYSrFPlC1EIQgS0DEqaJvl6+P2ue4oXYkiECSgI72frnyX2KDw\ng9x/Jh/4cauEV6C6HAQJaDk8sevgH/hjdz/kJrk9PPslKluRjBAkkMAi47h7TMr6/HGquYND\nLAQJ6HscOI1t//WW4vpQl4QgAX0/+z7jOh06KluIfBAkoG9mab4z1umTSmqDIAF98wvyncGN\nFa1DRggS0HeCHGbbtKgxClciGwQJJBBX7pbla/pA/+tKVyIXBAkkcKdijt4zhpQJ2KJ0IbJB\nkEAKyXPblm008orSZcgHQQKgAEECoABBAqAAQQKgAEECoABBAqAAQQKgAEECoABBAmB+i83n\nU/HTJyJGQJAAPjV2/379+AIlbjk/BIIEbm+rYYO1eVC+tfNjIEjg9tq059rteucvVkeQwO1F\nfMu1ZtNvTo+BIIHbK7CI7/g6v+QLggRuryG/NNNpctrpMRAkcHvf+bIBMreq4vwYCBK4vfTm\nOeeevftnbMBh58dAkECVUhZ1rtpqAqUJ+1PG5CDE2PiUiCEQJFCjO5Wzdx3/fpGcO2kNeOnI\nM1HPR5BAjWLL3bB8TekZ4iorXiBIoEKHyTG2TQ6frHAlGRAkUKHZRflO37cVreMFBAlUaGJl\nvjOivqJ1vCB/kB5euZZuaxsECQT9mCON67Tprmwhz8kcpKOdcxNCDPnidwhuhiCBoLvZ5rHt\naZPzF/XQJW+Q+ulIniqxsVVDCekptB2CBMK+9JqZyKRvKtBM6UIyyBqkWaTRAa53rB2ZKrAh\nggQ2zAk0hHsbE54qXUcGWYNUvVhqRtdcs4bAhggS2PJkx4LfbihdxAuyBsm/y4v+8ACBDREk\nzTswqFHjj0Vc2+Zq5H1Fikx73q+DVyR39pmhzpCPaxkmKF0HNTJ/RmpyhOud7kAmCWyIIClt\nRVx4gdgfzFIN/6PXL9ZmlcdqqfYgN3mP2iUQEhbd/K2YcEK6Cv2QECRlpXf17r1g0Xs+7dNs\nb+uUUkO5dkBFiXYgO5nPIx2MD7aeR8oT/z/BzRAkZc32329tjgYJHVoV4QHZw3V26BKl2YPs\n5L+y4f6l67iywcVFjuXaKQWkeXN3mZzhOsfJTUl2ID9cawdveEJ2c50jRMSUiQKeef3KdX72\nkerNo9wQJI1J27Pgx6Mix7hL+GNCZ8hl0QVl6u3G7Etdet120owvP6WCdL9sWYFHESRnbYvQ\nhech1f4VNYg5aAnXWeWbQqGmTJwMeOcaw1xpG3hWmvHlp1SQ7hChURAkJ+3x7nObYc41yev8\nlKFWfUuy88knVehGpapM7C9J8oeRMoekGl92SgUpZcsWgUcRJCdFd2Cb5HJ9RQ1zu3Cl3x8+\n/rN6fnF5FGI+8N2Sg5Kdp5IfPiNpyS0df1h5Xj5xA91obyA6faur4ktyF4oE6eZ+GyvRIEjO\nOUjuc51tOrFHw57u2yNmuSC3I2+Q/usyh2H2liVE3+zNw0FXzz23AkFyyjlynuv87KtsIe5H\n1iCdyUGmM/966xom1CJ5Xp9H6Sx5mcvcZ6Iq5nxTuE7nxsoW4n5kDVJr3Vwz00r/u6W7jPR7\n/dHLL16RPiePnd2He/vKb5u1mW8UvgQLqJM1SLmsk5SHNmH79UsIbPgNguQc84eGxsMHVvX8\nRulC3I6sQfKzHpzN2YPtv+svsCGC5LS/P6wfN9z51UnASbIGqU7ehwwTV9p69iC9ZIzAhggS\nqIysQdrmWXUnc9B3WBqT1JdME9gQQQKVkffw949GElazEAmu6E+6Cm2HIIHKyHxC9uKAvNaD\n26bGmwQ3Q5BAZeS/suHx5fM3bN3ZhyCBysgdpJun+Kntbl8R2ApBApWRN0gHSxOSm1uLvbHQ\nKAgSqIysQTpr0tePNZFZ1j6CBFoia5Da6zYyzK0Ik3XRWwQJtETWIIU3sn497W1dQgBBAi2R\n9xIhbi2XUeQvBAm0RdYgRUexzZOwEskIEmiKrEEaRvo9s7YbSPskBAm0RNYgJdUkfnHWziiS\nLwRBggyPVo2duFHKm6IfrPhs0m8STS3Gkfc80v2hkdy7u0XFBKfjQpDcysqg7DGVfcJ3SraD\nH7MH1qrkXXivZDtQbhYh8wWh6bgQJHeyxTjO8mr0sIf/GYl2sNE4yfJqdL9z4H8S7YBx1em4\nECR3Uv49tjHX6SjRDkp8xDbpNXpItAMGQQLFXSf8fKvf55BmBxfJSa6zIK80O7BCkEBhhwg/\nodQOIs3hgN0Zc1L9YZBualcECRR2OeMFY6XQNB4inM2Y7e/7EGl2YIUggdKKjuTali2kGd9c\nkF83ramEi8ggSKC0ZR5LLV/TPvPcL9EOFnv9ZPmaOtJ0RKIdMAgSuIDpHlEd2+YPWCPZDiYa\nS3ZqExa4XrIdIEjgCi5M7ZEw+7aEOzg7pXufb+5KuAMECYAGBAmAAgQJgAIECYACBAmAAgQJ\ngAIECYACBMkdPDondm1md5RyxoEFWBEk7ZtfjBBT0+NKl6Eye+p4En3pFfZujiBpXn+fsXsv\n/dY02y6lC1GVjR7xWy7vGuIx3s7tESSt22Zg12dmukbaWgMEXniaZwjbrjQcs+8JCJLWdX2b\na28aHP9Ju69Vfolcp9pQ+56AIGld1Ql8p/A8RetQl3HV+U5/O2+SQpC0rsbnfKfgQiXLUJkv\nqvCdfi3tewKCpHV9GnLtBZ1U981p0UbTfbY1l/nUvicgSFp3QM8ewk1pWlm6mT+0J7lwV/bY\nzNee3erFJGy14wkIktZNNvRa9fe35XOdUroQVdkdUHPR38s76A31Ro5tbuxn848QgqR9mxsE\n6QsnXFe6DJU536UgCalssM72wOzw+8rW5giSW3imdAGq9IyplcD1JofZeklCkACyYvb8lev8\nSy7a2BRBAsjKM/IP17lFbF3ggCABZCnXIq7dbnhgY0sECVTg2NIvNz9SYL99yrHLn5nfamhr\nSwQJXN61hiS0lClwkfx7vp6vwQmGudTJz+YcrQgSuLrEqKonLZ9Xphq/l3/f52qRwJyklO21\n/hAkcHVT83IfUMbnknQV2CycXbXsqB3XhCBI4Opq8XcyPDBsV7YQIQgSuLqIuXwn1zJF6xCE\nIIGrq/AF16aYflW2ECEIEri6AZW4zyhrPO8rXIkABAlc3SXfAdbZxA7nGah0JQIQJHB5W3IU\n6jm4iTFeiYN29kKQwPXdntoxduAWpasQhCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCCBxlwd\nVC1PtUHX7Ns4aUr9fGW6HBC/VwQJtGVPjrLjvx9fJtj2LUQWd8rmGbJkRpxxru1NbUCQQFOe\nhnWzXk+U2qVAoh1btyp719rMNRwWu18ECTRlSdATtn0caMf9tJd0O7lOg15ce/ef805O7Iwg\ngab0f4vvxH1ke+PV2fnOlPLWr3+VJ4QET3JqvV0ESZuu/zp/m1v+G/Zuz3fa9rG98Y95+M7s\nKMa62mXPA88uzAns7sx+ESQtSkww+hQ2BticsFqDJpXgO5FTbG+8R3+D6yQ0s0Qh9GPum8bf\nndgvgiSFpNVjhv1wT7n9NyuwOZ1Jmu091YnnHpk94Mvd1CuSzTkPdtp75kfPC7Y3Ti/yHtue\nzWb5QLXJi585r1U3J/aLIElgWz7/Wg1y+iswexRng9dptl3kc9vRpyZ10kU1K6NvquBfAZHG\nmyZeZC5OMH1hz8ZbPbsfS737Y77YdIaZFcV/c2y0E7tFkOg76dv3KcOkTjEqNcVAj1Zcmx6y\n9I3Hbk/rFDd4a5ZP7Zjf+mp0omRd9a5KNj8vMZK8C+zb+O9ylo19Pk6ydOcV5r83op4Te0WQ\n6ItvyP0W9i+jUAGNhvGdauNff2hLjvDuAxsaOqVm/szDOu485gXTRqmKk176uT/Opdu99Y3/\nHWanJWYO6M5y36ky2ImdIkj0BfKzRh0mCi3u1TrjgFXk16898h83/cGBnEMyf+bEjOzHfiBN\naZaXxFF1C9QedkOq4UWIrvvU2sz0POPEkxEk6lLIDq5zjxxSpoLp+bm/sSd0B197pD+/kOwK\n00PuG+lJrzw+MI7vPD+KTNuRPJGjF31aMsSuK3jk9V94xLjVc+I8ljjzZASJvoAVXHuMXFGm\ngAe5ulqTdKNck9cfKTuZa5M92UO8iyv7GCIGP3zx+IQKfKd5P2lKS45oay0ttUvYU2l2IMb9\nUVWCIju+/sfHPggSfa2ac+3Q4kpVsCd3ofc+7xxQ+c7rD4RnfAYPtoTd3MNn6G/bvy5S7Obz\nx/fpj7Lt1Ww/S1PZSj9uHu+nwYul2YFSECT6DnkNT7H8ms4zrlashLuT2lTvuujN6auiR3Lt\nPf0/DLPcxL6/elz+pbdxbxc9Yfl6qVJV+z+tO+TjRnynZV9pdqAUBEkCG4LyvNWmsGl21lvc\n+/O7XXK/t0k98P1vI8O4k47jcqcyTD3+7dsfxrvPN3rczFC9Uy2vaKkOBnzQku90cupCHNeF\nIEnhwYKP+szO+gNS8iAvj1B90CwZK2KY9flJXh+9f5V/GebZNOMPlu/kXM6Xo9vx0nbbx3cf\ns0mys0jTi/Kdcp9LtQtlIEgKiM+zJoV58pW3HVeDUbPeOPQOk76jmB8pUMabW/wueCX3UKr+\nL3Fjm7dNH7nEvgMr/3ly5wbWGU6K26erQZDkt9XIHRb/zlu+synpBbkzR7dChi2esYl7f1eT\nP/H4j/5mVk+zy4XKHuXq5/W07yVmgmnKDebWzGwjRe3S9SBI8usXy7XmPHZex0LBbv0trjO4\nzvPvLfA7ZW2SY+IyfYq9nkTUtbwamZf72HeJ7NxcxIcEf6XeS5AyhyDJr/kAvlPrU9n2uSKY\n78wv9Px76S1yTN9/5qeK+f4TNfSUUO6e1Hm+T+zaPvXkhuOuPB2+cxAk+XXowXfKyvch6VcT\nf9/ntNIvvpk2OZyQ7F1EXshUj3+HmOTtwuuASQ5Bkt/Xoc/Y9rzhH9n2eddjA9eplfDK9x9c\nFT10yYz7BwsuEj2WeikRpPR/j2dx8XEGbQfpYe6u1rc296rHyPhBIaHgOWszwfMU7ZFj+OMG\nqb7raA+tIrIGaeR8y5fUib6EePV6ILShywcp+biYA257chX5cGqvkFJ2TmJIRWKTbJ0mDa3k\ns4L6yJ8V5T7xrPRy4ZUpJSdrkEhty5f3SWDr3lVJ1DOBDV08SCcbexCSx7nZZlh3xr1Vod03\nSbY3pMi8vEvlRsPO0x/4bq721p/WjhzDbG6qYbIH6ZiOvZRyPhktsKFrB2m/X9MtN05+FdRe\n7Dsz8+lVm+R8VZLIofCguK6VdAnO/2HRANmD9C3hPmHXqCSwoWsHqRyXoCOmVeLG2VWKBHnr\nWih09x9FSUsHdZ5AYf5sNZM9SKP5jCT4CWzo0kE6orvAdXq+JbidLXt9ul1g0nZXjnxoe1uw\nR/Jvkyeut/sN8645Y5bTu7RE9iAtIcfYfotSAhu6dJCen9ucEylqnGod2OZRoVEiCwLOXwW8\nK1TOlmeTXRtfrmmIjAk2TaK1c3mDlHfsij0h7N0vezyELqN36SD9HMB3ZpYUM8xlwk/cPlWx\n+//U5u9P2r+/KMu7T45l62N5bX8y2MueWfkSi0efZ5j0Jd4z7Npz+vohbQevEvoQKGuQwnTE\n6k+GGeqd45LAhi4dpAtkP9d5u6OYYf4m/HUyG33EVuQeUjrpYxJaBRfI6lbwt5tybYe6dgz2\nZR7uDfW3fvZc13Q3xtSob2y2SgJnr+U9IZt4ZOWE7tF/MUxk2P+EtnPpIDFNqrP/+Gv1jv/T\nveQQ4W8EXxYiviR30D+39Q/Y0/hcmU9emW76hetsM9jxy9OQv94xydueecfql71o+XqjRqWs\n7xtW6BKh48J3Mrt2kK5GFJm6aVkP4zhRoyQHLOQ67zQTX5IbuGnkLnJKicj8fo27GW+VL5Jz\ntkcr/SXfCbfjCvxtRm7Gu+sCM1ngWjsnPBhaxjN3k80iRxmdkz3sstSwjUJJ2rcyO//Hd0jm\nE6GmZsx9v4fYcYVFrRFcm+a/xvbGn1bnO3H9s9wGQVJMaltTxymfNTDOVLoQdZhbhO9MK5f5\nBnX4w1cDytsx2phi3OWeaz3emGnpTf3f5jvdO2W5jVJBul+2rMCjbhEkhlnTqXxMX4UmkVSd\ndb7crJfMB7GZb7DVONN6onyh8Rc7RrsT0tG6NObeXAPt2HhCRnTrfpzlNkoF6Q4RGsVNgqRO\nFzb+acdfceoe+c5n2/u5Xp+HOcNi76KdukV52jenzP78Ia0Sauq72rgPgXVYv4ttT3hkfYhM\nqSClbNki8CiC5LIOVCQ+Hvq24mZ5cMpUn+8trzjnqpfI8tqFq9O6d5l8wc7hni7s1/6TnfZt\n2y2vNUG7CwlcyoLPSK7v5soJC48pXQTvkF/7k+aUv8pHPZJ/3xO8QmIiDbXF34rosOTe+vy1\nwnXvCJxzwo19rs481iuoWkFdcyXeTr2pVkv2gt2HilzXdH35p3MUWkzw7OJPFghOIIYb+1zd\n574/pjPM0TJVbdylkL5ueKdPhN4vv2LP+E6DfxS6Jexll2a822c2OwvRdd0+7lvTxF1nqD2u\nc2PfpeKFngsmCrxzcE23TT+w7fWAN1ffe9m1Kt71u9XyaGzXpeSp3XVVusZmL2LfG8Y5XkXj\n2xTws95bu4skct/7zcuup7oP17mxL3nh/z33Dl6RMvyYg38l6iq4YlFaxWrWWwT/Ld7cnkE/\nymV9h/SwZag9sVtrnGd5O5c+3mMXwxwh/PR4KwPt2Y8bwY19Lm5axunFMTFCm63y5W6tOa63\n40PETeN6tk0qaM9dBGX4Uy3xTSy/LtkXcv/RrbEdz3QnuLGPknVNQgOqTbL3U4f9FoTxnX6C\ntxG+14LvVJhge8yfgvirbQY2Et7Q6hbh731db7K8MI3MzX7kXmUUe4GU1uDGPjoGe/Ra+vOn\nuStTv9v1HL9URFLYdKHN2mdMV9d0kO0xn9+ROLmi7Y1PEX5aiX3WD64pLX26fT2xmWGi7Se6\nF9zYR8Uvnn9am1vF3qU+dOdC1pnonrQJFTz+0j/juplSk20PucafP/3wvh2zft/T8W8WV2ez\nHvk2/9S2VKWeCh2EdmG4sY+KRr24dp0X9cONT5t7NOzfNmf4UcGtfvPi7h3YqbPjSNx9b+4I\n4INc9lxOU70n18a2tmNjt4Ub+6jIWLPrKbHzohNHbB7Sos8CG+v7mRsUP2JptufrZs+In/tZ\nV+W8WLO4PTOFbPP41PLR73HfbMftGdpd4cY+KrLzd7WIXrPLaQ/e1kc2KqzrkWzPxubRHqEN\nyhhrXLZr6J9zZK9VzTcUt00JkTtIN0/xb89vC63wprogVebPiu3VKTfh48E5w+baPbH3pcUj\nZmy3d4LLx6s+G/8L/QOSmiJvkA6WJiQ3t2ZBY03dRjEzmP3jnta4gdKVgDJkDdJZk75+rImw\nn3C1FaTk2vmXXrz/R/0cp5WuBJQha5Da6zYyzK0Ik/UNiLaCxCR+7E+IIc6OaTcccGFks5je\nG2xutqd//QYfqXPG4Edftqve8f/kXU1AGrIGKZw9kX7a2zptjsaCxDDp5w5R/hix3Kf8wE9b\nerazsU7kSEODEcPqGsTNaaSMEwVDe3/ePaSEfQc9XJqsQfLjzkiMIn9pMEjUHfVgr4Q7mnuI\n4GZLTOyKk2s8RE7pr4BnhVtYj+rfr1VZ+CCuGsgapOgotnkSViIZQbKpK38h3DJvwclAo/gF\n8wbYcbmPi1kSxF1RddXzd4UrEU/WIA0j/dh3PxtI+yQEyZYic7g2Ub9dYKt7GRMo/6VT3UeN\n3hnXSkTLt7y7VGQNUlJN4sde3TWK5AtRZ5BSjm6V60xR7mV8x0foeMPzeUWPktsid/hk747M\nZwOWyjsZVyY2s2dSLNcm73mk+0MjuXd3i4qpcjqu5JG+xEhKCV7eRE2lz7j2MhG6zC7J6zeu\ns8pX3AeNO52NOiNpeEbUII4ZWovvFLNvTQhXptQsQuYLKpyOy9wy15LbaScSPH6TY2/j83Of\nID4qIngFQotG7MNpMR1E7e1hydKbHj3b2Sj4rKhhHLLbsIdt1xvpnjZQAqbjcsAa0wm2HZTf\nnmkFxXpSvPIh64u4UTi2pwLiLzPMhZbB4tZZHlGInY4mrZ6cU/p3y/VzGpOy2F8DyzgjSA5o\n25lr7wvMuEnR9TgSWFAXtt7GZgfKkLy5SUWRM98V4qcg/59RcH4nulIGe5kiPLONE7uqtQtA\nkBxQYQrfKTRfnh2eX71oj+3Luc2Hf1gmfLeSbWl6/m/D89VR5HFn8/w/ZEyudBAkB9QYy3dy\nf69oHVLgTusyzBWCywWdgSA54MNorj2swV+2aH7pn/lBcnz+0x4EyQGnPNl13h5UbqJ0JfT9\n5MXOOvFvbqyxnrUVzQoVarYi04cQJEcs8Ww8fenw0BI3lC5EAoONnb5Z0M+vmV232Lql9M7e\nvefP7+3dObMTdgiSQ450L5e33hc25k9QqU1tihVsulgDB9CkMjOAXVD9YMBXmTyIIAHYJ4Kf\ne3NCRCYPIkgAdrmfMeXsfpLJAXsESXLmw98vPYh3TKp3g/ALJJ0gmXxGRpCkdrgcyRdKSqvz\nVnB4IS2An7xwWUAmS1UhSBI7G9TmEsNcic+uvVNP7ubdcuziUIllM5uXGkGSWLs67MHS9IZv\nK10JiHSjYLWtT578Wa1gZmc/ECRppfn8zHU2eOEEjdpdbaXX6fStMl0NGkGS1k3Cz5h9hmhg\nqhy393jPnix+MxEkaT3V8f++ewj1pZPAhSBIEqvAL/w1TGhhNchaojqm6kKQJLbSk/2QtN7r\nB6UrUaPbfcN1PlWX2d5QcQiS1MYZ6gwdVs8wRuk61Ohi/lLf7vx1kNdHShdiG4IkuYODGjYc\nsF/pKlSpUS12HsRtNqatcAUIkvJubl68O1HpIlzRhYyr2zq2UrYQOyBISnvSy2gK04csUroO\nF7TOl+/8X1FF67CHmwXp3LyhXzr+Pywlc6Pw39OYR5M85ildietZ68935mZ244JrcasgpQ0w\nFGxcxlDnuhSDO2mlDzc54pfZqa+HrnpnMs5m92iubCF2cKsgfRxk/dB6rnJ5F5rfo0Mnrk32\n+1nZQlzGvhEtO0/hLmerGcteZ73fa42iFdnDnYJ01eMXtr0duFiC0Z0U/TnfKT1T0TpchXmA\nvub73SP811r/41RIjZWnd4/z66p0Vba5U5AW5+Zvr+vaXoLRndSYv/KByb9A0TpcxVS/rZav\n6Z96sTPHXooPILpis1VwW6Q7BWlSJb4zuo4EoztpbCR3l9gBcuqV71NeRlMtUnPM4jqN+Le8\nzFXBZdYEx8rk/jvJuFOQ5gj6VM4AACAASURBVIfxnV6tBbeT1c3sH1h/3tdKtXjpm7f6RegD\n625UqiYFHchY5Wl+fnEDJU8s4+ld4SvZsuROQbrAz2/9KNccCUZ31tagqAETuwVUe2mNr7P5\nSn+zY3Uf4+dZP0urtur5t3G/+Ika52lM7glbNn0a2MzGStbUuFOQmHdDrWfK7zQq4lLXEdwY\n07xKx8UvH0is2ZC9CXCtfqdCJSnnNLnAdWZEihpnaH729rszOaaKrchOmgxS2tGfNmd2O/Cz\nDvqqXRv6lbRzLS3z6ZUblbgb71jGx6Vm3RTYu8KKcQdfkksOsrGhoLRgfr2QiXJdE6HFIG0u\nTHKa9B3uZvLQrvFdhq6y8yzSntIkKBuJlT9KP+bmO5PVt1C5aBuMnycxzJXYfLfEjHKJ8Et4\n7iQyvfvQYJA2e3x0nUnbXrKCuANfB307XWDM+6MLZRZISS3Nx3emlZN71y5gRbB3hSKGiuIm\nXfov4x3iHuL0QT/HaC9I5iLcAiW3c4lb4bd2a/Zj79Piot5jOGOfnn8VbB8v965dwZNN077d\nJfLMUYo/Pwfd1yIP/tlNe0E6SPhZXobVFFPCLR23UDAzK1zMMM4wl45nf492GjfJvWvN6BPF\nTpFxM0yuVWq0F6S1AXznuzDB7WzYR/iLSP80yH5efb9/o40X9433TZB7x9pxN6r40jMnFxSs\nLNfKIdoL0hYP/mDCV8XFlHCKXOE6a7I8oXHyg1ql2i6W4pTf6ebehBT5xlaC749vGtXkc9k/\nw1ldGlI/quXXSUrs2j4P+gYSEvKxbCvwaC9I9z25S1OZht0FtrL5KpMWMpvrdG+YxRbfedUa\nM7Onfz3aPyu2tLTztmfvOhkWPujrjyPyHqFcgB02+5cf8XW/nGVuyr9r+12Rcz047QWJ6VuA\nvcNnhkeWv2BLYwK9y0+wcVBvQhB7n/Ny4+bMHz9iZC/Xvlg4s5mgnba5SS6PkoPu2d7Q8oG6\nWAvrK8KztoVkf2G47j/YGve7lbP6I+N+NBikxCbZukwdWcO0NIvHzd19Bq7eNC5vFeE76dK6\neLafNKahMatT4z34X6JfjXecLvUNEww9lm+ZEVnwkh3brvLl8vYwu+wzfY0pzk02d4Io8Gro\noEcnZPk7o8EgMeZlHcvXG/hvVg9/l22ftbkZ0dfGOBu6VYrpm+VyLCX4+4fSvOgdW9ulZ+9g\nS4ypb8fGgxvxneYfUCvATrED+U7hb+XetYNWlNQRQ7W/pN+RFoNkQzX+xNDKbLtX73X+A07B\nhXwncOaafyhNR9y1GdceJVn+GUg5vHoPV3TfjGvYO/ags3f71fqU75SdLveuHTPJY+jeGzt6\nGFdJvic3DJLPBq6dTkiQzm+ss0fdan/MtQsIyW7w+pDK+4dy0/hOcOZr0Fv2lstStO8Y64HJ\nySX475WX/SrxLu249pmfa98DftrInZf9LEcmi1XS5X5BMvNvxaZ5ka3Mo8VBfZwcZ1Ywe1Bo\nuT7oKpO4JjSOxtmmUhnrZef5MfMNvvSadId5vCRHL0v/rJG9HZv5zXBC5G6vLv98wTFHnrDe\ni9vl1EDXnjZtNH+xYkqOrD4wU+N+QWJKjbd+ve49xMP6Z2qHfrdzwyRXjdySwpzxNmyx/te/\n3jTePbTuwrWXdJl/NLvpw92PvtNgvcFilO/sh8yjuf4fi9tp+gjPkOhw3dt2HSrktcy7Oom5\nNcboQpNfZKZdxintupJf4OCGQZqc03qq9dt81bnPGHUGC2+e8tPgd8Zsz+SBB10NHoFEx9/H\n2onGPBC/eLIBMr9TOvPXt/l5+e/XH2DdbHogCdEFTBL5Wjgi+2rLCIdL1HRg2YdnA02GHCSf\nq89u36En34mRfOp1NwzSs5iwhWcutQkI5S4N7St83/mZKP/YHjUNbTL7DHR365pPM+4/+6Iq\njdo6B0w/dv33ON99mT88qi7f+YBbSDNp38q9Ym8TuOrBzQN22Terz2WZevj3qiMuNKtZ5iYV\n5f44PMom+VxnbhgkJmloECEewfw0kZ27CG2bGNHE+pbnSP4s7rFbHMp3RtTNfAPHpE8PtZTW\nOKsPPRMq851mRWOqvfsnjT0yCzJe5dp3oTKeK7nm+4W1MfcoKPm5JHcMksXlM5s9uSAl5Z0t\ntOGcXFwl23WZ31Z7mhxkW3O54ZRKu30i62kGtnpwV7Z/SqI+G9/C+D6NAxzjq/OdYRq8TOEn\nj7eXbJsb7b9L8j25aZAsf/vLN7Te8pXaM7fgOaA2GRcAFZib+QbNylunvTGPzmbPxQhipVes\nb/2HWaXPzh4n8RP8G2Cn2RnzavdoR2E0V7O/dZi+ULfz0u/IbYPEnC+cf+CcYSWChScYqZdx\nuKfKxMw3uF0+5P1Zn1TJto5udVm4EBE2cM5wHxP3Q/uiIIUhT+r2su3jXN9QGE0pKeeyvHRS\nnhm53DdIzKNJTSMbjLRxhXCHzlxrzpPVkd5ns96OrDPgAs3KBDyeHBdZT8cfFjiRcQ+jKO2L\nWC+jeNiskEvNreSQf2p7EkMFef6WZcGNg2SXpQHcnQK/eND4naXjMeFeQ5irWV9J5IAnsZ4N\nP2gVVPSU7U1d1M/GTn9c2v6hUcnZ0xEkYamVKlh/WTcEDlW6kpdknKf/w4PKzB7mTUNa9vtO\nvXMkPwz+hG0Xe55TrggKQTq1btYqhy4wsc11gsTcamCIqh9mHOxKi9T3qsBOIJneOE7pSlzC\n0hzJXKfMWOWKEB2kfXWJVfQeaiUxLhUkhtkze+TiC0oX8YpreRtY/nKda5v9pNKVuIQRGTed\nvNtBuSLEBulsIGn61erZLXQBdk5faheXCpILOlebBASRCoeUroM5Pa3noCWyzYuQheeLi/To\nqFwRYoPURsddb7VS14ZSRVYIki3n164UN4UiFaMMJd9pGpzvH2WrWOHPRTm92CTlihAbpLDa\nfKeuqMmvXuOGQTKvfTe66YgLSpfhkC/Zs2eJPbMrMUP6C4mhfdlLPCZnU/DIqsggJZMufK97\nHir1cNwvSMktvdqMHVTe5yelC3FAchC3KFh6JdnvdX/Vtmz1l/7zU1tjFjdxyUJkkNJD+Cls\nnkW8Taskxh2DNCAve6HqRM/jSldiv506/r7T6SWEN5Tc6fhQEtLMyRvL6BD71m4+aWI9z3Im\nNggHG0R4ZFrJderLPv+C89Zn4zvLcilaB0vpuSrFBqlnIaIvWDVcT0JrW7SkVJXbBWmrkT8V\n8rW49bVktZfwk7xOLq1sIa5AbJCCXxFFqSq3C9LzhR6fL+miAmm5uaNkKSVF3uyuBbhEyCUc\nIte4zuhqyhbikIWeC9MZ5m6rXKIWBdMGBMklmAtzUy7ez6fgqRDHfWnK26iKT3HXn25VeqKD\nlHpiB49aTW4YJOZX49B7DLOvQil13ctw47uhX/zq8lM3yEFskA4VJBnoFeWGQWLWh+nyB+ha\nuPTyDpA1sUGqRZp8MYNDryh3DBKTsve7tf8pXQQ4S2yQ/GLp1fKCOwYJVE1skAo7PHnOwyvX\nbN7bgyCByogNUvcyWU8elYmjnXNbPk0Z8sULH5pAkEBlxAbpZmStdSdOsex4Zj8dyVMlNrZq\nKCE9hbZTc5BuyTNrDbgWsUG6UdaBo3azSCN+cvhj7UhWK+FZqTZIJ94OJN7RvyldBshObJBa\nkPwJQzm2n1i92PNTDuaaNQQ2VGuQ/vJpsurYpvcMXypdCMhN9LV2jkx57d/lRX94gMCGKg3S\ns4LcMiJLPNQ7tRU4R2SQnpIRDjyxeuSLzw91NPiKtN6bn/64yjBlC9G20z/N+ydZ6SJeJ/YV\nKaKaA/NUzSJN+MuyTncgQheVqTRIEzMWi/iouaJ1aNrlBiQkQh+6Xuk6XiM2SDsD2x64fYdl\nxzMTCAmLbv5WTDghXYVWUkCQIAsPI6JPMMyDocbNSlfyKrFByu7l0LV2B+ODreeR8sT/T3Az\nlQZpA97aSe6TQtzssv1d7A5I0XfIvmDns+9fuq7VKxtwsEF6Zb7g2gvEtWbHxP1INOHwt+Ry\nZCx67eFa7+0oBOnpEeElhpyg1iAxJ1rihKy0Cs7j2kdE+lX4HCE6SP+19LB8PBr9zhXHxrhf\ntqzAo6oNEoNLhCTWib/dYLGfhPMGpTp+e6XYIF0LI9XrEGYyyXfNoTHuCB6cUHOQQFKHPSZb\nm/38Ui4SMH9bzlNfaJDgiqhvEhukvuQ7ZqnlG4sM7zk0RsqWLQKPIkgu6cpOe85xSGy5T7n+\no5p5dJbqhT/9Hb9PtuyaXTTSsRldxAapQB2GDRLTvIjD42QNQXJB34YSQkr8qnQZzH+fvF23\nn9CfYXG+y3bY2jwq+45DTxMbpGy9+SD1yZbV5m9I//e4jfkyECTXM9xn4pmkIx8avle6EInV\nHMC1mz3uO/I0sUGqUpkPUvkKtp84cr7lS+pEX0K8ej14/cH7/Xo9VxNBcjWH9RvZdnJ2h36/\n1Cf7Gq5NdOywoNggfU7GpluD9Dmx42Q+sa4B8z4JbN27Kol6fc3SOx3bPFcBQXI1Q6K5NjV4\nqbKFSC3gZ65NIg6d1REbpLQYElGNvFeBlLLjaKQ1SMd0la2fWOeT0QIb4q2dy2n1Pt+p/YmS\nZUivGn9j3VbjXUeeJiZIEdbZQZNn5Ld8Bs0x8pE9T7QE6VvCre9Wo5LAhgiSy4l/l+9UG+fw\nc9NX92/ae75Kpr6c688uhZhUtbVDTxMTJMIv2fn4uJ3ZtQZpNJ+RBD+BDREklzM5gjvefNfk\n8HG7+7W83xrYPqSIOi5ATGuRY+re40vLFHRs+T8aQbL/iZYgLSHH2H6LUgIbIkgu56b/KGuT\n0qa4Q9NGWcWVumT5+qhZYaXXMLJP2pTCOpKj123HniVvkPKOXbEnpL21u8eju8CGCJLrWedd\nb9YvU8rkdHjG/P06bhXCRyHzqBclkceOzxwtKkit77zM9hPDdOyNS38yzFDvHJcENkSQXNDJ\n7iX8y390w+HnTSvJdzp3oluQSxEVpFfZ8czEIysndI/+i2EiwwTv7EOQNOSTOnxnQDNF65CW\nqCAFV32ZI0McF763D0HSkP8rwHdaJChZhsRk/YxkcfMUf3XQbaH7LhAkrbi0YfkfHr+w3bOm\nDQoXIyV5g3SwNCG5F7HdxriNQvuuxhHfEBLsu8LMMDsjGgvNd6N2sgbprElfP9ZEZln7CJL2\nPShSbb+ZudzD6Jm9Ym5dR03/TGUNUnvdRoa5FWGynplDkLRvRAT3Y+xYYc2UH88qXIzExAQp\nYYGDTwxvZP162tt69AZB0r7IaVx7hFx2+LlPFvRpNUx40jaXIussQn7clF2jyF8Ikjvw5u68\ncOaX7GCBnG3er2ds+/o9Ai5L1iBFR7HNk7ASyQiSGwhZxrW3ySEHn3k/d/xTS3MkVDVHzGUN\n0jDSj/0Ls4G0T5IlSJjQR1HNO3Dt/OyOXp83oRA3S/7vBscuHVWOrEFKqkn84qydUSRfiORB\n+rVRsEfUwHviBwIn/WVgT3UcDhnj6DMbDeJac8iPdEuSjLwzrd4fGsm9u1tUTPLpuMYae/30\n58ziBYWu6ZPVo2NPlC5BbrONdUZPbOfVweF3BpUn8p3isymXJBWlpiw2X5B4Oq6/9ez59MSY\nBmJHomNtaR3Rld+kdBkyO/ph3crd1zn+vLf6cG2K3xq6BUlGK3N/m8+u3fLKte+dWnDtYXKG\nUlGizDIO3HtzVz/uvQ7YMjcHdzPBwmxvTJLjojQSpN2lSYCnvvVLUSozg+8Ercr0GfK6aJrP\ntl/6On6niztKLlvxFMOkL/WZrHQl9tJGkPb6dDnLpP5dvsSLp5X4mu/kXE6xMGdN4lfzSQ+b\no2whanGjob5wjRymCUrXYTdtBKkqd5z1QcFPn3+rVVeu/U93mFpZzuuZcTXVWx8qWoeKHJo7\nbqWKXr41EaRLhL//eVLJ599b63nA2pjbCa16IZuEdnyn6SBF6wCpaCJI23X8TU7rfF98s3PA\n9CPXNjf2P0CzMGfNCeVOSSblWKJwJSANTQRpP+Gn0f0h54tvplsn3PNs6hpzQN0LYieiNb+f\nB9c+aZMmgvTMj/873+HV1cTvnLIxWb98NpoaL/jfvDrZtipdCEhDE0FihudmV+b93rBNknJo\nON4h3FCo879KlwES0UaQUlr6dJk+rqHRtRdB1vKd1m5PG0FizCvjy1RPOChNNQA2aSRI1Fz7\n/Q96Jy9Sjq09qJo700AUBOllZ+oSL09dU0oXjH+bi/iR7JOFp/ADbUCQXnIhZ+MDaSm7Y/I7\nPi9vJiaZptxi7v2f/0Aag4GLQ5Be0q4me7g8qVxvCoNd9uJWW92id4VrlEBiCNILSab1XGdp\nEIXRvg7nO1VHURgNXByC9MIFcoHrHCQUFhweGMd3ejoxszOoDYL0wg1+DTRmu47CsbZRtflO\nu3cFt3MfB/vXi0lQ0VR1DkGQXhLOzxQwrByFwdZ7czd5JuZSzfpa0ppkrDdqbDPDB9o8L40g\nvWRmwF5r8z/v7ygMlhrV3LrUY2q3fLhO1Wq9x0prs913ltKVSAJBekl6D89OM6e3Mw6gMtqp\nsMJD540sGbKbymiqV/M9rp1YQJMvSQjSK9bHly7XUWh+I0fc/7xhkXojqJyTUr90D34GpVNO\nTASuAgiSva5v2et289JRlEh2cZ0b5ATNcR/t+tPB9celgSDZ50AV4qXz6PVQ6TrUK4T/4LnN\nQPEf8c47Br0HqX2S3ojOQpDssi9b+yOpjzYUrYJrUJ3VuwI7nbe5WWN6Yz4qUeb3pyn74oKU\nvw8aQbJL5Xi2uZFrmsKFqNe1vA0tLxyX3vE7Sm/MkYXY+SPTGzehN6aTtB+k49MTRq5xdDWE\n15zLeF8/uor4gtzV2RgSmJOU3kdxyAh+FtAdhrsUR3WK1oOU3l9Xqn093+LiPuD+7sEfsl0R\nTKEmt3Vm1fJjNI99m41/cJ2HRPG5orQepFGB1n/re2+FivqEuyPjmqFFYRRqAlqy8fPzX6N7\nJNAZGg/SfRM3YXFSuKjJbx95/cx12rcUXRLQU7sv1y4IFPneXTyNB2mtLz8h15C6osZ5rxB7\n1+xPLjxNkTv62fM3a3M690ilK9F6kOYX4jtflhY1ztM6gf0XzHjbMEV8SUDRSGP7r+f28W2e\nrHQhWg/Sumz8a/7g+uIGSpvbNLxs553iKwKqtsaXLPzWEhe4ek/jQXrovZRtn+ZXzUI7oEoa\nDxLzub/1wM7NRuEudoYXNEbrQTIPMxRqXtW7rEusfwnapfUgMcy5//towm+YWw6kpeIg3V4w\naNgPeMcGLkG9QVqcLW/TBjlybpa8GADbVBukDcYvLe/Xng30pngxMbwmaV6Puj0XKn+SRgVU\nG6Qy/KrGca2lLsZ9XSge3GnUO4GlNHlvOGVqDdINcojrrPAXua9fWhYp1GyZC5zSczmppetb\nb/e5W6sSjtXYpNYgHSX8nfo7SZKYPZn7eHb7dl6CT/s0MaNo02pf7t/4mmmjwpWogFqDdIPw\nU9Ov9BO1p/nZ2Mt+jgZNEjWMJn3YlO/UHaZoHaqg1iAxZT7i2matRO2pNH/d8IxQvLl7XfdO\nfKfNe4rWoQqqDdJ641fWo3aDTUfE7OiZbjvXOUmuiRlHk0ZX5zvlxitahyqoNkjMQp/Q5o2C\nQzaJ2tFDsp/r/EfOixpIi/bpuanotumPKVyJCqg3SMyt+QOGLH0kck85F3HtOm9MtPWG7rnW\nM4x5TXBfpQtRAVUGac/MAbMO0dnTB1Hs9KnJ1TrQGU9TUj7y8C/p5zkkVelCVECFQXrQXF+6\nWQldh0Qae7oTUemPJ4nba+XFScfMXP9l5npMXm4PFQapYXHrW/a9BdtT2dWNdnqdQRf3H5XB\nwG2pL0i/e55l24P6/XR29nj3Pw/ojATuS31BGtiQ71QYJ3kdAHZSX5A69eA7LfpLWMFTCccG\nDVJfkD6M5TtVxkq1+11Nc5A87ZRf4QDUQ31B2uDNHWA7Ychscqwr+8WeWWKY743xK3Z938BH\nqwtwgwTUFyRzdAXrIbZTkc3eeCh9Rh5CdNVEzj53JRu7dou5Xyje34G91Bck5lYtz5hO1Y2x\nb06L38t/xplHuzt7/i5q518U465ffeq/QtQ47urKxu8Put9NKSoMEmP+/bPu4zKZhPtP/s3e\nR/lF3RzdMeNoRp3RYoZxU7db67LlIeHiLoFUITUGKSvd+KUiHniKWpe8Qy++U0/5qdlVJ7FM\n2Z3pzK2BHu42KY2WghSdcRiv+GwxOx9bhmuTg74XM4x7mpLnHtv2L+pm93dpKUh1R/GdwvPE\n7PycJ3dF+KgQrGHusGr8z+BKxh3M7kJLQRrI34h2Xifu4qGvDP23Xfj9HeMvokZxT/mW8h2/\ndYrWITstBemUxzfWJqlBNZFvKzZUMhKvurvEDeKeivHvqpMNWxWtQ3ZaChIz39hq3obJkfnF\n3+ya/B/uwXFKt0Zcu9rLzd4XaypIzJ62hbOV/1jxpeLd2BEP9mz26dAPla5EZtoKEijuB1PV\noRM7eDd3tzv3ESSg6+zHDat0XeVmB78RJAAqECQAChAk3ulFnyw8LfdOHZH2x/TxP4u/RwSk\noUSQ0v89buPgsuxBSuqmK1CrgK4rlZmJJHEo0rNsNf/A5UrXAZmTNUgj51u+pE70JcSrl+B8\nI7IHKT7/DsvXv/O3k3pHZyd27jnjuuPPuxzc7o7lpzXB+Cv9moACWYNEalu+vE8CW/euSqKE\njo/KHaQ9+oNse8gg8eUM0zxKdu8Y4bvM4ScmVOZu8fmoBO2SgArZg3RMV9nyp5WZT4Ru9pE7\nSJ9U4zvVpb0F6SePHyxf0ycbHb6JN2wu154iF+iWBHTIHqRvyT9sv0YlgQ3lDlLveL7T4V1J\n91N8OL+bJo4+M+P+nmf8Px+4GNmDNJrPSILQAmFyB2loPb5Tf4iUu7lG+JWjf/F29IRlLv6q\n6osEkxu5JNmDtIRwa4S0KCWwodxB2ux1iW0vm36TcjfHyU2us5s4Oq9Ke371vC+wIJprkjdI\neceu2BPCztm9x6O7wIZyB8kcXemqpblaqYakv6Z3dXu4zvIAR596xGu09WjDWpOoexZBMrIG\nKUxHrP60vJfyznFJYEPZD3/fqu7duG8Tn2o3pd1Nde6vh7me44vI/BJQML5nOcNntEsCOuQ9\nIZt4ZOWE7tF/MUxkmODsi/Jf2ZC+9uO2H/+cLvFe/vIYmcgwD7r7/+v4c29/3avjhJP0awIq\nFLpE6Ljwr6wjQTq5ePxPV8XWI5d1OX2rlDcVEjmFJbgetV9r96ANyV892HO41K8ltDxd98WU\nzSlKVwHUqTxI5rqR+63LnAYOlbYgAGFKBel+2bICj9odpJ+9uaX21hmFjl0ASE2pIN0hQqPY\nHaSerflOxiU0AIpQKkgpW4SmFbY7SE0H8Z2YMWIrAhBB5Z+R3unKd4rPlKwYANuUCdK8HcKP\n2x2keSHcPaOHdUdElgQghjJBIglvfi91zU/PvWtvkJIimlonIjwf2VJ0TQAiyBmky+szkCaW\nL689+l/ewOd8iL2TE5wuFtS6f6xXAzeb1xNcjZxBWkReIbClA1c2PPuu71uDN+CSaFCWnEF6\n1JX4Dv/CilSxfBHYEtNxgcrI+xlpRVD4dnaETD4jvQxBApWR+WDD5br6YSkIEmiO3EftzJM9\nyx5DkEBr5D/8fbC4aTqCBBqjwHmkxL4EQQKNUeSE7JYpvwtvgCCBysgdpJun+Gm/b18R2ApB\nApWRN0gHSxOSexHbbUzphCyAK5A1SGdN+vqxJjLL2keQQEtkDVJ73UaGuRVhsk4WiiCBlsga\npHB27fjT3s0YBAm0RdYg+fVkm1HkLwQJtEXWIEVHsc2TsBLJCBJoiqxBGkb6scuLbSDtkxAk\n0BJZg5RUk/jFWTujSL4QBAk0RN7zSPeHRnLv7hYVo3VjH4ArUGoWIfMFKtNxAbgGlU/HBeAa\nECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAk\nAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAK\nECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAk\nAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAK\nECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAk\nAAoQJAAKECQACuQP0sMr19JtbYMggcrIHKSjnXMTQgz54ncIboYggcrIG6R+OpKnSmxs1VBC\negpthyCBysgapFmk0QGud6wdmSqwIYIEKiNrkKoXS83ommvWENgQQQKVkTVI/l1e9IcHCGyI\nIIHKyPuKFJn2vF8Hr0igITJ/RmpyhOud7kAmCWyIIIHKyHvULoGQsOjmb8WEE9LVLLCdFEFK\ne0p9SIAMMp9HOhgfbD2PlCf+f4Kb0Q/SoopeuvAP7lEeFYAn/5UN9y9dl//Khnd9Rmze/X9R\nha7SHRaA5x7X2q3y2m1tEqu2oDosQAb3CFLj3lz7t/4G1XEBeEoF6X7ZsgKP0g5SvqVcm278\ng+q4ADylgnSHCI1CO0h5v+fadI8tVMcF4CkVpJQtQr/StINU/32u3aPD0QaQhHt8RvrBhz0R\nnFy7CdVhATIoE6R5wrcjUQ+SOT7gi90nfqyY9wLVYQEyKBMkkpDJNx/ee24a7fNI6TOL6klQ\n1+t0RwXIIGeQLq/PQJpYvrz26FkdeUmik/vI2hMc+QbpyBmkReQVrz/837nnVpBkJ/cBoAg5\ng/SoK/Ed/oUVqWL5IrDl3wgSqIu8n5FWBIVvZ0fI7DPSSxAkUBmZDzZcrqsfloIggebIfdTO\nPNmz7DEECbRG/sPfo+DTYAAACg9JREFUB4ubpiNIoDEKnEdK7EsQJNAYRU7Ibpnyu/AGCBKo\njNxBunmKn9ru9hWBrRAkUBl5g3SwNCG5F7HdxkKjIEigMrIG6axJXz/WRGZZ+wgSaImsQWqv\n28gwtyJMpxgECbRF1iCFN7J+Pe3djEGQQFtkDZIft5bLKPIXggTaImuQoqPY5klYiWQECTRF\n1iANI/2eWdsNpH0SggRaImuQkmoSvzhrZxTJF4IggYbIex7p/tBI7t3domKC03EhSKAySs0i\nZL4gNB0XggQq45rTcSFIoDIIEgAFCBIABQgSAAUIEgAFCBIABQgSAAUIEgAFCBIABQgSAAUI\nEgAFCBIABQgSAAUIEgAFCBIABQgSAAUIEgAFCBIABQgSAAUIEgAFCBIABQgSAAUIEgAFCBIA\nBQgSAAUIEgAFCBIABQgSAAUIEgAFCBIABeoKUvq+RYv2pUu+ewBHqSpI+0uSggVJyf2S7x/A\nQWoK0qmADtcZ5nqHgNOSFwDgGDUFqWVDs7VJb9hK8gIAHKOiIKWY1nOddaYUySsAcIiKgnSN\n8G/pTpFrklcA4BAVBekx2cl1/tE9kbwCAIeoKEhM+cFcO6i85AUAOEZNQVrhudrarPZcKXkB\nAI5RU5CYCYaYwYNjDBMk3z+Ag1QVJObwkNjYIYcl3z2Ao9QVJAAXhSABUIAgAVCAIAFQgCAB\nUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUIAgAVCAIAFQgCABUOCaQdpL\nAFRmr8O/5tIHiTm073WbycQlyioUr3ABLaMULmAMma9wBf7vK1xATOM3fjN5hxz/LZchSG+6\nQU4osduXVJyscAGf1FG4gN3kqcIVhPykcAFdu1IcDEFSBoKEIImHICFICBIFCBKChCBRgCAh\nSAgSBQgSgoQgUYAgIUgIEgUIEoKEIFGAICFICBIFCBKChCBRcFd3VondvqT6lwoX8HkjhQs4\naHimcAX5fla4gF69KA6mSJCYc4rs9SVXkxQu4MkNhQtQ/mfwX5rCBdy7R3EwZYIEoDEIEgAF\nCBIABQgSAAUIEgAFCBIABQgSAAUIEgAFCBIABQgSAAUIEgAFCBIABQgSAAUIEgAFCBIABfIE\n6cxXsuxGfQU8XnRZ2QJkpHwFUpInSB9k59r775f0rzWd68+uEVBj9htd+Qp4Nrymf6H4s8oV\nYNWVrFewgL/q+edpK9e/QGYV3B0Q5RM14J4cFWTx46b2WyhLkDZ7cf+Gl/OS+r1KkW7WfgIp\n1rko6fdaV74CHtQkUT0b6rwPKlWA1QrCBUmZApZ55u3wliHHRVkKyKyCe4VI7V61SMQD6SvI\n4sdN77dQhiC9U4wQ7t8wjvzEMOnvkd8Y5iBpnMqkNtQdfaUrYwHDSF/LNzboyyhVgMWVIF82\nSMoUcNFYxfIbPJd0kaGAzCsYTmZZvjGDfCJ9BZn/uCn+FsoQpLfj4vzYf8Mn+trWJtGvEcPE\nk8OW7n7S+ZWujAVE+rGTf9QnNxUqgGHMdcOHs0FSpoABZKe1iOlzZCgg8wqakluW7lXSQvoK\nMv9xU/wtlOczUkn233Af6cP+VwXPNCY4lO3myc283JWxgKg4thtLTilUAMNM1m//gg2SMgXk\nDXv+oBwFZFLBGPKDpfcdGS99BZn/uCn+FsoZpBuksbVJCyaX75Ma7ANVyKOXujIWwH//lilX\nqlIFHPQcxrBBUqaAx6TmoWY5w1qfkamATP4JHtT2iP8k3lj/kUwVvP7jpvlbKGeQmNL6Py1f\nRxJy8hJpzn4nllx5qStjAdy3T0eQhYxCBSRGlU3mgqRMAZdJYd9S3RvrffbKU0BmP4P5RkKI\nxxKZ/gne+HHT/C2UNUi7vQ3NepfzLUTOXSdvsd+JJdde6spYgPW/noz2Nn3NMAoV0Nd0jOGC\npEwB5wkZamaY33Xl5Ckgk3+CCaT54aeHmpKpslTw5o+b5m+hrEFiTrcKDYk9UovcSTfEsN+o\nakh/qStjAZb+xvwk7pSlVaaALcR6KoUNkjIF3CA52BkaG5KbshTwZgV3TcVTLP+dXMTnoQwV\nZPLjpvlbKG+QOAVyWD7UFWK7Yfle6cpYADOalNjG/aciBUx5vhT9PGUKSDdVZLsJZL8sBbxZ\nwT/8cYeeZK/0FWT646b4WyhrkObPsbyXYHZbz3rFk9OW7jES/0pXxgIWkfbJ/IOKFPB7glUV\n0iRhh0L/Ao392Xmba+mfyFLAmxVc5d9OWY+CS11B5j9uir+FsgapI1nMMI+jDZZPKP8jHRnG\n3I5sf6UrXwHmYvmeT/+tSAEc7vC3MgVsIn0tb2OWkzh5CsikgjKGzZZv/KqvJHkFWfy4Kf4W\nyhqk84H66C75PRZb+11J3eExpMdrXdkKuEBCGnNuK1MAhwuSQgV0JaV6NSB5LstTQCYVHPHT\nNepTXxdwUvIKsvpx0/stlPcz0r+tc/vG/MF2zROr+1ef/HpXtgL+eP4R5YoyBXD4IClUwJRo\nv6h+92QqILMKrr0b5RPV+4b0FWT146b3W4j7kQAoQJAAKECQAChAkAAoQJAAKECQAChAkAAo\nQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQ\nAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkAAoQJAAKECQAChAkFRi\nGimVxvU2kgJPla0F3oQgqURaWfI120ktTtYoXAu8CUFSi136wDvWdiZprHQp8CYESTUSSILl\n670grzNCWz2RqRp4FYKkGvdz6g8yzAdklKWfNq6qb8F+16zfvtipuCms1SFLr2vu1H6+3yhb\npNtCkNRjCYlmThoLJjJMcgyp2Ks2yX+RYY77erV6P84YdNUapN4h8X8rXaWbQpBUpA75IZas\ntXRmkLGWr4tJK4Z5n2ywdGeR7yxBMpS6o3CF7gtBUpFTnv4k1trJH5Fubap5PmW2LbV2N5IZ\nliCR5YqW59YQJDUZRbzOWponpNpSqzrkiOW/nh35ZWJRLkiChyFASgiSmpwnda3NcZLhH+Zp\nT29iLBrHBemR0gW6LwRJTf4j9a3NHdL3+bca6YYdSWN2cUHCsW/FIEhqwgeJyVGRbSaPZh4Y\nW1l7mxEkhSFIapIRpBFkHGM9ateBuUvqWXp3Y8g0BElRCJKaZATpUUlS4b23DPmuWN7akWrD\newXXI6XXI0hKQpDUJCNITNKQ8j4RfaxXNtxNCPWvuZh5L6AngqQkBAmAAgQJgAIECYACBAmA\nAgQJgAIECYACBAmAAgQJgAIECYACBAmAAgQJgAIECYACBAmAAgQJgAIECYACBAmAAgQJgAIE\nCYACBAmAAgQJgAIECYACBAmAAgQJgAIECYACBAmAAgQJgAIECYACBAmAAgQJgAIECYACBAmA\ngv8Hz8kMGQgWTSQAAAAASUVORK5CYII=",
"text/plain": [
"plot without title"
]
},
"metadata": {
"image/png": {
"height": 420,
"width": 420
}
},
"output_type": "display_data"
}
],
"source": [
"plot(ats)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, \n",
"* Compute the appropriate correlation coefficient between years and Temperature and store it (look at the help file for `cor()`\n",
"* Repeat this calculation a *sufficient* number of times, each time randomly reshuffling the temperatures (i.e., randomly re-assigning temperatures to years), and recalculating the correlation coefficient (and storing it) \n",
"\n",
"```{tip}\n",
"You can use the `sample` function that we learned about above to do the shuffling. Read the help file for this function and experiment with it. \n",
"``` \n",
"* Calculate what fraction of the random correlation coefficients were greater than the observed one (this is your approximate, asymptotic p-value).\n",
"\n",
"* *Interpret and present the results*: Present your results and their interpretation in a pdf document written in $\\LaTeX$ (include the the document's source code in the submission) (*Keep the writeup, including any figures, to one A4 page*)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Groupwork Practical: Autocorrelation in Florida weather\n",
"\n",
"\n",
"*This Practical assumes you have at least a basic understanding of [correlation coefficients](regress:correlations) and [p-values](13-t_F_tests.ipynb).*\n",
"\n",
"Your goal is to write an R script (name it `TAutoCorr.R`) that will help answer the question: *Are temperatures of one year significantly correlated with the next year (successive years), across years in a given location*? \n",
"\n",
"To answer this question, you need to calculate the correlation between \\(n-1\\) pairs of years, where $n$ is the total number of years. However, here again, you can't use the standard p-value calculated for a correlation coefficient, because measurements of climatic variables in successive time-points in a time series (successive seconds, minutes, hours, months, years, etc.) are *not independent*. \n",
"\n",
"The general guidelines are:\n",
"\n",
"* Compute the appropriate correlation coefficient between successive years and store it (look at the help file for `cor()`\n",
"* Repeat this calculation *a sufficient number of* times by – randomly permuting the time series, and then recalculating the correlation coefficient for each randomly permuted sequence of annual temperatures and storing it. \n",
"* Then calculate what fraction of the correlation coefficients from the previous step were greater than that from step 1 (this is your approximate p-value).\n",
"* *Interpret and present the results* Present your results and their interpretation in a pdf document written in $\\LaTeX$ (submit the the document's source code as well).\n",
"\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Readings and Resources\n",
"\n",
"Check the readings under the R directory in the TheMulQuaBio repository. Also, search online for \"R tutorial\", and plenty will pop up. Choose ones that seem the most intuitive to you. * *Remember, all R packages come with pdf guides/documentation!*\n",
"\n",
"### R as a Programming language\n",
"\n",
"* There are excellent websites besides cran. In particular, check out \n",
"[statmethods.net](https://www.statmethods.net/) and the [R Wiki](https://en.wikibooks.org/wiki/R_Programming). \n",
"* https://blog.sellorm.com/2017/12/18/learn-to-write-command-line-utilities-in-r/\n",
"* https://www.r-bloggers.com/2019/11/r-scripts-as-command-line-tools/\n",
"\n",
"### Mathematical modelling and Stats in R \n",
"\n",
"* The Use R! series (all yellow books) by Springer are really good. In particular, consider: \"A Beginner's Guide to R\", \"R by Example\", \"Numerical Ecology With R\", \"ggplot2\" (coming up in the [Visualization Chapter](08-Data_R.ipynb), \"A Primer of Ecology with R\", \"Nonlinear Regression with R\", and \"Analysis of Phylogenetics and Evolution with R\".\n",
"* For more focus on dynamical models: Soetaert & Herman. 2009 \"A practical guide to ecological modelling: using R as a simulation platform\".\n",
"\n",
"### Debugging\n",
"\n",
"* [`testthat`](https://testthat.r-lib.org/): Unit Testing for R by Hadley Wickham (also developer of the `tidyverse` package, including `ggplot`).\n",
"* [Notes on debugging in R by Wickham](https://adv-r.hadley.nz/debugging.html)\n",
"* A good [overview of debugging methods in R with examples](https://data-flair.training/blogs/debugging-in-r-programming) \n",
"* [An introduction to the Interactive Debugging Tools in R](http://www.biostat.jhsph.edu/~rpeng/docs/R-debug-tools.pdf) by Roger D Peng. \n",
"* If you are using RStudio, [see this](https://support.rstudio.com/hc/en-us/articles/205612627-Debugging-with-RStudio).\n",
"\n",
"### Sweave and knitr, and beyond\n",
"\n",
"There are tools built around R that allow you to write your Dissertation Report or some other document such that it can be updated automatically if embedded data analyses or R simulations / calculations change. Instead of inserting a prefabricated graph or table into the report, the master document contains the R code necessary to obtain it. When run through R, all data analysis output (tables, graphs, etc.) is created on the fly and inserted into a final document like a pdf or html. The report will then be automatically updated if data or analysis change, which allows for truly reproducible research. To learn more about these tools, check out:\n",
"\n",
"* [Sweave and knitr](https://support.rstudio.com/hc/en-us/articles/200552056-Using-Sweave-and-knitr) and [this](http://yihui.name/knitr/)\n",
"* [Quarto](https://quarto.org/)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "ir"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "4.2.1"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": false,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {
"height": "447.67px",
"left": "33.6667px",
"top": "422px",
"width": "310.288px"
},
"toc_section_display": true,
"toc_window_display": true
},
"toc-autonumbering": true
},
"nbformat": 4,
"nbformat_minor": 4
}