…because you should do it just one time.
1. Meet R all on its own
- Open R (outside of RStudio) once just to see what it looks like. It’s very likely that you will never open R on its own again…but there are people who work entirely in the basic R interpreter and minimal environment.
- Now you’ve seen it. Close it - from now on, you’ll open RStudio (which opens R along with it). RStudio is an incredibly awesome and user-friendly integrated development environment (IDE). An IDE provides a place for data scientists to see and work with a bunch of different aspects of their work in a nice, organized interface.
Always remember: R is the programming language. RStudio is the IDE.
2. Meet RStudio
RStudio provides a nice user interface for data science, development, reporting, and collaboration (all of which you’ll learn about throughout the MEDS program) in one place. Note that while it’s called RStudio, it is a useful IDE for development in a number of languages and file types (check out some by clicking on File > New File, and seeing the multitude of options that RStudio suggests).
Let’s take a quick tour of the RStudio IDE, then customize it for fun and functionality:
- Primary panes include the Source pane, Console pane, Environment pane (which contains Environment, History, Connections, Build, Tutorial tabs), and the Output pane (which includes Files, Plots, Packages, Help, Viewer, and Presentation tabs)
- Update or arrange panes (and even add additional Source panes) by navigaing to Tools > Global Options > Pane Layout
- Update your theme by navigating to Tools > Global Options > Appearance (pick a theme & editor font – the above screenshot uses the Material theme and Monaco editor font)
- Update some important settings, including:
- Turn on rainbow parentheses: Code > Display > Use rainbow parentheses (CHECK)
- Turn on code wrapping: Code > Editing > Soft wrap R source files (CHECK)
- Prevent RStudio from saving your workspace: General > Basic > Save workspace to .Rdata on exit: (Choose NEVER)
You can check out the RStudio User Guide for additional information and helpful tips!
3. Basic calculations in the Console
Next, let’s familiarize ourselves with some basic operations in the Console:
- Pressing Enter / Return will immediately execute a line of code typed into the Console. Try using some of the common operators (
*
,/
,+
,-
,^
), as well as functions to perform calculations in the Console:
RStudio Console
# perform calculations using operators:
7 * 11
21 / 3
1 + 1
4 - 3
4 ^ 2
# perform calculations using functions:
sqrt(4) # compute the square root
- Cool! But what exactly is a function? Functions are self-contained pieces of code, which are built to accomplish a specific task. They accept inputs (we call these arguments), and return outputs.
- Check out the R documentation for a function using the syntax,
?function_name
. For example, try running,?sqrt
, in the Console- The documentation tells us that
sqrt()
computes the (principle) square root of x, \(\sqrt{x}\), and takes one argument,x
. We must supply a value for thex
argument in order for the function to execute. The Arguments section of the documentation tells us thatx
can be a numeric or complex vector or array. We can supply just our numeric value (e.g.4
), or explicitly include the argument name as well:
- The documentation tells us that
- Check out the R documentation for a function using the syntax,
RStudio Console
sqrt(4)
sqrt(x = 4)
- Store objects using the assignment operator,
<-
. Any objects (also called variables) will appear in the Environment tab. Use snake_case and always start object names off with a letter (and not a number). For example:
RStudio Console
<- 2024
current_year <- "EDS 212"
class_num <- 17.4 temp_c
- Now let’s Restart our R session by clicking Session > Restart R. Are your objects still in your Environment? NOPE!
4. Write an R script
If we want to save our code so that we can access / re-run it later (which, in most cases, you do!), we can instead write our code in an R script (rather than directly in the console):
Create a new R script by clicking on the new file button (top left corner of RStudio), then choosing R Script.
Create some variables to your script, then save it (to your Desktop is fine for now). Run your code and see your variables appear in your Environment:
.R
# my variables ---
<- 2024 # integer
current_year <- "EDS 212" # character string
class_num <- 17.4 # numeric temp_c
#
We’ll talk a lot about the importance of annotating your code (i.e. leaving notes so that collaborators, including future YOU, can better understand your code and decisions) throughout the MEDS program. Any text prefixed with a #
is an “annotation” and will not be executed when you run your code.
- Restart R, then reopen your script – your Environment will be cleared after restarting your R session, but now we can easily re-run our script, which preserves our variables.
5. Introduction to Quarto
Quarto is a publishing framework that lets you make all kinds of things (dashboards, websites, notebooks, slides, books, etc.) that combine markdown (plain text with added formatting), code, and outputs in one place - which makes it an incredible tool for reproducibility. Let’s make a Quarto document (New File (top left corner) > Quarto Document… > provide it with a title (e.g. My first Quarto doc!
) and learn by doing.
Render your Quarto doc – when you first create a new Quarto doc, it’ll have some pre-populated content. Click the Render button at the top of your file (it’ll prompt you to save your file first – you can save this to your Desktop for now). This converts all the content first to plain markdown, then to the file output type that you select (the default, and the one we’ll use most often, is HTML)
Delete all the pre-populated content starting on line 6 (beneath the YAML gates,
---
; more on YAML later)Let’s add some of our own formatted content to the body of our Quarto doc, including:
- Some text formatting
- Headers (with different numbers of pound signs,
#
,##
, … to start the line) - Bullet points (asterisks,
*
, or dashes,-
, to start lines) - Links (
[text here!](link here)
) - Bold (double asterisk) & italics (single asterisk)
- Headers (with different numbers of pound signs,
- Code chunks – any executable code must be written inside a code chunk. Add a new code chunk by clicking on the “new code chunk” button (green box with a ‘C’ and ‘+’ at the top right of your file), or use the keyboard shortcut:
command/control
+option
+I
. Try adding the same variables as before to your new code chunk, along with annotations (again, using the#
symbol). - Outputs – run code within code chunks using the keyboard shortcut,
command/control
+enter/return
, or using buttons buttons (Run button at the top right of your file, or the green play button on each individual code chunk)
- Some text formatting
6. Our first function
We’ll get into the weeds of functions in EDS 221. For now, we’ll create functions to quickly do repeated calculations, and to familiarize ourselves with function notation.
General function notation looks like this:
<- function(argument_1, argument_2) {
function_name
function_body
}
For example, let’s make a function to help us convert units of \(\frac{g}{cm^3}\) to \(\frac{kg}{ft^3}\), given that \(1 cm^3 = 3.531\times10^{-5}ft^3\)
- First, we should write down the dimensional analysis to keep our conversion straight (this is worth writing out by hand, every time):
\[\frac{g}{cm^3}*\frac{1 kg}{1000 g}*\frac{1cm^3}{3.531\times10^{-5}ft^3}\]
- Let’s add a new code chunk to our Quarto doc, then make a function that will convert any value input in \(\frac{g}{cm^3}\) to \(\frac{kg}{ft^3}\):
.qmd
<- function(value_g_cm3) {
convert_units = value_g_cm3 * (1 / 1000) * (1 / 3.531e-5)
value_kg_ft3 print(value_kg_ft3)
}
- Try it out! Convert \(50\frac{g}{cm^3}\) to \(\frac{kg}{ft^3}\) using the function you’ve created.
.qmd
convert_units(50)
[1] 1416.029
7. Close & reopen - no precious outputs or objects
If you are writing code reproducibly, you should be able to close the file you’re working in without stress. That’s because all of your stored objects - functions, variables, etc. - should be recreated by opening and re-running the code in your file. If you cannot do that, then your code is not reproducible.
That means that your scripted code is what is precious - and we want to build bomb-proof strategies for making sure it’s safe.
Which brings us to a critical lesson: Create things like you expect your computer to explode at any minute. Your computer is NOT a safe place. Where is? Someone away from your local computer…somewhere cloudlike and wonderful. Somewhere like GitHub (coming up soon!).
End interactive session 1A