R notes
Preface
1
Starting RStudio
1.1
Download, install, and run
1.2
Make a new R Notebook file
1.3
Save the file
1.4
Run the example R command
1.5
Clear it and try a sum
1.6
Did that help?
2
Starting R
2.1
Arithmetic
2.1.1
Activities
2.1.2
Answers
2.2
Variables
2.2.1
Activity
2.2.2
Answer
2.3
A note on variable names
2.4
Vectors
2.4.1
Activity
2.4.2
Answer
2.5
Functions
2.5.1
Activity
2.5.2
Answer
2.6
More functions
2.6.1
Activity
2.6.2
Answer
2.7
Data frames
2.7.1
Activity
2.7.2
Answer
2.8
Loading data frames from a file
2.8.1
Activity
2.8.2
Answer
2.9
Packages
2.9.1
Activity
2.9.2
Answer
2.10
The end!
3
Visualising data in the tidyverse
3.1
Getting setup
3.2
An interlude on functions
3.3
A scatterplot in ggplot
3.3.1
Warm-up activity
3.3.2
Answer
3.4
Another aesthetic: colour
3.5
Another geom: jitter
3.5.1
Activity to develop your help-searching skill!
3.5.2
Answer
3.6
Aggregating/summarising data by group
3.6.1
Activity
3.6.2
Answer
3.7
Pipes
3.8
Plot the mean life expectancy by continent
3.8.1
Actvity
3.8.2
Answer
3.9
Yet another geom: line
3.9.1
Activity
3.9.2
Answer
3.10
Filtering data along the pipeline
3.10.1
Activity
3.10.2
Answer
3.11
Other handy tools: select, slice, bind, and arrange
3.12
Filtering for members of a vector
3.13
Final challenge
3.13.1
Activity
3.13.2
Answer
3.14
More ideas for visualisations
4
P-values and confidence intervals
4.1
Correlation recap
4.2
Testing null-hypotheses
4.2.1
What can samples look like when the true correlation is 0?
4.2.2
Understanding actual data in relation to these simulations
4.2.3
So, what is a p-value?
4.3
Confidence intervals
4.3.1
Simulating confidence
4.3.2
What is a confidence interval, then?
4.4
Further reading
5
Linear regression
5.1
Before we begin
5.2
The dataset
5.3
Interlude on methodology
5.4
Descriptives
5.4.1
Activity
5.4.2
Answer
5.5
Prep to understand the simplest regression model
5.5.1
Activity
5.5.2
Answer
5.6
The simplest regression model: intercept-only model
5.6.1
Activity
5.6.2
Answer
5.7
Adding a slope to the regression model
5.7.1
Activity
5.7.2
Answer
5.8
Residuals
5.8.1
Activity
5.8.2
Answer
5.9
Comparing models
5.10
Regression with two or more predictors
5.10.1
Activity
5.10.2
Answer
5.11
Interpreting regression models with two or more predictors
5.12
Optional: that pesky negative intercept
5.12.1
Activity
5.12.2
Answer
5.13
Finally: confidence intervals
5.14
Very optional extras
5.14.1
Making functions
5.14.2
Another way to make scatterplots: GGally
6
Linear regression diagnostics
6.1
Before we begin
6.2
The dataset
6.3
Fit a regression model
6.3.1
Activity
6.3.2
Answer
6.4
Checking for normally distributed residuals
6.4.1
Base R histogram
6.4.2
Quantile-comparison plot
6.4.3
Statistical test of normality
6.5
Checking constant residual variance
6.5.1
Activity
6.5.2
Answer
6.6
Checking for relationships between residuals and predicted outcome or predictors
6.7
Checking linearity
6.7.1
What should be linear in a linear model?
6.7.2
Checking for linearity
6.8
Checking influence: leave-one-out analyses
6.8.1
Residual outliers
6.8.2
Cook’s distance
6.8.3
DFBETA and (close sibling) DFBETAS
6.8.4
View them all
6.8.5
So, er, what should we do with “potentially influential” observations…?
6.9
Checking the variance inflation factors (VIFs)
6.9.1
Activity
6.9.2
Answer
6.10
The challenge
6.10.1
Activity
6.10.2
Answer
7
Categorical predictors and interactions
7.1
Before we begin
7.2
The dataset
7.3
Factors
7.4
Visualising the data
7.5
The punchline: occupation type does predict prestige
7.6
Understanding factors in regression models
7.6.1
How are categorical variables encoded?
7.6.2
How are binary (two-level) categorical predictors encoded?
7.6.3
Categorical predictors with 3 or more levels
7.7
Interpreting the coefficients
7.7.1
Activity
7.7.2
Answer
7.8
Checking all combinations
7.9
The intercept is not always the mean of the comparison group
7.10
Recap
7.11
Challenge
7.11.1
Activity
7.11.2
Answers
7.12
Brief introduction to interactions
7.12.1
What is an interaction?
7.12.2
How to test for interactions in R
7.12.3
Understanding interactions
7.12.4
Further reading
8
Logistic regression
8.1
Setup
8.2
The dataset
8.3
Warmup activity
8.3.1
Activity
8.3.2
Answer
8.4
The punchline
8.5
Intermezzo: parametric versus nonparametric
8.6
What is a generalised linear model?
8.7
What is the log function again…?
8.7.1
The arithmetic
8.7.2
Why log?
8.8
Intercept-only models again
8.9
Odds and log odds
8.10
Back to that intercept
8.11
Interpreting model slopes
8.11.1
Interpret on the log-odds scale
8.11.2
Interpret using the “divide-by-4” approximation
8.11.3
Interpret using odds
8.11.4
Interpret using predicted probabilities
8.12
Diagnostics
8.12.1
Check the residual distribution
8.12.2
Check that the residual mean is constant
8.12.3
Linearity of predictors
8.12.4
Influence
8.12.5
Multicolinearity
8.13
A challenge
8.13.1
Activity
8.13.2
(An) Answer
9
Complex surveys
9.1
Readings
9.2
The dataset
9.3
The components of a survey design
9.4
Describing the data
9.4.1
Activity
9.4.2
Answer
9.5
Fitting a GLM
9.5.1
Activity
9.5.2
Answer
9.6
Slopes
9.6.1
Activity
9.6.2
Answer
9.7
Diagnostics
9.8
Another worked example: the European Social Survey
9.8.1
Set up the survey object
9.8.2
Try the analysis
10
Multilevel models
11
Mediation analysis
11.1
Simulated Example
11.1.1
Make up some data
11.1.2
Analyse it
12
References
Published with bookdown
Using R for social research
Chapter 10
Multilevel models
(Work in progress!)