Causation -- The idea that a change in an independent variable changes the value of a dependent variable.
Code -- A set of rules or instructions to tell your computer what you want it to do.
Computer lab -- A room offering access to computing resources, where students will learn how to write code in R to analyse data to answer social scientific questions.
Constant -- An unknown value that does not vary.
Correlation -- A measure of linear association between two variables. Correlation is not the same as causation.
Data -- Facts from which conclusions can be drawn.
Dependent variable -- The variable whose values are supposed to be explained by changes in the independent variables.
Error -- The difference between between an observed and predicted value of the dependent variable.
Estimator -- A rule for calculating an estimate of a parameter. For instance, OLS is an estimator for the regression coefficients.
Independent variable -- The variable that is supposed to explain changes in the dependent variable.
Linear regression model -- A statistical model that represents a continuous dependent variable as a linear function of the parameters (regression coefficients) and the independent variables plus an error term that follows a normal distribution. The regression coefficients are usually estimated via OLS. For instance, one might say that income depends on education plus chance events. Expressed as a linear regression model this would mean that income = a + b*education + e, where a is a constant, b is the regression coefficient of Education, and e is the error term.
Model (statistical) -- An idealised representation of the process that generated the the values of the dependent variable.
Normal distribution -- A symmetric, bell-shaped distribution.
OLS (ordinary least squares) -- An estimator that minimises the sum of squared errors.
Parameter -- An unknown quantity such as the population mean.
R -- A free software environment for statistical computing and graphics.
Regression coefficient -- An estimate of how much a one-unit increase in the independent variable is associated with changes in the dependent variable.
Statistical distribution -- A function showing all possible values of the variable and their relative frequency.
Unit -- A member of a population. For instance, a voter (unit) is a member of the electorate (population).
Variable -- A numerical value that can differ across units. For instance, voter's age (variable) can be 18 years, 19 years, and so on.