- Questions, problems, or concerns?
- Please attempt to have R and R Markdown running by section next week (you still do not need to use R or Markdown for this week’s problem set).
- Feel free to stop by office hours, set up an individual appointment, or drop by the Stat Lab if you are having trouble.

- Definition of a random variable: a variable that takes on a real value that is determined by a random generative process.
- A random variable is neither random nor a variable.
- Placeholder for a quantity that has yet to be determine by a random generative process (function).
- A random variable is a function.
- A random variable’s possible values might represent the possible outcomes of a yet-to-be-performed experiment, the results of a random process such as rolling a die, or the “subjective” randomness that results from incomplete knowledge of a quantity.
- How can we summarize random variables?

- Summarizes the probability of each outcome \(x\) occuring.
- A
*function*that describes the probability that a*discrete*random variable is exactly equal to some value. The PMF maps possible outcomes of a random variable to the corresponding probabilities of that outcome occuring. - How much “stuff” is there associated with a given event?
- All probabilities add up to 1.

*Example formalization*

Below is the probability mass function of an unfair die, where we observe a 1, 2, or 3 with probability \(\frac{1}{12}\), and a 4, 5, or 6 with probability \(\frac{1}{4}\). The PMF can therefore be defined as:

\[\begin{equation} f(x) = \begin{cases} \frac{1}{12} & : x = 1 \\ \frac{1}{12} & : x = 2 \\ \frac{1}{12} & : x = 3 \\ \frac{1}{4} & : x = 4 \\ \frac{1}{4} & : x = 5 \\ \frac{1}{4} & : x = 6 \\ 0 & : otherwise \\ \end{cases} \end{equation}\]What is \(Pr[X \geq 2]\)?

\[Pr[X \geq 2] = \] \[\sum_{x = 2}^6 f(x) = \] \[\frac{1}{12} + \frac{1}{12} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4} = \] \[\frac{11}{12}\]

*Example visualization*

- Describes the distribution of all random variables, not just discrete random variables (i.e. can be discrete or continuous).
- Describes the probability that a random variable \(X\) will take a value less than or equal to \(x\).
- How much “stuff” is there to the left of a point?

*Example visualizations*

- Discrete CDF will have “jumps” or “steps.”
- Unfair die again:

- Continuous CDF is smooth. Example: standard normal distribution.

\[ \Phi(x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2\pi}}e^\frac{-u^2}{2}du \]

- The PDF specifies the probability of a random variable falling within a particular range of values (sand), as opposed to taking on any one value (sticks). The PDF therefore allows us to answer the question: how much of the distribution of a random variable is found in the filled area? In other words, how much probability mass is there between observations in a given range?
- The probability of a random variable falling within a particular range of values is therefore given by the integral of this variable’s PDF over that range.
- What is the absolute likelihood a continuous random variable takes on any particular value? Why?
- The probability density function is nonnegative everywhere, and its integral over the entire space is equal to one.
- The PDF is the derivative of the CDF. This is why the the slope of the CDF is greatest at the highest point of the PDF.

*Example visualization*