# Lecture 20: Random Variables, Sampling Variablility

1. The model should fit our training data well
2. The new athletes

Random Variables {21, 21} -> 21

• X is a function

• argument is a sample: element of the domain
• returns number: element of the range
• Random Variables: X_1, X_2, X_3

• P(X=x)

• X: random Variable: function
• x: what the function may return: number
• chance X returns x

• added up all of the probabilities
• in a discrete probability we think of it as area
• P(a <= X <= b)

• in continuous

• Bernoulli(p)
• indicator variable I has value 1 if event happens and 0 if not
• P(I = 1) = p
• P(1 = 0) = 1-p
• Binomial(n, p) $$P(X = k) ~ = ~ \binom{n}{k} p^k(1-p)^{n-k}, ~~~~ 0 \le k \le n$$
# with scipy
# chance of 50 heads in 100 tosses of a fair coin
stats.binom.pmf(50, 100, 0.5)

• Uniform
unif_density = stats.uniform.pdf(x)    # uniform (0, 1) density

• Normal $$f(x) ~ = ~ \frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{1}{2}\big{(}\frac{x-\mu}{\sigma}\big{)}^2}, ~~~ -\infty < x < \infty$$
norm_density = stats.norm.pdf(x, 50, 5)      #


• there is no elementary closed form

## Expectation

• weighted average of possible values
• weights: probabilities

one sample at a time:

• E[X] = sum over all samples X(s) * P(s)
• E[X] = sum over all x, x * P(X=x)

• samples, P(s) and X(s)

## Properties

• E[X+Y] = E[X] + E[Y]
• S = X + Y
• S(s) = X(s) + Y(s)
• S(s) * P(s) = X(s)P(s) + Y(s)P(s)
• do a sum S(s) * P(s) = X(s)P(s) + Y(s)P(s)

• E[X]-E[5]=2-5=-3
• E[(X-5)(X-5)]=E[X^2]-E[10x]+E[25]=13-20+25=18

## Variance and SD

• Var[x]=E[(x-E[x])^2]

• pull out the term

• D_s = S - mu_S

• D_s = D_x + D_y

• Var[X] + Var[Y] +2E[D_x D_y]

• $$E(D_x D_y) = E((X- \mu x)(y-\mu Y))$$

• covariance

• var[s] = var[x+y] only if the covariance is zero, they are independent

## Random Variable

• A random variable is a function mapping events to real numbers
• \( X: \Omega -> \real, )