Axioms
- \(P(A) \geq 0\)
- \(P(A \cup B) = P(A) + P(B)\), where both are disjoint sets
- \(P(\Omega) = 1\)
Properties
Conditional Probability
Total Probability
Here, \(A_i\)’s are independent.
Independence
If \(A\) and \(B\) are independent, we have the following.
Conditional Independence
Expectation and Variance
Linearity
Probability Mass Functions
Marginal PMFs
Conditionals
Independence strikes back
- \(p_{X|A}(x) = p_X(x)\)
- \(E[XY]=E[X]E[Y]\)
- \(Var[X + Y] = Var[X] + Var[Y]\)
- \(p_{X, Y}(x, y) = p_X(x) p_Y(y)\), \(\forall\) \(x \in X\) and \(y \in Y\)
Continuity and Lovely Curves
Cumulative Distributions
Conditionals
Conditional Expectation
Bayes' Theorem
You can of course interchange \(p\) with \(f\) to account for continuous random variables.
Gimme More
where \(Y = g(X)\).
Also, if \(Y = aX+b\), then we have \(f_Y(y) = \frac{1}{|a|}f_X(\frac{y - a}{b})\).
Correlations
Law of iterated expectations
Law of total variance
Limits of the land
- If \(X\) takes only non-negative values then \(P(X \geq a) \leq E[X]/a\).
- \(P(|X - \mu|\geq c) \leq \sigma^2/c^2\) for all \(c > 0\).
- According to the idea of convergence \(\lim_{n\to \infty} P(|X_n - a| \geq \epsilon)= 0\).
Central Limit Theorem
Independent sampling of any distribution always yields a mean distribution which is normal.
Law of large numbers
If you repeat an experiment independently a large number of times and average the result, what you obtain should be close to the expected value.
Distributions
The gaussian distribution shows up in nature a lot because there are many situations in which a lot of small effects sum up to the thing you actually measure.
Poisson statistics describe situations where an event occurs randomly but has a constant probability of happening everytime.
Discrete Random Distributions
Continuous Random Distributions
Poisson Distribution
- To predict the number of events occurring in the future!
- More formally, to predict the probability of a given number of events occurring in a fixed interval of time.
- The Poisson Distribution, on the other hand, doesn’t require you to know \(n\) or \(p\). We are assuming \(n\) is infinitely large and \(p\) is infinitesimal. The only parameter of the Poisson distribution is the rate \(\lambda\) (the expected value of \(x\)). In real life, only knowing the rate (i.e., during 2pm~4pm, I received 3 phone calls) is much more common than knowing both \(n\) & \(p\).
This gives the probability of observing \(k\) events in an interval. The average number of events in an interval is designated by \(\lambda\).
- Even though the Poisson distribution models rare events, the rate \(\lambda\) can be any number. It doesn’t always have to be small.
- Assumptions:
- The average rate of events per unit time is constant.
- Events are independent.
- If the number of events per unit time follows a Poisson distribution, then the amount of time between events follows the exponential distribution. The Poisson distribution is discrete and the exponential distribution is continuous, yet the two distributions are closely related.
Random Shit
-
How do you test whether a data sample is normal or not? https://en.wikipedia.org/wiki/Jarque–Bera_test
-
Optimal theoretical size for a bet is given by the https://en.wikipedia.org/wiki/Kelly_criterion.
-
Random variable X is distributed as N(a, b), and random variable Y is distributed as N(c, d). What is the distribution of (1) \(X+Y\), (2) \(X-Y\), (3) \(X\times Y\), (4) \(X/Y\)?
(1) \(X+Y \sim N(a+c, \sqrt{(b^2 + d^2 + 2\rho bd)})\)
(2) \(X+Y \sim N(a+c, \sqrt{(b^2 + d^2 - 2\rho bd)})\)
(3) \(X\times Y \sim N(a\times c, \sqrt{(a^2 d + c^2 b + bd)})\)
-
The Monty Hall Problem