probability
numerical assignment to uncertain event
ex-ante
before the uncertainty is resolved
ex-post
after the uncertainty is resolved
experiment
any situation in which the possible outcomes aren't known with certainty
sample space
collection of ALL possible outcomes of an experiment
event
any subset of a sample space (includes elements, space itself, and empty set)
4 basic set operations
1) union
2) intersection
3) complementation
4) set difference
union
for 2 sets A and B, the union of A and B is the set of all points in A and/or B; AUB = {x?S: x?A and/or x?B}
intersection
for 2 sets A and B, the set of all elements that are in both A and B; A?B = {x?S: x?A and x?B}
complementation
the complement of a set A is the set of all elements that are not in A; A? = {x?S: x?A}
set difference
the difference between A and B is the set of all elements that are in A but not in B; A\B = A?B? = {x?S: x?A, x?B}
disjoint
2 sets A and B are disjoint if A?B = ?
synonym for disjoint
mutually exclusive
subset
A is a subset of B if all the elements in A are also in B; A?B
subset theorem
If A?B and B?A, then A=B
4 set properties
1) associative laws
2) commutative laws
3) distributive laws
4) DeMorgan's laws
associate laws
A U (B U C) = (A U B) U C; same for intersection
commutative laws
AUB = BUA; same for intersection
distributive laws
A U (B ? C) = (A U B) ? (A U C); same vice versa
DeMorgan's laws
(AUB)? = A? ? B?
probability
a probability measure is a function that assigns numbers to events for an event E; the probability of E, P(E)
A ? B = ?
A and B are disjoint
S
universal set; the sample space
E? =
S \ E
3 axioms of probability of E
1) P(E) ? 0
2) P(S) = 1
3) For a sequence E1, E2,..., of mutually exclusive events (Ej ? Ek = ?), then P(?Uj=1 Ej) = ??j=1 P(Ej)
4 Properties of Probability
1) P(?) = 0
2) P(A?) = 1 - P(A)
3) If A?B, then P(A)?P(B)
4) P(AUB) = P(A) + P(B) - P(A?B)
If 2 sets aren't necessarily disjoint, then
must subtract intersection of the sets
Let A be an event, then (3)
1) (A?)?=A
2) ??=S
3) S?=?
For all sets A and B, (5 union)
1) A U B = B U A
2) A U ? = A
3) A U A = A
4) A U S = S
5) A U A? = S
If A?B, then
A U B = B and A ? B = A
If each set of outcomes in some countable collection is an event,
we must call their union an event
For all sets A and B, (5 intersection)
1) A ? B = B ? A
2) A ? ? = ?
3) A ? A = A
4) A ? S = A
5) A ? A? = ?
simple sample space
sample space where every outcome is EQUALLY LIKELY and number of outcomes is finite
simple sample space probability
P(E) = n(E)/n
finite sample space
S = {s1, s2,...,sn}; countable
If we let Pj=P(Sj), j=1,...,n, then by 3 axioms:
1) Pj ? 0 for j=1,...,n
2) n?j=1 pj = 1
3) P(A) = ??S=A P(S)
when all elements of S are equally likely,
Pj = 1/n
multiplication principle
Suppose an experiment has following properties:
1) experiment has k parts
2) kth part has nk possible outcomes, regardless of outcome in n1,...,nk-1
Then, total number of outcomes = n1xn2x...xnk
permutation
given a set A=(a1,a2,...,an), in how many ways can we get samples of size k out of the n elements; ordered; always k?n
permutation w/ replacement
n^k; allow k>n
permutation w/o replacement
Pn,k = n x (n-1) x (n-2) x...x (n-(k+1)) = n!/(n-k)!
permutation corollary
the number of possible orders of n elements is Pn,n = n!
combination
set of n elements A=(a1,...,an), in how many samples of size k are possible; unordered and w/o replacement
combination formula
Cn,k = (Pn,k) / k! = n! / ((n-k)! k!)
binomial coefficient
(n k) = Cn,k = n! / ((n-k)! k!); "n choose k
counting summary
- ordered w/ replacement = n^k
- ordered w/o replacement = Pn,k
- unordered w/o replacement = Cn,k
binomial coefficient properties
1) for all n, (n 0) = (n n) = 1
2) for all n and all k=0,1,..,n, (n k) = (n n-k)
0! =
1
multinomial coefficients
want to split n elements into k groups such that n1+n2+...+nk=n; (n n1)(n-n1 n2)... = n! / (n!n2!...nk!) = (n n1,n2,...nk)
Probability of Union of 3 Events
P(A1UA2UA3) = P(A1)+P(A2)+P(A3)-P(A1?A2)-P(A1?A3)-P(A2?A3)+P(A1A2A3)
Simpson's Paradox
when a correlation present in different groups is reversed when the groups are combined (ex. basketball players' scores)
Continuity from below
Given A1?A2?A3..., P(?Uj=1 Aj) = lim,n-->? P(An); use complements for continuity from above
Given P,Q; z=aP+bQ; a,b>0; a+b=1; z=[0,1]; z(E)=aP(E)+bQ(E), then...
1) z(S)=aP(S)+bQ(S)=a+b=1
2) {Aj}?,j=1 is disjoint; z(?Uj=1 Aj) = ??j=1 z(Aj)
Probability of Union of Finite Events
1) Disjoint: P(nUj=1 Aj) = n?j=1 P(Aj)
2) Not Disjoint: P(nUj=1 Aj) = [n?j=1 P(Aj)] - [n?j<k P(Aj?Ak)] +...+/- [n?j<..<n P(Aj?...?Ak)]
P(A?B?) =
P(A) - P(A?B)
conditional probability
P(A|B) = P(A?B)/P(B)
multiplication rule for conditional probability
P(A?B) = P(A|B)P(B)
P(A?|B) =
1 - P(A|B)
independence
If 2 events A and B are such that A gives no information about B and B gives no information about A, then A and B are independent.
test for independence
P(A|B)=P(A), P(B|A)=P(B) --> P(A?B) = P(A) x P(B)
If A and B are independent, then
A and B? are independent; also A? and B? are independent
Independence Special Cases:
1) If P(A) or P(B)=0 or 1, then A and B are independent
2) No event is independent with itself
3) No event is independent with its complement, unless either P(A) or P(A?) is 0 or 1
4) If A?B=?, then P(A?B)=P(?)=0, so A and B independent iff either P(A), P
mutually independent
sequence of events A1,...,An are mutually independent if for any subcollection of events Aj1,...,Ajk, P(Aj1?...?Ajk) = P(Aji)x...xP(Ajk)
conditionally independent
collection of events A1,...,An are conditionally independent if for any subcollection of events Aj1,...,Ajk, P(Aj1?...?Ajk|B) = P(Aj1|B)x...xP(Ajk|B)
partition
collection of events B1,...,Bk form a partition S iff: 1) union of all sets equals S 2) sets disjoint/mutually exclusive
law of total probability
Let A be an event and (B1,...,Bk) be a partition of S such that P(Bj)>0. Then, P(A) = P(A|B1)P(B1)+...+P(A|Bk)P(Bk)
Baye's Rule
Let A be an event and (B1,...,Bk) be partition of S such that P(Bj)>0. Then P(Bj|A) = P(A|Bj)P(Bj) / P(A|B1)P(B1)+...+P(A|Bk)P(Bk)
letters in relation to timing
ex-ante: uppercase letters, ex-post: lowercase letters
random variation
real valued function over the sample space; X(s): S-->?
assign probability to random variables by:
P(x?A) = P(s:x(s)?A)
discrete distributions
where X takes finite number of values/countably many; x1,..,xk
probability function
pf or pmf; f(x)=P(X=x) ?x??
support
set of all values of x that have positive probability; ?={x??: f(x)>0}
3 properties of f(x)
1) f(x)?0 ?x??
2) ?x?? f(x) = 1; ?x?? f(x) = 1
3) P(x?A) = ?x?A f(x)
probability function property
2 random variables X and Y can have same pf and not be the same
Bernoulli RV
RV that only takes 2 values: 1 with probability p, 0 with probability 1-p
pf of Bernoulli RV
f(x) = {p? (1-p)?-� , x?{0,1} and 0 otherwise
indicator function
of event E is Ie(s)={1, x?E and 0, x?E
binomial distributions
consider situation in which repeat an experiment n times (each independently) and record each success (prob p) or failure (prob 1-p); X ~ B(n,p)
pf of binomial distributions
f(x) = {(n p)p? (1-p)?-?, x?{1,2,...,n} and 0 otherwise
binomial distribution property
IF X ~ B1(n,p) then Y=X1+X2+...Xn where Xj are independent Bernoulli (p)
joint pf
of 2 RVs X and Y is function such that: f(x,y) = P(X=x, Y=y)
3 properties of joint pf:
1) f(x,y) ? 0 ? (x,y)??�
2) ?x???y?? f(x,y) = 1
3) P((X,Y)?A) = ?(x,y)?A f(x,y)
marginal pf of X
obtained by adding all possible values of y in joint pf; same as regular pf of X; f1(x) = ??j=1 f(x,yj)
For 2 RV, saying X and Y independent means:
for ALL A and B, P(x?A, y?B) = P(x?A)xP(y?B)
2 discrete RV are independent iff:
f(x,y)=f1(x)f2(y) ?(x,y)??; trivial to test outside support b/c probability will always be 0
joint pf independence property
If X and Y independent, then for functions h and g, h(x) and g(y) are also independent
joint pf conditional property
If X and Y are jointly distributed discrete RV, conditional probability that X=x given that Y=y, with P(Y=y)>0 is g(x|y) = f(x,y)/f2(y); note g(x|y)?)
?x?? g(x|y) =
?x?? f(x,y)/f2(y) = 1/f2(y) ?x?? f(x,y) = 1
f(x,y) =
g(x|y)f2(y)
continuous random variable
X is continuous random variable if there exists a non-negative function f such that the probability x lies in a ...; focus: A=[a,b]; informal: x is continuous if x can take all possible numbers in its range
probability density function
pdf; non-negative function f defined on the real line is a pdf of RV x iff: P(a<x<b) = ?b,a f(x)dx
support (for pdf)
x={x??: f(x)>0}
If x is a continuous RV, P(x=a) =
0 ?x?? (must be over a range)
If x is a continuous RV, P(x<a<b) =
P(a?x?b) = P(a?x<b) = P(a<x?b)
0 probability
doesn't mean not possible, just can't assign number ex-ante to event
2 Properties of pdf
1) f(x)?0, ?x??
2) ??,-? f(x)dx = 1
f(x), a pdf, is...
NOT a probability
uniform distribution
on an interval on [0.1] is model saying probability of X being in an interval [0,1] is proportional to length of interval
uniform distribution equation
f(x) = I(0?x?1)
pdf of uniform distribution on [a,b] is...
f(x) = {1/(b-a), a?x?b and 0 otherwise OR f(x) = 1(b-a)*I(a?x?b)
joint pdf
non-negative function f(x,y) on ?� is joint pdf of RVs (x,y) iff P((x,y)?A) = ??A f(x,y)dxdy
2 Joint pdf Properties
1) f(x,y)?0, ?(x,y)??�
2) ??,-???,-? f(x,y)dxdy = 1
marginal pdf
compute by "adding"/integrating out the other variable; f?(x) = ??,-? f(x,y)dy and vice-versa
P(x=a, y=b) =
0 ?(x,y)??�
P(y=h(x)) =
0
independence for pdf
X and Y are independent iff: f(x,y) = f(x)f(y) ?(x,y)??�
conditional for pdf
Let X and Y be 2 continuous RV with joint pdf f(x,y) and marginal pdfs f?(x) and f?(y). Then, for 2 points x? and y? such that f?(x?)>0 and f?(y?)>0, the conditional pdfs are: g?(x|y?) = f(x,y?)/f?(y?) and g?(y|x?) = f(y,x?)/f?(x?)
g?(x|y?) is...
NOT a probability
law of total probability (for pdf)
f?(x) = ??,-? g?(x|y)f?(y)dy
Baye's Rule (for pdf)
g?(y|x?) = g?(x|y)f?(y) / f?(x)
cumulative distribution functions
cdf; the cdf of a RV X is a function defined for each real number x as follows: F(x) = P(X?x), -?<x<?
4 Properties of cdf
1) F(x)?[0,1], ?x??
2) F() is non-decreasing: if x?<x?, then F(x?)<F(x?)
3) lim x-->-? F(x)=0 and lim x-->? F(x)=1
4) F(x) is continuous from the right
2 Facts about cdf
1) cdf of discrete RV have jumps
2) cdf of continuous RV have no jumps
Fundamental Theorem of Calculus with cdf
if pdf function f is continuous, then F'(x)=f(x)
bivariate cdf
P(X?x, Y?y) = F(x,y) ?(x,y)??�
2 Properties of Bivariate cdf
1) F?(x) = lim y-->? F(x,y) and F?(y) = lim x-->? F(x,y)
2) When it exists, ?�F(x,y)/?x?y = f(x,y)
functions of RV
Let Y=r(x), where r() is some function:
if X discrete, P(Y=y) = P(r(x)=y) = ?x:r(x)=y f(x)
if X continuous, G(y) = P(Y?y) = P(r(x)?y) = ?x:r(x)?y f(x)dx
If Y is also continuous, the pdf of Y is:
g(y) = dG(y)/dy
Special Case of functions of RVs
Suppose X and Y both continuous, then simplify under 2 conditions: 1) X is continuous with pdf f(x) 2) r() is DIFFERENTIABLE and STRICTLY MONOTONE, then pdf of Y: g(y) = f(s(y))*|ds(y)/dy|
2 Comments on functions of RVs:
1) r(x) has to be strictly monotone over support of X
2) let [a,b] be range of r(x), then density g(y) = formula for a?x?b or 0 otherwise
expectations
summary measure of central location
expected value of X for discrete
Let X be discrete RV with pf f(x), then expected value of X is E[x] = ?j xj
f(xj) provided ?j |xj|
xj|*f(xj)<?
expected value of X for continuous
Let X be continuous RV with pdf f(x), then expected value of X is E[x] = ??,-? xf(x)dx provided ??,-? |x|f(x)dx<?
expectation exists when...
?, the support of X, is bounded
expectations of functions
Let Y=r(x), then discrete: E[Y] = ?j r(xj)*f(xj); continuous: E[Y] = ??,-? r(x)f(x)dx
expectation rule
in general, E[g(x)] ? f(E[x])
Properties of Expectations
1) Let Y=a+bx, then E[Y] = a+bE[X]
2) Let X1,...,Xk be RVs. Let Y=k?j=1 aj
xj. Then E[Y] = k?j=1 aj
aj*E[xj]
2) If a?x?b, then a?E[X]?b
4) If X and Y are independent, g() is function of X, h() is function of Y, then E[h(Y)g(X)] = E[h(Y)]E[g(X)]
5) If X1,.
mean
?; common way to characterize distribution; not very useful with measure of dispersion; enough when indicator of RV b/c dispersion 1 or 0 (% actually giving probability score next shot, variance is Px(1-P)=52x48)
variance
Let X be RV with mean ?=E(x). Variance of X is V(x)=?�=E[(X-?)�]; V(x)?0; small value indicated probability distribution is tightly concentrated around ?
standard deviation of X
square root of variance; ??�
Variance Properties
1) V(x)=0 iff exists constant c such that P(X=c)=1 (only positive if some uncertainty)
2) V(aX+b) = a�V(X) [if adding constant to RV, doesn't affect variance]
3) For every RV X, V(X) = E(X�)-(E(X))� = E(X�)-?�
4) If X1,...,Xn are independent RVs, V(a1X1+.
variant of sum is
sum of variants ONLY when variables independent
covariance
Let X and Y be RVs and let E(X)=?x, E(Y)=?y, V(X)=?x�< ? and V(Y)=?y�<?, then Cov(X,Y)=?xy = E[(X-?x)(Y-?y)]
covariance is...
measure of LINEAR RELATIONSHIP between X and Y; positive, negative, or zero; depends on units used to measure X and Y
correlation
Let X and Y be RVs and let E(X)=?x, E(Y)=?y, V(X)=?x�< ? and V(Y)=?y�<?, then ?(X,Y) = Cov(X,Y)/?x*?y; can be positive, negative, zero
Correlation/Covariance/Variance Properties
1) Cov(aX,bY)=abCov(X,Y)
2) Cov(X,Y+Z)=Cov(X,Y)+Cov(X,Z)
3) Cov(X,Y)=E(XY)-E(X)E(Y)
4) If X and Y independent RVs with 0<?x�,?y�<?, then ?(X,Y)=0, but converse NOT true
5) If X and Y are RVs such that ?x�,?y�<?, then V(X+Y)=V(X)+V(Y)+2Cov(X,Y) and V(X-Y)=
conditional expectation
the conditional expectation of Y given X=x is denoted E(Y|X) such that E(Y|X=x) = ?y yg?(y|x) for discrete and E(Y|X=x) = ?-?,? yg?(y|x)dy for continuous; these are functions of x; treat x as fixed value
Properties of Conditional Expectation
1) Law of Total Expectation: E(Y) = E[E(Y|X)]
2) If X and Y are independent, E(Y|X)=E(Y)
3) If E(Y|X)=E(Y), then Cov(X,Y)=0
Law of Total Expectation
E(Y) = E[E(Y|X)]
conditional variance
the conditional variance of Y given X= is V(Y|X=x) = E[{[Y-E(Y|X=x)]�|X=x}
quantile function
Suppose F is cdf of CONTINUOUS RV X and that F-� on X(curly), so x=F-�(y) if y=F(x). The ?th quantile of distribution F defined to be value q? such that F(x?) = ?, or x?=F-�(?). ?=1/2 corresponds to the median
median
median defined to be number m such that P(X?m)?1/2 and P(X?m)?1/2 for discrete and P(X?m)=P(X?m)=?-?,m f(x)dx = 1/2
kth moment
Let X be RV and k be positive integer. E(X?) is kth moment of X provided expectation exists
centered moment
let X be RV and k positive integer. E((X-?)?) is kth centered moment of X; first centered moment of X is 0
existence of moments
if E(|X|?)<? for some positive k, then E(|X|?)<? for every positive j<k. Also, E(|X|?)?E(|X|?)
MGF
moment-generating function of RV X is M(t)=E(e^tX)
2 properties of MGF
1) if MGF exists for t in an open interval containing zero, it uniquely determines the probatility distribution (notice that always exists at t=0 where M(0)=1)
2) IF MGF exists for t in an open interval containing zero, then E(X?)=M?(0)=[d?E(e^tX) / dt?]
Bernoulli distribution
X is a Bernoulli random variable if pf is f(x|p) = {p?(1-p)?-� for x=0,1 and 0 otherwise}
Properties of Bernoulli Distribution
1) E(X)=p
2) V(X) = E(X�)-(E(X))� = p(1-p)
3) M(t) = E(e??) = pe?+1-p
Binomial distribution
binomial is sum of independent Bernoulli RVs; f(x|n,p) = {(n x)p?(1-p)?-? for x=0,1,...,n and 0 otherwise}
Properties of Binomial Distribution
1) E(x) = np
2) V(X) = E(X�)-(E(X))� = np(1-p)
3) M(t) = E(e??) = (pe?+1-p)?
Poisson Distribution
Let X be discrete RV and nonnegative, then f(x|?)={e^(-?)?? / x! for x=0,1,... and 0 otherwise}; explicit with time period
Properties of Poisson Distribution
1) E(X)=?
2) V(X)=?
3) M(t)=e^(?(e^t - 1))
4) If X1,...,Xk are independent and Xi has Poisson distribution with mean ?i (i=1,..,k), then sum X1+...+Xk has Poisson distribution with mean ?1+...+?k
Poisson process
with rate ? per unit time is process that satisfies:
1) the number of arrivals in every FIXED INTERVAL of time of length t has a Poisson distribution for which mean is ?t
2) number of arrivals in every TWO DISJOINT time intervals are independent
Uniform Distribution
X has uniform distribution on [a,b] if its pdf is f(x|a,b)={1/(b-a) for a?x?b and 0 otherwise}
Properties of Uniform Distribution
1) E(X)=(a+b)/2
2) V(X)=(b-a)�/12
3) M(t)= (e^tb - e^ta) / t(b-a)
Gamma Distribution
if X has continuous distribution f(x|?,?)={(?^?/?(?))x^(?-1)e^(-?x) for X>0 and 0 otherwise}
gamma function
?(?)=?0,? x^(?-1)e^(-x)dx; ?(1)=1
Properties of Gamma Distributions
1) E(X?) = [?(?+1)...(?+k-1)]/?^k so E(X)=?/?
2) Var(X) = ?(?-1)/?^2 - (?/?)^2 = ?/?^2
3) For t<?, M(t)=(?/(?-1))^?
4) If X1,....,Xn and independent, then sum X1+...+Xn has gamma distribution with parameters ?=?1+...+?n and ?
Exponential Distribution
gamma when ?=1; if X is continuous RV, f(x|?)= {?e^(-?x) when x>0 and 0 otherwise}
Properties of Exponential Distributions
1) E(X) = 1/?
2) V(X) = 1/?^2
3) for t<?, M(t)=?/(?-t) = (1-t/?)^-1
Exponential Distribution for P(X?t) =
e^(-?t)
memoryless property of exponential distribution
P(X?t+h|X?t) = P(X?h)
Normal Distribution
f(x|?,?) = 1