lecture 1
assign the sample space into different subset, which is called the event.
for example, the sample space is $\Omega$, and subset is $A$, which is also called event A. then the outcome, the point is inside A, we call that event A occurs .
axioms is that thing all the ground truth model need to obey:
- $P(A)\ge 0$
- $P(\Omega)=0$
- if A disjoints B, then: $P(A\cup B)=P(A)+P(B)$
the axiom 3 tells us: the probability is like mass, or area, th probability of two disjoint event is just like the sum area of two disjoint area A and B.
and through there three axioms, we can get: $P(A)\le 1$
$$
\begin{flalign}
P(\Omega)&=^{(2)}1 \
&=P(A\cup A^c) \
&=^{(3)}P(A)+P(A^c) \
&\ge^{(1)} P(A)&
\end{flalign}
$$
more interesting is $P(A\cup B\cup C)=P(A)+P(B)+P(C)$ if A, B, C are disjointed
$$
\begin{flalign}
P(A\cup B\cup C)&=P(A\cup B)+P(C) \
&=P(A)+P(B)+P(C)&
\end{flalign}
$$
and for the finite subset, $s_i$ is an outcome of one experiment, for simplify, we remove the ${}$
$$
\begin{flalign}
P({s_1,s_2,s_3})&=P({s_1})+P({s_2})+P({s_3}) \
&=P(s_1)+P(s_2)+P(s_3)&
\end{flalign}
$$
a classical problem is what is the probability that a point is in a sqare?
such as $P((x,y)=(0.5, 0.3))=?$
the first we need to get the probability model and make a reasonable probality law.
in the sample space, we use the area to represent the probability, so in that case, the $P((x,y)=(0.5,0.3))$ is zero.
lecture 2 conditional probability and Bayes rule
if probability is 0, it does not mean it will not happen, $P((x, y)=(0, 0))=0$, but it does happen. The probability is zero only means it will extremely not happen.
On the other hand, the probability is one also means extremely happen, eg. $P((x,y)\ne (0,0))=1$
Another thing is conditional probability $P(A\vert B)$
$$
P(A\vert B)=\frac{P(A\cap B)}{P(B)}
$$
That’s mean is given event B has occured, what’s the probability that event A occurs?
The original sample space is $\Omega$, now we need to revice the sample space to B, do not need to care another area.
if $P(B)=0$, we call that $P(A\vert B)$ is undefined
It also can be represented as follow:
$$
\begin{flalign}
P(A\cap B)&=P(B)\times P(A\vert B) \
&=P(A)\times P(B\vert A)&
\end{flalign}
$$
here is the explaination: the probality of A and B occurs is proportional to the probability of B occurs. As long as we do experiments, we will find that when B occurs, sometime A also occurs, proportionally, and the proportional means $P(A\vert B)$
condational probability is sometimes counter-intuitional. Here is an example:
By calculastion, we will get $P(A\vert B)$ is about 34%, which means given the radar has registered something, ther is only 34% probability that airplane occur.
Another important thing is the Bayes’ rule:
$$
\begin{flalign}
P(A_i\vert B)&=\frac{P(A_i\cap B)}{P(B)} \
&=\frac{P(A_i)P(B\vert A_i)}{P(B)} \
&=\frac{P(A_i)P(B\vert A_i)}{\sum_jP(A_j)P(B\vert A_j)}
\end{flalign}
$$
When $A_i\to B$, it is a cause and effect model $P(B\vert A_i)$
And when $B\to A_i$, it is a inference model $P(A_i\vert B)$
lecture 3 independence
independence
Independece is whether the first thing occurs or not, does not change your belief about the second event.
$$
P(B\vert A)=P(B)
$$
But as we known, conditional probability requires $P(A)\ne 0$, so we could change the definition:
$$
P(A\cap B)=P(A)P(B)
$$
A example is $A$ and $B$, of cource they are disjoint, but if tey are independent?
The answer is not! And whether independent or not does not depend on disjoint.
As long as event $A$ occurs, then we are sure that $B$ will not happen. Thus, $A$ and $B$ are absolutely dependent. This is the intuitional result. Also we could use the equation 16 to get numeral result: given $P(A)=1/3$, $P(B)=1/4$, so $P(A\cap B)=0\ne P(A)P(B)=1/12$, so they are dependent. In conclusion, most of the time the intuition is not right(the intuitional law is whether the first thing $A$ occurs does not change our belief about the second one $B$. So we need more exact fomular such as numeral explanation.
conditional independence
we can write conditional independent as follow:
$$
P(A\cap B)= P(A)P(B) \
P(A\cap B\vert C)=P(A\vert C)P(B\vert C)
$$
here is an example:
given their probability $P(A)=1/3,P(B)=1/4,P(A\cap B)=1/12$, above probability just full the assumption: $A$ and $B$ are joint but independent. Now given event $C$, the oringnal probability model is $\Omega$, now given $C$ occurs, the new probability model is revized which is $C$. Thus in the new model, event $A\vert C$ and $B\vert C$ are disjoint, so they are abusolutely dependent.
The conclusion is even if in the original model $A$ and $B$ are idnependent, but in the sub-model $C$, $A$ and $B$ are not independent anymore.
Another example is coins: we have two biased coins.