Python_Data_Science_第五课-hmchzb19-ChinaUnix博客

Linuxer

首页　| 　博文目录　| 　关于我

hmchzb19

博客访问： 1812932
博文数量： 297
博客积分： 285
博客等级：二等列兵
技术积分： 3006
用户组：普通用户
注册时间： 2010-03-06 22:04

个人简介

Linuxer, ex IBMer. GNU https://hmchzb19.github.io/

文章分类

全部博文（297）

machine_learning（16）
PYthon_Design_Pa（1）
数学（1）
Data Struct（1）
scheme（3）
Container（1）
sqlite3（1）
firefox（4）
Tor（1）
java（30）
生活（2）
测试生涯（1）
互联网（4）
algorithm（4）
ubuntu（4）
安全和kali （35）
windows（5）
cloud_manage（3）
tcp/ip（1）
security（5）
Linux（74）
python（70）
C（9）
postgresql（5）
shell（3）
db2（3）
oracle（3）
Power-VM虚拟化（7）
未分配的博文（0）

文章存档

2020年（11）

2019年（15）

2018年（43）

2017年（79）

2016年（79）

2015年（58）

2014年（1）

2013年（8）

2012年（3）

我的朋友

相关博文

Python_Data_Science_第五课

分类： Python/Ruby

2018-05-31 15:12:35

conditional probability and Bayes' Theorem
这两个理论看的我头晕。

1. conditional probability
if I have two events that depend on each other, what's the probability
that both will occur.
Notation:
P(A,B) is the probability of A and B both occuring independent of each other.
P(B|A): probability of B given A has already occurred.

we know:
P(B|A) = P(A,B) / P(A)

Example:
I give my students two tests, 60% of my students passed both tests, but the first
test was easier, -80% passwd that one. what percentage of students who passed
the first test also passed the second ?
A= passing the first
B= passing the second
so we are asking for P(B|A) - probability of B given A

P(B|A) = P(A,B) / P(A) = 0.6 / 0.8 = 0.75
75% of (students who passed the first test) also passed the second.

#Question ,What about P(A|B), probability of A given B has already occurred.
we calculate the students passed the second test also passed the first test.
P(A|B) is NOT equal to P(B|A)

2. Bayes' Theorem
Now that you understand conditional probability, you can understand Bayes' Theorem:

P(A|B) = P(A)P(B|A) / P(B)
In english: The probability of A given B, is the Probability of A times the probability of B given A over the probability of B.

The key insight is that the probability of something that depends on B depends very much on the base probability of B and A.

Drug testing is a common example. Even a "highly accurate" drug test can produce more false positives than true positives.
Let's say we have a drug test that can accurately identify users of a drug 99% of the time,
and accurately has a negative result for 99% of non-users. But only 0.3% of the overall population actually uses this drug.

A= is a user of the drug
B = test positively for the drug

P(A)=0.3% = 0.003
We can work out from that information that P(B) is 1.3% ( 0.99 * 0.03 + 0.01 * 0.997 ) -
The probability of testing positive if you do use, plus the probability of testing positive if you don't .

P(A|B)= P(A) * P(B|A) / P(B) = 0.003 * 0.99 / 0.013 = 22.8%

So the odds of someone being an actual user of the drug given that they test positive is only 22.8%.
Even though P(B|A) is 99%, P(A|B) is only 22.8%.

看完Bayes' Theorem, 对很多所谓的统计数字能有些新的认识。
参考web page .

里面的例子是cancer test.
%1的人有癌症， 99%没有
80%有癌症的人能得到正确的结果，20%人即使有cancer 可能也会miss.
没有癌症的人9.6%的几率会被误诊为癌症， 90.4%的概率会得到自己没有癌症的诊断。

如果一个人诊断结果是癌症，那么他真的有癌症的几率是多少呢？
So, our chance of cancer is .008/.10304 = 0.0776, or about 7.8%.

阅读(1158) | 评论(0) | 转发(0) |

上一篇：Python_Data_Science_第四课

下一篇：Python_Data_Science_第六课

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6