Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I appreciate the lack of math notation, for many with a poor mathematics backgrounds it feels like a huge wall into getting into interesting and useful theories.


I agree, I m working on a google translate idea for math. I think notations could be more readable!


Bayes theorem is very well suited to this. Frankly, it's one of those rare cases where those without much math might find it easier to read the original paper than many of the introductions...


When I finally got Bayes Theorem I thought it says something obivious in unfamiliar terms.

What made it click for me was realizing that bayesian networks are a mini-language and the Bayes theorem is much more easily explained visually than with formulas. I think teachers should start with telling the correspondence between them and probability terminology.

Here's how I would explain it.

----

In a bayesian network nodes are events, arrows are probabilities.

When you traverse a path made of successive arrows you multiply the probabilities of the arrows you encounter along the path.

When there is more than one path to get from A to B and you want to know the probability of getting from the former to the latter, you sum the probabilities obtained from the various paths.

When you say "probability of A" it's like saying: sum of the paths that get to A.

When you say "probability of A and B" it's like saying: sum of the paths that include both A and B.

When you say "conditional probability of B given A" it's like saying: starting from A, sum of the paths that lead to B.

----

Let's do a simple application. This is a tree that doctors should find familiar and from which i understood it.

     /T+
  D+/
   /\
  /  \T-
 /
 \    
  \  /T+
   \/
  D-\
     \T-
    
Starting from root, at the first bifurcation we have: probability of having a disease or not. At the second bifurcation we have: probability that a diagnostic test tells either "positive" or "negative".

Usually doctors can estimate the values of the single arrows of this tree.

Let's say I told you: what's the conditional probability of having a positive test given the patient has the disease? Given what we said, you just put your pencil on D+ and follow the path to T+: just 1 arrow, no need to multiply (it's called the "sensitivity" of the test).

What's the probability of having a positive test randomly extracting a person from population? Since we don't start with a patient that has or not a disease, we put our pencil on root. There are 2 ways of getting to a T+: root-->D+-->T+ and root-->D- -->T+. As we said above, while following each of the paths we multiply the arrows we encounter and then we sum the result of the 2 paths.

And finally: what's the probability of our patient having the disease given that the test says "positive"? We said we have 2 ways to get a positive test, but in only one of these ways our patient really has the disease, so we just divide the probability given by the only path that contain both D+ and T+ by the probability given by all paths that lead to T+. We are just saying that true positive are a fraction of all positives (seems obvious to me?). Numerator is the only "test is positive and it's true" path. Denominator is the sum of all "test is positive" paths.

Well, guess what we just did:

P(D+|T+) = ( P(T+|D+) P(D+) ) / P(T+)

(Additional intuition: another way to see it is that what we did corresponds to mapping the tree we started from to a flipped one in which the first bifurcation is T+/T- and the second one is D+/D-)


I think a better way to describe it is ven diagrams:

conditional probability is just like, what proportion does A represent given B has already happened.

A might be small in the ven diagram box, but take up a larger area when constrained to only the part that B is in


Bayes Theorem hardly requires any math notation at all. It would literally take you less than a minute to understand conditional probability.

Yikes.


Possibly true, but just looking at the Wikipedia page for Bayes Theorem, more than half the text on the page is math notation: https://en.wikipedia.org/wiki/Bayes%27_theorem

It doesn't matter how simple the math actually is, if someone is unfamiliar with mathematical notation it's going to be overwhelming to read.


Then learn the notation or find another source.

This is a bit like complaining about the existence of books because you never learned how to read.


I guess some people interested in machine learning doesn’t know about multiplication and division, but i wouldn’t want to depend on their models...


Oh yeah, and the first actually usable form of Bayesian Theorem would be probabilistic graphical models with max-sum algorithm. Good luck mastering that quickly or at all!


That is far from the first usable form of Bayes. I have no idea what point you are making.

Bayes Theorem is easily derived algebraically using conditional probability and the chain rule. You can also derive it easily with a Venn diagram. There is barely any notation needed at all here to understand it.

If you're struggling with things at that level, it is more likely due to your own laziness, not because the math is hard. Because it is very easy to reason about.


Chain rule???




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: