Bayes Theorem is a very common and fundamental theorem used in Data mining and Machine learning. Its formula is pretty simple:

P(X|Y) = ( P(Y|X) * P(X) ) / P(Y), which is Posterior = ( Likelihood * Prior ) / Evidence

So I was wondering why they are called correspondingly like that.

Let’s use an example to find out their meanings.