We refer to P(E) as the probability of the event E.
The distribution of a parameter before observing any data is called the prior distribution of the parameter. The conditional distribution of the parameter given the observed data is called the posterior distribution. If we plug the observed values of the data into the conditional p.f. or p.d.f. of the data given the parameter, the result is a function of the parameter alone, which is called the likelihood function.
Probability and Statistics, Fourth Edition, M.H. DeGroot and M. J. Schervish
Likelihood is NOT probability, because likelihood violates the three axioms. For example, we observe x=5 for an exponential distribution with an unknown parameter p. Then the support of the likelihood function is p>0. If the likelihood function is probability, then we must have the following expression