Posted on April 17, 2013 @ 01:30:00 PM by Paul Meagher
In my last blog I showed how to compute the likelihood term P(E|H) in Bayes formula which is shown below:
P(H|E) = P(E|H) * P(H) / P(E)
In today's blog we will be using the likelihood values we previously computed in order to predict startup success based upon the evidence of two diagnostic tests P(H|E). Here is the data table we created in the last blog with likelihoods appearing in parenthesis.
|
Tests |
Outcome |
# Startups |
++ |
+- |
-+ |
-- |
S |
1200 |
650 (.54) |
250 (.21) |
250 (.21 |
50 (.04) |
U |
8800 |
100 (.01) |
450 (.05) |
450 (.05) |
7800 (.89) |
Total |
10,000
|
|
This data table provides us with all the information we need in order to use Bayes Theorem to predict the probability of startup success given evidence from two diagnostic tests. To compute the posterior
probabilities for each hypothesis given different evidence patterns, we will use a simple bayes_wizard.php script. Let me show you how it works.
When we point our browser at the bayes_wizard.php script (in a php-enabled web folder), the first screen asks us to input the number of hypothesis and test labels:
The next screen asks us to input the labels for the hypothesis and tests. We use S to mean successful startup and U to mean unsuccessful startup. We use ++ to indicate a positive outcome on two diagnostic tests, -- to indicate a negative outcome on two diagnostic tests, and so on.
Next we are asked to enter the prior probability of the different hypothesis (i.e., P(H=S) and P(H=U)). These are just the fraction of the 10,000 startups classified as successful or unsuccessful.
The next screen asks up to input the likelihood for each combination of test and hypothesis. We enter the likelihoods we computed in our last blog in this screen (see values in parenthesis in table above):
The final screen displays the posterior probabilities for each hypothesis given each evidence pattern:
The way to interpret this table is to examine each row separately. In the first row where we have two diagnostic tests with positive outcomes, we see that the posterior probability that the startup is successful is significantly higher (.88) that the probability that the startup is unsuccessful (.12). So, a startup exhibiting this pattern of diagnostic evidence is quite likely to be successful. Our posterior probability calculation allows us to move from an inital estimate of 12 percent probability of startup success to an 88 percent probability of startup success.
The diagnostic tests that might be used could be anything that might be predictive of startup success. We could, for example, assess a startup's business plan with respect to a checklist of desirable attributes and score it as pass + or fail -. The Bayes Wizard allows you to specify as
many tests and hypothesis as you want. It is up to you to come up with the hypothesis you want to examine and the number and kind of tests you want to use. You should look for empirical information about the covariation between your tests and outcomes so that you can compute the required likelihood terms.
If you have been following my last few blogs, you should now have a good sense of how you can begin to use Bayes inference to arrive at better Angel Investment decisions. If you want to see how the wizard works under the hood and how the Bayes theorem calculation is implemented, you can download the code from my GitHub account.
https://github.com/mrdealflow/BAYES
|