Base rate fallacy is when probabilistic inference is made based only on data relating specifically to the situation but ignores additional background or general data relating to the instance of the situation that sometimes leads to wrong conclusions. Base rate fallacy is a “paradigmatic Bayesian inference problem” (Bar-Hillel, 1979).
If we consider a situation where a hit and run occurred at night in a city where there are 2 cab companies and a cab was suspected to have been involved. One of the cab companies have the blue colour while the other company have the green colour for their cabs. The blue cabs consists of 85% of the total cabs in the city while the remaining 15% of the cabs in the city belongs to the green cab …show more content…
company.
During the trial, a witness identified the cab involved as green cab. However, the court decided to test the reliability of witnesses’ night-time visibility and found that the witness was able to correctly identify a green cab 80% of the time and incorrectly 20%.
With this set of information provided, when asked what is the probability that the cab involved in the accident is indeed green and infers that there is an 80% chance the cab involved in the accident is green, then that person has actualised the base rate fallacy. This is because only one set of information, the one specific to the incident (the witness statement) is used to deduce the conclusion above and the general information (the distribution of the population of the cabs in the city) has been completely ignored in arriving at the inferred conclusion.
In order to work out the correct probabilistic conclusion and both the specific information and the general information about the situation has to be combined and bayes rule gives a normative way to do this with the formula; In this case we can represent Green cab as G and Blue cab as B and apply to the formula to give
(P(G|g))/(P(B|g))=(P(g|G))/(P(g|B))×(P(G))/(P(B))=(0.8)/(0.2)×(0.15)/(0.85)=12/17
Which makes the posterior probability as 12/(12+17)=0.41
41% chance that the cab in the accident is green
Base rate fallacies may not always lead to incorrect conclusions and can be ignored when the specific information that is presented with the base rate is actuarial in nature. Modifying the above CAB example to now replace the witness ability to identify the correct colour at night is replaced with actuarial leading to; Investigators discovered that in the neighbourhood where the accident occurred which is nearer to the Green CAB company headquarters than to the Blue cab company, 80% of all taxis are green and 20% are blue. In this case, it will be no fallacy to disregard the base rate.
In the context of Information security, consider the intrusion detection system (IDS). They have false positives errors (warning when there is no intrusion) and false negatives (not warning when there are intrusions) (Cavusoglu et al, 2005). Consequently, this leads to manual investigations to rule out or confirm intrusions. However, given that the proportion of hackers in the user population is low, an IDS with a moderate false alarm rate will generate more alarm for normal users, and ignoring the prior probability and acting on every IDS intrusion signal, means the low numbers of hackers in the general population is being ignored and this is the base rate fallacy (Ogut et al, 2008). This also means that an increase in manual intervention or investigations that leads to increase cost. At this point it is good to point out that the quality profile of the IDS is measured by its false positives and negatives and the wish is for both to be low but the technology is such that reduction in one often means an increase in the other. However, depending on the sensitivity the organisation holds towards security, and if it is very level security and the balance of security far outweighs operations, it will be of significant value to investigate every alert. In this instance, base rate can be ignored.
Prosecutors Fallacy
This fallacy uses statistical reasoning to argue guilt of a defendant during a trial by a prosecutor.
This fallacy normally arise when assumptions about independence of events are made leading to a misunderstanding of conditional probability and the neglecting of prior odds of guilt before the introduction of the evidence being used to infer guilt. In order words, this fallacy consists of showing that to explain that the defendant is innocent is highly improbable and then deducing that the defendant is guilty is therefore the correct one (Buchanan, 2007).
In the controversial case of Sally Clarke, where the jury only had an option to decide whether she murdered her children or they died of a very rare and unexplained natural causes, and the prosecutor argues that the chances of them dying of natural causes was 1 in 73 million therefore she was guilty of murdering them. However, assumptions of independence of the two events were made in arriving at the probability of 1 in 73 million, therefore confusing conditional probability. Also, slim chance for an event not to occur is not relevant in deciding when the event has occurred and cannot be considered independently without comparing it to the chance of the event occurring that is the relationship between the chances of Sally killing her children cannot be looked at independently whilst ignoring the chances she didn’t do …show more content…
it.
This fallacy also materialises in situations when multiple tests are carried out such that a particular evidence is matched against a large database.
In this instance, the likelihood of finding a match increases in relation to the size of the database. Suppose the city in which a suspect lives has 500,000 adult inhabitants. Given the 1 in 10,000 likelihood of a random DNA match, (this is not the probability that the person did it because their DNA was found at the scene of the crime), it works out that about 50 people in the city would have DNA that also matches the sample. So the suspect is only 1 of 50 people who could have been at the crime scene. Based on the DNA evidence only, the person is almost certainly innocent, not certainly guilty as against if the fallacy was evoked to suggest the suspect is 1 in 10,000 chance of being innocent so he is
guilty.
One can think for this manner – lets assume a plane crashes and the manufacturer is on trial arguing that the cause was not mechanical failure, citing that, "Only 1 in every 100,000 planes will ever have this particular mechanical failure leading to a crash." Probably true, but the real question that will be of interest will be, of planes that do crash, what percentage had that mechanical failure? It might not be a large number, as there are could be many possible causes of a crash, but it could be larger than larger than 1 in 100,000.
The same principle can be applied to system protection in that a supplier can present their product with the figures like there IDS will only miss 1 in every 1000 attempts to penetrate a network but the real question is how many attempts are made over a certain period? If it’s a very large number of attempts, then the probability of success will increase.