When Probability Misleads

Although it may not be that apparent, the concept of probability plays a crucial role in people’s day-to-day lives in a variety of different fields. From calculating rigging games so that a casino can still make a daily profit to determining whether or not a certain drug is effective after a medical trial, the mathematics and logic behind probability has proven to be an invaluable resource. However, the concepts and laws that allow us to perform these calculations can sometimes provide seemingly unintuitive results. This confusion could be exploited in news and research to mislead people into drawing faulty conclusions. In this data-driven world, having a basic sense of probabilistic literacy could go a long way towards being able to firmly grasp how mathematical concepts influence our lives.

The impact and importance of statistics in the field of medicine is often overshadowed in the media, and even when it is mentioned, it is usually shown in a deceptive way. The fairly relevant example of COVID-19 tests demonstrates this point beautifully. Suppose you took a COVID-19 test that boasts a sensitivity and specificity of 90% and 95% respectively. While these numbers seem to be reassuring, it is not entirely apparent what they actually mean.

What this is actually saying is that the test correctly identifies the disease about 90% of the time when a person has COVID-19, and is correct around 95% of time when a person does not have COVID-19. The graph shown below shows a representative sample population of this data. Now, suppose you tested positive for COVID-19, what is the probability that you are indeed infected? For simplicity, let us estimate the prevalence of the disease in the population to be around 3%.

Data from a sample of the example population mentioned above.

The obvious answer may seem to be 90%, but the actual chance of having the disease works out to be around 37%. While this may seem counterintuitive at first, it actually makes more sense when considering just how unlikely it is to be affected in the first place due to the very low prevalence of the disease. This is an example of conditional probability, and how the chances of an event, A, happening given another event, B, is not the same as the chances of event B happening given event A.

To illustrate this point, the probability of an NBA player being tall is significantly higher than the probability of a tall person being an NBA player because of how few professional basketball players there are. This type of probability is known as Bayseian probability, where instead of drawing conclusions solely based on given evidence, the new evidence is fitted into what is already known. In our case, we fitted the knowledge of the specificity and sensitivity of the tests into the already known prevalence of the disease in the population.

While nearly all fields take advantage of the useful tools probabilistic thinking provides, some completely rely on it. One such example of this is the casino industry; a multi-billion dollar sector that is built upon the mathematics of probability. Behind the casino’s curtains of flashy slot machines and lined-up poker tables lies a game of numbers rigged by mathematicians to ensure they can make a profit.

Take, for instance, the simple game of roulette. Known for having one of the highest chances of winning in a casino, roulette makes it seem like the odds of turning a profit is a clean 50/50, when, in reality, this is hardly the case. A standard French roulette wheel consists of eighteen red pockets, eighteen black pockets, and a single green pocket. Although it may seem unlikely to happen, the ball falling into that one green pocket is what gives the house an edge by rigging so that no matter which color you chose, it is statistically likely that you will lose money on your bet. This small 1.36% added chance of losing is what allows casinos to make tens of millions of dollars on this simple game.

Furthermore, psychological biases can also influence one’s perception of probability in gambling. The most famous example of this is the gambler’s fallacy, in which an individual believes that a certain event is more or less likely to happen based on previous events. For instance, while the probability of getting heads five times in a row when flipping a coin is 1/32, this is only true before the first coin flip. If the first four flips were to be heads, then the chance of the next flip also being heads is 1/2, not 1/32.

While the previous cases mentioned discussed how some quirks in the laws of probability can lead to some interesting conclusion, there are some fields where misusing this mathematical tool can actually become quite dangerous. Specifically, the field of scientific research, where statistics plays an integral role in a variety of ways, but most importantly in hypothesis testing.

If you’ve ever seen articles claiming something along the lines of “drinking coffee causes cancer” or any other type of extreme statement, then it is probably due to an exploit known as p-hacking. However, before talking about what p-hacking is and how it can be harmful, it is important to understand how scientists can determine whether or not two datasets are correlated to each other.

In order to solve the issue of not being able to objectively tell if the results of an experimental group are different from the control due to chance alone, the p-value, a number representing this exact probability, was created. Moreover, it has become fairly standard across most scientific journals that a p-value less than 0.05 is accepted as an accurate indicator that there is a true correlation.

However, this opens the door to a new issue that there is a 5% chance that the p-value would be less than 0.05 for two unrelated sets of data. In other words, when testing a certain hypothesis statistically, there is a 1/20 chance of a false positive result. Furthermore, p-hacking works by repeatedly testing multiple variables against each other until a strong correlation value shows up purely by chance. This short comic by xkcd is a perfect example of how p-hacking can lead to misinformation in the media.

Probability and statistics in general are some of the most powerful mathematical tools a person can learn. It is utilized by nearly every field of study ranging from business to medicine. It is for this reason that it is ever so crucial to firmly grasp how the laws of probability can affect each and every decision you make as well as how it too can be exploited to spread false science in this content-craving landscape.

Previous
Previous

Hydrogen Fuel Cells: Harnessing the Power of the Future