Dichotomistic Bias

Statistician Stephen Senn popularized the term  "dichotomania" for falsely dichotomizing continuous variables. In his 2017 paper "Statistical Errors in the Medical Literature", statistician David Harrell voices his affirmation for the existence of this phenomenon, and outlines several prominent examples where chopping up dichotomous variables like this in an effort to "simplify" them by making them discrete, has led to disastrous effects on statistical analyses in various papers - and not just any papers; articles where lives are potentially contingent on the results. It isn't that dichotomies are necessarily bad in themselves; they often help to simplify phenomena, and can be useful for decision making - after all, you can't  partially  operate on a patient.


But imposing a dichotomy is a dangerous business, because when you dichotomize, you necessarily throw away information: You no longer know how far every point is from some threshold, you only know which side of it each is on. It is a type of data loss called coarsening. In the case of the threshold for surgery, ideally the information one discards is unimportant, but in many contexts this is not the case . One compelling example of this is the well-known discrepancy in power between the statistical t-test, which one performs on continuous data, and the sign test, which one performs on categorized (coarsened) data. We talk about the efficiency of these tests in terms of  power; that is, how many subjects it takes for the test to be successful in differentiating two target groups. Relative to the t-test, the efficiency of the sign test is a meagre 64%; coarsening the data results in a loss of 36% of the information, information which could mean the difference between life or death for a person in some contexts. Recently, an important result was overlooked in the investigation of potential cures for COVID-19, wherein what turned out to be one of the most effective drugs (Remdesivir) was initially dismissed due to a large p-value in an underpowered study (Wang et al, 2020). The researchers didn't consider any options beyond "yes" and  "no", which resulted in a sobering amount of potentially savable lives lost. I hold the papers by Senn (2011) and Harrell (2017) in high regard, however,  "dichotomania" has the connotations of a mental illness (mania), which is not only clinically inaccurate ("dichotomania" is not a disease), but also insensitive towards individuals who do experience mania, many of whom may excel at mathematics and statistics (Mildred Boveka & Madeleine Jennings 2021, personal communication). Cognitive scientist and statistician Sander Greenland has identified the phenomenon as a cognitive bias (Greenland 2017), hence, I  refer to the phenomenon as "dichotomistic bias". 


Dichotomizing necessarily discards information. Whether a piece of information is important, however, depends on the task one is using it to accomplish: For example, data about the patient's gender and sexuality is probably unimportant for the task of removing an ingrown nail, but could be imperative for procedures involving sexual and reproductive health. This is why dichotomies themselves aren't necessarily bad; for example, if only 64% of the information is truly important to the decision one is trying to make, then the "loss" of power due to usage of the sign test isn't really a true loss at all - what we've done is simply eliminated meaningless noise, and simplified our data. In fact, this partitioning of information into "signal" and "noise" for the purposes of real-life decision making is the fundamental backbone of statistical practice. The important thing to realize, though, is that information could be "noise" for the purposes of one decision, but highly important for the purposes of another; a statistical model built for one purpose may not be generalizable to other contexts. 


I first started to realize how big of a problem dichotomistic bias was when I switched careers from cello to statistics: The vast dichotomy commonly imposed between art and math is another result of dichotomistic bias. Mathematics is really just a formalized version of philosophy, a method of communicating ideas about the underlying phenomena of the universe. So, too, is music a language, and using that language, art conveys profound truths about the world. Thus, art and math share the same common goal:  They are both languages, and they complement each other. All languages are inherently flawed, but when mathematics breaks down, we can turn to art for the answers and vice versa, piecing together a profile of the beauty of existence bit by bit, by independently analyzing its instantiations through different lenses: While absolute, objective truth can never be fully expressed or realized, its description can be approached using different forms of language.