Jan 28 2014

Occam’s Razor vs Hickam’s Dictum

Every year since 1998 Edge magazine asks a large group of public intellectuals a provocative question and then publishes their answers. This year the question is: What Scientific Idea is Ready for Retirement?

Gerald Smallberg (Practicing Neurologist, New York City; Playwright, Off-Off Broadway Productions, Charter Members; The Gold Ring) gave as his answer, “The clinician’s law of parsimony.” He writes:

“As an absolute, the Law of Parsimony is floundering. Not because it is aging poorly, but rather because it is being challenged more and more by the complexity of the real world and its need for a valid counterweight. From my vantage point as a physician in the practice of clinical neurology, its usefulness, which has always been a guiding principle for me, can easily lead to blind spots and errors in judgment when rigidly followed.”

He defines Occam’s razor as:

“This law states that the most simple of two competing theories should be the preferred one, and that entities should not be multiplied needlessly.”

The second part of this definition is the more precise, but Dr. Smallberg mainly discusses the first part. I have written about Occam’s razor in the past because it is frequently misrepresented, in a subtle but important manner. I believe that Dr. Smallberg is committing this same error.

Occam’s razor is often presented as the first part of the definition above – when there are multiple possible answers, the simplest should be preferred. However, the direct quote from William of Ockham is this:

“Numquam ponenda est pluralitas sine necessitate [Plurality must never be posited without necessity].”

Dr. Smallberg correctly identifies the flaw in applying the incorrect version of this principle – often, especially in medicine, the simplest answer is not the correct one. In medicine “simplest” is often translated into “fewest number of diagnoses.” He quotes Hickam’s Dictum which states that “a patient can have as many diagnoses as [she] damn well pleases.”

The fix, however, is not in concluding that Occam’s razor has limited utility, but rather in understanding what the principle actually is.

Stated another way – Occam’s razor is the principle that the introduction of new assumptions should be minimized. “New assumptions” should not be conflated with “additional diagnoses.” That is the error.

Dr. Smallberg gives an example case of a woman who has many simultaneous problems, several of which can relate to her chief complaint, difficulty walking. He chose a problem that is often multifactorial, but it serves as a useful example of his point.

Here’s the precise flaw in his argument – as we get older we tend to accumulate diseases and disorders. Further, one disease or condition can predispose to or lead directly to another – diabetes can cause neuropathy which can cause gait difficulty. We can add to this that some conditions are extremely common.

A patient who presents with known diseases, is at high risk for certain conditions, or complications of their known diseases is going to have multiple diagnoses. We do not have to introduce any entirely new assumptions. We can extrapolate directly, and with high probability, from what is known. In such a case, therefore, using multiple diagnoses to explain a patient’s presentation does not violate Occam’s razor.

In fact, Occam’s razor may prefer that we explain a patient’s presentation with three or four common or related disorders, rather than one extremely rare disease. The rare disease is introducing a giant new assumption, while the four common conditions to which the patient is at high risk are not really introducing anything new.

What a clinician should not do (and which does violate Occam’s razor) is introduce an entirely new disease or condition just to explain each individual sign or symptom of a patient.

In the end Occam’s razor is all about probability, and experienced clinicians understand and use probability in what is called their “differential diagnosis.” What clinicians do is try to think of every plausible way to explain the patient’s total presentation, and then they rank the possibilities from most likely to least likely. This helps prioritize what tests to order and treatments to give.

In ranking how likely each possibility is, you can imagine adding up all the new assumptions that each diagnosis would represent, both number and magnitude, resulting in a total assumption burden. Therefore, one entirely new and extremely rare diagnosis might render one explanation far less likely than using three new but very probable new diagnoses.

The more common a diagnosis, the less of a new assumption burden it represents, and the less it violates the principle of parsimony. Also, diagnoses may exist in a chain of causality: obesity leads to diabetes which can cause neuropathy which may result in gait difficulty. So, even though I may be giving the patient four diagnoses, they all flow from the first, and the total burden of new assumptions is low.

When viewed in this way, minimizing the total burden of new assumptions (accounting for both number and magnitude) Occam’s razor is a very useful logical tool for the clinician (or any investigator) and does not suffer the limitations imagined by Dr. Smallberg.

Share

56 responses so far

56 Responses to “Occam’s Razor vs Hickam’s Dictum”

  1. jasontimothyjoneson 28 Jan 2014 at 8:58 am

    misusing Occam’s razor is worse that using an apostrophe out of place

  2. Skepticoon 28 Jan 2014 at 11:09 am

    … or starting a sentence with a lower case letter. :-)

    Great post Steven. I’m often having to explain that Occam’s Razor does not say ‘choose the simplest.’ As I wrote in http://skeptico.blogs.com/skeptico/2007/09/occams-razor.html, if it did, “Goddidit” would be the answer to everything.

  3. 123FakeNameon 28 Jan 2014 at 11:34 am

    Good to see this post. People need to stop using the word “simple” in descriptions of Occam’s Razor. It’s always misleading. “Fewest assumptions” should be the phrase that everyone associates with Occam’s Razor, not “simplest”.

  4. Steven Novellaon 28 Jan 2014 at 11:38 am

    One formulation is called ontological parsimony: “”rule of thumb, which obliges us to favor theories or hypothoses that make the fewest unwarranted, or ad hoc, assumptions about the data from which they are derived.”

    I think that’s a good description – qualifying assumptions as unwarranted or ad hoc. I like my phrase of minimizing the assumption burden – not all assumptions are created equal, larger ones are more of a negative predictor than minor ones.

  5. The Other John Mcon 28 Jan 2014 at 12:16 pm

    There’s another scientist (this time a psychologist) at The Edge railing against the “principle of parsimony”, interesting that this keeps popping up:

    http://www.edge.org/response-detail/25346

    Though this response seems more like a confusion of extreme reductionism with the idea of parsimony.

  6. Bronze Dogon 28 Jan 2014 at 12:30 pm

    I remember a scene from Scrubs which has probably been played out numerous times with medical students. They have a patient with an unusual collection of symptoms. Dr. Cox asks the med students for their opinion. One of them brings up a disease that produces those exact symptoms. Dr. Cox shoots down that suggestion by pointing out that it’s a rare tropical disease that was only seen in a handful of patients. His speech went something like this:

    “What you’ve suggested here is what we call a zebra. As in, ‘when you hear hoofbeats, think horses, not zebras.’ Which is more likely? The patient has an extremely rare tropical disease from a country he’s never visited or that he has a common condition that has manifested with atypical symptoms?”

    I think of Occam’s Razor as something akin to thinking horses first, zebras second, and only considering unicorns if the situation is pretty darn anomalous. In the bleeding edge of science, it’s generally easier to test hypotheses that invoke unicorns or pegasi before you consider the ones that invoke both.

    Just in case someone comes along and misinterprets this metaphor as an easy way to shoehorn in “supernatural” entities, no, that’s not the point. The fantasy equines are a symbolic stand-in for newly posited, unproven entities. I’m a monist, anyway, so I don’t really acknowledge “supernatural” as anything more than a junk drawer category. I can accept it in fictional worlds only because there typically is a meaningful, practical reason to create the category within that world’s laws. Better to know ahead of time how your +3 sword is going to react to an anti-magic zone, for example.

    Silly question: Unicorns have cloven hooves in many depictions, so would that mean they’re a very horse-like member of Bovidae, rather than genuine equines?

  7. Steven Novellaon 28 Jan 2014 at 12:37 pm

    Occasinoally goats are born with their two horns merged as one. This is probably the source of the legend, and why unicorns are depicted with twisted horns and goatees.

  8. Bruceon 28 Jan 2014 at 1:20 pm

    UNICORNS AREN’T REAL!?!?!?!?!

  9. Bruceon 28 Jan 2014 at 1:21 pm

    I will have you know they are the Scottish national animal!

  10. Bronze Dogon 28 Jan 2014 at 2:50 pm

    Occasinoally goats are born with their two horns merged as one. This is probably the source of the legend, and why unicorns are depicted with twisted horns and goatees.

    Neat. I love it when biology tidbits like that explain something so well.

  11. chrisjon 28 Jan 2014 at 3:04 pm

    I like this it seems like a good way to understand parsimony, however I am a bit confused by it.

    Steven, you wrote. “In ranking how likely each possibility is, you can imagine adding up all the new assumptions that each diagnosis would represent, both number and magnitude, resulting in a total assumption burden. Therefore, one entirely new and extremely rare diagnosis might render one explanation far less likely than using three new but very probable new diagnoses.”

    What do you mean by “assumption” here? If some assumptions are low probability and some are high, then Occam’s Razor ought to be something like “avoid the set assumptions that have the lowest probability of being true.” As you seem to point out, it might be better to make more assumptions rather than fewer because one big assumption (i.e. having low probability of being true) might have a lower probability of being true. It seems your view is that Occam’s razor shouldn’t be about the number of assumptions, but about their probabilities. Is this right?

    It is NOT what Occam says. Also, William of Occam lived during the 13th and 14th centuries (http://en.wikipedia.org/wiki/William_of_Ockham), we don’t see anything like modern probability theory until the 16th or 17th century (http://en.wikipedia.org/wiki/Probability_theory). This means that Occam couldn’t have understood what he was saying in terms of probabilities.

    Perhaps this is just nitpicking, but it seems we are talking about a theory of parsimony that is different from Occam’s.

  12. Steven Novellaon 28 Jan 2014 at 3:55 pm

    Chris – there have been many formulations of Occam’s theory over the years. I agree that more recent formulations tend to have a more probabilistic slant to them.

    As I quoted in the comments above, I like “favor theories or hypothoses that make the fewest unwarranted, or ad hoc, assumptions”

    “unwarranted” and “ad hoc” are really ways of saying low probability. A warranted assumption is so by virtue of being higher probability, which in turn can be due to being a known complication, or for which the patient has risk factors, etc. “Ad hoc” means that you have no particular a-prior reason to think the assumption might be true, you are just making it up as an explanation.

    While not stated in strictly statistical terms, Occam’s notion has always been about which alternative is more likely (preferred) rather than which has to be strictly true. Why are fewer assumptions better? Because each new assumption is a gamble, and some gambles are bigger than others.

  13. Will Nitschkeon 28 Jan 2014 at 4:25 pm

    @Steven Novella

    “When viewed in this way, minimizing the total burden of new assumptions (accounting for both number and magnitude) Occam’s razor is a very useful logical tool for the clinician (or any investigator) and does not suffer the limitations imagined by Dr. Smallberg.”

    As far as I can determine you are either misrepresenting or misunderstanding the point Dr Smallberg is making, which it seems to me is this: it’s all to easy to fall into the trap of trying to fit a complex situation into a simple solution. Although I have scientific and philosophical academic training I work in engineering. (Sorry the money is better.) And over the years I’ve come to assume that where there is a complex problem, there is often more than one problem that requires addressing. This is not something I assumed when I was younger. I would typically try to find “the” problem and “fix” that. Dr Smallberg is making the same point. This has nothing much to do with you providing a mini lecture on why Occam’s Razor meets your approval. It’s to do with the way in which it can be misapplied, probably by those with less clinical, engineering and other forms of professional experience.

    But anyway, thanks for the link. It is an interesting article to read.

  14. Skepticoon 28 Jan 2014 at 4:45 pm

    In my opinion it has nothing to do with probability. If you think it is I’d say, ‘please show me the math.’ Occam’s Razor is used where we usually have no math to go on.

    O.R. just means don’t make stuff up. If you hear hooves, that isn’t evidence for unicorns. We know horses exist and that they make the noise of hooves, so we don’t need to say ‘unicorns!’ Of course it could be unicorns. We don’t know it isn’t. (And we certainly can’t calculate the probability that it might be unicorns.) But the hooves can be explained by something that we know does exist (horses) and so we don’t need to make up something else (unicorns) to explain the noise.

  15. tkitzleron 28 Jan 2014 at 6:05 pm

    Hello Dr Novella,

    I read with interest your article on Occam’s razor and had some thoughts that I would like to share with you. First of all, I listen to your podcasts and I read many of your blogs and really value your logical and sceptical approach. I agree with you that the problem here is the lack of a clear understanding of what Occam’s razor really stands for. I would not go as far as to abandon Occam’s razor, but I believe that we are completely misusing it. I do not believe that it is of any meaningful use in clinical practice, but rather in clinical science.

    I am a medical resident with a science background myself and I realized that I often felt uneasy when colleagues used the Occam’s razor analogy in order to justify their clinical choices. Commonly, they are condensing it to phrases like ‘less is more’ or ‘the simplest explanation is the right one’, etc., turning the principle of parsimony into nothing more than a proverb of the likes of ‘nice guys finish last’. Every educated mind would shy away from using a proverb to guide any professional decision making.

    One should know that the principle of parsimony has been stated in many different ways throughout time. And there is not one formulation of the concept that is more accepted than the others. One very frequently used version, is the one linked to Ockham, which states, as mentioned by you, ”entities must not be multiplied beyond necessity”. Ockham did not come up with this, he just frequently used it, also in many different varieties. This is the most popular one of his versions. Other common forms are: “It is futile to do with more things that which can be done with fewer.” “We consider it a good principle to explain the phenomena by the simplest hypothesis possible.” and so forth. You mentioned correctly that this is just one aspect of the principle.

    A more comprehensive version of the principle is described in the following: “the explanation of any phenomenon should make as few assumptions as possible, eliminating those that make no difference in the observable predictions of the explanatory hypothesis or theory”

    This is my personal favourite. As one can easily grasp, this is a principle used to improve an explanatory theory by simplifying it, but it says nothing about improving the predictability of a theory. In other words, if you have two predictive models predicting the same outcome, use the one relying on fewer variables (ie, assumptions). However, I can’t see how someone can use this model as a diagnostic tool. In medicine, two completely different disease can give the exact same clinical picture. No one is more valid than the other, independent of complexity or difficulty to diagnose it. Of course we can use probability to choose one over the other, however, simplicity of our diagnostic approach has nothing to do with this.

    In my opinion, Occam’s razor is misused in the way that it is used to justify to look for the simplest explanation of something. Eg, patient has cough and runny nose, so he has the common cold. And more often than not we will be right. But this has nothing to do with the fact that we used the simplest explanation. This has to do with probability. The principle of parsimony does not comment on the predictive power of the theory in use. It only comments on how to make your theory less complex, and in case of two competing theories that predict the same outcome, it states you should go for the simpler one. So this would mean, if you can diagnose a disease as successfully using two tests than using three tests, then go ahead and use only two tests. This is something we learn through clinical trials and it illustrates perfectly where Occam’s razor should be used, as an approach to simplify scientific concepts so we can more easily apply them in praxis. But in my understanding it has nothing to do with, “which is the more likely diagnosis in the patient that is sitting in front of me?” “The one that comes easier to mind than the other” or ” the one that explains the easiest all his symptoms.”

    The principle of parsimony is a logical concept used in problem solving, often for problems of philosophical or scientific nature. I doubt the it can be used in a meaningful way to assist practical clinical judgements.

    I am looking forward to hear from you and your thoughts on this.

  16. rezistnzisfutlon 28 Jan 2014 at 10:56 pm

    Will,

    Sorry, but engineering and science are two very different disciplines. Engineers utilize the science that is discovered by scientists. Clinical research, study, and observation utilizing the scientific process is far different from problem solving occupations filled by engineers – the paradigm is completely different. Basic science isn’t so much about problem solving as it is about discerning factual reality based on empirical evidence. If anything, engineering is at most what could be considered “applied science”, not unlike straight medicine, that may, at best, be involved in development, but that is not basic science.

    In that vein, I don’t believe Dr. Novella is suggesting that anyone is trying to fit complex situations into simple solutions, which is again the mindset of an engineer. It seems to me that he’s actually arguing the opposite as far as how it applies to medical research.

    I find it interesting that you seem intent on contradicting everything that’s brought up here. It’s like you’re making it a point specifically to do that, a la trolling, despite what the message or argument is. This is why I suggest people stop feeding you, and which is what I’ll do at this point. I just wanted to clear up the obvious misconception you mentioned regarding engineering as it relates to the sciences.

  17. Dreaded Anomalyon 29 Jan 2014 at 12:23 am

    The math behind Occam’s Razor is both simple and strong, combining information theory and conditional probability. http://en.wikipedia.org/wiki/Minimum_message_length

    “MML naturally and precisely trades model complexity for goodness of fit. A more complicated model takes longer to state (longer first part) but probably fits the data better (shorter second part). So, an MML metric won’t choose a complicated model unless that model pays for itself.”

  18. jt512on 29 Jan 2014 at 3:17 am

    In my opinion it has nothing to do with probability. If you think it is I’d say, ‘please show me the math.’

    If two models equally explain the data, the simpler one will have the higher posterior probability. As requested, here’s the math: Berger & Jefferys (1992)

  19. Steven Novellaon 29 Jan 2014 at 7:12 am

    Thanks for the math links. I would also add that it’s not just about not making stuff up, because often a new element is required to explain the data (i.e. a new diagnosis). It is about minimizing the improbability of the new elements required by minimizing their number and maximizing their prior probability.

    I have found engineering reasoning and clinical reasoning to be very different. I have had a number of engineers as patients over the years and I have seen them struggle to apply their engineering logic to medicine – always with horrible results.

    Dr. Smallberg, like myself, is a practicing neurologist. I am intimately familiar with the point he is making, and why it is, in my opinion, flawed.

  20. tkitzleron 29 Jan 2014 at 8:02 am

    Following my previous post, here is a link illustrating the example I gave on two competing models, one simple, one more complex: http://planetmath.org/occamsrazor

    When we have two explanatory models with the same predictability, chose the simpler one. For the sole sake of simplicity. I am not sure this implies that the simpler one is more likely or more probable. We have a human tendency to be attracted to simple explanations. Karl Popper suggests there is a practical reason to this. Simpler concepts are easier to falsify, which goes well along with his theory of falsification.

    As found on wikipedia on Occam’s razor:
    “…We prefer simpler theories to more complex ones “because their empirical content is greater; and because they are better testable” (Popper 1992). The idea here is that a simple theory applies to more cases than a more complex one, and is thus more easily falsifiable. This is again comparing a simple theory to a more complex theory where both explain the data equally well…”

  21. BillyJoe7on 29 Jan 2014 at 8:54 am

    The number of people who misunderstand Ockham’s Razor pales into insignificance beside those who misspell it. |:

  22. BillyJoe7on 29 Jan 2014 at 8:58 am

    Oh and…

    How many “nitschkes” are there between Steven Novella’s understanding of Dr. Smallberg’s point and the actual point he is making?

  23. ccbowerson 29 Jan 2014 at 10:14 am

    “The number of people who misunderstand Ockham’s Razor pales into insignificance beside those who misspell it. |:”

    Steven spelled “William of Ockham” correctly and refers to “Occam’s razor,” which is also correct. You can call it “Okham’s razor” if you want, but the spelling of “Occam’s razor” has been around as long as the term has existed, which is a result of the re-Latinization of the English language after William of Ockham’s life. Many spellings were changed as a result, and this is an example.

    “How many “nitschkes” are there between Steven Novella’s understanding of Dr. Smallberg’s point and the actual point he is making?”

    I believe it is an infinite number of nitschkes between any 2 understandings, even identical ones.

  24. Will Nitschkeon 29 Jan 2014 at 4:21 pm

    @Steven Novella

    “I have found engineering reasoning and clinical reasoning to be very different. I have had a number of engineers as patients over the years and I have seen them struggle to apply their engineering logic to medicine – always with horrible results.”

    Sorry, Steve what exactly is “engineering reasoning” ? Didn’t you just make that up?

    I wouldn’t apply my problem solving strategy in my field of engineering to another field of engineering, because one has to be intimately familiar with the technical details of each field. (Although there is a vast element of snobbery in such dismissals. I have read lots of interesting articles where academics with statistical, engineering and other expertise have aided medical practitioners in their diagnostic procedures.)

    Of course all of this is ducking the point that Dr. Smallberg was not under a misapprehension as you asserted. Dr. Smallberg’s point was that medical science has now advanced to the point where injudiciously applying Ockham’s razor is more likely to do more harm than good. That argument may or may not be correct. However, your claim that Dr, Smallberg misunderstands Ockham’s Razor most certainly is.

    I suspect this defensiveness fits into the narrative that amateur skeptics have “discovered” guiding principles that allow them to separate truth from nonsense. Of course these principles, or more correctly, rules of thumb, are just as likely to be misused as used correctly. I suspect this is the real point of your objection. If this were so, amateur skeptics would therefore not have the “special power” to separate truth from nonsense as they claim to possess.

  25. ccbowerson 29 Jan 2014 at 4:58 pm

    “Dr. Smallberg’s point was that medical science has now advanced to the point where injudiciously applying Ockham’s razor is more likely to do more harm than good. That argument may or may not be correct. However, your claim that Dr, Smallberg misunderstands Ockham’s Razor most certainly is.”

    If you are “injudiciously” applying parsimony, you are not applying the Occam’s razor, so Smallberg’s point criticizes a strawman of the concept. Steve’s point is spot on, but yet again you object. Smallberg’s discussion reads like someone who misunderstood a concept his whole life, now realizes that there is a problem, yet erroneously believes he has uncovered a flaw in the concept. In actuality this flaw was just his (and others) misunderstanding the entire time.

  26. Davdoodleson 29 Jan 2014 at 7:57 pm

    “If you are “injudiciously” applying parsimony, you are not applying the Occam’s razor, so Smallberg’s point criticizes a strawman of the concept. ”

    Exactly. This is similar to argument to those being put by Klein and Ornish in support of their view that evidence based medicine should be abandoned (in favour of course of some dodgy woo), because if done poorly, EBM may not achieve reliable, replicable outcomes.

    If you do a bad RCT, it is not good. Therefore we should not do any RCTs, and just buy woo-water on faith.

    If you misapprehend Occam’s Razor, something something, ergo buy my woo berries.

    http://www.sciencebasedmedicine.org/fighting-against-evidence/
    .

  27. BillyJoe7on 30 Jan 2014 at 7:14 am

    Well, there is absolutely no doubt that Gerald Smallberg misunderstands Ockham’s Razor. I am willing to accept that what he says about medicine is correct, but what he says about medicine has no point of contact with Ockham’s Razor.

    But I was amused by this…

    “…the most simple of two competing theories…”

    He would have been correcter if he had said “more simple” and correctest is he had said “simpler”.
    How embarrassing for him.

  28. BillyJoe7on 30 Jan 2014 at 7:25 am

    Mary, Mary, quite contrary…

    Will Nitschke accepts that problem solving strategy in his field of engineering is different from problem solving strategy in other fields of engineering, but he is unwilling to accept Steven Novella’s observation that problem solving strategy in medicine is different from problem solving strategy in engineering.

    How many Nitschkes between Ockham’s Razor and Will’s throat.

  29. Steven Novellaon 30 Jan 2014 at 7:25 am

    Will wrote: “Sorry, Steve what exactly is “engineering reasoning” ? Didn’t you just make that up?
    I wouldn’t apply my problem solving strategy in my field of engineering…”

    Do you even read what you write? I am talking about problem solving strategies used in engineering. I had one family member of a patient who explicitly was trying to apply engineering problem solving to the medical problem of his family member. He was trying to use the intellectual tools he had at hand. However, they simply did not apply. I had to explain to him clinical problem solving and how it was different – he wanted to “test the system” without considering risk vs benefit, for example.

    Simply restating your premise is also not an argument. You have been thoroughly refuted by multiple commenters.

    Dr. Smallberg was using “simplest” as an operational definitiion, applied to medicine “simplest” means “fewest diagnoses.”

    But it is clear that Occam’s razor refers to fewest new assumptions, not simplest. Later formulations clarified this with “ad hoc” or “unwarranted” assumptions, which I restated as total assumption burden.

    You have done nothing to address this, simply restated your position, and then launched into speculations about the motivation of “skeptics” in order to make yourself feel superior. In fact, you do that quite a bit – make up self-serving assumptions in order to put forward ad hominem arguments, while blithely ignoring the utter destruction of your position.

  30. BillyJoe7on 30 Jan 2014 at 7:46 am

    “How many Nitschkes between Ockham’s Razor and Will’s throat”

    About the same as the distance between the last two posts (:

  31. brive1987on 30 Jan 2014 at 7:48 am

    Thanks for the article. After watching Rebecca describe the Edge exercise as “dumb” “boring” and “pretentious” in her pre SGU patreon video I was going to skip it. But there are actually some interesting points raised here.

  32. MrWindUpBirdon 30 Jan 2014 at 9:27 am

    Nice post. I must say that when i was a resident presenting cases on ward rounds, Occam’s razor would come up occasionally and it would irritate me a lot. For example, when we were discussing a patient who had presented to us with paralysis in both his legs and abdominal pain and vomiting of blood, my attendings insisted that my first diagnosis (using Occam’s razor) must be a gastric cancer with the paralysis as a paraneoplastic syndrome (that is something caused indirectly by the cancer) while i thought he had two different causes for his different symptoms- his paralysis caused by hereditary spastic paraplegia and blood vomiting caused by gastritis due to excessive analgesic use for leg pain (a pretty common complication of that treatment). It seemed that for them using Occam’s razor meant fitting all the symptoms and signs in a patient into a single diagnosis (or least number of diagnoses) but many times it is much more probable that a patient will have 2 or 3 or more common diseases occurring together rather than a single rare one. This seems to be the most common misuse of Occam’s razor I have encountered. It occurs because people forget the caveat “without necessity”- in medicine many times the probability of two common diseases occurring together are much higher than a single rare one and so it becomes necessary to consider two diagnoses as more likely than one.

  33. Steven Novellaon 30 Jan 2014 at 10:56 am

    tkitzler – I disagree with your assessment in a couple ways.

    First, Occam’s razor does apply to probability in that – the probability of A+B has to be lower than the probability of A (assuming both are otherwise equivalent). That is a logical necessity. If B is unnecessary adding it in just lowers the probability.

    Yes, there are multiple formulations of the principle of parsimony. However, that does not mean they are all equal. Using the “simplest” formulation, first of all is not “the” formulation of Occam’s razor and certainly is not the one that philosophers generally prefer or that is applicable to clinical decision-making, where its limitation become glaring.

    Using the formulation that focuses on avoiding unnecessary or ad hoc assumptions is both logically valid and is perfectly applicable to medical decision-making. WindUpBird give a perfect example of the difference between the two. Paraneoplastic processes are very rare, so adding that as a new diagnosis would be a huge assumption burden.

    You are correct, however, in that the rule of parsimony is only one of many rules that are simultaneously applied. We must consider the base rate of each potential diagnosis, the specific risk of the patient, how predictive their symptoms are of each entity, and the risk/benefit of working up and treating each potential entity. This all gets filed under analytical reasoning, but experienced clinicians also use experiential reasoning, which is essentially pattern recognition from prior familiarity.

    The most valid formulations of Occam’s razor are a useful addition to this complex reasoning process. Trying to apply an invalid formulation (“simplest”) is what leads to problems.

  34. Skepticoon 30 Jan 2014 at 11:04 am

    Dreaded Anomaly, jt512:

    I hear hooves. Please show me the math that says unicorns have a lower probability than horses.

  35. jt512on 30 Jan 2014 at 2:44 pm

    Skeptico,

    The math you are seeking falls under the heading of Bayesian inference. There is a wide literature on the subject. And if you have to ask whether the sound of hooves is less probable to be from a unicorn than a horse, then you urgently need to read it.

  36. Will Nitschkeon 31 Jan 2014 at 5:56 am

    @Steven Novella

    “Do you even read what you write? I am talking about problem solving strategies used in engineering. I had one family member of a patient who explicitly was trying to apply engineering problem solving to the medical problem of his family member. He was trying to use the intellectual tools he had at hand. However, they simply did not apply. I had to explain to him clinical problem solving and how it was different – he wanted to “test the system” without considering risk vs benefit, for example.”

    Since you are writing to an engineer and want to discuss engineering, I still don’t have the faintest idea what you think you are talking about. Logical reasoning skills don’t change regardless of whether you are scientist, engineer or medical practitioner. Risk versus benefit is fundamental to engineering problem solving as well, of course. I already explained that the fundamental differences are based on experience with and knowledge of a particular problem domain. Logical reasoning itself does not change. But since I already mentioned that to you, I wonder if you are reading what I wrote?

    “Simply restating your premise is also not an argument. You have been thoroughly refuted by multiple commenters.”

    I’m not interested in reading the comments of your acolytes. I’ve read some of the comments in the past and most of them are hopelessly muddled, confused, state conclusions instead of present arguments, misdirect, misunderstand, etc. Your audience seems to me to act largely as gate keepers of criticism of you. (Although oddly, for someone who criticises something, someone or some group, in nearly every one of your articles, it seems strange to me on how highly sensitive you and your followers are to criticism.) Anyway, I apologise to those who contribute here who are intelligent and thoughtful. But I can’t wade through volumes of nonsense in order to find them. If you’re not interested in moderating your readership for such basic things as not engaging in personal attacks on others, then don’t expect me to read them. I don’t mind if you don’t have the time or interest to hold such individuals to basic standards. That’s your call and at the end of the day, it only reflects badly on yourself. You may of course, not be worried about that sort of thing.

    “Dr. Smallberg was using “simplest” as an operational definitiion, applied to medicine “simplest” means “fewest diagnoses.”
    But it is clear that Occam’s razor refers to fewest new assumptions, not simplest. Later formulations clarified this with “ad hoc” or “unwarranted” assumptions, which I restated as total assumption burden. You have done nothing to address this, simply restated your position….”

    I explained clearly where you misunderstood what Dr Smallberg was saying. There is a second point, which is whether what you are saying actually makes any sort of sense, which is not what I have discussed at all. Although this is what you have now attacked me on. Clearly the Copernican system was preferred to the Ptolemaic system because it was simpler, *not* because it made “fewer” assumptions. The core assumptions in both systems was that planets moved in circular motions. (Which turned out to be wrong, thanks to discoveries made by Kepler but that is another story entirely.) One can invoke Occam to prefer Copernicus over Ptolemy so your argument (at least as stated above) is trivial to refute.

    “and then launched into speculations about the motivation of “skeptics” in order to make yourself feel superior. In fact, you do that quite a bit – make up self-serving assumptions in order to put forward ad hominem arguments, while blithely ignoring the utter destruction of your position.”

    Let me put my cards on the table. I’m of the view that activist groups cannot be self critical because they protect their core belief systems at all cost. You can’t be a skeptic and be part of a skeptical activist group. It’s an oxymoron. It’s an observation that is obviously going to upset you. But it’s not my job to please you. That’s the purpose your cheer squad serves.

  37. Will Nitschkeon 31 Jan 2014 at 6:30 am

    @Steven Novella

    I’m noting the following in a separate response because I didn’t want to get it side stepped.

    Whether your distinction between ‘simplest’ and ‘fewest assumptions’ has any practical utility in terms of applying a heuristic such as Occam’s Razor is a separate issue. I’m skeptical that what you are offering is of any sort of practical use or even if the distinctions you are making can be sensibly divided. However, that is not a point I have discussed in my criticism of you.

    My criticism of you was limited to your ‘interpretation’ of Dr Smallberg’s argument. If your argument is that the problem with Occam’s Razor is that it is being misunderstood, and Dr Smallberg’s argument is that it should be tossed aside because it more often than not leading to wrong diagnosis, then at the end of the day it seems to me that in terms of practical outcomes, you are splitting hairs.

  38. Bruceon 31 Jan 2014 at 6:38 am

    “I’m of the view that activist groups cannot be self critical because they protect their core belief systems at all cost.”

    This itself is proof enough that you don’t understand the core basics of skepticism and science. Self regulation and evaluation of our own core beliefs are what we are constantly doing. You obviously have not done any back reading or looked at articles here and on other skeptical blogs that specifically look at not only human failings, from perception to memory, but also the skeptic movement’s own internal controversies and even how scientific debates happen.

    I am a student of formal logic, and through understanding its strengths, I also understand where it has its failings and where it is not relevant or helpful to cast an issue in black or white. Life is nuanced, because we as humans are not perfect.

    If I were you I would take the time to read the comments some here have made about you, how about you turn your critical lense on yourself instead of trying to find fault in everyone else. You have yet to make one positive comment in this blog (that I have seen), and while I am open to criticism as much as I am sure everyone else on this blog is, when you wade in here and don’t give any credit where it is due we become suspicious of motives and intellectual credibility. Even if it were the case that every commenter here and Steven Novella himself were always wrong on every single topic, I would ask what benefit coming in here guns blazing shooting down every single person, making snide comments about our intelligence and being a pompous superior ass is going to help in any way. Not to mention why you are wasting your time here with us who are so obviously deluded and misguided.

    It has been said before, start your own blog, put the effort in yourself to make a difference and perhaps you will get some credibility beyond a very obvious self important troll.

  39. BillyJoe7on 31 Jan 2014 at 7:37 am

    “If you’re not interested in moderating your readership…”

    Be careful what you ask for.
    You just might find yourself at the boot end of that moderation.
    As a test, I suggest you try posting at Jerry Coyne’s blog “Why Evolution is True”. That is heavily and openly moderated. You are likely to find your first post being your last, with no come back at all to the criticism he levels at it.
    At least that’s what’s been on my mind as I’ve read your succession of blathering posts.
    What would Jerry do?

  40. Steven Novellaon 31 Jan 2014 at 7:41 am

    Will – I would like to see you support your interpretation of Smallbeg’s position from the actual text (that is the accepted standard).

    That is what I did – I quoted his key paragraph, where he talks about “the Law of Parsimony is floundering” because “it is being challenged more and more by the complexity of the real world…” He never talks about it being misunderstood or misused, just overused. He does not want to eliminate it, but he wants to pull it back because it is flawed and incomplete.

    I then quoted (and added further quotes in the comments) sources describing a different formulation of Occam’s razor which does not suffer from the flaws and limitations that Smallberg claims.

    You, on the other hand, are just making stuff up. You also admit why – your narrative is that, “activist groups cannot be self critical because they protect their core belief systems at all cost.” It is evident to everyone here that you are shoehorning into your preferred narrative whatever I write. You will find fault with it, even if you have to make it up, in order to make your a prior case.

    Regarding the commenters here – as a reader of many blogs, in my opinion the quality of discourse in the comments here is far above average. There is the occasional snark, but it’s not practical to ban all snark from the comments section. I do give the occasional warning and ban the occasional troll. I am actually quite proud of my regular commenters.

    Of course, you will immediately jump to the conclusion that this is because they are sycophants – because that is your preferred narrative. If you bothered to read the comments, which you admit you do not, you will find they frequently challenge me and each other. It’s clear that you just don’t like the criticism they have aimed at you. Instead of you looking inward to see if there is any flaw in your process, or genuinely engaging, you blithely blame the environment and make up whatever assumptions you need to feed your comforting narrative.

    The problem with your narrative, as Bruce has already pointed out, is that self-criticism is a core belief. We keep each other honest in this regard. That is actually what you are experiencing here.

  41. ccbowerson 31 Jan 2014 at 10:31 am

    “I’m of the view that activist groups cannot be self critical because they protect their core belief systems at all cost. You can’t be a skeptic and be part of a skeptical activist group. It’s an oxymoron.”

    It is a challenge for many activist groups to be self critical, but self criticism is an important aspect of skepticism itself. It is not about an advocacy to a certain unchanging end result, but it is about following a process (if done properly) in which self criticism is very important.

    “it seems strange to me on how highly sensitive you and your followers are to criticism”

    It is not the criticism that is the problem, it is the use of bad arguments for unjustified criticism. If you actually had some constructive justified criticism, you wouldn’t get the type of pushback you get (this does happen quite often). The snark here is pretty tame, and is usually in context. Have you seen the rest of the internet? This blog is very friendly by blog standards. Once again you have an unjustified criticism.

  42. Bronze Dogon 31 Jan 2014 at 2:51 pm

    This notion of activist groups being incapable of self-reflection reeks of what I call apathism. It’s certainly true that a lot of activist groups reach irrational extremes, but that’s not unexpected, given human nature and its irrationalities. Individuals are just as capable of the same irrational behaviors as groups, though some tactics work a little differently at different levels. Just like there are many activist groups with irrational positions, there are many individuals with irrational positions. You can’t avoid irrationality by scapegoating the notion of people working together.

    It seems to me that apathism serves to let an “independent” person stroke their ego by pretending that they’ve risen above an issue. They often think that rationality means being without emotion, so they think they’re inherently superior to anyone who gives a care, or (gasp!) acts to bring about change. I think a lot of people also do it because of some twisted hipster appeal, choosing something other than the mainstream for the sake of being different, rather than being right.

    Groups that consciously avoid such extremes usually have a hard time getting attention because being reasonable is considered boring. It involves explanations of a finely nuanced position and the science and logic supporting it, rather than shouting convenient soundbites. When they do get attention, they’re typically subjected to ratings-friendly spin from media that tries to cast them as exactly as extremist as their opposition for the false idol of “balance”, or they’re straw manned by opposing groups to drown out their actual position and maintain a narrative.

    Being skeptics means that we try to counteract all our psychological failings as humans, both on the individual and group level. We argue on all scales, from individuals to entire cultures. We have arguments online because we know there might be a counterargument we haven’t heard yet, or a fallacy in our chain of logic that we didn’t notice. Accordingly, most skeptical sites I’ve been to are loosely moderated and far from ban-happy.

    Another aspect I think I see in apathism is an aversion to any sort of conflict. It’s particularly prevalent in newage spiritualists who try to claim a middle ground between organized religion and the scientific mindset. A lot of newagers I’ve met seem to think life should be an extended brainstorming session where ideas are generated and never discarded. They seem to think we should never move on to filter out good ideas from bad ones because it might hurt their feelings if their favorite ideas get thrown out.

    The snark here is pretty tame, and is usually in context. Have you seen the rest of the internet? This blog is very friendly by blog standards. Once again you have an unjustified criticism.

    For one extreme example of contrast, my brother hates the Call of Duty game series in part because of the community that plays it. He got sick of being called a f****t n****r by teenagers and frat boys whenever he got a kill. In Borderlands 2, he got to play with some very polite kids from the UK who are now on his friend list.

    Usually, when someone complains about tone in a blog as gentle as this one, they’re being oblivious to their own viciousness and prejudice. Whether it’s out of deceit or genuine naivete is a toss-up.

  43. Will Nitschkeon 31 Jan 2014 at 7:46 pm

    @Steven Novella

    “That is what I did – I quoted his key paragraph, where he talks about “the Law of Parsimony is floundering” because “it is being challenged more and more by the complexity of the real world…” He never talks about it being misunderstood or misused, just overused. He does not want to eliminate it, but he wants to pull it back because it is flawed and incomplete.”

    Let me give you another example, using software engineering. A network database is being corrupted intermittently. The engineer identifies the particular protocol that has been incorrectly configured. The change to the protocol is entirely consistent with all ‘symptoms’ of the issue. Using any version of Occam’s Razor you wish, the conventional one or your version, the matter is resolved. However, an experienced engineer would not leave the matter to rest. He would continue to check for issues with network interface drivers, local redirector caching and so on. The experienced engineer may actually identify several potential overlapping issues and will address them all immediately. In this case, Occam’s Razor is utterly useless. That is the gist of Dr Smallberg’s point. Your attempt to ‘save’ Occam’s Razor just doesn’t work. That’s not to say it a useless principle, but that if used injudiciously, it may do more harm than good. This is why Dr Smallberg wrote:

    ” From my vantage point as a physician in the practice of clinical neurology, its usefulness, which has always been a guiding principle for me, can easily lead to blind spots and errors in judgment when rigidly followed.”

    The basic problem with this sort of indulgent amateur epistemological philosophising is that it always looks impressively convincing post hoc. Whether such rules of thumb (or whatever you want to call them) are actually of much practical use in difficult problem solving, is an entirely different matter.

    Regarding your readership. I can hardly be bothered getting into a debate on that. I read 10 comments or so to one of my earlier posts and they were all low grade muddled rubbish. That doesn’t rule out that you have intelligent people posting comments to your articles. I already said as much. In fact it would be improbable if you didn’t. But at the end of the day I have to do a cost benefit analysis on how to best spend my time. If you’re proud of some or your readership that’s great. But perhaps it would be healthier to be embarrassed by the rest of them.

  44. Steven Novellaon 31 Jan 2014 at 10:52 pm

    You still have said nothing to counter my position, which I clearly stated. No one is saying that Occam’s razor should be rigidly applied, or that it is the only guiding principle. As such Dr. Smallberg’s point is a straw man. It is one rule of thumb among many. Further, it is used in medicine (as I indicated explicitly) not to derive the one correct answer but to prioritize the differential diagnosis. Your example above is consistent with this process, and so is not even a counter example.

    My point remains – confusing “fewest unwarranted assumptions” with “simplest” is the problem, not Occam’s razor itself. This does not mean that there aren’t specific cases where the simplest is also the fewest assumptions, but that is irrelevant. I gave a specific example when the two formulations lead to very different outcomes, one useful and one misleading.

    You should be careful before throwing around accusations of arguments being “muddled rubbish.”

  45. BillyJoe7on 01 Feb 2014 at 1:37 am

    Will Nitschke,

    “at the end of the day I have to do a cost benefit analysis on how to best spend my time”

    Not everything is a cost/benefit analysis.
    Otherwise who would ever read more than one of your posts before putting you on ignore.
    Sometimes you do things purely for amusement.
    Like reading a few more of your posts.

    I don’t mind the occasional laugh.
    And what better than someone with an overinflated self opinion making a fool of himself.
    I can’t wait for the next instalment.

    In the mean time I’m counting lines.
    Why am I counting lines?

    Because I want to get down to zero.

    .

  46. Bruceon 01 Feb 2014 at 6:31 am

    “I read 10 comments or so to one of my earlier posts and they were all low grade muddled rubbish.”

    And a good day to you.

    Not going to waste one more second reading anything you write as you have absolutely no credibility in my eyes now. Not often that happens, even the most deluded commenter here interacts with us minions, if you can’t be bothered, then I don’t see how you can think we will take you seriously at all.

  47. sonicon 01 Feb 2014 at 10:20 am

    Will Nitschke-
    Dr. N. is correct about the Razor.
    Dr. S. (and it appears you too) are making a common error.

    Take another look at what Dr. N. is saying about the proper use and interpretation.
    It is clear and correct.

    my two cents if they are worth anything to you.

  48. tkitzleron 01 Feb 2014 at 5:18 pm

    Hello Dr Novella,

    Thank you very much for your response. I do believe I can follow your logic and I am not in disagreement with your argument. The point I was trying to make was that I believe Occam’s razor is often used for the wrong aspect of the problem solving process. I am not saying it is impossible to apply it the way you describe it, I am just wondering whether it is the ideal way to make use of it’s logic. In the setting of comparing the probability of different diagnoses, in my opinion, it’s logic cannot be applied consistently, since too many other issues factor in, as also mentioned by you. And many people have the tendency to use the ‘shortsighted’ and confusing version of this principle, making it’s clinical utility questionable at least. I came across this treatise pointing out some of the logical problems when using Occam’s razor and found it a very interesting and comprehensive read (http://logictutorial.com/occam.html).

    It seems to me that a part of the disagreement stems from whether Occam’s razor, as a logical rule, offers any predictive value. When scanning the literature, this argument seems far from settled and has occupied some of the greatest minds out there, without having arrived at a final all satisfying answer. So it is probably worth to continue this debate.

    Whether the law of parsimony deals with probability or not has, in my opinion, to do with how it is applied and how someone sets up the problem at hand. I will try to illustrate this by using your example and how I understand you make use of it (Example 1) and then will compare it with how I would use it (Example 2). Please let me know if I misunderstood the way you see it, and I apologize in advance in case I misrepresented your thoughts. Also, I think both examples are valid, to me, they illustrate that Occam’s razor can be used for different aspects of a problem. Importantly, my argument is that the practicability of the law of parsimony is better illustrated in example 2 since I believe that this setup corresponds more to its original intention, as it was mainly a tool for scientific and philosophical reasoning and not for decision making in clinical practice.

    Example 1:

    The problem: patient presents with a set of signs and symptoms (S)

    To solve the problem the clinicians makes assumptions (each assumption is represented by a letter A, B, C, D, etc.). Here I am not entirely sure what you mean by assumptions. I believe you mean diagnoses or disease. Part of the general misunderstanding here may stem from the fact that it is not entirely clear from our discussion which variables constitute our equation. Assumption is a very vague term. However, in this setting I will use it so that each letter in the equation represents the assumption the patient has the following disease explaining part of his signs and symptoms on the other side of the equation.

    So in this setup A + B + C + D = S is less likely than A + B + C = S, which is again less likely than A + B = S, since each additional assumption increases improbability.

    This setup works well on paper, however, in reality the physician’s need to account for all the patients signs and symptoms is complicated by the fact that they may be unrelated to the primary disorder. Also, two disease can have the same set of signs symptoms; both diseases can be equally likely, but how do I choose which of them to eliminate from my equation?

    Again, this is how I understood your approach. Maybe, I got it wrong, and if so, I apologize in advance and this was not intentionally.

    Now to how I see it. I will assign the variables differently. You will see that I use Occam’s razor for a completely different aspect.

    Example 2:

    The problem the physician faces is to successfully diagnose the disease the patient has. To do so the approach is not only to think of a disease that can explain all his symptoms, since this can be very tricky. The ideal approach is a diagnostic algorithm that gets the physician as quickly as possible to the point where he can make his diagnosis confidently. The signs and symptoms make him think of different algorithms that are available to him.

    The problem: Patient has a disease (X) that the physician would like to diagnose.

    As mentioned, the patient’s signs and symptoms give the physician an idea of what type of disease he may think of and which tests he would ask for in order to make the diagnosis. Also, patient’s age group, origin, social behaviour, family history, etc. make some diseases less and others more likely. No need to use Occam’s razor at this point.

    Now to set up my equation. In order to arrive at a diagnosis, I will make the patient undergo a number of tests. Each letter in my equation represents one clinical test (A, B, C, D, etc). Notably, we are not dealing with a probability determined by the number of variables on one side of my equation. The probability here is dependent on whether I chose the right approach to diagnose his disease, which is dependent on the factors named above.

    So in this setting, it may be that in order to diagnose a specific disease (X1) it may take three tests to arrive at the diagnosis. A1 + B1 + C1 = X1. For a different disease X2 it takes only two tests A2 + B2 = X2. However, which disease is the more likely has nothing to do with how many tests I need to diagnose/explain his condition. It has to do with how probable/frequent each diseases is in this specific patient cohort. Often, doctors will order the tests for both disease if they are equally likely. So to arrive at a diagnosis a doctors may have done the following: (A1 + B1 + C1) + (A2 + B2). Also, depending on their results they will adjust their decision making process. This is a very dynamic process that requires constant adaptation. And I agree, that some doctors may use reasoning that may sound like they are applying Occam’s razor to guide themselves. However, I don’t see it strictly applied in reality, because often we will also test for the improbable disease just to make sure we didn’t miss the rare occasion that we do not want to miss in real life. That’s why I am arguing the law of parsimony has little practical value in a diagnostic approach in real life settings.

    So where to use it. As I mentioned in an earlier post, I believe its value lies in clinical science, and how to translate scientific results to clinical applicability. Lets say it took four tests (A1, B1, C1, and D1) in the past to make a diagnosis of disease X1. Because of medical advances and clinical studies researchers could demonstrate that we can correctly diagnose as many patients with a clinical algorithm making use of only 2 tests, one of the old tests in combination with a newly developed one (A1 and E1). So, A1 + B1 + C1 + D1 = X1 and A1 + E1 = X1 are both equally probable and have the same predictive power; however, for practical and economic reasons it makes sense to apply the law of parsimony and use the second algorithm in place of the first one. One needs less variables to correctly diagnose/explain the same condition. This will require less resources, time, etc. The advantages drawn from the law of parsimony in this scenario seem obvious to me.

    I hope I was able to express myself well. Please let me know if you think my logic is flawed. I find this a very interesting topic. I enjoy reading your blog. Thank you very much for your time.

    Best wishes,
    Thomas

  49. Will Nitschkeon 01 Feb 2014 at 8:17 pm

    @Steven Novella

    “You still have said nothing to counter my position, which I clearly stated. No one is saying that Occam’s razor should be rigidly applied, or that it is the only guiding principle. As such Dr. Smallberg’s point is a straw man. It is one rule of thumb among many. Further, it is used in medicine (as I indicated explicitly) not to derive the one correct answer but to prioritize the differential diagnosis. Your example above is consistent with this process, and so is not even a counter example.”

    You actually have no position left, that I can find. Let me summarise my arguments as briefly as I can.

    The Ptolemaic verus Copernican example is a text book application of Occam’s Razor in the academic literature. Your definition of Occam’s Razor cannot be applied to that example, which implies that Occam’s Razor cannot be how you have narrowly defined it.

    My second engineering example demonstrates how the application of Occam’s Razor fails, whether we use your narrower definition or whether it’s subsumed under the more general description of ‘simpler is better’.

    “My point remains – confusing “fewest unwarranted assumptions” with “simplest” is the problem, not Occam’s razor itself.”

    To repeat, in my first example I showed that there is no confusion over “simplest” versus “fewest assumptions” as you claimed. (For the sake of the argument I am assuming there is actually a valid distinction between the two.) BOTH clearly apply to all accepted definitions of Occam’s Razor. In my second example, I showed how your definition does not ‘save’ Occam’s Razor.

    “This does not mean that there aren’t specific cases where the simplest is also the fewest assumptions, but that is irrelevant. I gave a specific example when the two formulations lead to very different outcomes, one useful and one misleading.”

    For every example you offer of where you think Occam’s Razor is useful post hoc, I can offer a counter example demonstrating where it is obstructive. Your problem is, I only need to offer one counter example to your claim in order to disprove your claim. Even if you shift your position and admit that Occam’s Razor will sometimes fail, then there is almost nothing to distinguish your position from Dr Smallberg’s. He asserts its failure is harmful, and you are asserting its intermittent failure is not. (Which my second example demonstrates is not correct.)

    Dr Smallberg makes the general point that Occam’s Razor does more harm than good. This conclusion is based on his anecdotal experience. But you cannot claim he is in error by using a more narrow definition of Occam’s Razor, because (a) this does not disprove he is correct in terms of the more generally accepted definition, and (b) your narrower definition does not encompass the accepted definition of Occam’s Razor anyway, and (c), even if we apply your narrow definition, it is trivial to demonstrate it failing in any number of problem solving cases. This is why Dr Smallberg may or may not be correct, but you most definitely are not correct.

  50. jt512on 01 Feb 2014 at 9:49 pm

    @tkizler:

    Following my previous post, here is a link illustrating the example I gave on two competing models, one simple, one more complex: http://planetmath.org/occamsrazor

    When we have two explanatory models with the same predictability, chose the simpler one. For the sole sake of simplicity. I am not sure this implies that the simpler one is more likely or more probable.

    If two models equally fit the data, the simpler model will, indeed, tend to have the higher posterior probability. Unfortunately, the planetmath.org article you linked to fails to demonstrate the point, though it could have, as follows. The article shows a data set of 20 observations of an independent variable, X, and a dependent variable, Y; and the results of two models: a simple, linear model and a more complex, quadratic model. Not shown, is the key statistic, R^2, which shows that the quadratic model fits slightly better than the linear model (R^2=.97424 vs .97239). We might, then, naively think that we should slightly prefer the quadratic model, but Bayesian analysis shows this thinking to be wrong. The posterior odds of the linear model are about 20 times that of the quadratic model (Bayes factor of 18.7, using Jeff Rouder’s calculator).

    The simple model is more likely to be correct than the more complex model, even though the more complex model fits the data slightly better. This is a mathematical example of Occam’s Razor.

  51. jt512on 01 Feb 2014 at 9:55 pm

    Above, I should have stated the assumption that the models are equally likely a priori.

  52. tkitzleron 01 Feb 2014 at 10:39 pm

    @jt512:

    “If two models equally fit the data, the simpler model will, indeed, tend to have the higher posterior probability. Unfortunately, the planetmath.org article you linked to fails to demonstrate the point, though it could have, as follows. The article shows a data set of 20 observations of an independent variable, X, and a dependent variable, Y; and the results of two models: a simple, linear model and a more complex, quadratic model. Not shown, is the key statistic, R^2, which shows that the quadratic model fits slightly better than the linear model (R^2=.97424 vs .97239). We might, then, naively think that we should slightly prefer the quadratic model, but Bayesian analysis shows this thinking to be wrong. The posterior odds of the linear model are about 20 times that of the quadratic model (Bayes factor of 18.7, using Jeff Rouder’s calculator).
    The simple model is more likely to be correct than the more complex model, even though the more complex model fits the data slightly better. This is a mathematical example of Occam’s Razor.”

    Thank you very much for pointing this out. I just began recently to learn some basic Bayesian analysis. I will definitely read more on that. Thanks again.

  53. tkitzleron 01 Feb 2014 at 11:54 pm

    @jt512:

    Would the same be true for two multiple linear regression models where one has three independent variables and the other one four (including the three variables of the first one)? It seems counterintuitive that the second model, which includes all the predictors of the first one, would be less likely to be correct.

  54. jt512on 02 Feb 2014 at 2:40 pm

    Would the same be true for two multiple linear regression models where one has three independent variables and the other one four (including the three variables of the first one)?

    Adding a term to a linear regression model has two effects: the fit of the model to the data is improved, and the model’s complexity is increased. The first effect increases the model’s posterior probability; the second, decreases it. In other words, there is a tradeoff between model fit and complexity. Whether the net effect is positive or negative depends on how better the more complex model fits the data and how much the additional term increases its complexity relative to the size of the dataset. Large datasets tolerate more complex models better than do small datasets.

    It seems counterintuitive that the second model, which includes all the predictors of the first one, would be less likely to be correct.

    The reason for the paradox is that a real dataset contains random error. When we do linear regression, we want to find the model that best fits the underlying data-generating process, not the random error: we want to pick out the signal from the noise. However, if we make our model too complex, we will be fitting our model the noise as well as the signal. To see this, consider the set of (probably fake) 20 observations in the planetmath.org article you referenced. We could fit that data set perfectly by using a 19th-degree polynomial regression model. In that data set, the regression curve would pass through every point. However, if the Y’s in that dataset were generated by a process that was linear in X, except for random error, then the 19th-degree polynomial model, which fit exactly to our dataset, would be fit terribly to another dataset generated by the same process. In comparison, a linear model would not fit the original dataset as well, but would much better fit a subsequent dataset generated by the same process.

    Bayesian methods of model assessment, like the Bayes factor, reward models for goodness-of-fit, but extract a penalty for complexity, essentially requiring added complexity to pay for itself with sufficient improvement in fit.

    How much this applies to making a diagnosis in the clinical setting is unclear to me.

  55. tkitzleron 02 Feb 2014 at 5:11 pm

    @jt512:

    Thank you very much for your very clear and detailed explanation. You have a great way of explaining a rather complex issue.

    I am not formally trained in statistics, but I am very fascinated by its usefulness. I have some basic understanding of it and so far, I used regular t-tests, multiple linear regression analyses, ANOVA, ANCOVA, and Kaplan Meier functions. But I understand that there is much more to know. Fortunately, when needed, I always have access to people with a more detailed understanding on how to correctly apply them. My understanding is that there is a lot of statistics that are applied incorrectly in medicine.

    I can clearly see how Occam’s razor works well, and logically stringent, when applied in your mathematical example. It is clear now to me how the simpler model can be more likely. But even there one needs to make trade offs, as mentioned by you (regarding predictive power etc). Nevertheless, I can see its usefulness and its strength. I am suggesting, however, that in the messy reality of clinical decision making its logic is too often violated in order to make it useful. Most physicians are not aware of the statistical and logical intrinsics that you were referring to, but to correctly apply Occam’s razor one should know all of this. But since most people do not, they use a shortcut version of Occam’s razor.

    Part of it may have to do with how we set up our explanatory model in our diagnostic approach. My understanding is that many clinicians set the signs and symptoms as their dependent variable, which they try to explain with independent variables, the possible diagnoses or diseases. (Please note: I don’t say they do this consciously, but rather intuitively, since this is the traditionally taught approach in medicine). In reality, you may have two disease that may explain your signs and symptoms equally well. So how would I choose which of the two independent variables to eliminate from my equation? I choose the disease (independent variable) that is more likely in this scenario. Does this now mean I used Occam’s razor? I am not sure. In a scenario where two models could explain equally well one occurrence, I chose the one that is more likely in the current situation. I don’t think that this follows the same logic that is applied in your mathematical example. Since its not a matter of complexity.

    In contrast, in a different scenario (the Example 2 I mentioned above), where we make the tests, that are used to arrive at a diagnosis, the independent variables and the disease we would like to diagnose the dependent variable, I can see how Occam’s razor helps us to choose one of two competing algorithms (where there is otherwise similar fit and predictive value). We can choose the one that uses fewer independent variables (i.e., tests). This approach is commonly used in a experimental designs.

    But maybe I am just overthinking it :)
    Because I completely get where Dr Novella is coming from. I am not trying to make it more complicated than it already is, maybe I just got carried away…I enjoyed it anyways and learned a bit about Bayesian analysis. I will continue to read more about it. Thanks again!

  56. PharmD28on 03 Feb 2014 at 12:35 pm

    I read down about half way so far….

    I am an off and on reader of this blog when I get the time…

    For whatever it is worth…since I have been observing many topics and comment sections within this blog, it often exhibits “healthy skepticism”….that is I see it very often historically the “acolytes” correcting Dr. Novella on various points, down even to the regular grammar, spelling, and in this case math….

    And among the “acolytes” there are often significant disagreements that are discussed.

    I simply find the notion that folks I have seen on this forum in disucssion largely accept all of the points made uncritically shows either that you have not observed the discussions long enough on this forum to know or that you simply are too cynical to be a proper skeptic.

    If you goal is to make arguments that are valid and will be heard/considered that are in opposition to say Dr. Novella….I see that as welcome….but once you start with ad hominem drivel, I write you off personally….

Trackback URI | Comments RSS

Leave a Reply

You must be logged in to post a comment.