Flawed Research

Four Reasons Why It’s Difficult to Conduct a Proper Study

By Cal Cates
[Massage Therapy as Health Care ]

Takeaway:

The traditional and accepted processes of conducting and publishing research are deeply flawed. Researchers are dissuaded from being truly curious and they are manipulated into publishing their findings in incomplete and misleading ways that don’t serve honest inquiry and discovery.

The massage therapy profession loves to talk about research. We love to cite it. We love to call it a priority. We love to rail against the lack of it and the ways that “even the good studies” don’t give us the respect we think we deserve.

Let’s begin by agreeing to stop saying that research “proves” anything. Research suggests, sometimes strongly, that a theory has real merit. Research can show us that we may be on to something, and that one thing may be safer, wiser, or more effective than another thing. But research and the data we gather in conducting it is a moving target. Remember how Newton “proved” gravity and that whole, flat, two-dimensional idea of stuff falling back to earth? That was a fact—until we discovered that space and time are actually curved and it’s a lot more complicated. Remember when lymphatic vessels were (accidentally) discovered within the meninges back in 2015, opening brand new and previously unconsidered possibilities for our understanding of the pathogenesis of serious neurological conditions?

Research brings us face to face with the immutable truth we hate: Change is the only constant. When it comes to research, we have to supervise ourselves. We are humans. We are wired to develop biases and to see the world as we’ve been taught to see it by our background, experiences, and goals for how we want things to be. It’s so deeply ingrained that we don’t even notice our constant attempts to shape the world to fit our image of it. This dynamic has plagued the validity and reproducibility of research for centuries.

As health-care providers, it’s essential that we understand the flaws in this system so we can conduct research ethically, modulate our understanding of what is “discovered,” and keep each other honest about what we are and are not learning. In April 2019, Dorothy Bishop published an article in the journal Nature called “Rein in the Four Horsemen of Irreproducibility.”1 The article explores four common behaviors that take place in the process of conducting and publishing research that undermine our ability to ethically and accurately communicate our findings. Bishop argues that “Many researchers persist in working in a way almost guaranteed not to deliver meaningful results. They ride with what I refer to as the four horsemen of the reproducibility apocalypse: publication bias, low statistical power, P-value hacking, and HARKing (hypothesizing after results are known).”

As a person who has been actively involved in designing, conducting, and publishing research in massage therapy since 2010, I have seen this in action. It’s really slippery. These behaviors take place in plain sight. They go unchecked and, on the rare occasions they are noticed, they are easily shepherded along with little or no resistance as “how research is done.”

So, Who Are These Horsemen?

Publication bias is basically what happens when research is done correctly and with appropriate rigor, but when the data produced by that research don’t support our hypothesis and we essentially pretend the study didn’t happen. We even call these “failed experiments.” An experiment is designed to see what happens, not to prove us right. The problem is that we design experiments that we expect to demonstrate what we already think we know. When they demonstrate something else, we don’t want to “admit” it and, worse, journals don’t want to publish this kind of data. It discourages the research community from truly experimenting and it results in lots of wasted time and resources because maybe the researcher down the road is using funding and they don’t know it’s already been studied and it showed the hypothesis was wrong. But knowing we were “wrong” is sometimes even more valuable than discovering we were “right.”

Low statistical power is one that plagues massage therapy research, largely because funding is so scarce. Low statistical power, simply put, is what happens when we don’t (or can’t afford to) include the number of human subjects necessary to detect the effect of whatever intervention we’re measuring. Funders need to demand that studies be adequately “powered” to prevent the data universe from becoming (even more) cluttered with what are little more than guesses. It’s not difficult to measure statistical power when we’re designing a study, but this is a bit like when a client says, “How often should I see you?” and we know that “daily” or “weekly” is the answer, but we also know they will not be able to afford that, so we say “every other week” or “once a month” knowing this will be less effective and our “results” will be skewed. It doesn’t work for our clients, and it doesn’t work for research.

P-value hacking is a little more intentional, though distressingly common and regularly rewarded by publications and funders. The “p” in p-value stands for probability. It’s the number used to determine how likely (or unlikely) it is that any “results” you found from your intervention were the result of chance rather than the intervention itself. When we have a weak p-value, we are teetering toward what’s known as a null hypothesis, which is “what you thought would happen, didn’t happen.” So, to keep getting published and keep getting funding, we try out a variety of analyses of the data until we find one that has a small, pretty p-value that allows us to deny or reject the null hypothesis. It’s a half-truth at best, and it leaves out some of the really important information that would help us all have a more complete understanding of what really did (and didn’t) happen.

HARKing is the result of being backed into a corner by the research community’s wrong-headedness about all of this. “Hypothesizing after results are known” is when we rewrite our hypothesis to make it look like we “knew” this (unexpected) result would happen. Researchers massage (yep, I sure did) the data into something that appears to support their hypothesis, even if it’s in a roundabout way. The idea of mining the data for new and surprising discoveries is not the issue here. The issue is that, as researchers, we are left having to do it in a backhanded and sketchy way that doesn’t allow this unexpected data to stand on its own as a possible new direction for the research we set out to conduct in the first place.

So, my friends, you can see we have a lot of work to do, and a really big part of it is about learning how to be “wrong.” As a culture, and as people so deeply addicted to knowing what can be known, we have to understand the value of being wrong, of being uncertain, and of collaborating in our curiosity as we make our way along this constantly changing landscape.

Note

1. Dorothy Bishop, “Rein in the Four Horsemen of Irreproducibility,” Nature 568 (2019): 435, https://doi.org/10.1038/d41586-019-01307-2.

  Cal Cates is an educator, writer, and speaker on topics ranging from massage therapy in the hospital setting to end-of-life care and massage therapy policy and regulation. A founding director of the Society for Oncology Massage from 2007–2014 and current executive director and founder of Healwell, Cates works within and beyond the massage therapy community to elevate the level of practice and integration of massage overall and in health care specifically. Cates also is the co-creator of the podcasts Massage Therapy Without Bordersand Interdisciplinary.