by Elizabeth Hoffheinz, M.P.H., M.Ed.
As full-time editor-in-chief of Arthroscopy: The Journal of Arthroscopic and Related Surgery; Arthroscopy Techniques; and Arthroscopy, Sports Medicine, and Rehabilitation, James H. Lubowitz, M.D. is particularly well positioned to opine on the state of the orthopedic literature. And, he says, all is not well.
As I learned from Don Johnson, M.D., Arthroscopy Associate Editor, emeritus, and Past President of the Arthroscopy Association of North America (AANA), “Half of what we read in the literature is wrong…we just don’t know which half.”
An overarching issue, says Dr. Lubowitz, is clinical versus statistical significance.
Known for many insightful commentaries, Dr. Lubowitz stands out for his common sense approach to patient care. His 2019 podcast, “Our Measure of Medical Research Should Be Appreciable Benefit to the Patient,” * is particularly notable.
“The clinical relevance of research is much more important than any statistical significance,” stated Dr. Lubowitz to OSN. “Yes,” says Dr. Lubowitz, “we need to consider levels of evidence…and journals, including ours, publish all 5 levels of studies. However, level of evidence is only one measure of the quality of a medical research publication. Randomized controlled trials (RCT) are very valuable, but we must balance the results of RCTs with practical and expert opinion.”
Statistics for you ≠ relevance for me
And in his expert opinion, an ongoing issue in orthopedics is the generalizability of results. “Take an article that discusses a large group of patients treated by different surgeons at different times in different cities with variations in technique. In order for that information to be clinically relevant, we must report on outcomes of importance to the patient…that is how we should measure clinical research.”
“The goal is to determine if a medical treatment results in an appreciable benefit to the patient. While we do it every day, surgery is truly a big deal for patients, with risk, cost, logistics, and time to recover all playing a part in their experience. If Mr. Jones is sitting across from you at 2:30 on a given Tuesday and you are about to recommend surgery, is the paper you just read in XYZ generalizable to him? If you perform 10 hip surgeries in a week, are you sure that a given publication will be generalizable to patient number three?”
“Joshua Harris, M.D. has written extensively on clinical versus statistical significance, indicating that a statistically significance difference may not be clinically relevant. He famously said, ‘Patients don’t care about their p-values!’ And my former fellow and partner Dr. Jeb Reed used to joke that, ‘If you beat the statistics hard enough, they will confess to anything.’”
Don’t worship at the altar of statistics
Dr. Lubowitz says that if you dig around long enough in the sandbox of statistics, then you can find something that is ‘statistically significant.’ “The reality is that patients may not even be able to perceive any postoperative differences that researchers find to be of statistical significance. Fundamentally, the point is to be able to generalize to patients to the point where you can say to Mrs. Jones, ‘This surgery is going to help you.’”
“Often we only have outcome measures that are based on the outcomes recorded by the surgeon. In the past, patient reported outcome measures (PROMs) have been called ‘subjective,’ indicating bias or lower consequence—today, we understand that is not the case.”
“There have been times when, following an ACL surgery, for example, the ligament felt a little loose…and yet the patient was thrilled, had no symptoms, and returned to sports. Patient reported outcome measures should not be called subjective be any means.”
MCID, PASS, SCB…vital acronyms
“Outcomes should be measured in terms of the minimal clinically important difference (MCID) detectable by a patient,” advises Dr. Lubowitz. “In addition, reporting of patient-acceptable symptomatic state (PASS) and substantial clinical benefit (SCB) need to become more routine. We know that PASS and SCB ultimately correlate with whether patients are satisfied and/or would be willing to undergo the intervention again.”
“‘Can you notice a difference? Are you much better than before your surgery?’ Those are the types of questions we need to be asking patients. Change is coming, I believe. We at Arthroscopy are paying increasing attention to such details. When we invite a paper to be revised, we are asking the authors to report those metrics. If someone treats a certain number of patients, say 200, and they have a spreadsheet and an outcomes score, and they determine the score indicating a substantial clinical benefit, then that clinician can quickly go through 200 patient outcome scores, by software or by hand, and show, for example, that 150 met the criteria…so that percentage of patients, in this example 75%, achieved substantial clinical benefit.”
“Gray” is not a measure of patient satisfaction
Dr. Lubowitz told OSN, “There is no gray area for an individual patient—they are either satisfied or they’re not. Thus, studies showing ‘mean improvement in pain exceeded the MCID for a group of patients’ are reporting this metric incorrectly and could be misleading, because the patient outcomes are not distributed on a perfect bell curve. If there are outliers (i.e., some patients got worse, some had persistent pain, and many had absolutely no pain), then the average score could be misleading. The percentage of patients meeting the threshold is what matters. Ultimately, if surgeons are speaking to new patients in their clinic, and they can provide an evidence-based opinion that, ‘75% of patients with problems similar to yours who received this treatment were satisfied afterwards,’ then that is valuable information.”
Asked if he finds this daunting, Dr. Lubowitz admitted, “Yes. Statistics are incredibly challenging to understand, and the vast majority of authors and editors, including me, are not first and foremost statisticians. And while mathematics may be a hard science, when it comes to biostatistics, there may be many ways to interpret data.”
So what is the solution?
“Peer review,” says Dr. Lubowitz. “The peer-review process isn’t meant to simply referee, ‘Accept or Reject,’ but to improve the papers we do publish through the process of manuscript revision. In addition, a goal of peer-review is to educate authors whose papers we don’t accept, so they might be successful the next time they submit a manuscript.”
Asked how he would proceed if approached by the American Academy of Orthopaedic Surgeons (AAOS) about improving peer review, Dr. Lubowitz told OSN, “I would start by educating authors and reviewers. We present a journal course at the AANA annual meeting. It used to be called a ‘Reviewers’ Course,’ but is now a ‘Course for Writers and Reviewers.’ This change is telling and demonstrates an acknowledgement that the peer review process can be improved.”
“There has been a great effort by AANA and other AAOS subspecialty societies to address the appropriate reporting of outcomes. Frankly, there are so many outcome measures that inappropriate measures for a specific condition or group of patients may be selected and reported. And even if you correctly select a validated outcome measure for a specific condition, for example shoulder instability, it may not be appropriate for each unique population, such as contact athletes versus throwing athletes. Another concern is reporting bias where if different authors study the same condition in similar populations but use different outcome measures—then you can’t compare the results. We need to get everyone on same page.”
Dynamic situation
“Even if we can come to an agreement on the appropriate outcome measures in 2020, next year we may inevitably learn something new. Some measures are quite general, such as the Short Form 36 (SF-36), and this may not be helpful for assessing surgical outcomes. If you ask, ‘How are you feeling today?’ you could be capturing information that emanated from a disagreement at work or something else that is unrelated to their orthopedic issue.”
“The Patient-Reported Outcomes Measurement Information System (PROMIS) is general, but it contains Physical Function domains, which are relevant to orthopedic surgery. With PROMIS, we may finally be able to compare apples to oranges. We may look at how much a knee surgery helped a group of patients versus how much a shoulder surgery helped a different group of patients, because in both cases we might use PROMIS to measure and report Physical Function. Of interest, PROMIS uses computer adaptive testing (CAT) where the answer to one question leads the computer to select the next question, meaning that it’s very efficient and can generate an outcome score with few questions mitigating against ‘survey fatigue.’”
So should all the other outcome measures be left in the dustbin of orthopedic history?
Not so fast, says Dr. Lubowitz. “PROMIS is still new, and because it necessitates that patients have access to a computer, that may be a challenge for some patients or providers. Mark Cote, Director of Outcomes, Research, and Quality for the UConn Musculoskeletal Institute at UConn Health said, ‘Sometimes we need legacy measures to determine what are the apples and what are the oranges in the first place.’”
Indicating that authors and editors are now attempting to correlate PROMIS scores with legacy scores, Dr. Lubowitz notes, “If a PROMIS score correlates precisely with a different rotator cuff outcome measure, then we could just use PROMIS. We recently published a study demonstrating that PROMIS correlated well with cartilage scores postoperatively but less so preoperatively. There is nuance, and once again, we continue to learn.”
“Then there is the Single Assessment Numeric Evaluation (SANE) score, which allows us to ask, ‘How would you rate your knee as a percentage of normal where zero is the worst knee you can imagine and 100% is a normal knee?’ Surprisingly, or maybe not so surprisingly, in evidence-based studies, SANE has been shown to correlate very well with many legacy measures and obviously is efficient.”
“In our practice, we measured and recorded a SANE score for every patient at every visit. Over the years, this proved valuable because sometimes despite our best efforts we have patients who are not satisfied. Let’s say someone tells me, ‘I still have pain when I jog.’ Then I look at their SANE score from the previous visit and it was an 85 versus on the day I first met them, when their SANE was a 5 or a 0. So, is this person highly demanding or does she or he have low pain tolerance? A patient who improves from 0 to 85 might derive more benefit from surgery than one who improves from 90 to100; they may not feel they have achieved a ‘normal’ or ‘perfect’ outcome. But, it could be valuable to be able to show and remind them of where they started.”
In the end, says Dr. Lubowitz, putting patients first sometimes means putting statistics aside…or at the very least making more room for patient-reported outcome measures.
To listen to the podcast, “Our Measure of Medical Research Should Be Appreciable Benefit to the Patient,” please visit:
https://podcasts.apple.com/us/podcast/arthroscopy-podcast/id1441179280?i=1000457131307