false
Catalog
Virtual Didactic - Shoulder Pain as a Metaphor for ...
Virtual Didactic- Shoulder Pain as a Metaphor for ...
Virtual Didactic- Shoulder Pain as a Metaphor for Evidence-Based Medicine Led by Ameet Nagpal, MD
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
All right. Let's go ahead and get started. I want to welcome everybody to the AAP virtual didactics for today. My name is Sterling Herring. I'm a PGY3 at Vanderbilt and I'm hosting or moderating these sessions. We're excited today to have Dr. Amit Nagpal. We'll get to him in just a second. First, we always want to recognize that for some folks this epidemic has been more of an inconvenience, but for others it has been more personal. We want to recognize and express our support for those who have been more affected, either professionally or personally. We appreciate what you're going through and we support you. If there's anything we can do to help, please reach out. As always, first, a little bit of housekeeping. The goals here are to augment didactic curricula of your home institutions, to offload faculty in logistical situations that are more difficult than they probably have been in the past, to provide learning opportunities for off-schedule residents. We know that a lot of folks are being pulled off of clinical responsibilities or just being shifted into different schedules. And then, obviously, to develop additional opportunities for digital learning and support physiatrists in general during this time. Further, we are trying to keep everybody video and audio muted just for purposes of understanding and bandwidth and keeping people from getting too distracted with what's going on on video chats or the video screens. If you have any questions, please send them to me. So again, if you have any questions, please open a chat with me. Again, my name is Sterling Herring. I should be at the top of the participant list or near the top of it. Send them to me during the session, and I will pose them to Dr. Nagpal as appropriate. And again, if you have any kind of bigger questions or concerns, please reach out to us via email or on Twitter. And with that, we will move on to today's presenter, Dr. Amit Nagpal. I will refrain from saying too much about him so that I don't set expectations too high, but I'm sure we'll all be both informed and entertained today. So Dr. Nagpal, thank you for joining us today. Thank you, Sterling, and thank you, everybody, for taking the time to listen. I want to echo what Dr. Herring said, that this is a tough time for everybody. So I hope that everyone's doing well and everybody's able to suspend reality a little bit to learn a little bit about shoulder pain and evidence-based medicine today, because frankly, that's going to be what we need to do when we get to the other side of this problem. And so any of you who have sat through, some of you may be residents of the University of Utah, in which case you may have heard a version of this lecture when I went out there and gave a grand rounds during a snowstorm, actually. And some of you may have been at AAPM this year and heard bits and pieces of this, but I bet the majority of you haven't. And I bet also the majority of you have not heard an evidence-based medicine lecture that you left and felt riveted by or felt like you were going to go out and conquer the world. So the idea today is that we're able to do that, but the only way to teach evidence-based medicine is to make it applicable to what you do. And so that's what we're going to do today. And there we go. So we're going to talk about the glenohumeral joint and the associated conditions and the shoulder itself, or depending on your level of training right now, whether you're a PGY1 or a PGY4, you're going to have different knowledge bases on how the shoulder operates. And it is the most difficult joint to understand of all the joints in the human body because of the various degrees of movement that it has. But then we're also going to talk about principles of evidence-based medicine and how that applies to determining whether somebody has a particular reason for their shoulder pain and how to treat their shoulder pain. So with that said, I definitely am not, I don't have a PhD in statistics. I am not an epidemiologist. I am a mediocre physiatrist at best, but I do know a lot about evidence-based medicine. So I'm going to do my best to give you guys the best information I can, but certainly I'm not perfect. So if there are questions or disagreements, I welcome that. There are three tenets of evidence-based medicine, reliability, validity, and effectiveness. If you have a perfect method of treating someone's shoulder pain, you will have all reliable and all valid diagnostic capabilities, and then a perfectly effective treatment option. And unfortunately in physiatry, we don't, or in most cases, in most everything in medicine, we don't have these three things perfectly lined up. So reliability is the one of these three I will spend the least amount of time on because it's pretty straightforward. It is the, as you guys have all heard this many, many times from college to medical school to residency, it's the extent to which a measure or instrument is reproducible. I mean, you've heard about intra versus inter-rater reliability. Intra-rater reliability, intra-rater meaning one single measurement person who is doing the measuring at different times, and inter-rater means between different measuring physicians or participants. And how do we measure it is with clearly simply with correlation coefficients and with Kappa scores. And I'm not going to get into the statistics of this, but we can just simply look at the Kappa score of a given diagnostic test, whether it's an x-ray, whether it's an MRI, whether it's a physical exam maneuver, and you could tell based on the Kappa score whether that reliability is poor, fair, moderate, good, or very good. And very few things we do in physical exams specifically fall anywhere into even good or very good. So most of, if you have a good physical exam maneuver, it's usually here in moderate. That's a really well-designed physical exam maneuver. Because most of the time when we do physical exam maneuvers from musculoskeletal medicine, they're done very differently from person to person on patients with different prevalence of disease, which we'll get into, and so therefore you wind up in a situation where you may not be able to reproduce the same findings. It's certainly inter-rater reliability tends to be at best moderate for our physical exam. So we'll spend most of our, the bulk of our time talking about validity and effectiveness, and then we'll get into, then of course we'll talk about the shoulder. So validity is, I'm going to assume because everybody's on mute, I'm just going to assume that everybody is laughing because of this quote from Abraham Lincoln, but I can't be sure, but actually the best thing about everybody on mute is I don't hear your silence from my jokes. So I'm going to just assume that this is going really well. We need to know if we can measure what we intend to measure, and that's the whole purpose of validity. But if you get down into the details of validity, you will find quickly that there are so many different types of validity. Face validity, construct validity, criterion predictive, there's so many. And if you are going to get knee deep into evidence-based medicine and write papers on these topics, you need to know the differences between these. But what you need to know as a resident and what you need to know to be able to clinically treat a patient is predictive validity in diagnostic medicine. When you perform a physical exam maneuver, you are not doing anything other than trying to increase the likelihood that you're making the right diagnosis or that you're ruling out a different diagnosis. And those physical exam maneuvers, just like an x-ray, just like an MRI, just like an EMG, just like any other diagnostic test, requires a validation study to prove what its worth is in exam. And the only way to prove that is to first have a criterion validity study to prove what is the gold standard for that diagnosis. So for the best example I usually give is that currently most commonly used tests for diagnosing a pulmonary embolism is either a VQ scan or a spiral helical CT scan. And the gold standard for diagnosing pulmonary embolism is pulmonary angiography because you can actually visualize the emboli or the embolism. And so the way that it was determined that CT scan or VQ scan was just as good as pulmonary angiography is first a long time ago, decades ago, pulmonary angiography was found to be the gold standard for treatment because it showed reliably you could figure out if somebody had a PE. Thereafter those other tests such as CT scan were compared to the gold standard, to the criterion standard, which is pulmonary angiography and determined to be at least as good. And that's why CT scans are still are utilized predominantly and of course they have less risk of morbidity from as opposed to an angiography. So the way that was done is by creating a two-by-two table and you simply if you again use that same example and we say well if the disease is present we know it's present if they had a positive angiography and we know it's absent if they didn't have a positive angiography. And this goes on and on for things like such is in the shoulder a open rotator cuff repair where you could be 100% positive whether the rotator cuff was torn or not that would either fall under disease present or disease absent. Your new test whether it's a physical exam maneuver or imaging would fall under test positive or test negative and we have true positives, false positives, false negatives, and true negatives. This is not at all news to any of you I'm sure. And going back to these formulas one more time that all of you have heard many times the sensitivity is the true positives divided by the true positives plus the false negatives and the specificity is the true negatives divided by the true negatives plus the false positives. So in order to determine sensitivities we go down and specificities we go down this way. True positives divided by the net number of people who had the disease present and disease and and for specificity it's the true negatives divided by the total number of people who did not have the disease. Now this is important because the denominators here don't change based on the population you're studying. They will always be the same so therefore sensitivities and specificities are properties of the test itself as opposed to predictive values which have the equations are right here and I won't belabor those but the point the fact of the predictive values is that the denominators change when the prevalence changes and so therefore positive and negative predictive values are properties of the population being studied not the test itself and so if you take and measure predictive values in one study in a different population than you're treating they don't represent your population and are therefore no longer helpful in treating your patients or diagnosing your patients. So predictive values are relatively useless in what we do. So as an example here if we and this is a little different than what we if I would do this if I would do this in I know my screen's not sharing right now is that let's see Sterling or somebody or Candice can you let me know is the screen back up? It's back yes. Okay great so yeah so ordinarily I would ask for a show of hands here or somebody to do these calculations for me but in this virtual environment I'm just going to go through this myself and hope that you guys are doing it at home too but you should be able to calculate here from this randomly made set of data that the sensitivity would be 90 divided by 110 and the specificity would be 80 divided by 90 which is the total number of people without the disease and so you would get this sensitivity of 82 percent or 0.82 and a specificity of 89 percent or 0.89. So I ask and I want you to contemplate this at home what's a good sensitivity and what's a good specificity if we were in person I would be pointing at people and asking them but here I just want you to think about well geez how would you know if a sensitivity is good or a specificity is good and 82 percent 89 percent sound pretty good but then you have to use your own intuition about whether that's helpful and why this all matters is something called a likelihood ratio and so the way to determine whether sensitivity and specificity are valuable is to understand how they interpret into a likelihood ratio and it's used to determine the utility of a given diagnostic test. Positive likelihood ratios are used to determine the likelihood that a positive result would predict the presence of a condition such as and it's described as sensitivity over one minus specificity whereas a negative likelihood ratio is used to determine the likelihood ratio the likelihood that a negative result would predict the absence of the condition one minus sensitivity divided by specificity. Now the best way to analyze and interpret data using likelihood ratios is with this thing here which is called a likelihood ratio nomogram. You start with your pre-test probability which is usually related primarily to prevalence of disease so let's say you had a prevalence of a disease that's around five percent you would start here you find the likelihood ratio whether it's positive or negative the higher the positive likelihood ratio the better because it's in because you take a lot say you had something like 50 you would right draw a line through here and that would get you somewhere somewhere between 80 and 90 and that would give you a post-test probability that you have the correct diagnosis. A negative likelihood ratio you actually want lower numbers because you want to go through some of these lower numbers down here and decrease the chances that you're getting that the diagnosis in question is the accurate diagnosis because you're trying to rule it out. So let's look at some examples. Oh before we look at examples let's talk about 95 confidence intervals. So just like everything else when you calculate something there is a range around that number in which the actual truth may live and so when you calculate an absolute likelihood ratio that might not actually be the real likelihood ratio because there are is there's going to be a range an upper and a lower border of where the actual the actual number may fall just like everything else in statistics which is kind of sad because we thought hey maybe we can identify the truth here but almost but not quite. So let's take a disease that has a prevalence of 0.2 percent that would be the pre-test probability and let's say you have an incredible test with a likelihood ratio of 50. Our best likelihood ratios in physical exam are in the 4, 5, 6 range. So let's say you have this great test 50 that's incredible that's most genetic tests have a likelihood ratio around a thousand something like that so those are the most expensive highly exaggerated tests that we can think of. So even if you had a positive test you're only increasing the chance that you're making the correct diagnosis right now to 10 which shows us that low prevalence diseases are incredibly difficult to diagnose. Even with a positive test. So let's say we have the same test but the prevalence of the disease was now 10 percent. So now all of a sudden you take the exact same test exact same disease the only difference is your population is now much more likely to have that disease. All of a sudden you have something like an 85 percent chance of getting the correct diagnosis. Well that's a big difference and we need to understand that that's a big difference because all that we changed is the prevalence of disease. So as something becomes more prevalent the easier it is to diagnose. What if the prevalence was 50 percent? All of a sudden now a positive test gives you such a high accuracy on the post-test probability that you can almost be certain that you have the correct diagnosis. Well what if we took the likelihood ratio and changed it to two? Now our original example the pre-test probability was 0.2 percent and so with a positive test all we've done is changed the post-test probability that we made the correct diagnosis to 0.5 percent and that's not a big enough increase to justify doing that test probably. But I want you to think about this. What if that test costs twenty thousand dollars? What if it costs five dollars? So cost and value are wrapped up in everything we do. A simple physical exam maneuver may seem like it's not costing you much but I'll tell you in reality when I'm in my clinic and I have about seven minutes with each of my patients and then I have to add in the time I need for documentation every minute costs money and so I should only be performing diagnostic tests that are valuable meaning the physical exams that I do and frankly everything that I order I should be aware of whether it's going to change my post-test probability strongly that I'm making the correct diagnosis. So we're going to apply this a little bit to shoulder pain. These are the common shoulder pain conditions that we're used to talking about and we're going to go through these one by one and decide whether or not we can accurately diagnose these things. So let's talk about subacromial impingement. Subacromial impingement occurs when the subacromial and subdeltoid bursa is formed and irritated and inflamed underneath usually a hooked acromion though it can certainly occur under a normal acromion as well. It's the most common cause of shoulder pain. It's caused by narrowing of that space that we spoke about and you also subsequently develop supraspinatus tendon trauma and you can develop long head of the biceps tendon trauma too because the long head of the biceps goes through the glenohumeral joint and is in juxtaposition to the subacromial bursa. Abduction and internal rotation of the shoulder reproduce those symptoms and it could of course progress to a rotator cuff tear. The symptoms that patients have with subacromial impingement are classically night pain though I can't tell you with any sort of statistical basis this is true. I can tell you that the majority of my patients hate laying down on that shoulder. That's a big problem when they're sleeping. Laying on that side is very painful. They can also reproduce with overhead activity and they have anterolateral shoulder pain. There are stages to subacromial impingement. The first stage is edema with some microhemorrhage. The second stage is fibrosis and tendinitis and then the third stage you get reactive acromial spurring which can lead to as we spoke about rotator cuff tearing and finally tendinosis which is diseased tendon that is not inflamed. On x-ray on this side on the right or the anatomic left we have this normal looking shoulder x-ray where the glenoid sits in the I'm sorry the humerus sits in the glenoid fossa really nicely and we have the overlying acromion where we expect the supraspinatus tendon to be living. Here on this side on the left you see that the humerus has ridden up into the acromion. This is called a high riding humerus and that is a classical x-ray finding for subacromial bursitis. You guys are aware of these rotator cuff muscles and as we said this can progress into disease of the rotator cuff. And without belaboring it, we have of course our four rotator cuff muscles, supraspinatus, infraspinatus, subscapularis, and teres minor. And any one of them can be affected in subacromial impingement, to be honest. Supraspinatus is the most commonly infected because the tendon is right on right within that area. And if you've looked at this with ultrasound, you can see that it's clearly the subacromial bursa lives directly on top of that supraspinatus tendon. But because of changes in biomechanics, any of these rotator cuff muscles can be affected by subacromial bursitis and subacromial impingement. But as I said, the most commonly affected is supraspinatus. It's usually an extension of that impingement syndrome. You can also get it though from overuse or scapular instability or pre-existing genetic issues with the ligaments. And then there's a separate type of rotator cuff tendinopathy that occurs to the supraspinatus, which is really a vascular problem. It's a hypovascularity that's genetic. And due to angiofibroblastic hyperplasia, the tendon itself just does not get enough vascular supply and can lead to tendinopathy. So that's a rare cause, but I wanted to be complete and add that in. So I go back and I ask, and I asked you to think about this. I asked everybody to think about what's a good sensitivity and specificity. Geez, if I want to diagnose subacromial impingement or subacromial bursitis or rotator cuff tendinosis, whichever one of those, or rotator cuff tendinopathy, whatever part of that paradigm of that pathway you want to diagnose, how am I supposed to figure out if I'm able to diagnose it with physical exam? How do I know if I have a good sensitivity and specificity? Well, and let's go through some of these physical exam maneuvers that everybody's used to going through. I would encourage all of you to find some videos of these. Gerard Malanga's book is excellent for this. And there's a CD in his book that really has some great videos. So I really encourage you to go through those videos. I would love to show you in person. And one day I'll be happy to do that when I meet each and every one of you. But for now, let's talk about these one by one. And we've all heard about many of these. The Hawkins test is passively done by the practitioner, and it forces the greater tuberosity under the coracochromial ligament. As you can see, the shoulder is flexed to 90 degrees, as is the elbow. And then it's put in a neutral abduction-adduction position. And then the shoulder is forcibly, though not aggressively, internally rotated passively by the examiner. Theoretically, if it leads to pain, that would be considered a positive test. If it's positive for subocriminal bursitis, our studies show that the sensitivity is 70.6 percent, and the specificity is about 47 percent. And for rotator cuff pathology, that sensitivity goes down just a little bit. Specificity is about the same. So if we look at the likelihood ratios for those positive likelihood ratios, we're around one. Let's go back here. I should have brought this up. Let me go back here to my nomogram. Let's look at this nomogram. I want you to know that any time a positive or a negative likelihood ratio travels through one, you get to the exact same number that you started with. So it's as if you didn't do the test. So a likelihood ratio of one, whether it's a positive or a negative likelihood ratio, does not help you diagnose or rule out the diagnosis of any disease. And so when we have anything near one, it's not helpful. Let's get back to where we were. Okay, so we said these likelihood ratios are slightly above one. So Hawkins tests, at least based upon the studies that exist, do not help us rule it in. I did not calculate the negative likelihood ratios here. I would venture to say that they're slightly below one though, but I'd welcome anybody to look at those and decide whether they think this is a helpful test. Nearest test is done with, again, the shoulder in neutral abduction-adduction, and with the forearm fully pronated and the elbow fully extended, the practitioner passively flexes the shoulder forward and brings the shoulder all the way up. This causes impingement of the supraspinatus tendon under the acromion. For subacromial bursitis and for rotator cuff pathology, the sensitivities and specificities are listed here. Once again, our likelihood ratio for subacromial bursitis is slightly higher, 1.7, but it's still not very robust, and for rotator cuff pathology is about the same as Hawkins. We have Job's test here, which is also classically called the empty can test. That's probably what you guys are used to calling it. It tests strength of supraspinatus tendon in isolation. The arm is put in the angle of scaption, which essentially means that the scapula and the shoulder and the humerus are in a single line. The forearm is fully pronated so that the thumbs are pointing down, and then the patient resists for upward flexion of the shoulder. If they have pain, it's considered to be positive for a supraspinatus pain condition of some sort, and if they have weakness where the arm drops, it's consistent with a positive tear of the supraspinatus. If we look at the likelihood ratio for lesions and tendinitis of the supraspinatus, we're looking at this one range again. Now, for tears in the one study that exists, the specificity was 100 percent, so anytime the specificity is 100 percent, your likelihood ratio suddenly becomes infinity, which is pretty impressive, but this is a low sensitivity, 18 percent, so if we are to believe this study, which I'm not saying that we have to, but if we are to believe this single study on diagnosing supraspinatus tears, we would say that if you have a positive weakness with the empty can test, you have a 100 percent chance of having a supraspinatus tear, but if you have a negative test, we haven't ruled that out most likely because our negative likelihood ratio is actually incalculable. So another rotator cuff test here is the Hornblower's test, also called the Pate's test, and for completeness, I put the infraspinatus test over here, which is essentially the same thing but with the arms at the side, where there is a resistant external rotation, and the best case scenario, because this has been studied a couple times, best case scenario, the likelihood ratio is 14.1, and so I ask you to think about, well, geez, have any of my faculty taught me the Hornblower's test? Have any of them taught me Hawkins or Mears or the empty can, and I venture to say that most of them have talked to you about Hawkins and Mears and empty can, and I venture to say that most of them have not talked to you about the Hornblower's test, but in reality, this is probably the best test we have for diagnosing a form of a rotator cuff tear, but infraspinatus is not nearly as commonly torn as supraspinatus, so what is the value associated with knowing how to diagnose that? And I think you need to think about that on your own and decide, that's the art of medicine, the art of medicine is determining whether this data is something that I could apply in my population. How about the liftoff test? Liftoff test is to diagnose subscapularis tear, patient internally rotates their shoulder, puts their arm behind their back, the elbow is flexed to 90 degrees, and they pass, and they actively resist pushing off from their back against the examiner. Best case scenario, likelihood ratio is 1.9, this is one of the few physical exam maneuvers where we can actually look at reliability, and our reliability is pretty good, it's actually that fair range, that 66 percent range, so and actually a little higher than that, right, that actually puts us in the good range, which is really good reliability of this test, but just because it's reliable doesn't mean it's valid. So now we'll move on to bicipital tendinosis, we've talked about the rotator cuff enough I think, this is a primary degenerative pathology of the long head of the biceps tendon that occurs under the coracoacromial arch as it enters the glenohumeral joint. Most of the time this is a secondary problem due to a rotator cuff injury, patients have pain in the anterior shoulder that's worse at night, and they might, if they're athletes, they might have a pop or a snap with a throwing motion. So ordinarily I would ask if anybody knows what this is, this is the Popeye sign, the long head of the biceps actually zippers down towards the elbow during, after an acute biceps tear. We have no idea what the sensitivity and specificity of this physical exam maneuver is, but I had a patient in my clinic about a week ago that had a subacromial bursa injection, and a couple days later started having ecchymosis and hematoma formation in the middle of their, in the middle of their arm. I brought them into clinic, this is one of the maybe five patients I've actually seen in clinic in the last two weeks, and sure enough they, she had this Popeye sign, we got an MRI, and she ruptured her bicep tendon. I think this is pretty valid and reliable, but hey I can't be sure of it because I don't have any tests to demonstrate that, any studies. So how about speeds test? So speeds test, the patient has upward forward flexion of the shoulder with the elbow in full extension and the arm is in neutral of abduction and adduction, there is resisted forward flexion by the practitioner and by the patient against the practitioner. Sensitivity is 90 percent, specificity is almost 14 percent, which is very low. That gives us a positive likelihood ratio of about one and a negative likelihood ratio, we haven't talked about those yet, of 0.95. So this is around one for both of these, right? Yergesen's test, which I'm sure many of you have heard about, is essentially resisted supination of the forearm. It has never been tested for bicep bicepital tendonitis, it is commonly taught to residents and I would encourage all of you to think about this for one second and think to yourself, what does the long head of the biceps do? The long head of the biceps forward flexes the shoulder and flexes the elbow. The short head of the biceps supinates the forearm. Now they both do a little bit of both, but that's primarily the case. So if we're trying to identify if the long head of the biceps has disease, why on earth would we be testing supination, which is primarily performed by the short head of the biceps? I ask you to think about that and I ask you to think about that every time someone tells you that a physical exam maneuver is helpful because there's so many people out there who do Yergesen's test to assess for biceps tendonitis and it's never been studied and it's not even primarily testing the right tendon of the biceps. So think about that every time you think about evidence-based medicine for the shoulder. So let's think about this though, about speeds tests. Let's go back to speed tests. The positive likelihood ratio is 1.04, so that means it's useless. It doesn't change our post-test probability. The negative likelihood ratio is 0.95. It also doesn't test our post-test probability in the sense that the negative likelihood ratio means that it doesn't help us rule out the presence of the disease, but the test is negative. We didn't rule out the presence of the disease, but the sensitivity is so high. Let's go back and look at that. The sensitivity is 90 percent and you've all been taught and I was taught for a long time that a high sensitivity rules out this disease, right? Spin and snout. So if you have a high sensitivity and the test is negative, that means they don't have that, right? That is a lie that's been taught to us for generations. If you utilize the statistics correctly, the specificity is also necessary to assess whether something is negative and when you have such a low specificity as speeds test does in the 14 percent range, that high sensitivity is no longer meaningful. Okay, again I want you to think about this. So we're going to move on to glenohumeral osteoarthritis. So we've talked about rotator cuff pathology in conjunction with subacromial subdeltoid bursitis and impingement. We've talked about biceps tendonitis. Now we're going to talk about osteoarthritis of the glenohumeral so there are four radiographic signs of osteoarthritis. Those four are joint space narrowing, which we see here on this x-ray, subchondral sclerosis, which is this thickening, this radio opacity on either side of the joint line, subchondral cysts, and we see one right here, right? An osteophyte formation and you could see actually an osteophyte coming off of the acromion here but that's not the glenohumeral joint and you probably see a little bit of osteophyte formation here on the head of the humerus on the inferior margin there. So the four, again the four radiographic signs of osteoarthritis in any joint are joint space narrowing, subchondral sclerosis, subchondral cysts, and osteophyte formation. Specifically with the shoulder when you have osteoarthritis, it causes pain that's worse with activity. It tends to improve with rest and actually with heat. This is different than the other diagnoses we talked about because it's deep in nature and aching. It can interfere with sleep which is normal for most of these pathology but it actually is not worst at night as those other things that we talked about are because the other ones are tendinous and overuse related and so they get worse over the course of the day. Glenohumeral arthritis actually might be worse at any point of the day and patients have a gradual loss of their function. So by the way there's no physical exam maneuvers that anybody's studied for this as far as I'm aware so I'm not even going to get into that and we'll move on to the next disease state because just it hasn't been studied on how to diagnose glenohumeral osteoarthritis with physical exam. By the way when we go back here I would like every one of you to think about looking up where those four diagnostic criteria came from because they should also have their own likelihood ratios, right? An x-ray is no different than a physical exam maneuver, it's just helping you make a diagnosis. So here we this is an MRI showing a labral injury here. There's a discontinuity of the posterior portion of the labrum and the most common cause of a labral tear is a superior labral anterior to posterior lesion also called a SLAP lesion. Causes are most likely cause is a fall on an outstretched stretch hand or in an athlete poor biomechanics during overhead throwing. You can also have a traumatic trauma from a biceps tendon pulling on it that's less common. This is a disease that patients complain of clicking and catching and grinding sort of like a meniscus injury in the knee. Now the difference between this and another disorder of the shoulder is that you can't really diagnose it with a traditional MRI and then you would have to actually do a gadolinium enhanced MR arthroscopy in which the contrast is injected directly into the shoulder joint prior to the MRI. The gold standard for this is arthroscopy to visually identify it but gadolinium enhanced MR arthroscopy, I'm sorry MR arthrogram, I think I said arthroscopy a few times there. A gadolinium enhanced MR arthrogram has been shown to be just as good as an arthroscopy in terms of diagnosing a labral tear but a traditional MRI without contrast inside of the shoulder has not been demonstrated to be valuable in diagnosing labral tears. That's important because if you have a patient with a shoulder pain and you'd like to identify whether or not you can identify if the cause of their pain is a labral tear, you would like to know ahead of time whether or not you're going to order a traditional MRI or an MR arthrogram with injected contrast. The test that's most commonly thought of in diagnosing labral tears is the O'Brien's test. In the O'Brien's test, just like the empty can test, the forearm is fully pronated, the shoulder is forward flexed to 90 degrees, the elbow is fully extended, and the arm is brought to 30 degrees of adduction. The patient then performs upward forward flexion of the shoulder and against resistance. If it is painful in this position but then when the forearm is converted to supination and then the same thing is done it's not painful, so painful with pronation of the forearm but not with supination of the forearm, then that's considered a positive test. And the sensitivities range from 85 to 100 percent and from 41 to 98.5 percent. Most studies, this test has been studied a few times, have their sensitivity and specificity down as 85 and 41 percent range. The only study with these sky-high numbers, it's one study that has 100 percent sensitivity and 98 percent specificity and it was done by Dr. O'Brien himself. And so the test that he named after himself gave you virtually 100 percent chance of a positive and negative predictive value of near infinity. So I'll let you make your decisions as to whether that is valid or not. I think though, oh we have a couple more causes of shoulder pain to discuss, but I'm going to talk about the AC joint. The next, the AC joint, the most commonly performed test that we everybody thinks about is the Apley Scarf Test, which is shown here, where the patient actually actively cross arm adducts with the elbow flexed and when they do that, theoretically if they have pain at the AC joint, that would mean that it's a positive test. Well, that's never been studied. O'Brien actually studied his own test again to determine if it would be positive, then it would be considered AC joint pain. And in his study, again, he had 196.6 percent as a sensitivity and specificity respectively and most, and there's only one or two other studies on this, with these really low sensitivity and a slightly higher specificity. So incredibly, if you have a positive O'Brien's test and you go with O'Brien's actual studies, you have 100 percent chance of getting the right diagnosis of AC joint and labral tear pain, which is impossible, but I leave that up to you to assess. We'll move on to myofascial pain. Myofascial pain being a great mimicker of all pain conditions in every part of the body. And so I asked you what a trigger point is, because we talk about trigger points all the time, so let's get into that. How do we diagnose what a trigger point is? These pictures are from Simon and Travell's book, and Simon and Travell are the people who kind of originated the idea of discussing myofascial pain. You can see that the referral patterns are all around the shoulder for many of these muscles. I mean, all the muscles in the chest wall and in the shoulder itself. And so if we are to believe that trigger points and myofascial pain are a real thing, we certainly have to acknowledge that they can cause pain in areas where we would think that also the condition might be a labral tear or osteoarthritis of the shoulder or a rotator cuff pathology. So Simons and Travell diagnosed trigger, or I'm sorry, defined a trigger point as a hyper irritable spot in skeletal muscle that's associated with a hypersensitive palpable nodule in a top band. That spot is painful on compression and can give rise to characteristic referred pain, referred tenderness, motor dysfunction, and autonomic phenomenon. The American College of Rheumatology diagnosed, defined a trigger point as a hyperirritable spot in skeletal muscle associated with a top band upon palpation with a characteristic referral pattern. So not can, but it must have a characteristic referral pattern. The ACR defined a tender point, which we think of as a fibromyalgia symptom, as a hyperirritable spot in skeletal muscle associated with tenderness to palpation, but did not classify it as needing a top band or a characteristic referral pattern. Two years ago, a Delphi panel, a Delphi panel was put together in Spain, actually, to identify what might be a good way to diagnose a trigger point. This group of experts decided that 70% of those experts decided that it requires a top band, a hypersensitive spot, and referred pain. But a second round, those experts considered that the minimum diagnosis should include the top band, hypersensitive spot, and referred pain. And then during the third round, they determined that there's such a thing as a latent trigger point, and the referred pain may not be present when you examine a patient because they may have a latent as opposed to an active trigger point. So these diagnostic criteria are all based upon expert consensus. And because there is no criterion standard, aka a gold standard, we have no way to identify whether a myofascial pain or a trigger point is a real condition or not. So now we'll move into effectiveness with the time I have left. I want you to, and we're going to think about interventional treatment. So if you do a blind injection for the subacromial bursa, accuracy, which remember we consider a form of reliability, you have a 29 to 83% chance of getting it in the subacromial bursa every time. And we know why that is. I think many of you at this point have probably done an ultrasound-guided subacromial bursa injection, and if we look at it, geez, well, okay, here's the greater tuberosity, here's the supraspinatus tendon, here's the acromion, here's the subacromial bursa, this thin line, and maybe there's some fluid in there sometimes, but if you expect to get into that thin line every single time you inject into this area blindly, then you are incorrect that you'll be able to do that. Eighty-three percent seems very high to me. I think that most likely the more accurate assessment would be you're in the 29% range for reliably getting into this area. So I think ultrasound has been shown to be probably a standard of care for performing such a procedure. Now, for glenohumeral joint injections, a blind approach has been shown to have an accuracy, an anterior approach has an accuracy as low as 27% in one study. But in most studies, when you do an anterior approach to a glenohumeral joint, the accuracy is shown to be about 90%. The joint is right there. But if you do a posterior approach, you have a 25 to 30% accuracy rate. And we can see why if you do an anterior versus a posterior approach, the posterior approach, that joint is pretty deep here. I can't tell in this picture, unfortunately, what that depth, it looks like it's about three centimeters deep. But here, the anterior approach, this rotator cuff interval approach, you know, the joint is right there, but so is the biceps tendon, so is the subcapularis tendon, and so is the supraspinatus tendon. So you could accidentally inject steroid into a tendon which could predispose to tendinous rupture. So is reliability, all those studies, those couple studies I just showed about blind injections that are talking about reliability, is it a substitute for efficacy? Because even if I told you, hey, you can reliably get somewhere, you might be reliably getting way up here when your target is way down here. And so let's talk about efficacy. And effectiveness, which is the third part of the tenet of evidence-based medicine, is the degree to which a treatment improves a condition. And so the key with pain treatment and shoulder pain treatment is that effectiveness has to be measured in a certain way. So if we took a drug to try to treat somebody's shoulder pain, and we had a pre-intervention and post-intervention group, and then a placebo group, pre-intervention and post-intervention, and we looked at the difference in pain for each of these O6 patients in each of these groups, pre-intervention, post-intervention, and we took the means, we could say they all, the mean of the drug group started at eight, the mean of the placebo group started at eight, the drug group post-intervention, the mean is five, and the placebo group, the mean is five. So therefore, the mean change in both groups is three. Well, does that mean to us that there's no difference between this drug and placebo? And the answer is no, because in the drug group, three patients actually improved quite substantially. They got at least five points better. In the placebo group, only one patient got five points better. And so this data is useless. What we want to know is, with a standard set, identified amount of improvement, and in this case, we're saying five points improvement, did the patient get better that much? And in our drug group, or our intervention group, whatever you want to think about it, it's three patients versus one in the placebo group. And so if we take that data, which is categorical data, which is different than the mean data we looked at before, where mean data is considered continuous data, we can take this five points improved, yes or no data now, and say, when treated with experimental drugs, three patients with this disease will experience 50% improvement, which is five points improvement, for every one who experiences 50% improvement with placebo. And so therefore, the number needed to treat, which is the most important number in effectiveness studies, is the ratio of the experimental event rate to the control event rate, in this case, three to one, and so the number needed to treat is three. So for every third patient you treated with the drug, they should have at least 50% improvement, and that's a very good number needed to treat. And so let's go back, by the way, if this study was set up as a mean data study, we would be, the title of the study would be, hey, this drug does not help treat this condition. But if it was a categorical data study, which is the correct way to do it, the title would be, this drug does help this condition. So the way that you consume that data is going to change the way that you even interpret the study. But let's say, what's the number needed to treat for glenohumeral intraarticular corticosteroid injections? Because we talked about reliability, but we didn't talk about effectiveness. So what's the number needed to treat? I want you all to think about it and guess in your head, what do you think the number needed to treat is? Okay, so admittedly, I did this search about a year ago. I think it was last February. So this number may have changed. But I put glenohumeral osteoarthritis steroid injection into PubMed, February 2019. And I want you to guess how many times I had hits, how many hits I had. This was not even filtered for clinical trials. It was just glenohumeral osteoarthritis steroid injection. There were six studies that popped up, and none of them were helpful. Now, maybe something's changed in the last 14 months, but I'll tell you that nobody really knows what the number needed to treat is, and nobody even knows if steroid injections for osteoarthritis of the shoulder are even an effective treatment, even though we've been doing it for decades, because it has not been studied. And I'll tell you right now that the causes of peripheral joint pain are varied. You can have osteoarthritis, or now, you know, there's a movement to change the name of that to osteoarthritis, arthrosis, because it's not primarily inflammatory or even osteoarthropathy, there are rheumatologic diseases, there's traumatic arthritis. And I ask you, are these all the same? And more importantly, are they responsive to the same treatments? Every time somebody has arthritis in their shoulder, should we be putting a steroid in there? And so a several-decade paradigm for treating chronic joint pain exists. And it comes from literature in the neck and the back for cervical and lumbar facet pain. And that data starts with can it hurt, and then the next piece of the paradigm is does it hurt? And the next piece is, well, can we block it, and can we make it feel better? And so my group here at UTL San Antonio did a little bit of the work on this. We were the first to publish a cadaveric study of the articular branches of the shoulder joint to show, hey, there are actually nerves that innervate this joint and they're terminal branches that aren't just large branches. And we showed anteriorly that there's a sensory branch, I'm sorry, posteriorly, posteriorly that there's a sensory branch from the suprascapular nerve and from the axillary nerve that innervates that glenohumeral joint. And anteriorly, there's a branch from the lateral pectoral nerve. We also demonstrated that the primary, almost all of the innervation of the AC joint comes from that lateral pectoral nerve as well. There also is innervation from the nerve to subscapularis that we found in this study. We then identified under fluoroscopy that there are targets where you can actually, where you can perhaps inject these nerves, and we actually just showed their path. And this is the illustration where we showed there might be safe zones, which are these yellow areas, and unsafe zones to perform the blocks. And this is probably a lecture better for my pain fellows in our pain fellowship, but just suffice it to say that this process of identifying, hey, is there a way to target treatment, is probably a more evidence-based manner than blindly assuming that putting a particular drug into a particular cavity is going to help. So we identified, first of all, that there's innervation to this area. Nobody had done that before. Philip Peng's group in Toronto, for completion's sake, I'll just say, and this is from my presentation at AAPM a couple months ago, did show that, hey, look, this is clearly, he overlapped a bunch of other studies and showed that these four nerves, the axillary nerve, suprascapular nerve, lateral pectoral nerve, and the nervous subscapularis provide almost all the innervation to the shoulder joint by computer rendering. And then I ask, and then we have to find out, well, okay, does it hurt? We don't even, before we identify if we can effectively treat something, how do we know that's actually what's causing it? And we, the shoulder is not the back of the neck. So in the back and the neck, this paradigm came about because there's so many different potential causes of back pain or neck pain, and it's so hard to identify what's causing them. But the shoulder or the knee, you know, if you have pain in that area, odds are pretty high that, and they describe it as deep and aching, there's not much else there that could be causing the pain. So face validity is the idea that on its face, when you just look at something, is that good enough for diagnosis? And what I'm saying is, is face validity good enough to say, hey, if you have deep aching pain in and around your shoulder, it's probably coming from your shoulder joint? And I would argue yes, but that means that we're not doing any validation studies like we talked about earlier. D.J. Kennedy's group, when, actually before he was at Vanderbilt, I think he was at Stanford at the time, identified that patients who have 100% response to shoulder injections have this kind of distribution of their pain. And the heat maps indicate where their pain was, anteriorly and posteriorly. And those that were non-responders, i.e. those that did not have 100% response to injection, this is their distribution. So they tended to have much more distribution down distal to the shoulder. And this table kind of identifies that data and shows that, okay, well, you're more likely to have anterior shoulder and posterior shoulder pain than not if you're a responder to an injection into the shoulder, which would imply that the glenohumeral joint is in fact the pain generator in question. And so this is a form of sort of, I would say, appreciating and validating that face validity is probably acceptable for the shoulder joint. And lastly, I want to get into that we, our group here also then proceeded to publish last year, the first case series where we actually did radiofrequency ablation upon these terminal branches of the axillary suprascapular and lateral pectoral nerve. And thereafter, were able to achieve at least six months duration benefit in approximately 47% of our patients. And 60% of those patients who had the procedure done using a technology called cooled radiofrequency ablation as opposed to traditional radiofrequency ablation, where the number was closer to 30%, 33% actually. And that's beyond the scope of this lecture. But just suffice it to say that we're working on a way to try to show that we can actually block this targeted joint. And whether that's good or not is up for debate. Lastly, in our study, we didn't look at a functional scale. And we're all physiatrists. We only looked at pain. But man, we should have been looking at function. And so there is a standardized systematic comparison study that was done in 2014 that shows that the simple shoulder test is the best test to use in longitudinal studies or clinical trials for testing for function in patients and improvement in function when you treat shoulder pain. And that the Dutch shoulder disability questionnaire is the best one to use in clinical practice to see if you're really truly helping someone's shoulder pain. And to discriminate among patients or groups, i.e. from a prevalence from one population to another, the American Shoulder and Elbow Surgeon Shoulder Assessment and the Oxford Shoulder Scale are the best tests to use. Okay. So in summary, with five minutes left for questions, the best predictor of reliability is the Kappa score, and I spent the least time on that. The best method of determining the value of a test is understanding its utility coupled with its likelihood ratio. The optimal method of determining efficacy and effect, I didn't talk about the difference between efficacy and effectiveness, but I can if anybody wants, of a given therapy is the number needed to treat. And what I definitely didn't talk about is number needed to harm. So for all of those things, hey, what's the number needed to harm to identify that you're causing undue risk to a patient that you might actually make them worse? And that's something you have to think about with each and every one of these things. The typical diagnostic provocative maneuvers we perform for the shoulder pathology are probably not as good as we think they are. The typical interventional treatments we offer for shoulder pain might not be as good as we think, and we might not even know if they work. And hey, what about non-interventional too? What do you guys know about number needed to treat for physical therapy? I'll tell you what, there's not much out there. We don't even, we don't know that physical therapy works, and it's also expensive. Patients have a copay every time they go to physical therapy, maybe three times a week. And lastly, we're working on a way to reliably denervate the shoulder, but I don't know if that's a good thing. I know that it's my group doing this research, but that doesn't necessarily mean it's good, and we're hoping it does help in the long-term for patients with shoulder pain, but we don't know until we have good, strong numbers needed to treat. And lastly, I want you to know that a lack of evidence isn't an evidence of lack. Just because there isn't a study on intraarticular corticosteroid injections for glenohumeral osteoarthritis does not mean that it does not work. It just means there isn't data yet. And I want you to remember, whenever you go back and talk to your faculty, and they say, well, I've been doing this for this way for this long, and I know it works because I have this many patients that it worked on, that anecdote is not the plural of evidence. Until there is evidence, those anecdotes should not drive your clinical decision-making. And lastly, I hope this lecture left you feeling the need to ABS, always be skeptical. These are my references. Thank you very much. And so I welcome any questions in the time we have left. I can stay on for extra time if there's questions from people that want to stay a little extra. Thank you. Thank you. I appreciate it. We're getting questions if people will be able to access this later. There's a, as you know, you know, getting into this for the first time was probably a little daunting. I know certainly evidence-based medicine, if you don't have a statistical background, can be a little bit, it can be a lot. So I appreciate what you, hold on, I'm kind of looking through some of the questions. I appreciate what you have taught us. I think it goes everywhere from critical thinking through anatomy and clinical applications. And I think this served the purpose to make us think about the tests that we're performing and the so what's about them. One question asked about going back to likelihood ratios and positive predictive value and asked specifically about the difference between prevalence and pretest probability. Can you expound on that a little bit? Yes. Thank you. So pretest probability can change independent of the prevalence for sure. I think prevalence is the first piece of pretest probability because, you know, that if the prevalence in your population of a certain disease is, let's say, 25%, then your pretest probability starts at 25%. And I actually like to think of likelihood ratios as serially changing your pretest probability. So if you do, from there, you ask them a historical question. And so let's say you have a patient with, you're working up for glenohumeral osteoarthritis and the pretest probability that they have that disease is, in a patient with shoulder pain is, let's say, 10% in your population. So then you ask them, hey, is it achy? And they say yes. Well, historical fact of having achy pain probably has a likelihood ratio. And so let's say that likelihood ratio is about four. And so that brings your new, your posttest probability that the patient has glenohumeral osteoarthritis up from 10% to something like maybe 20%. So now you're starting with a new pretest probability of about 20%. And that's the whole reason we ask, and then you ask another historical question, and you increase or decrease the chances. And that's the whole reason we do a history and physical. Nobody ever teaches anybody that. But your history and physical is designed to serially change your posttest and pretest probability until you complete it and wind up upon a final posttest probability. That makes a lot of sense. That helps answer the question. I think it does. So if I understand correctly, pretest probability approaches prevalence without, in a situation that is sterile from data. That's correct. Once you start adding data, you start changing your pretest probability. That's a good way to put it. That's very helpful. One person asked if there is a paper that you're aware of that kind of brings in sensitivity, specificity, positive, negative, negative values or likelihood ratios for the physical exam that puts it all in one place. I know of Nitin Jain's paper on this. Can you think of any others off the top of your head? Nitin Jain's paper on shoulder pain specifically is really good for that. I think the best place to get a lot of this data is actually Gerard Malanga's book that I referenced. His book has all of this data for everything in the musculoskeletal system, SI joint, low back, neck. Now, of course, it's a book, and books become obsolete quickly these days as new studies come about, so it's up to you as the consumer to keep up with the data, but his book really has the most information on this for every joint, not just shoulder. But Nitin's is really good as well. All right. Well, thank you very much. We're actually out of time, but again, thank you for joining us. I appreciate you being here and giving this lecture. Very informative and very helpful. If anybody has any questions about this, anything that didn't get answered, please reach out to us on Twitter via email, and again, the daily schedule of these lectures is there on that website, physiatry.org slash webinars. Thank you again, Dr. Nagpal, and we hope to see everybody tomorrow. Thank you, guys. Take care. Be safe.
Video Summary
The video transcript is a presentation on evidence-based medicine in the context of shoulder pain. The presenter discusses the importance of reliability, validity, and effectiveness in diagnostic and treatment methods. They explain that relying on face validity alone for diagnosis is not enough and the need for validation studies. Various physical exam maneuvers commonly used for diagnosing shoulder pain conditions are discussed, with the presenter highlighting the lack of strong evidence supporting their accuracy. They stress the importance of number needed to treat as a measure of effectiveness and emphasize the need for more research in this area. The presenter also introduces the concept of denervating the shoulder to treat pain and describes their own research on targeted nerve blocks. The talk concludes with a recommendation for a standardized shoulder functional scale and an invitation to always be skeptical and seek strong evidence when making clinical decisions.
Keywords
evidence-based medicine
shoulder pain
reliability
validity
effectiveness
diagnostic methods
treatment methods
physical exam maneuvers
accuracy
research
×
Please select your language
1
English