Statistics and Epidemiology in Medicine is important for clinicians to understand. There can often be a difference in clinical outcomes seen in patients in clinical trials, compared to real-world, community settings. Clinical trials usually exclude patients with multiple diagnoses or comorbidities, whereas in the real world, patients have multiple diagnoses and conditions all the time. This disconnect means it can be challenging for clinicians to interpret the efficacy of treatments from clinical trials, systematic reviews, and meta-analyses.[1] Having a solid understanding of statistical principles can allow you to critically appraise the literature and understanding whether you should trust a study or not!
There are 4 types of validity:
Finally, you should also think about the relationship between a diagnostic test and the actual presence of disease, as determined by: sensitivity, specificity, positive predictive value, and negative predictive value.
In every research study, there is something called the null hypothesis. This means the default outcome of every study assumes there is no statistical significance between the two variables (e.g. - no relationship between smoking and risk of lung cancer) that you are looking at in your study. In most studies, a researcher or experimenter will try to disprove or discredit the null hypothesis (e.g. - the researcher will try to prove there is in fact a risk between smoking and lung cancer).
When one rejects or fails to reject the null hypothesis correctly, two types of errors can occur:
If you were looking at a study looking at the association between smoking and lung cancer, you would apply the following table below to look at the probability of a Type I and Type II error.
Positive Finding in Real World | Negative Finding in Real World | |
---|---|---|
Positive Finding in Study | (a) True Inference | (b) 👻 False Positive (Type I Error) |
Negative Finding in Study | (c) 👻 False Negative (Type II Error) | (d) True Inference |
If you were looking at a study looking at how good a test was at detecting lung cancer, you would apply the following table below to look at the probability of a Type I and Type II error.
Diseased in Real World | Non-Diseased in Real World | |
---|---|---|
Positive Test Result | (a) True Positive | (b) 👻 False Positive (Type I Error) |
Negative Test Result | (c) 👻 False Negative (Type II Error) | (d) True Negative |
Sensitivity and Specificity are fixed properties of a test and their values never change:
S
ensitive test when N
egative rules OUT
the disease (SNOut
)S
pecific test when P
ositive rules IN
the disease (SPIn
)Diseased in Real World | Non-Diseased in Real World | |
---|---|---|
Positive Test Result | True Positive (TP) | False Positive (FP) |
Negative Test Result | False Negative (FN) | True Negative (TN) |
Measure | Sensitivity = (TP) ÷ (TP+FN) | Specificity = (TN) ÷ (FP+TN) |
Sensitivity | Specificity | Interpretation |
---|---|---|
High | Low | If our test has a high sensitivity but a low specificity, the test will be very good at finding all the disease because it is sensitive. BUT, it will also be easily tricked into thinking the disease is there when it is not because it is not very specific. Our results will show a lot of true positives (which is good), but also a lot of false positives (which is bad). |
Low | High | If the test has low sensitivity but high specificity, when we get a positive result, we can be pretty sure that the disease is present. However, if we have a negative result, we can’t be too sure that we have a healthy patient because maybe the test was not sensitive enough to pick up the disease in that particular patient. This type of test is thus most helpful to us when it gives a positive result, because we can be pretty confident that the patient has the disease for sure. |
Positive Predictive Value (PPV) and Negative Predictive Value (NPV) are another two concepts that relate to Type I and Type II errors:
Diseased in Real World | Non-Diseased in Real World | Measure | |
---|---|---|---|
Positive Test Result | TP | FP | Positive Predictive Value = (TP) ÷ (TP+FP) |
Negative Test Result | FN | TN | Negative Predictive Value = (TN) ÷ (FN+TN) |
Measure | Sensitivity = (TP) ÷ (TP+FN) | Specificity = (TN) ÷ (FP+TN) |
Accuracy is the final concept related to Type I and Type II errors, and can be calculated below.
Diseased in Real World | Non-Diseased in Real World | Measure | |
---|---|---|---|
Positive Test Result | TP | FP | Positive Predictive Value = (TP) ÷ (TP+FP) |
Negative Test Result | FN | TN | Negative Predictive Value = (TN) ÷ (FN+TN) |
Measure | Sensitivity = (TP) ÷ (TP+FN) | Specificity = (TN) ÷ (FP+TN) | Accuracy = (TP + TN) ÷ (TP + FN + FP + TN) |
The terms prevalence and incidence are often used incorrectly in describing how common a disease or disorder is. Here are the correct definitions:[2]
Diseased in Real World | Non-Diseased in Real World | Measure | |
---|---|---|---|
Positive Test Result | TP | FP | PPV |
Negative Test Result | FN | TN | NPV |
Measure | Sensitivity | Specificity | Accuracy = (TP + TN) ÷ (TP + FN + FP + TN) Prevalence = (TP + FN) ÷ (TP + FN + FP + TN) |
The prevalence of a disease or disorder can change depending on where you are looking for the disease. A disease will have a much higher prevalence in a specialist referral hospital, compared to a primary care setting, which will have a much lower prevalence. Take a look at the example below.
Diseased in Real World | Non-Diseased in Real World | Measure | |
---|---|---|---|
Test (+) | 50 | 10 | PPV = 50/60 = 83% |
Test (-) | 5 | 100 | NPV = 100/105 = 95% |
Measure | Sensitivity = 50/55 = 91% | Specificity = 100/110 = 91% |
The prevalence of the disease in this specialist clinic is the number of real cases of disease, divided by the total number of individuals:
Diseased in Real World | Non-Diseased in Real World | Measure | |
---|---|---|---|
Test (+) | 50 | 100 | PPV = 50/150 = 33% |
Test (-) | 5 | 1000 | NPV = 1000/1005 = 99.5% |
Measure | Sensitivity = 50/55 = 91% | Specificity = 1000/1100 = 91% |
The prevalence of the disease in the primary care setting is the number of real cases of disease, divided by the total number of individuals:
There are a number of different kinds of data:
For each type of data, there are different univariate and multivariable analytic methods.
DELICACY
” can be used to remember the that efficacy is measured under strict, fancy, expensively run, randomized clinical trial conditions – a “delicacy”!
Once you know the basics of statistics, it is then important to be able to understand how statistics can be misinterpreted and be able to critically appraise the literature!