1 Journal Club Curriculum How to Evaluate the Literature
2 Data, data everywhere, but not a thought to think. -Jesse SheraStatistical thinking will one day be as necessary for efficient citizenship as the ability to read and write. -H.G. Wells,
3 Objectives How to read the literature and decide if you will adopt a practice Review important aspects of trial design Review study designs and strengths and weaknesses of both
4 To compare efficacy of…in the treatment of heavy menstrual bleeding Levonorgestrel-releasing intrauterine system or medroxyprogesterone for heavy menstrual bleeding: a randomized controlled trial. Kaunitz AM, Bissonnette F, Monteiro I, Lukkari-Lax E, Muysers C, Jensen JT. To compare efficacy of…in the treatment of heavy menstrual bleeding Inclusion criteria Exclusion criteria Randomization: centralized interactive voice response system and balanced using random permuted blocks Variables: change in menstrual blood loss from baseline and proportion in which treatment was successful (defined) Sample size calculation Intent to treat Statistical analysis: absolute change evaluated with Wilcoxin rank-sum test and proportion of patients with successful treatment analyzed with Pearson Chi Squared test Obstet Gynecol Sep;116(3):
5 Abstract The purpose of the abstract is to provide a concise overview of the study. A good abstract will highlight the primary results, and make a brief statement about the significance of the findings. For original research most abstracts will contain Objective, Materials and Methods, Results, and Conclusion sections.
6 Questions to ask about the abstractIn the absence of being able to read the entire article, would the abstract adequately summarize the article’s content? Are there major discrepancies between the abstract and the body of the article? Pitkin et al found that discrepancies occurred in 18–68% of the articles that they reviewed. Does the abstract’s conclusion address the specific aim of the investigation? JAMA 1999; 281:1110–1111
7 Introduction The Introduction section of the article should provide a rationale and significance for the study and explain the study’s specific goals. Although it does not have to be in the last paragraph of the introduction, the authors’ hypothesis and study aim should be easy to identify. The incidence and prevalence of the issue being investigated should be presented. If the authors’ question is not clearly discernible, it should raise concerns about the validity of the research.
8 Questions to address while reading the introduction sectionWhat is the question that the authors are trying to address? Does the introduction provide a conceptual framework for the research question? What are the general issues surrounding the authors’ question? How does the authors’ specific question fit into what is already known about the subject?
9 Questions to address while reading the introduction sectionDo the authors build a logical case and context for their hypothesis? Has the authors’ specific research question previously been answered? If so, does this article add to the fund of medical knowledge? Does this article cover an important topic?
10 Materials and Methods The Materials and Methods section is, in many ways, the most important section of an article. A well-written Materials and Methods section explains the authors’ study methodology. Power analysis
11 Materials and Methods Study methodology should include information regarding: Subject recruitment, including inclusion and exclusion criteria, which increases the homogeneity of the groups, and improves the likelihood for reproducibility and applicability of the study in another setting. Subject allocation, The intervention or test performed (including a sufficient description of the technical parameters) The methods of data analysis. The outcome variable
12 Screening or Diagnostic TestWas the diagnostic test or intervention evaluated in an appropriate selection of patients? For example, was the test or intervention evaluated in patients in whom it would be routinely used in clinical practice? Was the diagnostic test or intervention compared with an independent, reference standard? If so, was the comparison performed in a blinded fashion? Was the reference standard applied, regardless of the test result? Was the test or intervention validated in a second, independent group of subjects?
13 Is the method that the authors used a reasonable approach to answer the question?A common flaw in experimental design is that the research methodology fails to adequately test the hypothesis. The internal validity of a study refers to the study’s quality and is based on the adequacy of the research methodology. A well designed study attempts to minimize bias and confounding factors Did the authors conduct an intention-to-treat analysis?
14 Level of Precision The level of precision, sometimes called sampling error, is the range in which the true value of the population is estimated to be. We base our calculation on the standard deviation of our sample. The greater the sample standard deviation, the greater the standard error (and the sampling error). The standard error is also related to the sample size. The greater your sample size, the smaller the standard error.
15 Error Random error occurs due to chance variation, causing a sample to be different from the underlying population. This affects precision but is reduced by increasing the numbers Systematic error, or bias, is an incorrect study result due to nonrandom distortion of the data. Systematic error is due to flaws in the study design, data collection, and analysis. This affects accuracy and is not improved with increase in numbers
16 Error The larger the sample size, the less random error. Systematic error is not corrected by increasing sample size
17 Bias Bias is any process or effect at any stage of a study from its design to its execution to the application of information from the study, that produces results or conclusions that differ systematically from the truth There are multiple types of bias which are partially determined by the study design
18 Types of Bias Confounding Bias: Systematic error due to the failure to account for the effect of one or more variables that are related to both the causal factor being studied and the outcome and are not distributed the same between the groups being studied. Sampling (Selection) Biases: Systematic error that occurs when, because of design and execution errors in sampling, selection, or allocation methods, the study comparisons are between groups that differ with respect to an outcome of interest for reasons other than those under study.
19 Bias Measurement Bias: Systematic error that occurs when, because of the lack of blinding or related reasons such as diagnostic suspicion, the measurement methods (instrument, or observer of instrument) are consistently different between groups in the study. Screening Bias: The bias that occurs when the presence of a disease is detected earlier during its latent period by screening tests but the course of the disease is not changed by earlier intervention.
20 Bias Reader Bias: Systematic errors of interpretation made during inference by the user or reader of clinical information (papers, test results, ...). Such biases are due to clinical experience, tradition, credentials, prejudice and human nature. The human tendency is to accept information that supports pre-conceived opinions and to reject or trivialize that which does not support preconceived opinions or that which one does not understand. (JAMA 247:2533)
21 Confounder A confounding variable (also confounding factor, lurking variable, a confound, or confounder) is an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable. The methodologies of scientific studies need to control for these factors to avoid a false positive (Type I) error; an erroneous conclusion that the dependent variables are in a causal relationship with the independent variable.
22 Confounder There is no association with X and Y but not observing Z will give a spurious association between X and Y. Z is the confounder Z X Y
23 Simple linear regression
24 Variables Dependent variable=outcome variableie response to treatment Independent variable=variables that have an impact on the dependent variable ie Risk of cervical dysplasia HPV status high risk vs low risk # of sexual partners Interaction terms There is interaction between HPV status and # of sexual partners
25 Results What data are presented?Do the data follow from the investigators methods? Is it clear where the data came from? Is it clear how the data was obtained? Are all the data presented, and are all groups accounted for? If all the subjects or groups are not accounted for, how did the authors address this issue? Did the investigators perform an intent-to-treat analysis? What do the results show? Could these results have been from chance?
26 Discussion The authors should state whether or not their hypothesis was verified. They should summarize the main research findings and the conclusions that can be drawn, and emphasize unique aspects of the study. This section should explain how and why these results were obtained, along with their significance. The authors should review and comment on other studies relating to their investigation and explain what, if any, different differences exist among their findings and those reported in the literature.
27 Questions about the discussionWhat conclusions did the authors draw from the data? Would I draw the same conclusions? Are the authors’ conclusions supported and based on the methods and data? Do the conclusions drawn from the data disagree with the authors’ conclusions? If so, going back to the Results section to see where the discrepancy in interpretation occurred may be helpful. Do the results and conclusion apply to the patients in my practice?
28 Questions on methodologyIs the study design suited to fulfill the aims of the study? Is it stated whether the study is confirmatory, exploratory or descriptive in nature? What type of study was chosen, and does it permit the aims of the study to be addressed? Is the study's endpoint precisely defined? What statistical measure is employed to characterize the endpoint? Dtsch Arztebl Int Feb;106(7):100-5.
29 Questions on MethodologyDo epidemiological studies, for instance, give the incidence (rate of new cases), prevalence (current number of cases), mortality (proportion of the population that dies of the disease concerned), lethality (proportion of those with the disease who die of it) or the hospital admission rate (proportion of the population admitted to hospital because of the disease)? Are the geographical area, the population, the study period (including duration of follow-up), and the intervals between investigations described in detail?
30 Additional Questions What is the question that the authors are trying to address? What are the general issues surrounding the research question or hypothesis? Where do the authors’ specific aims fit into what is already known about the subject? Is the topic timely and relevant?
31 Baxt et al sent a fictitious scientific manuscript with intentional errors to over 200 reviewers of a leading emergency medicine journal. Reviewer dispositions were categorized into acceptance, rejection, or revision. Planted errors considered by the researchers as “major” were missed by two thirds of reviewers. Their results received much attention from the academic community and questioned the expertise of the peer review process Baxt et al, 2005
32 Other Important Aspects of Study Design
33 Power The probability that a study will detect the phenomenon studied when it exists is called “power”. Power depends on group variability, size of the sample, the true nature of the phenomenon being observed, and the level of significance. A good clinical study should inform the calculated power of the sample, so the reader can evaluate “non-statistically significant” results.
34 Factors Affecting Power1. Size of the effect 2. Standard deviation of the characteristic 3. Bigger sample size 4. Significance level desired It turns out that if you were to go out and sample many, many times, most sample statistics that you could calculate would follow a normal distribution. What are the 2 parameters (from last time) that define any normal distribution? Remember that a normal curve is characterized by two parameters, a mean and a variability (SD) What do you think the mean value of a sample statistic would be? The standard deviation? Remember standard deviation is natural variability of the population Standard error can be standard error of the mean or standard error of the odds ratio or standard error of the difference of 2 means, etc. The standard error of any sample statistic.
35 Sample Size 3 criteria are specified to determine the appropriate sample size: the level of precision (standard deviation) the level of confidence or risk (confidence interval) the degree of variability in the attributes being measured (how much each measurement varies from the mean)
36 Standard Deviation: how much your population deviates from the mean
37 Type I error Type I error, also known as an "error of the first kind", an α error, or a "false positive": the error of rejecting a null hypothesis when it is actually true. It occurs when we observe a difference when in truth there is none, thus indicating a test of poor specificity. An example of this would be if a test shows that a woman is pregnant when in reality she is not, or telling a patient he is sick when in fact he is not, ie a false positive
38 Type II error Type II error, also known as an "error of the second kind", a β error, or a "false negative": the error of failing to reject a null hypothesis when in fact we should have rejected it. In other words, this is the error of failing to observe a difference when in truth there is one, thus indicating a test of poor sensitivity. An example of this would be if a test shows that a woman is not pregnant, when in reality, she is.
39 Control Disease False negative False positive
40 Confidence Interval The Confidence LevelThe confidence or risk level is based on ideas encompassed under the Central Limit Theorem. The key idea encompassed in the Central Limit Theorem is that when a population is repeatedly sampled, the average value of the attribute obtained by those samples is equal to the true population value.
41 Degree of Variability The degree of variability in the attributes being measured refers to the distribution of attributes in the population. The more heterogeneous a population, the larger the sample size required to obtain a given level of precision. The less variable (more homogeneous) a population, the smaller the sample size required.
42 How Do Articles Get Published?
43 Reviewers' responsibilities to authorsProvide written, honest, and unbiased feedback in a timely manner Express a critical opinion about the manuscript, as experts in the field, in a collegial and constructive manner Comment on the style of writing, especially its clarity. Rate the work's detail, methodology, relevance, accuracy, and originality Avoid comments or criticism of a personal nature. Maintain professionalism and confidentiality, especially given the competitive nature of research, funding availability, and publication Perm J (1):32-40
44 Ten qualities of a good reviewer1. Competence (and/or expertise) in the field 2. Consistency (within and between reviews) 3. Confidentiality 4. Responsibility in feedback (constructive, educational, unbiased) 5. Knowledge of the scientific process (research and writing) 6. Integrity 7. Impartiality 8. Timeliness (punctuality) 9. Detail orientation 10. Outstanding language skills J Gen Intern Med 1993 Aug;8(8):422–8
45 Reviewers' responsibilities to the readersEnsure that published articles adhere to journal standards, as well as to standards of scientific practice Protect readers from incorrect or flawed research Identify missed references or erroneous citations (including misquoting or misinterpreting an author's findings)
46 Reviewers may be more likely to reject manuscripts with negative results (referred to as publication bias) that neither describe nor discuss novel ideas or novel results Statistical reviewers may identify more appropriate statistical tests or methods—or even study designs—that better suit the data collected and the research question being investigated.
47 How to read the literature and decide if you will adopt a practiceRead the abstract and decide if you are interested Does the introduction state a hypothesis? Read the Materials and Methods: is the study design appropriate for the question asked? Is there a control group that is comparable to the study group? Is the statistical approach reasonable? What biases and confounders are inherent in the study design and do they invalidate the findings? Does the data support the conclusions reached? Do the authors state conclusions that were not tested?