Division of Pharmacoepidemiology and Pharmacoeconomics Monthly Methods Literature Scan

Each month members of the Division scan and review the literature to identify new papers describing epidemiological and statistical methods that may be relevant to ongoing or future work in the Division.

Current and previous scans may be found below:

April 2016

March 2016

February 2016

January 2016

December 2015

November 2015

October 2015

September 2015

August 2015

July 2015

June 2015

May 2015

April 2015

March 2015

February 2015

January 2015

April 2016

1. BMC Med Res Methodol. 2016 Apr 27;16(1):47. doi: 10.1186/s12874-016-0146-y.

A scoping review of indirect comparison methods and applications using individual patient data.

Veroniki AA(1), Straus SE(1,)(2), Soobiah C(1,)(3), Elliott MJ(1), Tricco AC(4,)(5).
(1)Li Ka Shing Knowledge Institute, St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON, M5B 1T8, Canada. (2)Department of Geriatric Medicine, Faculty of Medicine, University of Toronto, 27 King’s College Circle, Toronto, ON, M5S 1A1, Canada. (3)Institute of Health Policy, Management and Evaluation, University of Toronto, Health Sciences Building, 155 College Street, 4th floor, Toronto, ON, M5T 3M6, Canada. (4)Li Ka Shing Knowledge Institute, St. Michael’s Hospital, 209 Victoria Street, East Building, Toronto, ON, M5B 1T8, Canada. triccoa@smh.ca. (5)Epidemiology Division, Dalla Lana School of Public Health, University of Toronto, 155 College Street, 6th floor, Toronto, ON, M5T 3M7, Canada. triccoa@smh.ca.

BACKGROUND: Several indirect comparison methods, including network meta-analyses (NMAs), using individual patient data (IPD) have been developed to synthesize evidence from a network of trials. Although IPD indirect comparisons are published with increasing frequency in health care literature, there is no guidance on selecting the appropriate methodology and on reporting the methods and results.
METHODS: In this paper we examine the methods and reporting of indirect comparison methods using IPD. We searched MEDLINE, Embase, the Cochrane Library, and CINAHL from inception until October 2014. We included published and unpublished studies reporting a method, application, or review of indirect comparisons using IPD and at least three interventions.
RESULTS: We identified 37 papers, including a total of 33 empirical networks. Of these, only 9 (27 %) IPD-NMAs reported the existence of a study protocol, whereas 3 (9 %) studies mentioned that protocols existed without providing a reference. The 33 empirical networks included 24 (73 %) IPD-NMAs and 9 (27 %) matching adjusted indirect comparisons (MAICs). Of the 21 (64 %) networks with at least one closed loop, 19 (90 %) were IPD-NMAs, 13 (68 %) of which evaluated the prerequisite consistency assumption, and only 5 (38 %) of the 13 IPD-NMAs used statistical approaches. The median number of trials included per network was 10 (IQR 4-19) (IPD-NMA: 15 [IQR 8-20]; MAIC: 2 [IQR 3-5]), and the median number of IPD trials included in a network was 3 (IQR 1-9) (IPD-NMA: 6 [IQR 2-11]; MAIC: 2 [IQR 1-2]). Half of the networks (17; 52 %) applied Bayesian hierarchical models (14 one-stage, 1 two-stage, 1 used IPD as an informative prior, 1 unclear-stage), including either IPD alone or with aggregated data (AD). Models for dichotomous and continuous outcomes were available (IPD alone or combined with AD), as were models for time-to-event data (IPD combined with AD).
CONCLUSIONS: One in three indirect comparison methods modeling IPD adjusted results from different trials to estimate effects as if they had come from the same, randomized, population. Key methodological and reporting elements (e.g., evaluation of consistency, existence of study protocol) were often missing from an indirect comparison paper. PMCID: PMC4847203 PMID: 27116943 [PubMed – in process]

2. Stat Methods Med Res. 2016 Apr 25. pii: 0962280216642264. [Epub ahead of print]

Assessing methods for dealing with treatment switching in clinical trials: A follow-up simulation study.

Latimer NR(1), Abrams KR(2), Lambert PC(3), Morden JP(4), Crowther MJ(3).
(1)School of Health and Related Research, University of Sheffield, UK n.latimer@shef.ac.uk. (2)Department of Health Sciences, University of Leicester, UK. (3)Department of Health Sciences, University of Leicester, UK Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden. (4)Clinical Trials and Statistics Unit (ICR-CTSU), Division of Clinical Studies, The Institute of Cancer Research, London, UK.

When patients randomised to the control group of a randomised controlled trial are allowed to switch onto the experimental treatment, intention-to-treat analyses of the treatment effect are confounded because the separation of randomised groups is lost. Previous research has investigated statistical methods that aim to estimate the treatment effect that would have been observed had this treatment switching not occurred and has demonstrated their performance in a limited set of scenarios. Here, we investigate these methods in a new range of realistic scenarios, allowing conclusions to be made based upon a broader evidence base. We simulated randomised controlled trials incorporating prognosis-related treatment switching and investigated the impact of sample size, reduced switching proportions, disease severity, and alternative data-generating models on the performance of adjustment methods, assessed through a comparison of bias, mean squared error, and coverage, related to the estimation of true restricted mean survival in the absence of switching in the control group. Rank preserving structural failure time models, inverse probability of censoring weights, and two-stage methods consistently produced less bias than the intention-to-treat analysis. The switching proportion was confirmed to be a key determinant of bias: sample size and censoring proportion were relatively less important. It is critical to determine the size of the treatment effect in terms of an acceleration factor (rather than a hazard ratio) to provide information on the likely bias associated with rank-preserving structural failure time model adjustments. In general, inverse probability of censoring weight methods are more volatile than other adjustment methods. © The Author(s) 2016. PMID: 27114326 [PubMed – as supplied by publisher]

3. Clin Trials. 2016 Apr 19. pii: 1740774516643297. [Epub ahead of print]

Subpopulation Treatment Effect Pattern Plot (STEPP) analysis for continuous, binary, and count outcomes.

Yip WK(1), Bonetti M(2), Cole BF(3), Barcella W(4), Wang XV(5), Lazar A(6), Gelber RD(5).
(1)Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA wyip@jimmy.harvard.edu. (2)Carlo F. Dondena Centre for Research on Social Dynamics and Public Policy, Bocconi University, Milan, Italy. (3)Department of Mathematics and Statistics, University of Vermont, Burlington, VT, USA. (4)Department of Statistical Science, University College London, London, UK. (5)Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA. (6)Division of Oral Epidemiology, Department of Preventive and Restorative Dental Sciences and Division of Biostatistics, Department of Epidemiology and Biostatistics, University of California-San Francisco, San Francisco, CA, USA.

BACKGROUND: For the past few decades, randomized clinical trials have provided evidence for effective treatments by comparing several competing therapies. Their successes have led to numerous new therapies to combat many diseases. However, since their conclusions are based on the entire cohort in the trial, the treatment recommendation is for everyone, and may not be the best option for an individual. Medical research is now focusing more on providing personalized care for patients, which requires investigating how patient characteristics, including novel biomarkers, modify the effect of current treatment modalities. This is known as heterogeneity of treatment effects. A better understanding of the interaction between treatment and patient-specific prognostic factors will enable practitioners to expand the availability of tailored therapies, with the ultimate goal of improving patient outcomes. The Subpopulation Treatment Effect Pattern Plot (STEPP) approach was developed to allow researchers to investigate the heterogeneity of treatment effects on survival outcomes across values of a (continuously measured) covariate, such as a biomarker measurement.
METHODS: Here, we extend the Subpopulation Treatment Effect Pattern Plot approach to continuous, binary, and count outcomes, which can be easily modeled using generalized linear models. With this extension of Subpopulation Treatment Effect Pattern Plot, these additional types of treatment effects within subpopulations defined with respect to a covariate of interest can be estimated, and the statistical significance of any observed heterogeneity of treatment effect can be assessed using permutation tests. The desirable feature that commonly used models are applied to well-defined patient subgroups to estimate treatment effects is retained in this extension.
RESULTS: We describe a simulation study to confirm that the proper Type I error rate is maintained when there is no treatment heterogeneity, and a power study to show that the statistics have power to detect treatment heterogeneity under alternative scenarios. As an illustration, we apply the methods to data from the Aspirin/Folate Polyp Prevention Study, a clinical trial evaluating the effect of oral aspirin, folic acid, or both as a chemoprevention agent against colorectal adenomas. The pre-existing R software package stepp has been extended to handle continuous, binary, and count data using Gaussian, Bernoulli, and Poisson models, and it is available on the Comprehensive R Archive Network.
CONCLUSION: The extension of the method and the availability of new software now permit STEPP to be applied to the full range of clinical trial end points. © The Author(s) 2016. PMID: 27094489 [PubMed – as supplied by publisher]

4. Stat Med. 2016 Apr 12. doi: 10.1002/sim.6958. [Epub ahead of print]

Meta-STEPP: subpopulation treatment effect pattern plot for individual patient data meta-analysis.

Wang XV(1,)(2), Cole B(3), Bonetti M(4), Gelber RD(1,)(2).
(1)Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA, 02215, U.S.A. (2)Department of Biostatistics, Harvard T. H. Chan School of Public Health, 655 Huntington Avenue, Boston, MA 02215, U.S.A. (3)Department of Mathematics and Statistics, University of Vermont, 16 Colchester Avenue, Burlington, VT 05401, U.S.A. (4)Bocconi University and Carlo F. Dondena Centre for Research on Social Dynamics and Public Policy, Via Röntgen 1, 20136 Milan, Italy.

We have developed a method, called Meta-STEPP (subpopulation treatment effect pattern plot for meta-analysis), to explore treatment effect heterogeneity across covariate values in the meta-analysis setting for time-to-event data when the covariate of interest is continuous. Meta-STEPP forms overlapping subpopulations from individual patient data containing similar numbers of events with increasing covariate values, estimates subpopulation treatment effects using standard fixed-effects meta-analysis methodology, displays the estimated subpopulation treatment effect as a function of the covariate values, and provides a statistical test to detect possibly complex treatment-covariate interactions. Simulation studies show that this test has adequate type-I error rate recovery as well as power when reasonable window sizes are chosen. When applied to eight breast cancer trials, Meta-STEPP suggests that chemotherapy is less effective for tumors with high estrogen receptor expression compared with those with low expression. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd. PMID: 27073066 [PubMed – as supplied by publisher]

5. PLoS One. 2016 Apr 5;11(4):e0153010. doi: 10.1371/journal.pone.0153010. eCollection 2016.

Quantification of Treatment Effect Modification on Both an Additive and Multiplicative Scale.

Girerd N(1), Rabilloud M(2), Pibarot P(3), Mathieu P(4), Roy P(2).
(1)INSERM, Centre d’Investigations Cliniques 1433, Université de Lorraine, CHU de Nancy, Institut Lorrain du cœur et des vaisseaux, Nancy, France. (2)Hospices Civils de Lyon, Service de Biostatistiques, Lyon, F-69003, France, Université de Lyon, Lyon, F-69000, France, Université Lyon I, Villeurbanne, F-69100, France, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, Equipe Biostatistiques Santé, Villeurbanne, F-69100, France. (3)Department of Medicine, Laval University, Québec, Canada. (4)Department of Surgery, Laval University, Quebec, Canada.

BACKGROUND: In both observational and randomized studies, associations with overall survival are by and large assessed on a multiplicative scale using the Cox model. However, clinicians and clinical researchers have an ardent interest in assessing absolute benefit associated with treatments. In older patients, some studies have reported lower relative treatment effect, which might translate into similar or even greater absolute treatment effect given their high baseline hazard for clinical events.
METHODS: The effect of treatment and the effect modification of treatment were respectively assessed using a multiplicative and an additive hazard model in an analysis adjusted for propensity score in the context of coronary surgery.
RESULTS: The multiplicative model yielded a lower relative hazard reduction with bilateral internal thoracic artery grafting in older patients (Hazard ratio for interaction/year = 1.03, 95%CI: 1.00 to 1.06, p = 0.05) whereas the additive model reported a similar absolute hazard reduction with increasing age (Delta for interaction/year = 0.10, 95%CI: -0.27 to 0.46, p = 0.61). The number needed to treat derived from the propensity score-adjusted multiplicative model was remarkably similar at the end of the follow-up in patients aged < = 60 and in patients >70.
CONCLUSIONS: The present example demonstrates that a lower treatment effect in older patients on a relative scale can conversely translate into a similar treatment effect on an additive scale due to large baseline hazard differences. Importantly, absolute risk reduction, either crude or adjusted, can be calculated from multiplicative survival models. We advocate for a wider use of the absolute scale, especially using additive hazard models, to assess treatment effect and treatment effect modification. PMCID: PMC4821587 PMID: 27045168 [PubMed – in process]

6. Am J Epidemiol. 2016 Apr 1. pii: kwv302. [Epub ahead of print]

Comparison of Calipers for Matching on the Disease Risk Score.

Connolly JG, Gagne JJ.

Previous studies have compared calipers for propensity score (PS) matching, but none have considered calipers for matching on the disease risk score (DRS). We used Medicare claims data to perform 3 cohort studies of medication initiators: a study of raloxifene versus alendronate in 1-year nonvertebral fracture risk, a study of cyclooxygenase 2 inhibitors versus nonselective nonsteroidal antiinflammatory medications in 6-month gastrointestinal bleeding, and a study of simvastatin + ezetimibe versus simvastatin alone in 6-month cardiovascular outcomes. The study periods for each cohort were 1998 through 2005, 1999 through 2002, and 2004 through 2005, respectively. In each cohort, we calculated 1) a DRS, 2) a prognostic PS which included the DRS as the independent variable in a PS model, and 3) the PS for each patient. We then nearest-neighbor matched on each score in a variable ratio and a fixed ratio within 8 calipers based on the standard deviation of the logit and the natural score scale. When variable ratio matching on the DRS, a caliper of 0.05 on the natural scale performed poorly when the outcome was rare. The prognostic PS did not appear to offer any consistent practical benefits over matching on the DRS directly. In general, logit-based calipers or calipers smaller than 0.05 on the natural scale performed well when DRS matching in all examples. © The Author 2016. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 27037270 [PubMed – as supplied by publisher]

7. Int J Epidemiol. 2016 Apr 20. pii: dyw040. [Epub ahead of print]

Outcome modelling strategies in epidemiology: traditional methods and basic alternatives.

Greenland S(1), Daniel R(2), Pearce N(3).
(1)Department of Epidemiology and Department of Statistics, University of California, Los Angeles, CA, USA. (2)Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK. (3)Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK Centre for Public Health Research, Massey University, Wellington, New Zealand neil.pearce@lshtm.ac.uk.

Controlling for too many potential confounders can lead to or aggravate problems of data sparsity or multicollinearity, particularly when the number of covariates is large in relation to the study size. As a result, methods to reduce the number of modelled covariates are often deployed. We review several traditional modelling strategies, including stepwise regression and the ‘change-in-estimate’ (CIE) approach to deciding which potential confounders to include in an outcome-regression model for estimating effects of a targeted exposure. We discuss their shortcomings, and then provide some basic alternatives and refinements that do not require special macros or programming. Throughout, we assume the main goal is to derive the most accurate effect estimates obtainable from the data and commercial software. Allowing that most users must stay within standard software packages, this goal can be roughly approximated using basic methods to assess, and thereby minimize, mean squared error (MSE). © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association. PMID: 27097747 [PubMed – as supplied by publisher]

8. Emerg Themes Epidemiol. 2016 Apr 5;13:5. doi: 10.1186/s12982-016-0047-x.eCollection 2016.

Dimension reduction and shrinkage methods for high dimensional disease risk scores in historical data.

Kumamaru H(1), Schneeweiss S(2), Glynn RJ(2), Setoguchi S(3), Gagne JJ(2).
(1)Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 1620 Tremont Street (Suite 3030), Boston, MA 02120 USA ; Department of Healthcare Quality Assessment, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8654 Japan. (2)Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 1620 Tremont Street (Suite 3030), Boston, MA 02120 USA. (3)Duke Clinical Research Institute, Duke University, 2400 Pratt Street, Durham, NC 27705 USA.

BACKGROUND: Multivariable confounder adjustment in comparative studies of newly marketed drugs can be limited by small numbers of exposed patients and even fewer outcomes. Disease risk scores (DRSs) developed in historical comparator drug users before the new drug entered the market may improve adjustment. However, in a high dimensional data setting, empirical selection of hundreds of potential confounders and modeling of DRS even in the historical cohort can lead to over-fitting and reduced predictive performance in the study cohort. We propose the use of combinations of dimension reduction and shrinkage methods to overcome this problem, and compared the performances of these modeling strategies for implementing high dimensional (hd) DRSs from historical data in two empirical study examples of newly marketed drugs versus comparator drugs after the new drugs’ market entry-dabigatran versus warfarin for the outcome of major hemorrhagic events and cyclooxygenase-2 inhibitor (coxibs) versus nonselective non-steroidal anti-inflammatory drugs (nsNSAIDs) for gastrointestinal bleeds.
RESULTS: Historical hdDRSs that included predefined and empirical outcome predictors with dimension reduction (principal component analysis; PCA) and shrinkage (lasso and ridge regression) approaches had higher c-statistics (0.66 for the PCA model, 0.64 for the PCA + ridge and 0.65 for the PCA + lasso models in the warfarin users) than an unreduced model (c-statistic, 0.54) in the dabigatran example. The odds ratio (OR) from PCA + lasso hdDRS-stratification [OR, 0.64; 95 % confidence interval (CI) 0.46-0.90] was closer to the benchmark estimate (0.93) from a randomized trial than the model without empirical predictors (OR, 0.58; 95 % CI 0.41-0.81). In the coxibs example, c-statistics of the hdDRSs in the nsNSAID initiators were 0.66 for the PCA model, 0.67 for the PCA + ridge model, and 0.67 for the PCA + lasso model; these were higher than for the unreduced model (c-statistic, 0.45), and comparable to the demographics + risk score model (c-statistic, 0.67).
CONCLUSIONS: hdDRSs using historical data with dimension reduction and shrinkage was feasible, and improved confounding adjustment in two studies of newly marketed medications. PMCID: PMC4822311 PMID: 27053942 [PubMed]

March 2016

1. Biometrics. 2016 Mar 17. doi: 10.1111/biom.12505. [Epub ahead of print]

Propensity score matching and subclassification in observational studies with multi-level treatments.

Yang S(1), Imbens GW(2), Cui Z(3), Faries DE(3), Kadziola Z(4).
(1)Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, U.S.A. (2)Graduate School of Business, Stanford University and NBER, Stanford, California 94305, U.S.A. (3)Real World Analytics, Eli Lilly and Company, Indianapolis, Indiana 46285, U.S.A. (4)Real World Analytics, Eli Lilly and Company, Vienna, Austria.

In this article, we develop new methods for estimating average treatment effects in observational studies, in settings with more than two treatment levels, assuming unconfoundedness given pretreatment variables. We emphasize propensity score subclassification and matching methods which have been among the most popular methods in the binary treatment literature. Whereas the literature has suggested that these particular propensity-based methods do not naturally extend to the multi-level treatment case, we show, using the concept of weak unconfoundedness and the notion of the generalized propensity score, that adjusting for a scalar function of the pretreatment variables removes all biases associated with observed pretreatment variables. We apply the proposed methods to an analysis of the effect of treatments for fibromyalgia. We also carry out a simulation study to assess the finite sample performance of the methods relative to previously proposed methods. © 2016, The International Biometric Society. PMID: 26991040 [PubMed – as supplied by publisher]

2. Stat Methods Med Res. 2016 Mar 17. pii: 0962280216628900. [Epub ahead of print]

Correcting for dependent censoring in routine outcome monitoring data by applying the inverse probability censoring weighted estimator.

Willems S(1), Schat A(2), van Noorden MS(2), Fiocco M(3).
(1)Mathematical Institute, Leiden University, Leiden, The Netherlands s.j.w.willems@math.leidenuniv.nl. (2)Department of Psychiatry, Leiden University Medical Centre, Leiden, The Netherlands. (3)Mathematical Institute, Leiden University, Leiden, The Netherlands Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands.

Censored data make survival analysis more complicated because exact event times are not observed. Statistical methodology developed to account for censored observations assumes that patients’ withdrawal from a study is independent of the event of interest. However, in practice, some covariates might be associated to both lifetime and censoring mechanism, inducing dependent censoring. In this case, standard survival techniques, like Kaplan-Meier estimator, give biased results. The inverse probability censoring weighted estimator was developed to correct for bias due to dependent censoring. In this article, we explore the use of inverse probability censoring weighting methodology and describe why it is effective in removing the bias. Since implementing this method is highly time consuming and requires programming and mathematical skills, we propose a user friendly algorithm in R. Applications to a toy example and to a medical data set illustrate how the algorithm works. A simulation study was carried out to investigate the performance of the inverse probability censoring weighted estimators in situations where dependent censoring is present in the data. In the simulation process, different sample sizes, strengths of the censoring model, and percentages of censored individuals were chosen. Results show that in each scenario inverse probability censoring weighting reduces the bias induced in the traditional Kaplan-Meier approach where dependent censoring is ignored. © The Author(s) 2016. PMID: 26988930 [PubMed – as supplied by publisher]

3. Stat Med. 2016 Mar 30;35(7):1001-16. doi: 10.1002/sim.6818. Epub 2016 Jan 14.

The missing cause approach to unmeasured confounding in pharmacoepidemiology.

Abrahamowicz M(1,)(2), Bjerre LM(3,)(4,)(5), Beauchamp ME(2), LeLorier J(6,)(7), Burne R(1).
(1)Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, QC, Canada. (2)Division of Clinical Epidemiology, McGill University Health Centre, Montreal, QC, Canada. (3)Department of Family Medicine, University of Ottawa, Ottawa, ON, Canada. (4)School of Epidemiology, Public Health, and Preventive Medicine, University of Ottawa, Ottawa, ON, Canada. (5)Bruyère Research Institute, Ottawa, ON, Canada. (6)Departments of Medicine and Pharmacology, University of Montreal, Montreal, QC, Canada. (7)Pharmacoepidemiology and Pharmacoeconomics, University of Montreal Hospital Research Center, Montreal, QC, Canada.

Unmeasured confounding is a major threat to the validity of pharmacoepidemiological studies of medication safety and effectiveness. We propose a new method for detecting and reducing the impact of unobserved confounding in large observational database studies. The method uses assumptions similar to the prescribing preference-based instrumental variable (IV) approach. Our method relies on the new ‘missing cause’ principle, according to which the impact of unmeasured confounding by (contra-)indication may be detected by assessing discrepancies between the following: (i) treatment actually received by individual patients and (ii) treatment that they would be expected to receive based on the observed data. Specifically, we use the treatment-by-discrepancy interaction to test for the presence of unmeasured confounding and correct the treatment effect estimate for the resulting bias. Under standard IV assumptions, we first proved that unmeasured confounding induces a spurious treatment-by-discrepancy interaction in risk difference models for binary outcomes and then simulated large pharmacoepidemiological studies with unmeasured confounding. In simulations, our estimates had four to six times smaller bias than conventional treatment effect estimates, adjusted only for measured confounders, and much smaller variance inflation than unbiased but very unstable IV estimates, resulting in uniformly lowest root mean square errors. The much lower variance of our estimates, relative to IV estimates, was also observed in an application comparing gastrointestinal safety of two classes of anti-inflammatory drugs. In conclusion, our missing cause-based method may complement other methods and enhance accuracy of analyses of large pharmacoepidemiological studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd. PMID: 26932124 [PubMed – in process]

4. Ann Epidemiol. 2016 Mar;26(3):212-7. doi: 10.1016/j.annepidem.2015.12.004. Epub 2016 Jan 12.

Implications of immortal person-time when outcomes are nonfatal.

Liang C(1), Seeger JD(2), Dore DD(3).
(1)Optum Epidemiology, Waltham, MA. Electronic address: caihua.liang@optum.com. (2)Optum Epidemiology, Waltham, MA; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Harvard Medical School/Brigham and Women’s Hospital, Boston, MA. (3)Optum Epidemiology, Waltham, MA; Department of Health Services, Policy, and Practice, School of Public Health, Brown University, Providence, RI.

PURPOSE: The amount of immortal time bias in studies with nonfatal outcomes is unclear. To quantify the magnitude of bias from mishandling of immortal person-time in studies of nonfatal outcomes.
METHODS: We derived formulas for quantifying bias from misclassified or excluded immortal person-time in settings with nonfatal outcomes, assuming a constant rate of outcome. In the situation of misclassified or excluded immortal person-time, the quantification includes the immortal time and corresponding events mistakenly attributed to the exposed group (misclassified) or excluded from study (excluded) that must be attributed to the comparison group.
RESULTS: With misclassified immortal person-time, the magnitude of bias varies according to the incidence rate ratio of immortal time and comparison group as well as the rate ratio of immortal time and exposed group: toward null for both ratios less than 1, no bias for both ratios equal to 1, away from null for both ratios greater than 1. For one ratio less than 1 and the other greater than 1, the direction and magnitude of bias can be obtained from the formula provided. With excluded immortal person-time, the magnitude of bias is associated with the incidence rate ratio of immortal time and comparison group: toward null for the ratio less than 1, no bias for the ratio equal to 1, and away from null for the ratio greater than 1.
CONCLUSIONS: Bias due to immortal person-time in studies with nonfatal outcomes can vary widely and can be quantified under assumptions that apply to many studies. Copyright © 2016 Elsevier Inc. All rights reserved. PMID: 26847051 [PubMed – in process]

5. Epidemiology. 2016 Mar 29. [Epub ahead of print]

Targeted Maximum Likelihood Estimation for Pharmacoepidemiological Research.

Pang M(1), Schuster T, Filion KB, Eberg M, Platt RW.
(1)1 Centre For Clinical Epidemiology, Lady Davis Research Institute, Jewish General Hospital, Montreal, Quebec, Canada. 2 Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada. 3 Department of Pediatrics, McGill University, Montreal, Quebec, Canada. 4 Division of Clinical Epidemiology, Department of Medicine, McGill University, Montreal, Quebec, Canada. 5 The Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada.

BACKGROUND: Targeted Maximum Likelihood Estimation has been proposed for estimating marginal causal effects, and is robust to misspecification of either the treatment or outcome model. However, due perhaps to its novelty, targeted maximum likelihood estimation has not been widely used in pharmacoepidemiology.
OBJECTIVES: To demonstrate targeted maximum likelihood estimation in a pharmacoepidemiological study with a high-dimensional covariate space, to incorporate the use of high-dimensional propensity scores into this method, and to compare the results to those of inverse probability weighting.
METHODS: We implemented the targeted maximum likelihood estimation procedure in a single-point exposure study of the use of statins and the one-year risk of all-cause mortality post-myocardial infarction using data from the UK Clinical Practice Research Datalink. A range of known potential confounders were considered, and empirical covariates were selected using the high-dimensional propensity scores algorithm. We estimated odds ratios using targeted maximum likelihood estimation and inverse probability weighting with a variety of covariate selection strategies.
RESULTS: Through a real example we demonstrated the double- robustness of targeted maximum likelihood estimation. We showed that results with this method and inverse probability weighting differed when a large number of covariates were included in the treatment model.
CONCLUSIONS: Targeted maximum likelihood can be used in high-dimensional covariate settings. In high-dimensional covariate settings, differences in results between targeted maximum likelihood and inverse probability weighted estimation are likely due to sensitivity to (near) positivity violations. Further investigations are needed to gain better understanding of the advantages and limitations of this method in pharmacoepidemiological studies.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially. PMID: 27031037 [PubMed – as supplied by publisher]

6. Pharmacoepidemiol Drug Saf. 2016 Mar;25(3):287-96. doi: 10.1002/pds.3924. Epub 2015 Dec 16.

Conditions for confounding of interactions.

Liu A(1,)(2,)(3), Abrahamowicz M(1,)(2), Siemiatycki J(3,)(4).
(1)Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada. (2)Division of Clinical Epidemiology, Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada. (3)Division of Population Health, CRCHUM Research Center, Montreal, Quebec, Canada. (4)Department of Social and Preventive Medicine, University of Montreal, Montreal, Quebec, Canada.

PURPOSE: Pharmaco-epidemiology increasingly investigates drug-drug or drug-covariate interactions. Yet, conditions for confounding of interactions have not been elucidated. We explored the conditions under which the estimates of interactions in logistic regression are affected by confounding bias.
METHODS: We rely on analytical derivations to investigate the conditions and then use simulations to confirm our analytical results and to quantify the impact of selected parameters on the bias of the interaction estimates.
RESULTS: Failure to adjust for a risk factor U results in a biased estimate of the interaction between exposures E1 and E2 on a binary outcome Y if the association between U and E1 varies depending on the value of E2. The resulting confounding bias increases with increase in the following: (i) prevalence of confounder U; (ii) strength of U-Y association; and (iii) heterogeneity in the association of E1 with U across the strata of E2. A variable that is not a confounder for the main effects of E1 and E2 may still act as an important confounder for their interaction.
CONCLUSIONS: Studies of interactions should attempt to identify-as potential confounders-those risk factors whose associations with one of the exposures in the interaction term may be modified by the other exposure. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26676843 [PubMed – in process]

February 2016

1. J Clin Epidemiol. 2016 Feb 27. pii: S0895-4356(16)00142-6. doi: 10.1016/j.jclinepi.2016.02.011. [Epub ahead of print]

Comparison of high dimensional confounder summary scores in comparative studies of newly marketed medications.

Kumamaru H(1), Gagne JJ(2), Glynn RJ(2), Setoguchi S(3), Schneeweiss S(2).
(1)Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA; Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA. Electronic address: hik205@mail.harvard.edu. (2)Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA. (3)Duke Clinical Research Institute, Duke University, Durham, North Carolina, USA.

OBJECTIVE: To compare confounding adjustment by high-dimensional propensity scores (hdPS) and historically-developed high-dimensional disease risk scores (hdDRS) in three comparative study examples of newly marketed medications: 1) dabigatran vs. warfarin on major hemorrhage and 2) on death; and 3) coxibs vs. non-selective non-steroidal anti-inflammatory drugs on gastrointestinal bleeds
STUDY DESIGN AND SETTING: In each example, we constructed a concurrent cohort of new and old drug initiators using US claims databases. In historical cohorts of old drug initiators, we developed hdDRS models including investigator-specified plus empirically-identified variables and using principal component analysis and lasso regression for dimension reduction. We applied the models to the concurrent cohorts to obtain predicted outcome probabilities, which we used for confounding adjustment. We compared the resulting estimates to those from hdPS.
RESULTS: The crude odds ratio (OR) comparing dabigatran to warfarin was 0.52 (95% confidence interval: 0.37-0.72) for hemorrhage and 0.38 (0.26-0.55) for death. Decile-stratification yielded an OR of 0.64 (0.46-0.90) for hemorrhage using hdDRS vs. 0.70 (0.49-1.02) for hdPS. ORs for death were 0.69 (0.45-1.06) and 0.73 (0.48-1.10), respectively. The relative performance of hdDRS in the coxibs example was similar.
CONCLUSION: hdDRS achieved similar or better confounding adjustment compared to conventional regression approach, but worked slightly less well than hdPS. Copyright © 2016 Elsevier Inc. All rights reserved. PMID: 26931292 [PubMed – as supplied by publisher]

2. Clin Trials. 2016 Feb 29. pii: 1740774516628825. [Epub ahead of print]

Selection of the effect size for sample size determination for a continuous response in a superiority clinical trial using a hybrid classical and Bayesian procedure.

Ciarleglio MM(1), Arendt CD(2), Peduzzi PN(3).
(1)Department of Biostatistics, Yale University School of Public Health, New Haven, CT, USA maria.ciarleglio@yale.edu. (2)Air Force Office of Scientific Research, Arlington, VA, USA. (3)Department of Biostatistics, Yale University School of Public Health, New Haven, CT, USA.

BACKGROUND: When designing studies that have a continuous outcome as the primary endpoint, the hypothesized effect size ([Formula: see text]), that is, the hypothesized difference in means ([Formula: see text]) relative to the assumed variability of the endpoint ([Formula: see text]), plays an important role in sample size and power calculations. Point estimates for [Formula: see text] and [Formula: see text] are often calculated using historical data. However, the uncertainty in these estimates is rarely addressed.
METHODS: This article presents a hybrid classical and Bayesian procedure that formally integrates prior information on the distributions of [Formula: see text] and [Formula: see text] into the study’s power calculation. Conditional expected power, which averages the traditional power curve using the prior distributions of [Formula: see text] and [Formula: see text] as the averaging weight, is used, and the value of [Formula: see text] is found that equates the prespecified frequentist power ([Formula: see text]) and the conditional expected power of the trial. This hypothesized effect size is then used in traditional sample size calculations when determining sample size for the study.
RESULTS: The value of [Formula: see text] found using this method may be expressed as a function of the prior means of [Formula: see text] and [Formula: see text], [Formula: see text], and their prior standard deviations, [Formula: see text]. We show that the "naïve" estimate of the effect size, that is, the ratio of prior means, should be down-weighted to account for the variability in the parameters. An example is presented for designing a placebo-controlled clinical trial testing the antidepressant effect of alprazolam as monotherapy for major depression.
CONCLUSION: Through this method, we are able to formally integrate prior information on the uncertainty and variability of both the treatment effect and the common standard deviation into the design of the study while maintaining a frequentist framework for the final analysis. Solving for the effect size which the study has a high probability of correctly detecting based on the available prior information on the difference [Formula: see text] and the standard deviation [Formula: see text] provides a valuable, substantiated estimate that can form the basis for discussion about the study’s feasibility during the design phase. © The Author(s) 2016. PMID: 26928986 [PubMed – as supplied by publisher]

3. BMC Med Res Methodol. 2016 Feb 19;16(1):22. doi: 10.1186/s12874-016-0119-1.

Head to head comparison of the propensity score and the high-dimensional propensity score matching methods.

Guertin JR(1,)(2), Rahme E(3,)(4), Dormuth CR(5), LeLorier J(6).
(1)Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, Canada. guertinj@mcmaster.ca. (2)Programs for Assessment of Technology in Health, St. Joseph’s Healthcare Hamilton, Hamilton, QC, Canada. guertinj@mcmaster.ca. (3)Research Institute of the McGill University Health Centre, Montreal, QC, Canada. elham.rahme@mcgill.ca. (4)Department of Medicine, McGill University, Montreal, QC, Canada. elham.rahme@mcgill.ca. (5)Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, BC, Canada. colin.dormuth@ti.ubc.ca. (6)Pharmacoeconomic and Pharmacoepidemiology unit, Research Center of the Centre hospitalier de l’Université de Montréal, Pavillon S, 850 St-Denis, 3e étage, Montreal, QC, Canada. jacques.le.lorier@sympatico.ca.

BACKGROUND: Comparative performance of the traditional propensity score (PS) and high-dimensional propensity score (hdPS) methods in the adjustment for confounding by indication remains unclear. We aimed to identify which method provided the best adjustment for confounding by indication within the context of the risk of diabetes among patients exposed to moderate versus high potency statins.
METHOD: A cohort of diabetes-free incident statins users was identified from the Quebec’s publicly funded medico-administrative database (Full Cohort). We created two matched sub-cohorts by matching one patient initiated on a lower potency to one patient initiated on a high potency either on patients’ PS or hdPS. Both methods’ performance were compared by means of the absolute standardized differences (ASDD) regarding relevant characteristics and by means of the obtained measures of association.
RESULTS: Eight out of the 18 examined characteristics were shown to be unbalanced within the Full Cohort. Although matching on either method achieved balance within all examined characteristic, matching on patients’ hdPS created the most balanced sub-cohort. Measures of associations and confidence intervals obtained within the two matched sub-cohorts overlapped.
CONCLUSION: Although ASDD suggest better matching with hdPS than with PS, measures of association were almost identical when adjusted for either method. Use of the hdPS method in adjusting for confounding by indication within future studies should be recommended due to its ability to identify confounding variables which may be unknown to the investigators. PMCID: PMC4759710 PMID: 26891796 [PubMed – in process]

4. Pharmacoepidemiol Drug Saf. 2016 Feb 15. doi: 10.1002/pds.3965. [Epub ahead of print]

Tailoring treatments using treatment effect modification.

Schmidt AF(1,)(2,)(3,)(4), Klungel OH(1,)(2), Nielen M(3), de Boer A(2), Groenwold RH(1,)(2), Hoes AW(1).
(1)Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands. (2)Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht, the Netherlands. (3)Department of Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, the Netherlands. (4)Institute of Cardiovascular Science, Faculty of Population Health, University College London, London, UK.

BACKGROUND AND OBJECTIVE: Applying results from clinical studies to individual patients can be a difficult process. Using the concept of treatment effect modification (also referred to as interaction), defined as a difference in treatment response between patient groups, we discuss whether and how treatment effects can be tailored to better meet patients’ needs.
RESULTS: First we argue that contrary to how most studies are designed, treatment effect modification should be expected. Second, given this expected heterogeneity, a small number of clinically relevant subgroups should be a priori selected, depending on the expected magnitude of effect modification, and prevalence of the patient type. Third, by defining generalizability as the absence of treatment effect modification we show that generalizability can be evaluated within the usual statistical framework of equivalence testing. Fourth, when equivalence cannot be confirmed, we address the need for further analyses and studies tailoring treatment towards groups of patients with similar response to treatment. Fifth, we argue that to properly frame, the entire body of evidence on effect modification should be quantified in a prior probability.Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd. PMID: 26877168 [PubMed – as supplied by publisher]

5. Stat Med. 2016 Feb 7. doi: 10.1002/sim.6893. [Epub ahead of print]

Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).

Sparapani RA(1), Logan BR(1), McCulloch RE(2), Laud PW(1).
(1)Division of Biostatistics, Medical College of Wisconsin, Milwaukee, U.S.A. (2)Booth School of Business, University of Chicago, Chicago, U.S.A.

Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model’s ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd. PMID: 26854022 [PubMed – as supplied by publisher]

January 2016

1. Epidemiology. 2016 Jan 25. [Epub ahead of print]

Evaluating additive interaction using survival percentiles.

Bellavia A(1), Bottai M, Orsini N.
(1)1Unit of Biostatistics, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden 2Unit of Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.

Evaluation of statistical interaction in time-to-event analysis is usually limited to the study of multiplicative interaction, via inclusion of a product term in a Cox proportional-hazard model. Measures of additive interaction are available but seldom used. All measures of interaction in survival analysis, whether additive or multiplicative, are in the metric of hazard, usually assuming that the interaction between two predictors of interest is constant during the follow-up period.We introduce a measure to evaluate additive interaction in survival analysis in the metric of time. This measure can be calculated by evaluating survival percentiles, defined as the time points by which different subpopulations reach the same incidence proportion. Using this approach the probability of the outcome is fixed and the time variable is estimated. We also show that by using a regression model for the evaluation of conditional survival  percentiles, including a product term between the two exposures in the model, interaction is evaluated as a deviation from additivity of the effects. In the simple case of two binary exposures the product term is interpreted as excess/decrease in survival time (i.e. years, months, days) due to the presence of both exposures. This measure of interaction is dependent on the fraction of events being considered, thus allowing evaluation of how interaction changes during the observed follow-up.Evaluation of interaction in the context of survival percentiles allows deriving a measure of additive interaction without assuming a constant effect over time, overcoming two main limitations of commonly used approaches.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it  is properly cited. The work cannot be changed in any way or used commercially.  PMID: 26829157  [PubMed – as supplied by publisher]

2. Int J Epidemiol. 2016 Jan 22. pii: dyv341. [Epub ahead of print]

Causality and causal inference in epidemiology: the need for a pluralistic approach.

Vandenbroucke JP(1), Broadbent A(2), Pearce N(3).
(1)Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands and Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus, Denmark, J.P.Vandenbroucke@lumc.nl. (2)Department of Philosophy, University of Johannesburg, Auckland Park, South Africa and. (3)Department of Medical Statistics and Centre for Global NCDs, London School of  Hygiene and Tropical Medicine, London, UK and Centre for Public Health Research,  Massey University, Wellington, New Zealand.

Causal inference based on a restricted version of the potential outcomes approach reasoning is assuming an increasingly prominent place in the teaching and practice of epidemiology. The proposed concepts and methods are useful for particular problems, but it would be of concern if the theory and practice of the complete field of epidemiology were to become restricted to this single approach to causal inference. Our concerns are that this theory restricts the questions that epidemiologists may ask and the study designs that they may consider. It also restricts the evidence that may be considered acceptable to assess causality, and thereby the evidence that may be considered acceptable for scientific and public health decision making. These restrictions are based on a particular conceptual framework for thinking about causality. In Section 1, we describe the characteristics of the restricted potential outcomes approach (RPOA) and show that there is a methodological movement which advocates these principles, not just for solving particular problems, but as ideals for which epidemiology as a whole should strive. In Section 2, we seek to show that the limitation of epidemiology to one particular view of the nature of causality is problematic. In Section 3, we argue that the RPOA is also problematic with regard to the assessment of causality. We argue that it threatens to restrict study design choice, to wrongly discredit the results of types of observational studies that have been very useful in the past and to damage the teaching of epidemiological reasoning. Finally, in Section 4 we set out what we regard as a more reasonable ‘working hypothesis’ as to the nature of causality and its assessment: pragmatic pluralism. © The Author 2016. Published by Oxford University Press on behalf of the
International Epidemiological Association. PMID: 26800751  [PubMed – as supplied by publisher]

3. Biometrics. 2016 Jan 12. doi: 10.1111/biom.12470. [Epub ahead of print]

Addressing issues associated with evaluating prediction models for survival endpoints based on the concordance statistic.

Wang M(1), Long Q(2).
(1)Department of Public Health Sciences, College of Medicine, Pennsylvania State  University, Hershey, Pennsylvania 17033, U.S.A. (2)Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, Georgia 30322, U.S.A.

Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of  censored data is of substantial interest. The standard concordance (c) statistic  has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues  associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society. PMID: 26756274  [PubMed – as supplied by publisher]

4. Biometrics. 2016 Jan 11. doi: 10.1111/biom.12471. [Epub ahead of print]

Instrumental variable additive hazards models with exposure-dependent censoring.

Chan KC(1).
(1)Department of Biostatistics and Department of Health Services, University of Washington, Seattle, Washington 98195, U.S.A.

Li, Fine, and Brookhart (2015) presented an extension of the two-stage least squares (2SLS) method for additive hazards models which requires an assumption that the censoring distribution is unrelated to the endogenous exposure variable. We present another extension of 2SLS that can address this limitation. © 2016, The International Biometric Society. PMID: 26754156  [PubMed – as supplied by publisher]

5. Contemp Clin Trials. 2016 Jan 30. pii: S1551-7144(16)30008-8. doi: 10.1016/j.cct.2016.01.008. [Epub ahead of print]

Likelihood ratio Meta-Analysis: New motivation and approach for An old method.

Dormuth CR(1), Filion KB(2), Platt RW(3).
(1)Department of Anesthesiology, Pharmacology & Therapeutics, University of British Columbia, Vancouver, British Columbia. Electronic address: colin.dormuth@ti.ubc.ca. (2)Center for Clinical Epidemiology, Lady Davis Research Institute, Jewish General Hospital, Montreal, Quebec; Department of Medicine, McGill University, Montreal, Quebec. (3)Departments of Pediatrics and of Epidemiology, Biostatistics, and Occupational Health, McGill University, and the  Research Institute of the McGill University Health Centre, Montreal, Quebec.

A 95% confidence interval (CI) in an updated meta-analysis may not have the expected 95% coverage. If a meta-analysis is simply updated with additional data, then the resulting 95% CI will be wrong because it will not have accounted for the fact that the earlier meta-analysis failed or succeeded to exclude the null.  This situation can be avoided by using the likelihood ratio (LR) as a measure of  evidence that does not depend on type-1 error. We show how an LR-based approach,  first advanced by Goodman, can be used in a meta-analysis to pool data from separate studies to quantitatively assess where the total evidence points. The method works by estimating the log-likelihood ratio (LogLR) function from each study. Those functions are then summed to obtain a combined function, which is then used to retrieve the total effect estimate, and a corresponding ‘intrinsic’  confidence interval. Using as illustrations the CAPRIE trial of clopidogrel versus aspirin in the prevention of ischaemic events, and our own meta-analyses of higher potency statins and the risk of acute kidney injury, we show that the LR-based method yields the same point estimate as a traditional analysis, but with an intrinsic confidence interval that is appropriately wider than the traditional 95% CI. The LR-based method can be used to conduct both fixed effect  and random effects meta-analyses, it can be applied to old and new meta-analyses  alike, and results can be presented in a format that is familiar to a meta-analytic audience.  Copyright © 2015. Published by Elsevier Inc.  PMID: 26837056  [PubMed – as supplied by publisher]

6. Epidemiology. 2016 Jan;27(1):91-7. doi: 10.1097/EDE.0000000000000409.

Selection Bias Due to Loss to Follow Up in Cohort Studies.

Howe CJ(1), Cole SR, Lau B, Napravnik S, Eron JJ Jr.
(1)From the aDepartment of Epidemiology, Center for Population Health and Clinical Epidemiology, Brown University School of Public Health, Providence, RI;  bDepartment of Epidemiology, University of North Carolina Gillings School of Global Public Health, Chapel Hill, NC; cDepartment of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD; and dDivision of Infectious Diseases, Department of Medicine, University of North Carolina School of Medicine, Chapel Hill, NC.

Selection bias due to loss to follow up represents a threat to the internal validity of estimates derived from cohort studies. Over the past 15 years, stratification-based techniques as well as methods such as inverse probability-of-censoring weighted estimation have been more prominently discussed and offered as a means to correct for selection bias. However, unlike correcting for confounding bias using inverse weighting, uptake of inverse probability-of-censoring weighted estimation as well as competing methods has been limited in the applied epidemiologic literature. To motivate greater use of inverse probability-of-censoring weighted estimation and competing methods, we use causal diagrams to describe the sources of selection bias in cohort studies employing a time-to-event framework when the quantity of interest is an absolute measure (e.g., absolute risk, survival function) or relative effect measure (e.g., risk difference, risk ratio). We highlight that whether a given estimate obtained from standard methods is potentially subject to selection bias depends on the causal diagram and the measure. We first broadly describe inverse probability-of-censoring weighted estimation and then give a simple example to demonstrate in detail how inverse probability-of-censoring weighted estimation mitigates selection bias and describe challenges to estimation. We then modify complex, real-world data from the University of North Carolina Center for AIDS Research HIV clinical cohort study and estimate the absolute and relative change in the occurrence of death with and without inverse probability-of-censoring weighted correction using the modified University of North Carolina data. We provide SAS code to aid with implementation of inverse probability-of-censoring weighted techniques. PMID: 26484424 [PubMed – in process]  

December 2015

1. Contemp Clin Trials. 2015 Dec 16. pii: S1551-7144(15)30143-9. doi: 10.1016/j.cct.2015.12.012. [Epub ahead of print]

Propensity Score and Proximity Matching Using Random Forest.

Zhao P(1), Su X(2), Ge T(3), Fan J(4).
(1)Computational Science Research Center, San Diego State University, San Diego, CA, USA. (2)Department of Mathematical Sciences, University of Texas, El Paso, TX, USA. (3)Janssen Research and Development, San Diego, CA, USA. (4)Department of Mathematics and Statistics, San Diego State University, San Diego, CA, USA. Electronic address: jjfan@mail.sdsu.edu.

In order to derive unbiased inference from observational data, matching methods are often applied to produce balanced treatment and control groups in terms of all background variables. Propensity score has been a key component in this research area. However, propensity score based matching methods in the literature have several limitations, such as model mis-specifications, categorical variables with more than two levels, difficulties in handling missing data, and nonlinear relationships. Random forest, averaging outcomes from many decision trees, is nonparametric in nature, straightforward to use, and capable of solving these issues. More importantly, the precision afforded by random forest (Caruana et al., 2008) may provide us with a more accurate and less model dependent estimate of the propensity score. In addition, the proximity matrix, a by-product of the random forest, may naturally serve as a distance measure between observations that can be used in matching. The proposed random forest based matching methods are applied to data from the National Health and Nutrition Examination Survey (NHANES). Our results show that the proposed methods can produce well balanced treatment and control groups. An illustration is also provided that the methods can effectively deal with missing data in covariates. Copyright © 2015. Published by Elsevier Inc. PMID: 26706666 [PubMed – as supplied by publisher]

2. Stat Med. 2015 Dec 17. doi: 10.1002/sim.6842. [Epub ahead of print]

Propensity score and doubly robust methods for estimating the effect of treatment on censored cost.

Li J(1), Handorf E(2), Bekelman J(3), Mitra N(1).
(1)Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, 19104, U.S.A. (2)Biostatistics and Bioinformatics Facility, Temple University Health System Fox Chase Cancer Center, Philadelphia, PA, 19111, U.S.A. (3)Department of Radiation Oncology, University of Pennsylvania, Philadelphia, PA, 19104, U.S.A.

The estimation of treatment effects on medical costs is complicated by the need to account for informative censoring, skewness, and the effects of confounders. Because medical costs are often collected from observational claims data, we investigate propensity score (PS) methods such as covariate adjustment, stratification, and inverse probability weighting taking into account informative censoring of the cost outcome. We compare these more commonly used methods with doubly robust (DR) estimation. We then use a machine learning approach called super learner (SL) to choose among conventional cost models to estimate regression parameters in the DR approach and to choose among various model specifications for PS estimation. Our simulation studies show that when the PS model is correctly specified, weighting and DR perform well. When the PS model is misspecified, the combined approach of DR with SL can still provide unbiased estimates. SL is especially useful when the underlying cost distribution comes from a mixture of different distributions or when the true PS model is unknown. We apply these approaches to a cost analysis of two bladder cancer treatments, cystectomy versus bladder preservation therapy, using SEER-Medicare data. Copyright © 2015 John Wiley & Sons, Ltd.  PMID: 26678242 [PubMed – as supplied by publisher]

3. Pharmacoepidemiol Drug Saf. 2015 Dec 16. doi: 10.1002/pds.3924. [Epub ahead of print]

Conditions for confounding of interactions.

Liu A(1,)(2,)(3), Abrahamowicz M(1,)(2), Siemiatycki J(3,)(4).
(1)Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada. (2)Division of Clinical Epidemiology, Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada. (3)Division of Population Health, CRCHUM Research Center, Montreal, Quebec, Canada. (4)Department of Social and Preventive Medicine, University of Montreal, Montreal, Quebec, Canada.

PURPOSE: Pharmaco-epidemiology increasingly investigates drug-drug or drug-covariate interactions. Yet, conditions for confounding of interactions have not been elucidated. We explored the conditions under which the estimates of interactions in logistic regression are affected by confounding bias.
METHODS: We rely on analytical derivations to investigate the conditions and then use simulations to confirm our analytical results and to quantify the impact of selected parameters on the bias of the interaction estimates.
RESULTS: Failure to adjust for a risk factor U results in a biased estimate of the interaction between exposures E1 and E2 on a binary outcome Y if the association between U and E1 varies depending on the value of E2. The resulting confounding bias increases with increase in the following: (i) prevalence of confounder U; (ii) strength of U-Y association; and (iii) heterogeneity in the association of E1 with U across the strata of E2. A variable that is not a confounder for the main effects of E1 and E2 may still act as an important confounder for their interaction.
CONCLUSIONS: Studies of interactions should attempt to identify-as potential confounders-those risk factors whose associations with one of the exposures in the interaction term may be modified by the other exposure. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26676843 [PubMed – as supplied by publisher]

4. Biostatistics. 2015 Dec 14. pii: kxv050. [Epub ahead of print]

Semiparametric regression for the weighted composite endpoint of recurrent and terminal events.

Mao L(1), Lin DY(2).
(1)Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA. (2)Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA lin@bios.unc.edu.

Recurrent event data are commonly encountered in clinical and epidemiological studies. A major complication arises when recurrent events are terminated by death. To assess the overall effects of covariates on the two types of events, we define a weighted composite endpoint as the cumulative number of recurrent and terminal events properly weighted by the relative severity of each event. We propose a semiparametric proportional rates model which specifies that the (possibly time-varying) covariates have multiplicative effects on the rate function of the weighted composite endpoint while leaving the form of the rate function and the dependence among recurrent and terminal events completely unspecified. We construct appropriate estimators for the regression parameters and the cumulative frequency function. We show that the estimators are consistent and asymptotically normal with variances that can be consistently estimated. We also develop graphical and numerical procedures for checking the adequacy of the model. We then demonstrate the usefulness of the proposed methods in simulation studies. Finally, we provide an application to a major cardiovascular clinical trial. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 26668069 [PubMed – as supplied by publisher]

5. J Comp Eff Res. 2015 Dec 4. [Epub ahead of print]

Comparing high-dimensional confounder control methods for rapid cohort studies from electronic health records.

Low YS(1), Gallego B(2), Shah NH(1).
(1)Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA 94305, USA. (2)Center for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.

AIMS: Electronic health records (EHR), containing rich clinical histories of large patient populations, can provide evidence for clinical decisions when evidence from trials and literature is absent. To enable such observational studies from EHR in real time, particularly in emergencies, rapid confounder control methods that can handle numerous variables and adjust for biases are imperative. This study compares the performance of 18 automatic confounder control methods.
METHODS: Methods include propensity scores, direct adjustment by machine learning, similarity matching and resampling in two simulated and one real-world EHR datasets.
RESULTS & CONCLUSIONS: Direct adjustment by lasso regression and ensemble models involving multiple resamples have performance comparable to expert-based propensity scores and thus, may help provide real-time EHR-based evidence for timely clinical decisions. PMID: 26634383 [PubMed – as supplied by publisher]

6. Value Health. 2015 Dec;18(8):1057-62. doi: 10.1016/j.jval.2015.08.015. Epub 2015 Nov 2.

Comparison of Benefit-Risk Assessment Methods for Prospective Monitoring of Newly Marketed Drugs: A Simulation Study.

Gagne JJ(1), Najafzadeh M(2), Choudhry NK(2), Bykov K(2), Kahler KH(3), Martin D(3), Rogers JR(2), Schneeweiss S(2).
(1)Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. Electronic address: jgagne1@partners.org. (2)Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. (3)Novartis Pharmaceuticals Corporation, East Hanover, NJ, USA.

OBJECTIVES: To compare benefit-risk assessment (BRA) methods for determining whether and when sufficient evidence exists to indicate that one drug is favorable over another in prospective monitoring.
METHODS: We simulated prospective monitoring of a new drug (A) versus an alternative drug (B) with respect to two beneficial and three harmful outcomes. We generated data for 1000 iterations of six scenarios and applied four BRA metrics: number needed to treat and number needed to harm (NNT|NNH), incremental net benefit (INB) with maximum acceptable risk, INB with relative-value-adjusted life-years, and INB with quality-adjusted life-years. We determined the proportion of iterations in which the 99% confidence interval for each metric included and excluded the null and we calculated mean time to alerting.
RESULTS: With no true difference in any outcome between drugs A and B, the proportion of iterations including the null was lowest for INB with relative-value-adjusted life-years (64%) and highest for INB with quality-adjusted life-years (76%). When drug A was more effective and the drugs were equally safe, all metrics indicated net favorability of A in more than 70% of the iterations. When drug A was safer than drug B, NNT|NNH had the highest proportion of iterations indicating net favorability of drug A (65%). Mean time to alerting was similar among methods across the six scenarios.
CONCLUSIONS: BRA metrics can be useful for identifying net favorability when applied to prospective monitoring of a new drug versus an alternative drug. INB-based approaches similarly outperform unweighted NNT|NNH approaches. Time to alerting was similar across approaches. Copyright © 2015 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved. PMID: 26686791 [PubMed – in process]

November 2015

1. Pharmacoepidemiol Drug Saf. 2015 Nov 26. doi: 10.1002/pds.3922. [Epub ahead of print]

Controlling confounding of treatment effects in administrative data in the presence of time-varying baseline confounders.

Gilbertson DT(1), Bradbury BD(2), Wetmore JB(1), Weinhandl ED(1), Monda KL(2), Liu J(1), Brookhart MA(3), Gustafson SK(1), Roberts T(1), Collins AJ(1,)(4), Rothman KJ(5,)(6).
(1)Chronic Disease Research Group, Minneapolis Medical Research Foundation,
Minneapolis, MN, USA. (2)Center for Observational Research, Amgen, Inc., Thousand Oaks, CA, USA. (3)Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. (4)Department of Medicine, University of Minnesota, Minneapolis, MN, USA. (5)RTI Health Solutions, Research Triangle Park, NC, USA. (6)Department of Epidemiology and Medicine, Boston University Medical Center, Boston, MA, USA.

PURPOSE: Confounding, a concern in nonexperimental research using administrative claims, is nearly ubiquitous in claims-based pharmacoepidemiology studies. A fixed-length look-back window for assessing comorbidity from claims is common, but it may be advantageous to use all historical claims. We assessed how the strength of association between a baseline-identified condition and subsequent mortality varied by when the condition was measured and investigated methods to control for confounding.
METHODS: For Medicare beneficiaries undergoing maintenance hemodialysis on 1 January 2008 (n = 222 343), we searched all Medicare claims, 1 January 2001 to 31 December 2007, for four conditions representing chronic and acute diseases, and classified claims by number of months preceding the index date. We used proportional hazard models to estimate the association between time of condition and subsequent mortality. We simulated a confounded comorbidity-exposure relationship and investigated an alternative method of adjustment when the association between the condition and mortality varied by proximity to follow-up start.
RESULTS: The magnitude of the mortality hazard ratio estimates for each condition investigated decreased toward unity as time increased between index date and most recent manifestation of the condition. Simulation showed more biased estimates of exposure-outcome associations if proximity to follow-up start was not considered.
CONCLUSIONS: Using all-available claims information during a baseline period, we found that for all conditions investigated, the association between a comorbid condition and subsequent mortality varied considerably depending on when the condition was measured. Improved confounding control may be achieved by considering the timing of claims relative to follow-up start. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26608680 [PubMed – as supplied by publisher]

2. Stat Methods Med Res. 2015 Nov 23. pii: 0962280215615899. [Epub ahead of print]

Causal mediation analysis with multiple causally non-ordered mediators.

Taguri M(1), Featherstone J(2), Cheng J(2).
(1)Department of Biostatistics, School of Medicine, Yokohama City University, Yokohama, Japan School of Dentistry, University of California, San Francisco, San Francisco, CA, USA taguri@yokohama-cu.ac.jp. (2)School of Dentistry, University of California, San Francisco, San Francisco, CA, USA.

In many health studies, researchers are interested in estimating the treatment effects on the outcome around and through an intermediate variable. Such causal mediation analyses aim to understand the mechanisms that explain the treatment effect. Although multiple mediators are often involved in real studies, most of the literature considered mediation analyses with one mediator at a time. In this article, we consider mediation analyses when there are causally non-ordered multiple mediators. Even if the mediators do not affect each other, the sum of two indirect effects through the two mediators considered separately may diverge from the joint natural indirect effect when there are additive interactions between the effects of the two mediators on the outcome. Therefore, we derive an equation for the joint natural indirect effect based on the individual mediation effects and their interactive effect, which helps us understand how the mediation effect works through the two mediators and relative contributions of the mediators and their interaction. We also discuss an extension for three mediators. The proposed method is illustrated using data from a randomized trial on the prevention of dental caries. © The Author(s) 2015. PMID: 26596350 [PubMed – as supplied by publisher]

3. Pharmacoepidemiol Drug Saf. 2015 Nov 24. doi: 10.1002/pds.3905. [Epub ahead of print]

Developing alerting thresholds for prospective drug safety monitoring.

Wangge G(1,)(2), Schneeweiss S(1), Glynn RJ(1,)(3), Gagne JJ(1).
(1)Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. (2)Division of Epidemiology and Statistics, Department of Community Medicine, Faculty of Medicine, Universitas Indonesia, Jakarta, Indonesia. (3)Division of Preventive Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States.

BACKGROUND: Current methods for prospective drug safety monitoring focus on determining whether and when to generate safety alerts indicating that a new drug may be less safe than a comparator. Approaches are needed to develop safety thresholds that can be used to define whether a new drug is no less than or equally safe as the comparator.
OBJECTIVES: Our aim is to develop a framework for determining which safety statements can be made about a new drug and when they can be made during prospective monitoring.
METHODS: We developed a two-pronged approach to establish safety thresholds for active monitoring. First, we adapted concepts from setting margins in non-inferiority (NI) trials ("NI approach"). Second, we summarized NI margins used in published randomized trials and reviewed publicly available data from the US FDA’s website to identify the type and magnitude of evidence used in regulatory decisions involving withdrawals and black box warnings between 2009 and 2013 ("benchmark approach"). We applied the framework to a case study of dabigatran versus warfarin and major bleed.
RESULTS: We provide formulas on both risk ratio and risk difference scales for the NI approach that are analogous to threshold setting in NI trials but based on point estimates and using a maximum tolerable increase rather than a preservation factor. Using this approach, we established a safety threshold for the dabigatran case study that was within range of the findings from the benchmark approach (1.18 to 7.30). Comparing the safety threshold with post-approval studies of dabigatran versus warfarin indicated that no safety statement can be made. CONCLUSIONS: The proposed framework expands the safety statements that can be made in current prospective drug safety monitoring systems. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26596260 [PubMed – as supplied by publisher]

4. Stat Med. 2015 Nov 22. doi: 10.1002/sim.6809. [Epub ahead of print]

Bayesian evidence synthesis for exploring generalizability of treatment effects: a case study of combining randomized and non-randomized results in diabetes.

Verde PE(1), Ohmann C(1), Morbach S(2), Icks A(3).
(1)Coordination Center for Clinical Trials, University of Duesseldorf, Duesseldorf, Germany. (2)Department of Diabetes and Angiology, Marienkrankenhaus, Hamburg, Germany. (3)Department of Public Health, University of Duesseldorf, Duesseldorf, Germany.

In this paper, we present a unified modeling framework to combine aggregated data from randomized controlled trials (RCTs) with individual participant data (IPD) from observational studies. Rather than simply pooling the available evidence into an overall treatment effect, adjusted for potential confounding, the intention of this work is to explore treatment effects in specific patient populations reflected by the IPD. In this way, by collecting IPD, we can potentially gain new insights from RCTs’ results, which cannot be seen using only a meta-analysis of RCTs. We present a new Bayesian hierarchical meta-regression model, which combines submodels, representing different types of data into a coherent analysis. Predictors of baseline risk are estimated from the individual data. Simultaneously, a bivariate random effects distribution of baseline risk and treatment effects is estimated from the combined individual and aggregate data. Therefore, given a subgroup of interest, the estimated treatment effect can be calculated through its correlation with baseline risk. We highlight different types of model parameters: those that are the focus of inference (e.g., treatment effect in a subgroup of patients) and those that are used to adjust for biases introduced by data collection processes (e.g., internal or external validity). The model is applied to a case study where RCTs’ results, investigating efficacy in the treatment of diabetic foot problems, are extrapolated to groups of patients treated in medical routine and who were enrolled in a prospective cohort study. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26593632 [PubMed – as supplied by publisher]

5. Am J Epidemiol. 2015 Nov 20. pii: kwv152. [Epub ahead of print]

The Impact of Sparse Follow-up on Marginal Structural Models for Time-to-Event Data.

Mojaverian N, Moodie EE, Bliu A, Klein MB.

The impact of risk factors on the amount of time taken to reach an endpoint is a common parameter of interest. Hazard ratios are often estimated using a discrete-time approximation, which works well when the by-interval event rate is low. However, if the intervals are made more frequent than the observation times, missing values will arise. We investigated common analytical approaches, including available-case (AC) analysis, last observation carried forward (LOCF), and multiple imputation (MI), in a setting where time-dependent covariates also act as mediators. We generated complete data to obtain monthly information for all individuals, and from the complete data, we selected "observed" data by assuming that follow-up visits occurred every 6 months. MI proved superior to LOCF and AC analyses when only data on confounding variables were missing; AC analysis also performed well when data for additional variables were missing completely at random. We applied the 3 approaches to data from the Canadian HIV-Hepatitis C Co-infection Cohort Study (2003-2014) to estimate the association of alcohol abuse with liver fibrosis. The AC and LOCF estimates were larger but less precise than those obtained from the analysis that employed MI. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 26589708 [PubMed – as supplied by publisher]

6. Biometrics. 2015 Nov 17. doi: 10.1111/biom.12451. [Epub ahead of print]

Instrumental variable method for time-to-event data using a pseudo-observation approach.

Kjaersgaard MI(1), Parner ET(1).
(1)Section for Biostatistics, Department of Public Health, Aarhus University, Aarhus, Denmark.

Observational studies are often in peril of unmeasured confounding. Instrumental variable analysis is a method for controlling for unmeasured confounding. As yet, theory on instrumental variable analysis of censored time-to-event data is scarce. We propose a pseudo-observation approach to instrumental variable analysis of the survival function, the restricted mean, and the cumulative incidence function in competing risks with right-censored data using generalized method of moments estimation. For the purpose of illustrating our proposed method, we study antidepressant exposure in pregnancy and risk of autism spectrum disorder in offspring, and the performance of the method is assessed through simulation studies. © 2015, The International Biometric Society. PMID: 26574740 [PubMed – as supplied by publisher]

7. Clin Trials. 2015 Nov 15. pii: 1740774515614542. [Epub ahead of print]

Sample size under the additive hazards model.

McDaniel LS(1), Yu M(2), Chappell R(3).
(1)Biostatistics Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, LA, USA lmcda4@lsuhsc.edu. (2)Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA. (3)Department of Biostatistics & Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA.

BACKGROUND: The additive hazards model can be easier to interpret and in some cases fits better than the proportional hazards model. However, sample size formulas for clinical trials with time to event outcomes are currently based on either the proportional hazards assumption or an assumption of constant hazards. AIMS: The goal is to provide sample size formulas for superiority and non-inferiority trials assuming an additive hazards model but no specific distribution, along with evaluations of the performance of the formulas.
METHODS: Formulas are presented that determine the required sample size for a given scenario under the additive hazards model. Simulations are conducted to ensure that the formulas attain the desired power. For illustration, the non-inferiority sample size formula is applied to the calculations in the SPORTIF III trial of stroke prevention in atrial fibrillation.
CONCLUSION: Simulation results show that the sample size calculations lead to the correct power. Sample size is easily calculated using a tool that is available on the web at http://leemcdaniel.github.io/samplesize.html. © The Author(s) 2015. PMID: 26572562 [PubMed – as supplied by publisher]

8. Int J Biostat. 2015 Nov 1;11(2):203-22. doi: 10.1515/ijb-2014-0055.

Structural Nested Mean Models to Estimate the Effects of Time-Varying Treatments on Clustered Outcomes.

He J, Stephens-Shields A, Joffe M.

In assessing the efficacy of a time-varying treatment structural nested models (SNMs) are useful in dealing with confounding by variables affected by earlier treatments. These models often consider treatment allocation and repeated measures at the individual level. We extend SNMMs to clustered observations with time-varying confounding and treatments. We demonstrate how to formulate models with both cluster- and unit-level treatments and show how to derive semiparametric estimators of parameters in such models. For unit-level treatments, we consider interference, namely the effect of treatment on outcomes in other units of the same cluster. The properties of estimators are evaluated through simulations and compared with the conventional GEE regression method for clustered outcomes. To illustrate our method, we use data from the treatment arm of a glaucoma clinical trial to compare the effectiveness of two commonly used ocular hypertension medications. PMID: 26115504 [PubMed – in process]

9. Syst Rev. 2015 Nov 5;4(1):147. doi: 10.1186/s13643-015-0133-0.

Network meta-analysis incorporating randomized controlled trials and non-randomized comparative cohort studies for assessing the safety and effectiveness of medical treatments: challenges and opportunities.

Cameron C(1,)(2,)(3), Fireman B(4), Hutton B(5,)(6), Clifford T(7,)(8), Coyle D(9), Wells G(10), Dormuth CR(11), Platt R(12), Toh S(13).
(1)School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa, 451 Smyth Road, Suite RGN 3105, Ottawa, ON, K1H 8 M5, Canada. ccame056@uottawa.ca. (2)Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, 133 Brookline Avenue, 6th Floor, Boston, MA, 02215, USA. ccame056@uottawa.ca. (3)Evidence Synthesis Group, Cornerstone Research Group Inc., 3228 South Service Road, Burlington, ON, L7N 3H8, Canada. ccame056@uottawa.ca. (4)Division of Research, Kaiser Permanente Northern California, 2000 Broadway, Oakland, CA, 94612, USA. bruce.fireman@kp.org. (5)School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa, 451 Smyth Road, Suite RGN 3105, Ottawa, ON, K1H 8 M5, Canada. bhutton@ohri.ca. (6)Ottawa Hospital Research Institute, Center for  Practice Changing Research Building, Ottawa Hospital-General Campus, PO Box 201B, Ottawa, ON, K1H 8 L6, Canada. bhutton@ohri.ca. (7)School of Epidemiology, Public  Health and Preventive Medicine, University of Ottawa, 451 Smyth Road, Suite RGN 3105, Ottawa, ON, K1H 8 M5, Canada. Tammyc@cadth.ca. (8)Canadian Agency for Drugs and Technologies in Health, 865 Carling Ave., Suite 600, Ottawa, ON, K1S 5S8, Canada. Tammyc@cadth.ca. (9)School of Epidemiology, Public Health and Preventive  Medicine, University of Ottawa, 451 Smyth Road, Suite RGN 3105, Ottawa, ON, K1H 8 M5, Canada. dcoyle@uottawa.ca. (10)School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa, 451 Smyth Road, Suite RGN 3105, Ottawa, ON, K1H 8 M5, Canada. gawells@ottawaheart.ca. (11)Department of Anesthesiology, Pharmacology and Therapeutics, University of British Columbia, Vancouver, BC, V6T 1Z3, Canada. colin.dormuth@ti.ubc.ca. (12)Department of Epidemiology and Biostatistics, McGill University, 4060 Ste Catherine W #300, Montréal, Québec, H3Z 2Z3, Canada. robert.platt@mcgill.ca. (13)Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, 133 Brookline Avenue, 6th Floor, Boston, MA, 02215, USA. darren_toh@harvardpilgrim.org.

Network meta-analysis is increasingly used to allow comparison of multiple treatment alternatives simultaneously, some of which may not have been compared directly in primary research studies. The majority of network meta-analyses published to date have incorporated data from randomized controlled trials (RCTs) only; however, inclusion of non-randomized studies may sometimes be considered. Non-randomized studies can complement RCTs or address some of their limitations, such as short follow-up time, small sample size, highly selected population, high cost, and ethical restrictions. In this paper, we discuss the challenges and opportunities of incorporating both RCTs and non-randomized comparative cohort studies into network meta-analysis for assessing the safety and effectiveness of medical treatments. Non-randomized studies with inadequate control of biases such as confounding may threaten the validity of the entire network meta-analysis. Therefore, identification and inclusion of non-randomized studies must balance their strengths with their limitations. Inclusion of both RCTs and non-randomized studies in network meta-analysis will likely increase in the future due to the growing need to assess multiple treatments simultaneously, the availability of higher quality non-randomized data and more valid methods, and the increased use  of progressive licensing and product listing agreements requiring collection of data over the life cycle of medical products. Inappropriate inclusion of non-randomized studies could perpetuate the biases that are unknown, unmeasured,  or uncontrolled. However, thoughtful integration of randomized and non-randomized studies may offer opportunities to provide more timely, comprehensive, and generalizable evidence about the comparative safety and effectiveness of medical treatments. PMCID: PMC4634799 PMID: 26537988  [PubMed – in process]

10. Stat Med. 2015 Nov 20;34(26):3381-98. doi: 10.1002/sim.6532. Epub 2015 May 26.

Estimation of causal effects of binary treatments in unconfounded studies.

Gutman R(1), Rubin DB(1).
(1)Department of Statistics, Harvard University, 1 Oxford St, 02138, CambridgeMA,
U.S.A.

Estimation of causal effects in non-randomized studies comprises two distinct phases: design, without outcome data, and analysis of the outcome data according  to a specified protocol. Recently, Gutman and Rubin (2013) proposed a new analysis-phase method for estimating treatment effects when the outcome is binary and there is only one covariate, which viewed causal effect estimation explicitly as a missing data problem. Here, we extend this method to situations with continuous outcomes and multiple covariates and compare it with other commonly used methods (such as matching, subclassification, weighting, and covariance adjustment). We show, using an extensive simulation, that of all methods considered, and in many of the experimental conditions examined, our new ‘multiple-imputation using two subclassification splines’ method appears to be the most efficient and has coverage levels that are closest to nominal. In addition, it can estimate finite population average causal effects as well as non-linear causal estimands. This type of analysis also allows the identification of subgroups of units for which the effect appears to be especially beneficial or harmful. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26013308  [PubMed – in process]

October 2015

1. Am J Epidemiol. 2015 Oct 29. pii: kwv130. [Epub ahead of print]

Bounding Formulas for Selection Bias.

Huang TH, Lee WC.

Researchers conducting observational studies need to consider 3 types of biases: selection bias, information bias, and confounding bias. A whole arsenal of statistical tools can be used to deal with information and confounding biases. However, methods for addressing selection bias and unmeasured confounding are less developed. In this paper, we propose general bounding formulas for bias, including selection bias and unmeasured confounding. This should help researchers make more prudent interpretations of their (potentially biased) results. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 26519426 [PubMed – as supplied by publisher]

2. Epidemiology. 2015 Oct 19. [Epub ahead of print]

Selection bias due to loss to follow up in cohort studies.

Howe CJ(1), Cole SR, Lau B, Napravnik S, Eron JJ Jr.
(1)a Department of Epidemiology, Center for Population Health and Clinical Epidemiology, Brown University School of Public Health, Providence, Rhode Island b Department of Epidemiology, University of North Carolina Gillings School of Global Public Health, Chapel Hill, North Carolina c Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland d Division of Infectious Diseases, Department of Medicine, University of North Carolina School of Medicine, Chapel Hill, North Carolina.

Selection bias due to loss to follow up represents a threat to the internal validity of estimates derived from cohort studies. Over the last fifteen years, stratification-based techniques as well as methods such as inverse probability-of-censoring weighted estimation have been more prominently discussed and offered as a means to correct for selection bias. However, unlike correcting for confounding bias using inverse weighting, uptake of inverse probability-of-censoring weighted estimation as well as competing methods has been limited in the applied epidemiologic literature. To motivate greater use of inverse probability-of-censoring weighted estimation and competing methods, we use causal diagrams to describe the sources of selection bias in cohort studies employing a time-to-event framework when the quantity of interest is an absolute measure (e.g. absolute risk, survival function) or relative effect measure (e.g., risk difference, risk ratio). We highlight that whether a given estimate obtained from standard methods is potentially subject to selection bias depends on the causal diagram and the measure. We first broadly describe inverse probability-of-censoring weighted estimation and then give a simple example to demonstrate in detail how inverse probability-of-censoring weighted estimation mitigates selection bias and describe challenges to estimation. We then modify complex, real-world data from the University of North Carolina Center for AIDS Research HIV clinical cohort study and estimate the absolute and relative change in the occurrence of death with and without inverse probability-of-censoring weighted correction using the modified University of North Carolina data. We provide SAS code to aid with implementation of inverse probability-of-censoring weighted techniques. PMID: 26484424 [PubMed – as supplied by publisher]

3. BMC Med Res Methodol. 2015 Oct 13;15:83. doi: 10.1186/s12874-015-0074-2.

Evaluation of a weighting approach for performing sensitivity analysis after multiple imputation.

Hayati Rezvan P(1), White IR(2), Lee KJ(3,)(4), Carlin JB(5,)(6,)(7), Simpson JA(8).
(1)Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Melbourne, VIC, Australia. phayati@student.unimelb.edu.au. (2)MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, CB2 0SR, UK. ian.white@mrc-bsu.cam.ac.uk. (3)Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Parkville, Melbourne, VIC, Australia. katherine.lee@mcri.edu.au. (4)Department of Paediatrics, The University of Melbourne, Parkville, Melbourne, VIC, Australia. katherine.lee@mcri.edu.au. (5)Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Melbourne, VIC, Australia. john.carlin@mcri.edu.au. (6)Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Parkville, Melbourne, VIC, Australia. john.carlin@mcri.edu.au. (7)Department of Paediatrics, The University of Melbourne, Parkville, Melbourne, VIC, Australia. john.carlin@mcri.edu.au. (8)Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, Melbourne, VIC, Australia. julieas@unimelb.edu.au.

BACKGROUND: Multiple imputation (MI) is a well-recognised statistical technique for handling missing data. As usually implemented in standard statistical software, MI assumes that data are ‘Missing at random’ (MAR); an assumption that in many settings is implausible. It is not possible to distinguish whether data are MAR or ‘Missing not at random’ (MNAR) using the observed data, so it is desirable to discover the impact of departures from the MAR assumption on the MI results by conducting sensitivity analyses. A weighting approach based on a selection model has been proposed for performing MNAR analyses to assess the robustness of results obtained under standard MI to departures from MAR.
METHODS: In this article, we use simulation to evaluate the weighting approach as a method for exploring possible departures from MAR, with missingness in a single variable, where the parameters of interest are the marginal mean (and probability) of a partially observed outcome variable and a measure of association between the outcome and a fully observed exposure. The simulation studies compare the weighting-based MNAR estimates for various numbers of imputations in small and large samples, for moderate to large magnitudes of departure from MAR, where the degree of departure from MAR was assumed known. Further, we evaluated a proposed graphical method, which uses the dataset with missing data, for obtaining a plausible range of values for the parameter that quantifies the magnitude of departure from MAR.
RESULTS: Our simulation studies confirm that the weighting approach outperformed the MAR approach, but it still suffered from bias. In particular, our findings demonstrate that the weighting approach provides biased parameter estimates, even when a large number of imputations is performed. In the examples presented, the graphical approach for selecting a range of values for the possible departures from MAR did not capture the true parameter value of departure used in generating the data.
CONCLUSIONS: Overall, the weighting approach is not recommended for sensitivity analyses following MI, and further research is required to develop more appropriate methods to perform such sensitivity analyses. PMCID: PMC4604630 PMID: 26464305 [PubMed – in process]

4. Pharmacoepidemiol Drug Saf. 2015 Oct 13. doi: 10.1002/pds.3885. [Epub ahead of print]

Quantifying the impact of time-varying baseline risk adjustment in the self-controlled risk interval design.

Li L(1), Kulldorff M(1), Russek-Cohen E(2), Kawai AT(1), Hua W(2).
(1)Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA. (2)Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA.

PURPOSE: The self-controlled risk interval design is commonly used to assess the association between an acute exposure and an adverse event of interest, implicitly adjusting for fixed, non-time-varying covariates. Explicit adjustment needs to be made for time-varying covariates, for example, age in young children. It can be performed via either a fixed or random adjustment. The random-adjustment approach can provide valid point and interval estimates but requires access to individual-level data for an unexposed baseline sample. The fixed-adjustment approach does not have this requirement and will provide a valid point estimate but may underestimate the variance. We conducted a comprehensive simulation study to evaluate their performance.
METHODS: We designed the simulation study using empirical data from the Food and Drug Administration-sponsored Mini-Sentinel Post-licensure Rapid Immunization Safety Monitoring Rotavirus Vaccines and Intussusception study in children 5-36.9 weeks of age. The time-varying confounder is age. We considered a variety of design parameters including sample size, relative risk, time-varying baseline risks, and risk interval length.
RESULTS: The random-adjustment approach has very good performance in almost all considered settings. The fixed-adjustment approach can be used as a good alternative when the number of events used to estimate the time-varying baseline risks is at least the number of events used to estimate the relative risk, which is almost always the case.
CONCLUSIONS: We successfully identified settings in which the fixed-adjustment approach can be used as a good alternative and provided guidelines on the selection and implementation of appropriate analyses for the self-controlled risk interval design. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26464236 [PubMed – as supplied by publisher]

5. Epidemiology. 2015 Oct 1. [Epub ahead of print]

Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.

Schmidt AF(1), Klungel OH, Groenwold RH; GetReal Consortium.
(1)From the aJulius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands; bDivision of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht, The Netherlands; cDepartment of Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands; and dInstitute of Cardiovascular Science, Faculty of Population Health, University College London, London, United Kingdom.

BACKGROUND: Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification.
METHODS: We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics.
RESULTS: At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%.
CONCLUSION: In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal. PMID: 26436519 [PubMed – as supplied by publisher]

6. Biom J. 2015 Oct 20. doi: 10.1002/bimj.201400102. [Epub ahead of print]

Treatment effect heterogeneity for univariate subgroups in clinical trials: Shrinkage, standardization, or else.

Varadhan R(1,)(2), Wang SJ(3,)(4).
(1)Division of Biostatistics and Bioinformatics, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA. (2)Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA. (3)Office of Biostatistics, OTS/CDER, FDA, Silver Spring, MD, 20993, USA. (4)Engineering and Applied Science Programs for Professionals, Johns Hopkins University, Baltimore, MD, USA.

Treatment effect heterogeneity is a well-recognized phenomenon in randomized controlled clinical trials. In this paper, we discuss subgroup analyses with prespecified subgroups of clinical or biological importance. We explore various alternatives to the naive (the traditional univariate) subgroup analyses to address the issues of multiplicity and confounding. Specifically, we consider a model-based Bayesian shrinkage (Bayes-DS) and a nonparametric, empirical Bayes shrinkage approach (Emp-Bayes) to temper the optimism of traditional univariate subgroup analyses; a standardization approach (standardization) that accounts for correlation between baseline covariates; and a model-based maximum likelihood estimation (MLE) approach. The Bayes-DS and Emp-Bayes methods model the variation in subgroup-specific treatment effect rather than testing the null hypothesis of no difference between subgroups. The standardization approach addresses the issue of confounding in subgroup analyses. The MLE approach is considered only for comparison in simulation studies as the "truth" since the data were generated from the same model. Using the characteristics of a hypothetical large outcome trial, we perform simulation studies and articulate the utilities and potential limitations of these estimators. Simulation results indicate that Bayes-DS and Emp-Bayes can protect against optimism present in the naïve approach. Due to its simplicity, the naïve approach should be the reference for reporting univariate subgroup-specific treatment effect estimates from exploratory subgroup analyses. Standardization, although it tends to have a larger variance, is suggested when it is important to address the confounding of univariate subgroup effects due to correlation between baseline covariates. The Bayes-DS approach is available as an R package (DSBayes). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. PMID: 26485117 [PubMed – as supplied by publisher]

7. J Comp Eff Res. 2015 Sep;4(5):455-63. doi: 10.2217/cer.15.23. Epub 2015 Oct 5.

Can statistical linkage of missing variables reduce bias in treatment effect estimates in comparative effectiveness research studies?

Crown W(1), Chang J(1), Olson M(2), Kahler K(3), Swindle J(4), Buzinec P(5), Shah N(6), Borah B(7). (1)Optum Labs, One Main Street, 10th Floor, Cambridge, MA 02142, USA. (2)Global Head HEOR Excellence, Novartis Pharma AG, 4056, Basel, Switzerland. (3)Novartis Pharmacueticals Corporation, One Health Plaza, East Hanover, NJ 07936-1080, USA. (4)Health Economics & Outcomes Research Optum, Inc., 200 E Randolph, Suite 5300, IL, 60601, USA. (5)Health Economics & Outcomes Research MN002-0258, 12125 Technology Drive, Eden Prairie, MN 55344, USA. (6)Division of Health Care Policy & Research, Mayo Clinic, 200 First St SW, Rochester, MN 55905, USA. (7)Mayo College of Medicine, Division of Healthcare & Medicine, 200 First St SW, Rochester, MN 55905, USA.

AIM: Missing data, particularly missing variables, can create serious analytic challenges in observational comparative effectiveness research studies. Statistical linkage of datasets is a potential method for incorporating missing variables. Prior studies have focused upon the bias introduced by imperfect linkage.
METHODS: This analysis uses a case study of hepatitis C patients to estimate the net effect of statistical linkage on bias, also accounting for the potential reduction in missing variable bias.
RESULTS: The results show that statistical linkage can reduce bias while also enabling parameter estimates to be obtained for the formerly missing variables.
CONCLUSION: The usefulness of statistical linkage will vary depending upon the strength of the correlations of the missing variables with the treatment variable, as well as the outcome variable of interest. PMID: 26436848 [PubMed – in process]

September 2015

1. Biometrics. 2015 Sep 13. doi: 10.1111/biom.12388. [Epub ahead of print]

New methods for treatment effect calibration, with applications to non-inferiority trials.

Zhang Z(1), Nie L(2), Soon G(3), Hu Z(4).
(1)Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, U.S.A. (2)Division of Biometrics V, Office of Biostatistics, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, U.S.A. (3)Division of Biometrics IV, Office of Biostatistics, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, Maryland, U.S.A. (4)Biostatistics Research Branch, Division of Clinical Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, Maryland, U.S.A.

In comparative effectiveness research, it is often of interest to calibrate treatment effect estimates from a clinical trial to a target population that differs from the study population. One important application is an indirect comparison of a new treatment with a placebo control on the basis of two separate randomized clinical trials: a non-inferiority trial comparing the new treatment with an active control and a historical trial comparing the active control with placebo. The available methods for treatment effect calibration include an outcome regression (OR) method based on a regression model for the outcome and a weighting method based on a propensity score (PS) model. This article proposes new methods for treatment effect calibration: one based on a conditional effect (CE) model and two doubly robust (DR) methods. The first DR method involves a PS model and an OR model, is asymptotically valid if either model is correct, and attains the semiparametric information bound if both models are correct. The second DR method involves a PS model, a CE model, and possibly an OR model, is asymptotically valid under the union of the PS and CE models, and attains the semiparametric information bound if all three models are correct. The various methods are compared in a simulation study and applied to recent clinical trials for treating human immunodeficiency virus infection. © 2015, The International Biometric Society. PMID: 26363775 [PubMed – as supplied by publisher]

2. Epidemiology. 2015 Sep;26(5):645-52. doi: 10.1097/EDE.0000000000000330.

Multiple Imputation to Account for Measurement Error in Marginal Structural Models.

Edwards JK(1), Cole SR, Westreich D, Crane H, Eron JJ, Mathews WC, Moore R, Boswell SL, Lesko CR, Mugavero MJ; CNICS.
(1)From the aDepartment of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC; bDivision of Allergy and Infectious Diseases, Department of Medicine, University of Washington, Seattle, WA; cDivision of Infectious Diseases, Department of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC; dSchool of Medicine, University of California, San Diego, San Diego, CA; eSchool of Medicine, Johns Hopkins University, Baltimore, MD; fFenway Health, Boston, MA; and gDivision of Infectious Diseases, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL.

BACKGROUND: Marginal structural models are an important tool for observational studies. These models typically assume that variables are measured without error. We describe a method to account for differential and nondifferential measurement error in a marginal structural model.
METHODS: We illustrate the method estimating the joint effects of antiretroviral therapy initiation and current smoking on all-cause mortality in a United States cohort of 12,290 patients with HIV followed for up to 5 years between 1998 and 2011. Smoking status was likely measured with error, but a subset of 3,686 patients who reported smoking status on separate questionnaires composed an internal validation subgroup. We compared a standard joint marginal structural model fit using inverse probability weights to a model that also accounted for misclassification of smoking status using multiple imputation.
RESULTS: In the standard analysis, current smoking was not associated with increased risk of mortality. After accounting for misclassification, current smoking without therapy was associated with increased mortality (hazard ratio [HR]: 1.2 [95% confidence interval [CI] = 0.6, 2.3]). The HR for current smoking and therapy [0.4 (95% CI = 0.2, 0.7)] was similar to the HR for no smoking and therapy (0.4; 95% CI = 0.2, 0.6).
CONCLUSIONS: Multiple imputation can be used to account for measurement error in concert with methods for causal inference to strengthen results from observational studies. PMID: 26214338 [PubMed – in process]

3. Med Care. 2015 Sep;53(9):e65-72. doi: 10.1097/MLR.0b013e318297429c.

Why Summary Comorbidity Measures Such As the Charlson Comorbidity Index and Elixhauser Score Work.

Austin SR(1), Wong YN, Uzzo RG, Beck JR, Egleston BL.
(1)*Whiting School of Engineering Undergraduate Student, Johns Hopkins University, Baltimore, MD Departments of †Medical Oncology ‡Surgery, Fox Chase Cancer Center §Academic Affairs, Fox Chase Cancer Center, Temple University Health System ∥Biostatistics and Bioinformatics Facility, Fox Chase Cancer Center, Temple University Health System, Philadelphia, PA.

BACKGROUND: Comorbidity adjustment is an important component of health services research and clinical prognosis. When adjusting for comorbidities in statistical models, researchers can include comorbidities individually or through the use of summary measures such as the Charlson Comorbidity Index or Elixhauser score. We examined the conditions under which individual versus summary measures are most appropriate.
METHODS: We provide an analytic proof of the utility of comorbidity summary measures when used in place of individual comorbidities. We compared the use of the Charlson and Elixhauser scores versus individual comorbidities in prognostic models using a SEER-Medicare data example. We examined the ability of summary comorbidity measures to adjust for confounding using simulations.
RESULTS: We devised a mathematical proof that found that the comorbidity summary measures are appropriate prognostic or adjustment mechanisms in survival analyses. Once one knows the comorbidity score, no other information about the comorbidity variables used to create the score is generally needed. Our data example and simulations largely confirmed this finding.
CONCLUSIONS: Summary comorbidity measures, such as the Charlson Comorbidity Index and Elixhauser scores, are commonly used for clinical prognosis and comorbidity adjustment. We have provided a theoretical justification that validates the use of such scores under many conditions. Our simulations generally confirm the utility of the summary comorbidity measures as substitutes for use of the individual comorbidity variables in health services research. One caveat is that a summary measure may only be as good as the variables used to create it. PMCID: PMC3818341 [Available on 2016-09-01] PMID: 23703645 [PubMed – in process]

4. Stat Methods Med Res. 2015 Sep 1. pii: 0962280215601134. [Epub ahead of print]

Estimating the effect of treatment on binary outcomes using full matching on the propensity score.

Austin PC(1), Stuart EA(2).
(1)Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada Institute of Health Management, Policy and Evaluation, University of Toronto, Ontario, Canada Schulich Heart Research Program, Sunnybrook Research Institute, Toronto, Canada peter.austin@ices.on.ca. (2)Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Many non-experimental studies use propensity-score methods to estimate causal effects by balancing treatment and control groups on a set of observed baseline covariates. Full matching on the propensity score has emerged as a particularly effective and flexible method for utilizing all available data, and creating well-balanced treatment and comparison groups. However, full matching has been used infrequently with binary outcomes, and relatively little work has investigated the performance of full matching when estimating effects on binary outcomes. This paper describes methods that can be used for estimating the effect of treatment on binary outcomes when using full matching. It then used Monte Carlo simulations to evaluate the performance of these methods based on full matching (with and without a caliper), and compared their performance with that of nearest neighbour matching (with and without a caliper) and inverse probability of treatment weighting. The simulations varied the prevalence of the treatment and the strength of association between the covariates and treatment assignment. Results indicated that all of the approaches work well when the strength of confounding is relatively weak. With stronger confounding, the relative performance of the methods varies, with nearest neighbour matching with a caliper showing consistently good performance across a wide range of settings. We illustrate the approaches using a study estimating the effect of inpatient smoking cessation counselling on survival following hospitalization for a heart attack. © The Author(s) 2015. PMID: 26329750 [PubMed – as supplied by publisher]

August 2015

1. Biometrics. 2015 Aug 21. doi: 10.1111/biom.12361. [Epub ahead of print]

Time-dependent prognostic score matching for recurrent event analysis to evaluate a treatment assigned during follow-up.

Smith AR(1), Schaubel DE(1).
(1)Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.

Recurrent events often serve as the outcome in epidemiologic studies. In some observational studies, the goal is to estimate the effect of a new or "experimental" (i.e., less established) treatment of interest on the recurrent event rate. The incentive for accepting the new treatment may be that it is more available than the standard treatment. Given that the patient can choose between the experimental treatment and conventional therapy, it is of clinical importance to compare the treatment of interest versus the setting where the experimental treatment did not exist, in which case patients could only receive no treatment or the standard treatment. Many methods exist for the analysis of recurrent events and for the evaluation of treatment effects. However, methodology for the intersection of these two areas is sparse. Moreover, care must be taken in setting up the comparison groups in our setting; use of existing methods featuring time-dependent treatment indicators will generally lead to a biased treatment effect since the comparison group construction will not properly account for the timing of treatment initiation. We propose a sequential stratification method featuring time-dependent prognostic score matching to estimate the effect of a time-dependent treatment on the recurrent event rate. The performance of the method in moderate-sized samples is assessed through simulation. The proposed methods are applied to a prospective clinical study in order to evaluate the effect of living donor liver transplantation on hospitalization rates; in this setting, conventional therapy involves remaining on the wait list or receiving a deceased donor transplant. © 2015, The International Biometric Society. PMID: 26295563  [PubMed – as supplied by publisher]

2. Epidemiology. 2015 Aug 13. [Epub ahead of print]

Doubly Robust Estimation of Standardized Risk Difference and Ratio in the Exposed Population.

Shinozaki T(1), Matsuyama Y.
(1)From the Department of Biostatistics, School of Public Health, The University  of Tokyo, Tokyo, Japan.

Standardization-a method used to adjust for confounding-estimates counterfactual  risks in a target population. To adjust for confounding variables that contain too many combinations to be fully stratified, two model-based standardization methods exist: regression standardization and use of an inverse probability of exposure weighted-reweighted estimators. Whereas the former requires an outcome regression model conditional on exposure and confounders, the latter requires a propensity score model. In reconciling among their modeling assumptions, doubly robust estimators, which only require correct specification of either the outcome regression or the propensity score model but do not necessitate both, have been well studied for total populations. Here, we provide doubly robust estimators of  standardized risk difference and ratio in the exposed population. Theoretical details, simple model extension for independently censored outcomes, and a SAS program are provided in the eAppendix (http://links.lww.com/EDE/A955). PMID: 26275176 [PubMed – as supplied by publisher]

3. Stat Med. 2015 Aug 6. doi: 10.1002/sim.6602. [Epub ahead of print]

Optimal full matching for survival outcomes: a method that merits more widespread use.

Austin PC(1,)(2,)(3), Stuart EA(4,)(5,)(6).
(1)Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada. (2)Institute of Health Management, Policy and Evaluation, University of Toronto,  Toronto, Ontario, Canada. (3)Schulich Heart Research Program, Sunnybrook Research Institute, Toronto, Ontario, Canada. (4)Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, U.S.A. (5)Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, U.S.A. (6)Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, U.S.A.

Matching on the propensity score is a commonly used analytic method for estimating the effects of treatments on outcomes. Commonly used propensity score matching methods include nearest neighbor matching and nearest neighbor caliper matching. Rosenbaum (1991) proposed an optimal full matching approach, in which matched strata are formed consisting of either one treated subject and at least one control subject or one control subject and at least one treated subject. Full matching has been used rarely in the applied literature. Furthermore, its performance for use with survival outcomes has not been rigorously evaluated. We propose a method to use full matching to estimate the effect of treatment on the hazard of the occurrence of the outcome. An extensive set of Monte Carlo simulations were conducted to examine the performance of optimal full matching with survival analysis. Its performance was compared with that of nearest neighbor matching, nearest neighbor caliper matching, and inverse probability of treatment weighting using the propensity score. Full matching has superior performance compared with that of the two other matching algorithms and had comparable performance with that of inverse probability of treatment weighting using the propensity score. We illustrate the application of full matching with survival outcomes to estimate the effect of statin prescribing at hospital discharge on the hazard of post-discharge mortality in a large cohort of patients who were discharged from hospital with a diagnosis of acute myocardial infarction. Optimal full matching merits more widespread adoption in medical and epidemiological research. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. PMID: 26250611 [PubMed – as supplied by publisher]

4. Stat Med. 2015 Aug 3. doi: 10.1002/sim.6607. [Epub ahead of print]

Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies.

Austin PC(1,)(2,)(3), Stuart EA(4,)(5,)(6).
(1)Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada. (2)Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada. (3)Schulich Heart Research Program, Sunnybrook Research Institute, Toronto, Canada. (4)Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, U.S.A. (5)Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, U.S.A. (6)Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, U.S.A.

The propensity score is defined as a subject’s probability of treatment selection, conditional on observed baseline covariates. Weighting subjects by the inverse probability of treatment received creates a synthetic sample in which treatment assignment is independent of measured baseline covariates. Inverse probability of treatment weighting (IPTW) using the propensity score allows one to obtain unbiased estimates of average treatment effects. However, these estimates are only valid if there are no residual systematic differences in observed baseline characteristics between treated and control subjects in the sample weighted by the estimated inverse probability of treatment. We report on a systematic literature review, in which we found that the use of IPTW has increased rapidly in recent years, but that in the most recent year, a majority of studies did not formally examine whether weighting balanced measured covariates between treatment groups. We then proceed to describe a suite of quantitative and qualitative methods that allow one to assess whether measured baseline covariates are balanced between treatment groups in the weighted sample. The quantitative methods use the weighted standardized difference to compare means, prevalences, higher-order moments, and interactions. The qualitative methods employ graphical methods to compare the distribution of continuous baseline covariates between treated and control subjects in the weighted sample.  Finally, we illustrate the application of these methods in an empirical case study. We propose a formal set of balance diagnostics that contribute towards an evolving concept of ‘best practice’ when using IPTW to estimate causal treatment  effects using observational data. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. PMID: 26238958 [PubMed – as supplied by publisher]

5. Clin Trials. 2015 Aug;12(4):309-16. doi: 10.1177/1740774515583500. Epub 2015 May 6.

Surrogate markers for time-varying treatments and outcomes.

Hsu JY(1), Kennedy EH(2), Roy JA(2), Stephens-Shields AJ(2), Small DS(3), Joffe MM(2).
(1)Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA hsu9@mail.med.upenn.edu. (2)Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. (3)Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA, USA.

BACKGROUND: A surrogate marker is a variable commonly used in clinical trials to guide treatment decisions when the outcome of ultimate interest is not available. A good surrogate marker is one where the treatment effect on the surrogate is a strong predictor of the effect of treatment on the outcome. We review the situation when there is one treatment delivered at baseline, one surrogate measured at one later time point, and one ultimate outcome of interest and discuss new issues arising when variables are time-varying.
METHODS: Most of the literature on surrogate markers has only considered simple settings with one treatment, one surrogate, and one outcome of interest at a fixed time point. However, more complicated time-varying settings are common in practice. In this article, we describe the unique challenges in two settings, time-varying treatments and time-varying surrogates, while relating the ideas back to the causal-effects and causal-association paradigms.
CONCLUSION: In addition to discussing and extending popular notions of surrogacy to time-varying settings, we give examples illustrating that one can be misled by not taking into account time-varying information about the surrogate or treatment. We hope this article has provided some motivation for future work on estimation and inference in such settings. © The Author(s) 2015. PMCID: PMC4506229 [Available on 2016-08-01] PMID: 25948621 [PubMed – in process]

6. Drug Saf. 2015 Jul 8. [Epub ahead of print]

A Method to Combine Signals from Spontaneous Reporting Systems and Observational Healthcare Data to Detect Adverse Drug Reactions.

Li Y(1), Ryan PB, Wei Y, Friedman C.
(1)Department of Biomedical Informatics, Columbia University Medical Center, 622  W. 168th Street, Presbyterian Building 20th Floor, New York, NY, 10032, USA, yl2565@columbia.edu.

INTRODUCTION: Observational healthcare data contain information useful for hastening detection of adverse drug reactions (ADRs) that may be missed by using data in spontaneous reporting systems (SRSs) alone. There are only several papers describing methods that integrate evidence from healthcare databases and SRSs. We propose a methodology that combines ADR signals from these two sources.
OBJECTIVES: The aim of this study was to investigate whether the proposed method would result in more accurate ADR detection than methods using SRSs or healthcare data alone.
RESEARCH DESIGN: We applied the method to four clinically serious ADRs, and evaluated it using three experiments that involve combining an SRS with a single  facility small-scale electronic health record (EHR), a larger scale network-based EHR, and a much larger scale healthcare claims database. The evaluation used a reference standard comprising 165 positive and 234 negative drug-ADR pairs.
MEASURES: Area under the receiver operator characteristics curve (AUC) was computed to measure performance.
RESULTS: There was no improvement in the AUC when the SRS and small-scale HER were combined. The AUC of the combined SRS and large-scale EHR was 0.82 whereas it was 0.76 for each of the individual systems. Similarly, the AUC of the combined SRS and claims system was 0.82 whereas it was 0.76 and 0.78, respectively, for the individual systems.
CONCLUSIONS: The proposed method resulted in a significant improvement in the accuracy of ADR detection when the resources used for combining had sufficient amounts of data, demonstrating that the method could integrate evidence from multiple sources and serve as a tool in actual pharmacovigilance practice. PMID: 26153397  [PubMed – as supplied by publisher]

7. Drug Saf. 2015 Sep 1. [Epub ahead of print]

Future Proofing Adverse Event Monitoring.

Seeger JD(1).

Author information:
(1)Division of Pharmacoepidemiology and Pharmacoeconomics, Harvard Medical School/Brigham and Women’s Hospital, 1620 Tremont, Suite 3030, Boston, MA, 02120, USA, jdseeger@partners.org.

PMID: 26323240  [PubMed – as supplied by publisher]

July 2015

1. BMC Med Res Methodol. 2015 Jul 28;15(1):53. doi: 10.1186/s12874-015-0049-3.

Propensity score interval matching: using bootstrap confidence intervals for accommodating estimation errors of propensity scores.

Pan W(1), Bai H(2).
(1)School of Nursing, Duke University, DUMC 3322, 307 Trent Drive, Durham, NC, 27710, USA. wei.pan@duke.edu. (2)Department of Educational and Human Sciences, University of Central Florida, PO Box 161250, Orlando, FL, 32816, USA. haiyan.bai@ucf.edu.

BACKGROUND: Propensity score methods have become a popular tool for reducing selection bias in making causal inference from observational studies in medical research. Propensity score matching, a key component of propensity score methods, normally matches units based on the distance between point estimates of the propensity scores. The problem with this technique is that it is difficult to establish a sensible criterion to evaluate the closeness of matched units without knowing estimation errors of the propensity scores.
METHODS: The present study introduces interval matching using bootstrap confidence intervals for accommodating estimation errors of propensity scores. In interval matching, if the confidence interval of a unit in the treatment group overlaps with that of one or more units in the comparison group, they are considered as matched units.
RESULTS: The procedure of interval matching is illustrated in an empirical example using a real-life dataset from the Nursing Home Compare, a national survey conducted by the Centers for Medicare and Medicaid Services. The empirical example provided promising evidence that interval matching reduced more selection bias than did commonly used matching methods including the rival method, caliper  matching. Interval matching’s approach methodologically sounds more meaningful than its competing matching methods because interval matching develop a more “scientific” criterion for matching units using confidence intervals.
CONCLUSIONS: Interval matching is a promisingly better alternative tool for reducing selection bias in making causal inference from observational studies, especially useful in secondary data analysis on national databases such as the Centers for Medicare and Medicaid Services data. PMCID: PMC4517543 PMID: 26215035  [PubMed – in process]

2. Int J Epidemiol. 2015 Jul 25. pii: dyv135. [Epub ahead of print]

Imputation approaches for potential outcomes in causal inference.

Westreich D(1), Edwards JK(2), Cole SR(2), Platt RW(3), Mumford SL(4), Schisterman EF(4).
(1)Department of Epidemiology, Gillings School of Global Public Health, UNC-Chapel Hill, NC, USA. djw@unc.edu. (2)Department of Epidemiology, Gillings School of Global Public Health, UNC-Chapel Hill, NC, USA. (3)Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, QC, Canada and. (4)Epidemiology Branch, Eunice Kennedy Shriver National Institute of  Child Health and Human Development, Bethesda, MD, USA.

BACKGROUND: The fundamental problem of causal inference is one of missing data, and specifically of missing potential outcomes: if potential outcomes were fully observed, then causal inference could be made trivially. Though often not discussed explicitly in the epidemiological literature, the connections between causal inference and missing data can provide additional intuition.
METHODS: We demonstrate how we can approach causal inference in ways similar to how we address all problems of missing data, using multiple imputation and the parametric g-formula.
RESULTS: We explain and demonstrate the use of these methods in example data, and discuss implications for more traditional approaches to causal inference.
CONCLUSIONS: Though there are advantages and disadvantages to both multiple imputation and g-formula approaches, epidemiologists can benefit from thinking about their causal inference problems as problems of missing data, as such perspectives may lend new and clarifying insights to their analyses. © The Author 2015; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association. PMID: 26210611  [PubMed – as supplied by publisher]

3. Pharmacoepidemiol Drug Saf. 2015 Jul 14. doi: 10.1002/pds.3832. [Epub ahead of print]

Comparative validity of methods to select appropriate cutoff weight for probabilistic linkage without unique personal identifiers.

Zhu Y(1), Chen CY(2), Matsuyama Y(1), Ohashi Y(1)(3), Franklin JM(2), Setoguchi S(4)(5). 
(1)Department of Biostatistics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan. (2)Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. (3)Department of Integrated Science and Engineering for Sustainable Society, Chuo University, Tokyo, Japan. (4)Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, USA. (5)Department of Pharmacoepidemiology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.

PURPOSE: Record linkage can enhance data quality of observational database studies. Probabilistic linkage, a method that allows partial match of linkage variables, overcomes disagreements arising from errors and omissions in data entry but also results in false-positive links. The study aimed to assess the validity of probabilistic linkage in the absence of unique personal identifiers (UPI) and the methods of cutoff weight selection.
METHODS: We linked an implantable cardioverter defibrillator placement registry to Medicare inpatient files of 1 year with anonymous nonunique variables and assessed the validity of three methods of cutoff selection against an internally derived gold standard with UPI.
RESULTS: Of the 64 890 registry records with an expected linkage rate of 55-65%, 55% were linked at cutoffs associated with positive predictive value (PPV) of ≥90%. Histogram inspection suggested an approximate range of optimal cutoffs. The duplicate method made accurate estimates of cutoff and PPV if the method’s assumption was met. With adjusted estimates of the sizes of true matches and searched files, the odds formula method made relatively accurate estimates of cutoff and PPV.
CONCLUSIONS: Probabilistic linkage without UPI generated valid linkages when an optimal cutoff was chosen. Cutoff selection remains challenging; however, histogram inspection, the duplicate method, and the odds formula method can be used in conjunction when a gold standard is not available. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26179362  [PubMed – as supplied by publisher]

4. Pharmacoepidemiol Drug Saf. 2015 Jul;24(7):738-47. doi: 10.1002/pds.3789. Epub 2015 May 26.

Propensity score matching and persistence correction to reduce bias in comparative effectiveness: the effect of cinacalcet use on all-cause mortality.

Gillespie IA(1), Floege J(2), Gioni I(3), Drüeke TB(4), de Francisco AL(5), Anker SD(6), Kubo Y(7), Wheeler DC(8), Froissart M(9); on behalf the ARO Steering Committee collaborators.
(1)Amgen Ltd, Uxbridge, UK. (2)Division of Nephrology, Medizinische Klinik II, RWTH University Hospital Aachen, Aachen, Germany. (3)on behalf of Amgen Ltd, UK.  (4)Inserm Unit 1088, UFR de Médecine et de Pharmacie, Université de Picardie, Amiens, France. (5)Servicio de Nefrología, Hospital Universitario Valdecilla, Universidad de Cantabria, Santander, Spain. (6)Department of Innovative Clinical  Trials, University Medical Centre Göttingen, Göttingen, Germany. (7)Amgen Inc, Thousand Oaks, CA, USA. (8)Center for Nephrology, University College London, UK.  (9)Amgen Europe GmbH, Zug, Switzerland.

PURPOSE: The generalisability of randomised controlled trials (RCTs) may be limited by restrictive entry criteria or by their experimental nature. Observational research can provide complementary findings but is prone to bias. Employing propensity score matching, to reduce such bias, we compared the real-life effect of cinacalcet use on all-cause mortality (ACM) with findings from the Evaluation of Cinacalcet Therapy to Lower Cardiovascular Events (EVOLVE) RCT in chronic haemodialysis patients.
METHODS: Incident adult haemodialysis patients receiving cinacalcet, recruited in a prospective observational cohort from 2007-2009 (AROii; n = 10,488), were matched to non-exposed patients regardless of future exposure status. The effect of treatment crossover was investigated with inverse probability of censoring weighted and lag-censored analyses. EVOLVE ACM data were analysed largely as described for the primary composite endpoint.
RESULTS: AROii patients receiving cinacalcet (n = 532) were matched to 1790 non-exposed patients. The treatment effect of cinacalcet on ACM in the main AROii analysis (hazard ratio 1.03 [95% confidence interval (CI) 0.78-1.35]) was closer  to the null than for the Intention to Treat (ITT) analysis of EVOLVE (0.94 [95%CI 0.85-1.04]). Adjusting for non-persistence by 0- and 6-month lag-censoring and by inverse probability of censoring weight, the hazard ratios in AROii (0.76 [95%CI  0.51-1.15], 0.84 [95%CI 0.60-1.18] and 0.79 [95%CI 0.56-1.11], respectively) were comparable with those of EVOLVE (0.82 [95%CI 0.67-1.01], 0.83 [95%CI 0.73-0.96] and 0.87 [95%CI 0.71-1.06], respectively).
CONCLUSIONS: Correcting for treatment crossover, we observed results in the ‘real-life’ setting of the AROii observational cohort that closely mirrored the results of the EVOLVE RCT. Persistence-corrected analyses revealed a trend towards reduced ACM in haemodialysis patients receiving cinacalcet therapy. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26011775  [PubMed – in process]

5. Am J Epidemiol. 2015 Jul 1;182(1):17-25. doi: 10.1093/aje/kwu485. Epub 2015 Apr 12.

Controlling Time-Dependent Confounding by Health Status and Frailty: Restriction Versus Statistical Adjustment.

McGrath LJ, Ellis AR, Brookhart MA.

Nonexperimental studies of preventive interventions are often biased because of the healthy-user effect and, in frail populations, because of confounding by functional status. Bias is evident when estimating influenza vaccine effectiveness, even after adjustment for claims-based indicators of illness. We explored bias reduction methods while estimating vaccine effectiveness in a cohort of adult hemodialysis patients. Using the United States Renal Data System  and linked data from a commercial dialysis provider, we estimated vaccine effectiveness using a Cox proportional hazards marginal structural model of all-cause mortality before and during 3 influenza seasons in 2005/2006 through 2007/2008. To improve confounding control, we added frailty indicators to the model, measured time-varying confounders at different time intervals, and restricted the sample in multiple ways. Crude and baseline-adjusted marginal structural models remained strongly biased. Restricting to a healthier population removed some unmeasured confounding; however, this reduced the sample size, resulting in wide confidence intervals. We estimated an influenza vaccine effectiveness of 9% (hazard ratio = 0.91, 95% confidence interval: 0.72, 1.15) when bias was minimized through cohort restriction. In this study, the healthy-user bias could not be controlled through statistical adjustment; however, sample restriction reduced much of the bias. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions,  please e-mail: journals.permissions@oup.com. PMCID: PMC4479111 [Available on 2016-07-01] PMID: 25868551  [PubMed – in process]

6. Am J Epidemiol. 2015 Aug 1. pii: kwv108. [Epub ahead of print]

Regularized Regression Versus the High-Dimensional Propensity Score for Confounding Adjustment in Secondary Database Analyses.

Franklin JM, Eddings W, Glynn RJ, Schneeweiss S.

Selection and measurement of confounders is critical for successful adjustment in nonrandomized studies. Although the principles behind confounder selection are now well established, variable selection for confounder adjustment remains a difficult problem in practice, particularly in secondary analyses of databases. We present a simulation study that compares the high-dimensional propensity score algorithm for variable selection with approaches that utilize direct adjustment for all potential confounders via regularized regression, including ridge regression and lasso regression. Simulations were based on 2 previously published pharmacoepidemiologic cohorts and used the plasmode simulation framework to create realistic simulated data sets with thousands of potential confounders. Performance of methods was evaluated with respect to bias and mean squared error  of the estimated effects of a binary treatment. Simulation scenarios varied the true underlying outcome model, treatment effect, prevalence of exposure and outcome, and presence of unmeasured confounding. Across scenarios, high-dimensional propensity score approaches generally performed better than regularized regression approaches. However, including the variables selected by lasso regression in a regular propensity score model also performed well and may  provide a promising alternative variable selection method.  © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions,  please e-mail: journals.permissions@oup.com.  PMID: 26233956  [PubMed – as supplied by publisher]

June 2015

1. Pharmacoepidemiol Drug Saf. 2015 Jun 26. doi: 10.1002/pds.3822. [Epub ahead of print]

Do case-only designs yield consistent results across design and different databases? A case study of hip fractures and benzodiazepines.

Requena G(1), Logie J(2), Martin E(3), Boudiaf N(2), González González R(3), Huerta C(3), Alvarez A(3), Webb D(2), Bate A(4), García Rodríguez LA(5), Reynolds R(6), Schlienger R(7), Gardarsdottir H(8), de Groot M(8), Klungel OH(8), de Abajo F(1,)(9), Douglas IJ(10).
(1)Pharmacology Unit, Department of Biomedical Sciences, School of Medicine, University of Alcalá, Madrid, Spain. (2)Worldwide Epidemiology, GlaxoSmithKline, Research and Development, Uxbridge, Middlesex, UK. (3)BIFAP Research Unit, Spanish Agency of Medicines and Medical Devices, Madrid, Spain. (4)Epidemiology, Pfizer Ltd, Tadworth, UK. (5)Spanish Center for Pharmacoepidemiological Research (CEIFE), Madrid, Spain. (6)Epidemiology, Pfizer Research and Development, New York, USA. (7)Global Clinical Epidemiology, Novartis Pharma AG, Basel, Switzerland. (8)Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht University, Utrecht, the Netherlands. (9)Clinical Pharmacology Unit, University Hospital Príncipe de Asturias, Madrid, Spain. (10)London School of Hygiene and Tropical Medicine (LSHTM), London, UK.

BACKGROUND: The case-crossover (CXO) and self-controlled case series (SCCS) designs are increasingly used in pharmacoepidemiology. In both, relative risk estimates are obtained within persons, implicitly controlling for time-fixed confounding variables.
OBJECTIVES: To examine the consistency of relative risk estimates of hip/femur fractures (HFF) associated with the use of benzodiazepines (BZD) across case-only designs in two databases (DBs), when a common protocol was applied.
METHODS: CXO and SCCS studies were conducted in BIFAP (Spain) and CPRD (UK). Exposure to BZD was divided into non-use, current, recent and past use. For CXO, odds ratios (OR; 95%CI) of current use versus non-use/past were estimated using conditional logistic regression adjusted for co-medications (AOR). For the SCCS,  conditional Poisson regression was used to estimate incidence rate ratios (IRR; 95%CI) of current use versus non/past-use, adjusted for age. To investigate possible event-exposure dependence the relative risk in the 30 days prior to first BZD exposure was also evaluated.
RESULTS: In the CXO current use of BZD was associated with an increased risk of HFF in both DBs, AORBIFAP  = 1.47 (1.29-1.67) and AORCPRD  = 1.55 (1.41-1.70). In the SCCS, IRRs for current exposure was 0.79 (0.72-0.86) in BIFAP and 1.21 (1.13-1.30) in CPRD. However, when we considered separately the 30-day pre-exposure period, the IRR for current period was 1.43 (1.31-1.57) in BIFAP and 1.37 (1.27-1.47) in CPRD.
CONCLUSIONS: CXO designs yielded consistent results across DBs, while initial SCCS analyses did not. Accounting for event-exposure dependence, estimates derived from SCCS were more consistent across DBs and designs. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26112821  [PubMed – as supplied by publisher]

2. Pharmacoepidemiol Drug Saf. 2015 Jun 25. doi: 10.1002/pds.3810. [Epub ahead of print]

Matching on the disease risk score in comparative effectiveness research of new treatments.

Wyss R(1,)(2), Ellis AR(3), Brookhart MA(1), Jonsson Funk M(1), Girman CJ(1,)(4), Simpson RJ Jr(5), Stürmer T(1).
(1)Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. (2)Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA. (3)The Cecil G. Sheps Center for Health Services Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. (4)CERobs Consulting, LLC, Chapel Hill, NC, USA.(5)Department of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

PURPOSE: We use simulations and an empirical example to evaluate the performance of disease risk score (DRS) matching compared with propensity score (PS) matching when controlling large numbers of covariates in settings involving newly introduced treatments.
METHODS: We simulated a dichotomous treatment, a dichotomous outcome, and 100 baseline covariates that included both continuous and dichotomous random variables. For the empirical example, we evaluated the comparative effectiveness of dabigatran versus warfarin in preventing combined ischemic stroke and all-cause mortality. We matched treatment groups on a historically estimated DRS and again on the PS. We controlled for a high-dimensional set of covariates using 20% and 1% samples of Medicare claims data from October 2010 through December 2012.
RESULTS: In simulations, matching on the DRS versus the PS generally yielded matches for more treated individuals and improved precision of the effect estimate. For the empirical example, PS and DRS matching in the 20% sample resulted in similar hazard ratios (0.88 and 0.87) and standard errors (0.04 for both methods). In the 1% sample, PS matching resulted in matches for only 92.0% of the treated population and a hazard ratio and standard error of 0.89 and 0.19, respectively, while DRS matching resulted in matches for 98.5% and a hazard ratio and standard error of 0.85 and 0.16, respectively.
CONCLUSIONS: When PS distributions are separated, DRS matching can improve the precision of effect estimates and allow researchers to evaluate the treatment effect in a larger proportion of the treated population. However, accurately modeling the DRS can be challenging compared with the PS. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26112690  [PubMed – as supplied by publisher]

3. Drug Saf. 2015 Jun 9. [Epub ahead of print]

A Comparative Assessment of Observational Medical Outcomes Partnership and Mini-Sentinel Common Data Models and Analytics: Implications for Active Drug Safety Surveillance.

Xu Y, Zhou X, Suehs BT, Hartzema AG, Kahn MG, Moride Y, Sauer BC, Liu Q, Moll K, Pasquale MK, Nair VP, Bate A.
Comprehensive Health Insights, Humana Inc., 515 W. Market St., Louisville, KY, 40202, USA.

INTRODUCTION: An often key component to coordinating surveillance activities across distributed networks is the design and implementation of a common data model (CDM). The purpose of this study was to evaluate two drug safety surveillance CDMs from an ecosystem perspective to better understand how differences in CDMs and analytic tools affect usability and interpretation of results.
METHODS: Humana claims data from 2007 to 2012 were mapped to Observational Medical Outcomes Partnership (OMOP) and Mini-Sentinel CDMs. Data were described and compared at the patient level by source code and mapped concepts. Study cohort construction and effect estimates were also compared using two different analytical methods-one based on a new user design implementing a high-dimensional propensity score (HDPS) algorithm and the other based on univariate self-controlled case series (SCCS) design-across six established positive drug-outcome pairs to learn how differences in CDMs and analytics influence steps in the database analytic process and results.
RESULTS: Claims data for approximately 7.7 million Humana health plan members were transformed into the two CDMs. Three health outcome cohorts and two drug cohorts showed differences in cohort size and constituency between Mini-Sentinel  and OMOP CDMs, which was a result of multiple factors. Overall, the implementation of the HDPS procedure on Mini-Sentinel CDM detected more known positive associations than that on OMOP CDM. The SCCS method results were comparable on both CDMs. Differences in the implementation of the HDPS procedure  between the two CDMs were identified; analytic model and risk period specification had a significant impact on the performance of the HDPS procedure on OMOP CDM.
CONCLUSIONS: Differences were observed between OMOP and Mini-Sentinel CDMs. The analysis of both CDMs at the data model level indicated that such conceptual differences had only a slight but not significant impact on identifying known safety associations. Our results show that differences at the ecosystem level of  analyses across the CDMs can lead to strikingly different risk estimations, but this can be primarily attributed to the choices of analytic approach and their
implementation in the community-developed analytic tools. The opportunities of using CDMs are clear, but our study shows the need for judicious comparison of analyses across the CDMs. Our work emphasizes the need for ongoing efforts to ensure sustainable transparent platforms to maintain and develop CDMs and associated tools for effective safety surveillance. PMID: 26055920  [PubMed – as supplied by publisher]

4. Pharmacoepidemiol Drug Saf. 2015 Jun 4. doi: 10.1002/pds.3798. [Epub ahead of print]

Methodological considerations in assessing the effectiveness of antidepressant medication continuation during pregnancy using administrative data(†).

Swanson SA(1), Hernandez-Diaz S(1), Palmsten K(2), Mogun H(3), Olfson M(4), Huybrechts KF(3).
(1)Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA. (2)Department of Pediatrics, University of California, San Diego. (3)Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School. (4)New York State Psychiatric Institute, New York; the Department of Psychiatry, College of Physicians and Surgeons of Columbia University, New York.

PURPOSE: The decision whether to continue antidepressant use for depression during pregnancy requires weighing maternal and child risks and benefits. Little is known about the effectiveness of antidepressant therapy during pregnancy. The goal of this study is to evaluate whether standard administrative claims data can be used to evaluate the effectiveness of antidepressants.
METHODS: Using prescription and healthcare visit Medicaid claims (2000 2007), we identified 28 493 women with a depression diagnosis and antidepressant fill in the 90 days before their last menstrual period. Antidepressant continuation was defined based on prescription fills during the first trimester. Depression hospitalizations and deliberate self-harm served as measures of the effectiveness of treatment continuation during pregnancy. Propensity score and instrumental variable analyses were used to attempt to account for confounding.
RESULTS: Relative to women who discontinued antidepressant therapy, women who continued were more likely to have a depression inpatient stay (odds ratio [OR] = 2.2, 95% confidence interval [95%CI]: 2.0-2.4) and deliberate self-harm code (OR = 1.4, 95%CI: 0.7-2.7). Accounting for measured covariates in the propensity score analysis, including age, race, comorbidities, comedications, features of the depression diagnosis, and antidepressant class, led to slightly attenuated estimates (OR = 2.0, 95%CI: 1.8-2.2; OR = 1.1, 95%CI: 0.5-2.4). Similar associations were estimated in subgroups with different levels of baseline depression severity. Proposed preference-time, calendar-time-based, and  geography-based instruments were unlikely to meet the required conditions for a valid analysis.
CONCLUSIONS: Our findings suggest that either antidepressant medications do not reduce the risk of depression relapse in pregnant women, or that administrative data alone could not be used to validly estimate the effectiveness of psychotropic medications during pregnancy. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26045370  [PubMed – as supplied by publisher]

5. Stat Methods Med Res. 2015 Jun 2. pii: 0962280215588771. [Epub ahead of print]

An imputation-based solution to using mismeasured covariates in propensity score analysis.

Webb-Vargas Y(1), Rudolph KE(2), Lenis D(3), Murakami P(3), Stuart EA(4).
(1)Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA. (2)Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA School of Public Health, University of California, Berkeley, USA Center for Health and Community, University of California, San Francisco, USA. (3)Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA. (4)Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA.

Although covariate measurement error is likely the norm rather than the exception, methods for handling covariate measurement error in propensity score methods have not been widely investigated. We consider a multiple imputation-based approach that uses an external calibration sample with information on the true and mismeasured covariates, multiple imputation for external calibration, to correct for the measurement error, and investigate its performance using simulation studies. As expected, using the covariate measured with error leads to bias in the treatment effect estimate. In contrast, the multiple imputation for external calibration method can eliminate almost all the  bias. We confirm that the outcome must be used in the imputation process to obtain good results, a finding related to the idea of congenial imputation and analysis in the broader multiple imputation literature. We illustrate the multiple imputation for external calibration approach using a motivating example estimating the effects of living in a disadvantaged neighborhood on mental health and substance use outcomes among adolescents. These results show that estimating the propensity score using covariates measured with error leads to biased estimates of treatment effects, but when a calibration data set is available, multiple imputation for external calibration can be used to help correct for such bias. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav. PMID: 26037527  [PubMed – as supplied by publisher]

6. Pharm Stat. 2015 Jun 30. doi: 10.1002/pst.1697. [Epub ahead of print]

Risk patterns in drug safety study using relative times by accelerated failure time models when proportional hazards assumption is questionable: an illustrative case study of cancer risk of patients on glucose-lowering therapies.

Ng ES(1,)(2), Klungel OH(3), Groenwold RH(4), van Staa TP(3,)(5,)(6).
(1)Director’s Office, London School of Hygiene and Tropical Medicine, UK. (2)Clinical Practice Research Datalink (CPRD), Medicines and Healthcare Products Regulatory Agency, UK. (3)Department of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, P.O. Box 80082, 3508, TB, The Netherlands. (4)Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands. (5)Faculty of Epidemiology and Public Health, London School of Hygiene and Tropical Medicine, UK. (6)Health eResearch Centre, Farr Institute for Health Informatics Research, University of Manchester, UK.

Observational drug safety studies may be susceptible to confounding or protopathic bias. This bias may cause a spurious relationship between drug exposure and adverse side effect when none exists and may lead to unwarranted safety alerts. The spurious relationship may manifest itself through substantially different risk levels between exposure groups at the start of follow-up when exposure is deemed too short to have any plausible biological effect of the drug. The restrictive proportional hazards assumption with its arbitrary choice of baseline hazard function renders the commonly used Cox proportional hazards model of limited use for revealing such potential bias. We demonstrate a fully parametric approach using accelerated failure time models with an illustrative safety study of glucose-lowering therapies and show that its results are comparable against other methods that allow time-varying exposure effects. Our approach includes a wide variety of models that are based on the flexible generalized gamma distribution and allows direct comparisons of estimated hazard functions following different exposure-specific distributions of survival times. This approach lends itself to two alternative metrics, namely relative times and difference in times to event, allowing physicians more ways to communicate patient’s prognosis without invoking the concept of risks, which some may find hard to grasp. In our illustrative case study, substantial differences in cancer risks at drug initiation followed by a gradual reduction towards null were found. This evidence is compatible with the presence of protopathic bias, in which undiagnosed symptoms of cancer lead to switches in diabetes medication. Copyright © 2015 John Wiley & Sons, Ltd.PMID: 26123413  [PubMed – as supplied by publisher]

May 2015

1. Stat Med. 2015 May 26. doi: 10.1002/sim.6532. [Epub ahead of print]
Estimation of causal effects of binary treatments in unconfounded studies.

Gutman R, Rubin DB.
Department of Statistics, Harvard University, 1 Oxford St, 02138, Cambridge MA, U.S.A.

Estimation of causal effects in non-randomized studies comprises two distinct phases: design, without outcome data, and analysis of the outcome data according to a specified protocol. Recently, Gutman and Rubin (2013) proposed a new analysis-phase method for estimating treatment effects when the outcome is binary and there is only one covariate, which viewed causal effect estimation explicitly as a missing data problem. Here, we extend this method to situations with continuous outcomes and multiple covariates and compare it with other commonly used methods (such as matching, subclassification, weighting, and covariance
adjustment). We show, using an extensive simulation, that of all methods considered, and in many of the experimental conditions examined, our new ‘multiple-imputation using two subclassification splines’ method appears to be the most efficient and has coverage levels that are closest to nominal. In addition, it can estimate finite population average causal effects as well as non-linear causal estimands. This type of analysis also allows the identification of subgroups of units for which the effect appears to be especially beneficial or harmful. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 26013308 [PubMed – as supplied by publisher]

2. Am J Epidemiol. 2015 May 20. pii: kwu469. [Epub ahead of print]
Propensity Score Methods for Analyzing Observational Data Like Randomized Experiments: Challenges and Solutions for Rare Outcomes and Exposures.

Ross ME, Kreider AR, Huang YS, Matone M, Rubin DM, Localio AR.

Randomized controlled trials are the “gold standard” for estimating the causal effects of treatments. However, it is often not feasible to conduct such a trial because of ethical concerns or budgetary constraints. We expand upon an approach to the analysis of observational data sets that mimics a sequence of randomized studies by implementing propensity score models within each trial to achieve covariate balance, using weighting and matching. The methods are illustrated using data from a safety study of the relationship between second-generation antipsychotics and type 2 diabetes (outcome) in Medicaid-insured children aged 10-18 years across the United States from 2003 to 2007. Challenges in this data set include a rare outcome, a rare exposure, substantial and important differences between exposure groups, and a very large sample size. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 25995287 [PubMed – as supplied by publisher]

3. J Biomed Inform. 2015 May 22. pii: S1532-0464(15)00092-1. doi:10.1016/j.jbi.2015.05.012. [Epub ahead of print]
When to conduct probabilistic linkage vs. deterministic linkage? A simulation study.

Zhu Y(1), Matsuyama Y(2), Ohashi Y(3), Setoguchi S(4).
(1) Department of Biostatistics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan. (2) Department of Biostatistics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan. (3) Department of Biostatistics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; Department of Integrated Science and Engineering for Sustainable Society, Chuo University, Tokyo, Japan. (4)Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC, United States; Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, United States; Department of Pharmacoepidemiology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.

INTRODUCTION: When unique identifiers are unavailable, successful record linkage depends greatly on data quality and types of variables available. While probabilistic linkage theoretically captures more true matches than deterministic linkage by allowing imperfection in identifiers, studies have shown inconclusive results likely due to variations in data quality, implementation of linkage methodology and validation method. The simulation study aimed to understand data
characteristics that affect the performance of probabilistic vs. deterministic linkage.
METHODS: We created ninety-six scenarios that represent real-life situations using non-unique identifiers. We systematically introduced a range of discriminative power, rate of missing and error, and file size to increase linkage patterns and difficulties. We assessed the performance difference of linkage methods using standard validity measures and computation time.
RESULTS: Across scenarios, deterministic linkage showed advantage in PPV while probabilistic linkage showed advantage in sensitivity. Probabilistic linkage uniformly outperformed deterministic linkage as the former generated linkages with better trade-off between sensitivity and PPV regardless of data quality. However, with low rate of missing and error in data, deterministic linkage performed not significantly worse. The implementation of deterministic linkage in SAS took less than 1 min, and probabilistic linkage took 2min to 2h depending on file size.
DISCUSSION: Our simulation study demonstrated that the intrinsic rate of missing and error of linkage variables was key to choosing between linkage methods. In general, probabilistic linkage was a better choice, but for exceptionally good quality data (<5% error), deterministic linkage was a more resource efficient choice. Copyright © 2015. Published by Elsevier Inc. PMID: 26004791 [PubMed - as supplied by publisher] 4. EGEMS (Wash DC). 2015 Mar 23;3(1):1052. doi: 10.13063/2327-9214.1052. eCollection 2015. Transparent reporting of data quality in distributed data networks.

Kahn MG(1), Brown JS(2), Chun AT(3), Davidson BN(4), Meeker D(5), Ryan PB(6), Schilling LM(1), Weiskopf NG(7), Williams AE(8), Zozus MN(9).
(1)University of Colorado. (2)Harvard Pilgrim Health Care Institute ; Harvard Medical School. (3)Cedars-Sinai Health System. (4)Hoag Memorial Hospital Presbyterian. (5)University of Southern California. (6)Observational Health Data Sciences and Informatics. (7)Oregon Health Sciences University. (8)Maine Medical Center Research Institute. (9)Duke University.

INTRODUCTION: Poor data quality can be a serious threat to the validity and generalizability of clinical research findings. The growing availability of electronic administrative and clinical data is accompanied by a growing concern about the quality of these data for observational research and other analytic purposes. Currently, there are no widely accepted guidelines for reporting quality results that would enable investigators and consumers to independently
determine if a data source is fit for use to support analytic inferences and reliable evidence generation.
MODEL AND METHODS: We developed a conceptual model that captures the flow of data from data originator across successive data stewards and finally to the data consumer. This “data lifecycle” model illustrates how data quality issues can result in data being returned back to previous data custodians. We highlight the potential risks of poor data quality on clinical practice and research results. Because of the need to ensure transparent reporting of a data quality issues, we created a unifying data-quality reporting framework and a complementary set of 20 data-quality reporting recommendations for studies that use observational clinical and administrative data for secondary data analysis. We obtained stakeholder input on the perceived value of each recommendation by soliciting public comments via two face-to-face meetings of informatics and comparative-effectiveness investigators, through multiple public webinars targeted to the health services research community, and with an open access online wiki.
RECOMMENDATIONS: Our recommendations propose reporting on both general and analysis-specific data quality features. The goals of these recommendations are to improve the reporting of data quality measures for studies that use observational clinical and administrative data, to ensure transparency and consistency in computing data quality measures, and to facilitate best practices and trust in the new clinical discoveries based on secondary use of observational data. PMCID: PMC4434997 PMID: 25992385 [PubMed]

5. Am J Epidemiol. 2015 May 5. pii: kwv059. [Epub ahead of print]
When Is the Difference Method Conservative for Assessing Mediation?

Jiang Z, VanderWeele TJ.

Assessment of indirect effects is useful for epidemiologists interested in understanding the mechanisms of exposure-outcome relationships. A traditional way of estimating indirect effects is to use the “difference method,” which is based on regression analysis in which one adds a possible mediator to the regression model and examines whether the coefficient for the exposure changes. The difference method has been criticized for lacking a causal interpretation when it is used with logistic regression. In this article, we use the counterfactual framework to define the natural indirect effect (NIE) and assess the relationship between the NIE and the difference method. We show that under appropriate assumptions, the difference method consistently estimates the NIE for continuous outcomes and is always conservative for binary outcomes. Thus, the difference method can be used to provide evidence for the presence of mediation but not for the absence of mediation. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 25944885 [PubMed – as supplied by publisher]

April 2015

1. Stat Methods Med Res. 2015 Apr 30. pii: 0962280215584401. [Epub ahead of print]

The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes.

Austin PC(1), Stuart EA(2).
(1)Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada Institute of Health Management, Policy and Evaluation, University of Toronto Schulich Heart Research Program, Sunnybrook Research Institute, Toronto, Canada (2)Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland Department of Health, Policy, and Management

There is increasing interest in estimating the causal effects of treatments using observational data. Propensity-score matching methods are frequently used to adjust for differences in observed characteristics between treated and control individuals in observational studies. Survival or time-to-event outcomes occur frequently in the medical literature, but the use of propensity score methods in survival analysis has not been thoroughly investigated. This paper compares two approaches for estimating the Average Treatment Effect (ATE) on survival outcomes: Inverse Probability of Treatment Weighting (IPTW) and full matching. The performance of these methods was compared in an extensive set of simulations that varied the extent of confounding and the amount of misspecification of the propensity score model. We found that both IPTW and full matching resulted in estimation of marginal hazard ratios with negligible bias when the ATE was the target estimand and the treatment-selection process was weak to moderate. However, when the treatment-selection process was strong, both methods resulted in biased estimation of the true marginal hazard ratio, even when the propensity  score model was correctly specified. When the propensity score model was correctly specified, bias tended to be lower for full matching than for IPTW. The reasons for these biases and for the differences between the two methods appeared to be due to some extreme weights generated for each method. Both methods tended to produce more extreme weights as the magnitude of the effects of covariates on treatment selection increased. Furthermore, more extreme weights were observed for IPTW than for full matching. However, the poorer performance of both methods  in the presence of a strong treatment-selection process was mitigated by the use of IPTW with restriction and full matching with a caliper restriction when the propensity score model was correctly specified.© The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav. PMID: 25934643  [PubMed – as supplied by publisher]

2. Am J Epidemiol. 2015 Apr 12. pii: kwu485. [Epub ahead of print]

Controlling Time-Dependent Confounding by Health Status and Frailty: Restriction Versus Statistical Adjustment.

McGrath LJ, Ellis AR, Brookhart MA.

Nonexperimental studies of preventive interventions are often biased because of the healthy-user effect and, in frail populations, because of confounding by functional status. Bias is evident when estimating influenza vaccine effectiveness, even after adjustment for claims-based indicators of illness. We explored bias reduction methods while estimating vaccine effectiveness in a cohort of adult hemodialysis patients. Using the United States Renal Data System and linked data from a commercial dialysis provider, we estimated vaccine effectiveness using a Cox proportional hazards marginal structural model of all-cause mortality before and during 3 influenza seasons in 2005/2006 through 2007/2008. To improve confounding control, we added frailty indicators to the model, measured time-varying confounders at different time intervals, and restricted the sample in multiple ways. Crude and baseline-adjusted marginal structural models remained strongly biased. Restricting to a healthier population removed some unmeasured confounding; however, this reduced the sample size, resulting in wide confidence intervals. We estimated an influenza vaccine effectiveness of 9% (hazard ratio = 0.91, 95% confidence interval: 0.72, 1.15) when bias was minimized through cohort restriction. In this study, the healthy-user bias could not be controlled through statistical adjustment; however, sample restriction reduced much of the bias. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 25868551 [PubMed – as supplied by publisher]

3. Ann Epidemiol. 2015 May;25(5):342-9. doi: 10.1016/j.annepidem.2015.02.008. Epub 2015 Feb 20.

Does selection bias explain the obesity paradox among individuals with cardiovascular disease?

Banack HR, Kaufman JS.
Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, QC, Canada.

OBJECTIVES: The objectives of this article are to demonstrate that the obesity paradox may be explained by collider stratification bias and to estimate the biasing effects of unmeasured common causes of cardiovascular disease (CVD) and mortality on the observed obesity-mortality relationship.

METHODS: We use directed acyclic graphs, regression modeling, and sensitivity analyses to explore whether the observed protective effect of obesity among individuals with CVD can be plausibly attributed to selection bias. Data from the third National Health and Examination Survey was used for the analyses.

RESULTS: The adjusted total effect of obesity on mortality was a risk difference (RD) of 0.03 (95% confidence interval [CI]: 0.02, 0.05). However, the controlled direct effect of obesity on mortality among individuals without CVD was RD = 0.03 (95% CI: 0.01, 0.05) and RD = -0.12 (95% CI: -0.20, -0.04) among individuals with CVD. The adjusted total effect estimate demonstrates an increased number of deaths among obese individuals relative to nonobese counterparts, whereas the controlled direct effect shows a paradoxical decrease in morality among obese individuals with CVD.

CONCLUSIONS: Sensitivity analysis demonstrates unmeasured confounding of the mediator-outcome relationship provides a sufficient explanation for the observed protective effect of obesity on mortality among individuals with CVD. Copyright © 2015 Elsevier Inc. All rights reserved. PMID: 25867852  [PubMed – in process]

4. Pharmacoepidemiol Drug Saf. 2015 Apr 10. doi: 10.1002/pds.3773. [Epub ahead of print]

On the role of marginal confounder prevalence – implications for the high-dimensional propensity score algorithm.

Schuster T(1), Pang M, Platt RW.
(1)Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada; Centre for Clinical Epidemiology, Lady Davis Institute, Jewish General Hospital, Montreal, Quebec, Canada.

PURPOSE: The high-dimensional propensity score algorithm attempts to improve control of confounding in typical treatment effect studies in pharmacoepidemiology and is increasingly being used for the analysis of large administrative databases. Within this multi-step variable selection algorithm, the marginal prevalence of non-zero covariate values is considered to be an indicator for a count variable’s potential confounding impact. We investigate the role of the marginal prevalence of confounder variables on potentially caused bias magnitudes when estimating risk ratios in point exposure studies with binary outcomes.

METHODS: We apply the law of total probability in conjunction with an established bias formula to derive and illustrate relative bias boundaries with respect to marginal confounder prevalence.

RESULTS: We show that maximum possible bias magnitudes can occur at any marginal  prevalence level of a binary confounder variable. In particular, we demonstrate that, in case of rare or very common exposures, low and high prevalent confounder variables can still have large confounding impact on estimated risk ratios.

CONCLUSIONS: Covariate pre-selection by prevalence may lead to sub-optimal confounder sampling within the high-dimensional propensity score algorithm. While we believe that the high-dimensional propensity score has important benefits in large-scale pharmacoepidemiologic studies, we recommend omitting the prevalence-based empirical identification of candidate covariates. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25866189  [PubMed – as supplied by publisher]

5. Ann Epidemiol. 2015 Apr 1. pii: S1047-2797(15)00129-5. doi: 10.1016/j.annepidem.2015.03.019. [Epub ahead of print]

Left truncation results in substantial bias of the relation between time-dependent exposures and adverse events.

Hazelbag CM(1), Klungel OH(2), van Staa TP(3), de Boer A(4), Groenwold RH(2).
(1)Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands. (2)Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands; Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht,The Netherlands. (3)Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands; Health eResearch Centre, The Farr Institute of Health Informatics Research, University of Manchester, Manchester, UK. (4)Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands.

PURPOSE: To assess the impact of random left truncation of data on the estimation of time-dependent exposure effects.

METHODS: A simulation study was conducted in which the relation between exposure and outcome was based on an immediate exposure effect, a first-time exposure effect, or a cumulative exposure effect. The individual probability of truncation, the moment of truncation, the exposure rate, and the incidence rate of the outcome were varied in different simulations. All observations before the moment of left truncation were omitted from the analysis.

RESULTS: Random left truncation did not bias estimates of immediate exposure effects, but resulted in an overestimation of a cumulative exposure effect and underestimation of a first-time exposure effect. The magnitude of bias in estimation of cumulative exposure effects depends on a combination of exposure rate, probability of truncation, and proportion of follow-up time left truncated.

CONCLUSIONS: In case of a cumulative or first-time exposure, left truncation can result in substantial bias in pharmacoepidemiologic studies. The potential for this bias likely differs between databases, which may lead to heterogeneity in estimated exposure effects between studies. Copyright © 2015 Elsevier Inc. All rights reserved. PMID: 25935711  [PubMed – as supplied by publisher]

6. Drug Saf. 2015 May 3. [Epub ahead of print]

Incorporating Linked Healthcare Claims to Improve Confounding Control in a Study of In-Hospital Medication Use.

Franklin JM, Eddings W, Schneeweiss S, Rassen JA.
Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, 1620 Tremont St., Suite 3030, Boston, MA, 02120, USA.

INTRODUCTION: The Premier Perspective hospital billing database provides a promising data source for studies of inpatient medication use. However, in-hospital recording of confounders is limited, and incorporating linked healthcare claims data available for a subset of the cohort may improve confounding control. We investigated methods capable of adjusting for confounders measured in a subset, including complete case analysis, multiple imputation of missing data, and propensity score (PS) calibration.

METHODS: Methods were implemented in an example study of adults in Premier undergoing percutaneous coronary intervention (PCI) in 2004-2008 and exposed to either bivalirudin or heparin. In a subset of patients enrolled in UnitedHealth for at least 90 days before hospitalization, additional confounders were assessed from healthcare claims, including comorbidities, prior medication use, and service use intensity. Diagnostics for each method were evaluated, and methods were compared with respect to the estimates and confidence intervals of treatment effects on repeat PCI, bleeding, and in-hospital death.

RESULTS: Of 210,268 patients in the hospital-based cohort, 3240 (1.5%) had linked healthcare claims. This subset was younger and healthier than the overall study population. The linked subset was too small for complete case evaluation of two of the three outcomes of interest. Multiple imputation and PS calibration did not meaningfully impact treatment effect estimates and associated confidence intervals.

CONCLUSIONS: Despite more than 98% missingness on 24 variables, PS calibration and multiple imputation incorporated confounders from healthcare claims without major increases in estimate uncertainty. Additional research is needed to determine the relative bias of these methods. PMID: 25935198 [PubMed – as supplied by publisher]

7. Ann Epidemiol. 2015 Mar 21. pii: S1047-2797(15)00097-6. doi: 10.1016/j.annepidem.2015.03.008. [Epub ahead of print]

Enrollment factors and bias of disease prevalence estimates in administrative claims data.

Jensen ET(1), Cook SF(2), Allen JK(2), Logie J(2), Brookhart MA(3), Kappelman MD(4), Dellon ES(5).

(1)Center for Esophageal Diseases and Swallowing, Division of Gastroenterology and Hepatology, Department of Medicine, University of North Carolina School of Medicine, Chapel Hill; Center for Gastrointestinal Biology and Disease, Division of Gastroenterology and Hepatology, Department of Medicine, University of North Carolina School of Medicine, Chapel Hill. (2)World Wide Epidemiology, GlaxoSmithKline, Research Triangle Park. (3)Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill. (4)Center for Gastrointestinal Biology and Disease, Division of Gastroenterology and Hepatology, Department of Medicine, University of North Carolina School of Medicine, Chapel Hill; Division of Pediatric Gastroenterology and Hepatology, Department of Pediatrics, University of North Carolina School of Medicine, Chapel Hill. (5)Center for Esophageal Diseases and Swallowing, Division of Gastroenterology and Hepatology, Department of Medicine, University of North Carolina School of Medicine, Chapel Hill; Center for Gastrointestinal Biology and Disease, Division of Gastroenterology and Hepatology, Department of Medicine, University of North Carolina School of Medicine, Chapel Hill.

PURPOSE: Considerations for using administrative claims data in research have not been well-described. To increase awareness of how enrollment factors and insurance benefit use may contribute to prevalence estimates, we evaluated how differences in operational definitions of the cohort impact observed estimates.

METHODS: We conducted a cross-sectional study estimating the prevalence of five gastrointestinal conditions using MarketScan claims data for 73.1 million enrollees. We extracted data obtained from 2009 to 2012 to identify cohorts meeting various enrollment, prescription drug benefit, or health care utilization characteristics. Next, we identified patients meeting the case definition for each of the diseases of interest. We compared the estimates obtained to evaluate the influence of enrollment period, drug benefit, and insurance usage.

RESULTS: As the criteria for inclusion in the cohort became increasingly restrictive the estimated prevalence increased, as much as 45% to 77% depending on the disease condition and the definition for inclusion. Requiring use of the insurance benefit and a longer period of enrollment had the greatest influence on the estimates observed.

CONCLUSIONS: Individuals meeting case definition were more likely to meet the more stringent definition for inclusion in the study cohort. This may be considered a form of selection bias, where overly restrictive inclusion criteria definitions may result in selection of a source population that may no longer represent the population from which cases arose. Copyright © 2015 Elsevier Inc. All rights reserved. PMID: 25890796 [PubMed – as supplied by publisher]

March 2015

  1. Am J Epidemiol. 2015 Mar 27. pii: kwu486. [Epub ahead of print]

Invited Commentary: Estimating Population Impact in the Presence of Competing Events.

Naimi AI, Tchetgen Tchetgen EJ.

The formal approach in the field of causal inference has enabled epidemiologists to clarify several complications that arise when estimating the effect of an intervention on a health outcome of interest. When the outcome is a failure time or longitudinal process, researchers must often deal with competing events. In this issue of the Journal, Picciotto et al. (Am J Epidemiol. 2015;000(00):0000-0000) use structural nested failure time models to assess potential population effects of hypothetical interventions and censor competing events. In the present commentary, we discuss 2 interpretations that result from  treating competing events as censored observations and how they relate to measures of public health impact. We also comment on 2 alternative approaches for handling competing events: an inverse probability weighting estimator of the survivor average causal effect and the parametric g-formula, which can be used to estimate a functional of the subdistribution of the event of interest. We argue that careful consideration of the tradeoff between the interpretation of the parameters from each approach and the assumptions required to estimate these parameters should guide researchers on the various ways to handle competing events in epidemiologic research. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. PMID: 25816819

  1. Stat Med. 2015 Mar 20. doi: 10.1002/sim.6470. [Epub ahead of print]

Bias in estimating the causal hazard ratio when using two-stage instrumental variable methods.

Wan F, Small D, Bekelman JE, Mitra N.

Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA, U.S.A.

Two-stage instrumental variable methods are commonly used to estimate the causal effects of treatments on survival in the presence of measured and unmeasured confounding. Two-stage residual inclusion (2SRI) has been the method of choice over two-stage predictor substitution (2SPS) in clinical studies. We directly compare the bias in the causal hazard ratio estimated by these two methods. Under a principal stratification framework, we derive a closed-form solution for asymptotic bias of the causal hazard ratio among compliers for both the 2SPS and 2SRI methods when survival time follows the Weibull distribution with random censoring. When there is no unmeasured confounding and no always takers, our analytic results show that 2SRI is generally asymptotically unbiased, but 2SPS is not. However, when there is substantial unmeasured confounding, 2SPS performs better than 2SRI with respect to bias under certain scenarios. We use extensive simulation studies to confirm the analytic results from our closed-form solutions. We apply these two methods to prostate cancer treatment data from Surveillance, Epidemiology and End Results-Medicare and compare these 2SRI and 2SPS estimates with results from two published randomized trials. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25800789  [PubMed – as supplied by publisher]

  1. Drug Saf. 2015 Mar;38(3):295-310. doi: 10.1007/s40264-015-0280-1.

Addressing limitations in observational studies of the association between glucose-lowering medications and all-cause mortality: a review.

Patorno E, Garry EM, Patrick AR, Schneeweiss S, Gillet VG, Zorina O, Bartels DB, Seeger JD.

Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA.

A growing body of observational literature on the association between glucose-lowering treatments and all-cause mortality has been accumulating in recent years. However, many investigations present designs or analyses that inadequately address the methodological challenges involved. We conducted a systematic search with a non-systematic extension to identify observational studies published between 2000 and 2012 that evaluated the effects of glucose-lowering medications on all-cause mortality. We reviewed these studies and assessed the design and analysis methods used, with a focus on their ability  to address specific methodological challenges. We described these methodological issues and their potential impact on observed associations, providing examples from the reviewed literature, and suggested possible approaches to manage these methodological challenges. We evaluated 67 publications of observational studies evaluating the association between glucose-lowering treatments and all-cause mortality. The identified methodological challenges included trade-offs associated with the outcome of all-cause mortality, incorrect temporal sequencing in administrative databases, inadequate treatment of time-varying hazards and treatment duration effects, unclear definition of the exposure risk window, improper handling of time-varying exposures, and incomplete accounting for confounding by indication. Most of these methodological challenges may be adequately addressed through the application of appropriate methods. Observational research plays an increasingly important role in assessing the clinical effects of diabetes therapy. The implementation of suitable research methods can reduce the potential for spurious findings, and thus the risk of misleading the medical community about benefits and harms of diabetes therapy.

PMID: 25761856  [PubMed – in process]

  1. Annu Rev Public Health. 2015 Mar 18;36:89-108. doi: 10.1146/annurev-publhealth-031914-122559.

Statistical foundations for model-based adjustments.

Greenland S, Pearce N.

Department of Epidemiology and Department of Statistics, University of California, Los Angeles, California 90095-1772.

Most epidemiology textbooks that discuss models are vague on details of model selection. This lack of detail may be understandable since selection should be strongly influenced by features of the particular study, including contextual (prior) information about covariates that may confound, modify, or mediate the effect under study. It is thus important that authors document their modeling goals and strategies and understand the contextual interpretation of model parameters and model selection criteria. To illustrate this point, we review several established strategies for selecting model covariates, describe their shortcomings, and point to refinements, assuming that the main goal is to derive  the most accurate effect estimates obtainable from the data and available resources. This goal shifts the focus to prediction of exposure or potential outcomes (or both) to adjust for confounding; it thus differs from the goal of ordinary statistical modeling, which is to passively predict outcomes. Nonetheless, methods and software for passive prediction can be used for causal inference as well, provided that the target parameters are shifted appropriately. PMID: 25785886  [PubMed – in process]

  1. Value Health. 2015 Mar;18(2):250-9. doi: 10.1016/j.jval.2014.11.001. Epub 2015 Feb 2.

A unified framework for classification of methods for benefit-risk assessment.

Najafzadeh M, Schneeweiss S, Choudhry N, Bykov K, Kahler KH, Martin DP, Gagne JJ.

Department of Medicine, Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. Novartis Pharmaceutical Corporation, East Hanover, NJ, USA.

BACKGROUND: Patients, physicians, and other decision makers make implicit but inevitable trade-offs among risks and benefits of treatments. Many methods have been proposed to promote transparent and rigorous benefit-risk analysis (BRA).

OBJECTIVE: To propose a framework for classifying BRA methods on the basis of key factors that matter most for patients by using a common mathematical notation and compare their results using a hypothetical example.

METHODS: We classified the available BRA methods into three categories: 1) unweighted metrics, which use only probabilities of benefits and risks; 2) metrics that incorporate preference weights and that account for the impact and duration of benefits and risks; and 3) metrics that incorporate weights based on  decision makers’ opinions. We used two hypothetical antiplatelet drugs (a and b)  to compare the BRA methods within our proposed framework.

RESULTS: Unweighted metrics include the number needed to treat and the number needed to harm. Metrics that incorporate preference weights include those that use maximum acceptable risk, those that use relative-value-adjusted life-years, and those that use quality-adjusted life-years. Metrics that use decision makers’ weights include the multicriteria decision analysis, the benefit-less-risk analysis, Boers’ 3 by 3 table, the Gail/NCI method, and the transparent uniform risk benefit overview. Most BRA methods can be derived as a special case of a generalized formula in which some are mathematically identical. Numerical comparison of methods highlights potential differences in BRA results and their interpretation.

CONCLUSIONS: The proposed framework provides a unified, patient-centered approach to BRA methods classification based on the types of weights that are used across  existing methods, a key differentiating feature. Copyright © 2015 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved. PMID: 25773560  [PubMed – in process]

February 2015

1. Med Care. 2015 Mar 3. [Epub ahead of print]

Estimating Subgroup Effects Using the Propensity Score Method: A Practical Application in Outcomes Research.

Eeren HV, Spreeuwenberg MD, Bartak A, Rooij M, Busschbach JJ.
Viersprong Institute for Studies on Personality Disorders (VISPD), Halsteren Department of Psychiatry, Section Medical Psychology and Psychotherapy, Erasmus MC, Rotterdam Department of Health Services Research, Maastricht University, Maastricht Department of Clinical Psychology, University of Amsterdam (UvA) Bosen Lommer Private Practice, Amsterdam Department of Methodology and Statistics, Institute of Psychology, Leiden University, Leiden, The Netherlands.

OBJECTIVE: Our aim was to demonstrate the feasibility of the univariate and generalized propensity score (PS) method in subgroup analysis of outcomes research.
METHODS: First, to estimate subgroup effects, we tested the performance of 2 different PS methods, using Monte Carlo simulations: (1) the univariate PS with additional adjustment on the subgroup; and (2) the generalized PS, estimated by crossing the treatment options with a subgroup variable. The subgroup effects were estimated in a linear regression model using the 2 PS adjustments. We further explored whether the subgroup variable should be included in the univariate PS. Second, the 2 methods were compared using data from a large effectiveness study on psychotherapy in personality disorders. Using these data we tested the differences between short-term and long-term treatment, with the severity of patients’ problems defining the subgroups of interest.
RESULTS: The Monte Carlo simulations showed minor differences between both PS methods, with the bias and mean squared error overall marginally lower for the generalized PS. When considering the univariate PS, the subgroup variable can be excluded from the PS estimation and only adjusted for in the outcome equation. When applied to the psychotherapy data, the univariate and generalized PS estimations gave similar results.
CONCLUSION: The results support the use of the generalized PS as a feasible method, compared with the univariate PS, to find certain subgroup effects in nonrandomized outcomes research. PMID: 25738381 [PubMed – as supplied by publisher]

2. Stat Med. 2015 Mar 2. doi: 10.1002/sim.6457. [Epub ahead of print]

Detecting treatment-covariate interactions using permutation methods.

Wang R, Schoenfeld DA, Hoeppner B, Evins AE.
Division of Sleep and Circadian Disorders, Departments of Medicine and Neurology, Brigham and Women’s Hospital and Harvard Medical School, Department of Biostatistics, Harvard T. H. Chan School of Public Health

The primary objective of a Randomized Clinical Trial usually is to investigate whether one treatment is better than its alternatives on average. However, treatment effects may vary across different patient subpopulations. In contrast to demonstrating one treatment is superior to another on the average sense, one is often more concerned with the question that, for a particular patient, or a group of patients with similar characteristics, which treatment strategy is most appropriate to achieve a desired outcome. Various interaction tests have been proposed to detect treatment effect heterogeneity; however, they typically examine covariates one at a time, do not offer an integrated approach that incorporates all available information, and can greatly increase the chance of a false positive finding when the number of covariates is large. We propose a new permutation test for the null hypothesis of no interaction effects for any covariate. The proposed test allows us to consider the interaction effects of many covariates simultaneously without having to group subjects into subsets based on pre-specified criteria and applies generally to randomized clinical trials of multiple treatments. The test provides an attractive alternative to the standard likelihood ratio test, especially when the number of covariates is large. We illustrate the proposed methods using a dataset from the Treatment of Adolescents with Depression Study. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25736915 [PubMed – as supplied by publisher]

3. Stat Methods Med Res. 2015 Feb 24. pii: 0962280215570722. [Epub ahead of print]

Estimation of causal effects of binary treatments in unconfounded studies with one continuous covariate.

Gutman R, Rubin D.
Department of Biostatistics, Brown University, Providence, RI, Department of Statistics, Harvard University, Cambridge, MA.

The estimation of causal effects in nonrandomized studies should comprise two distinct phases: design, with no outcome data available; and analysis of the outcome data according to a specified protocol. Here, we review and compare point and interval estimates of common statistical procedures for estimating causal effects (i.e. matching, subclassification, weighting, and model-based adjustment) with a scalar continuous covariate and a scalar continuous outcome. We show, using an extensive simulation, that some highly advocated methods have poor operating characteristics. In many conditions, matching for the point estimate combined with within-group matching for sampling variance estimation, with or without covariance adjustment, appears to be the most efficient valid method of those evaluated. These results provide new conclusions and advice regarding the merits of currently used procedures. © The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav. PMID: 25715391 [PubMed – as supplied by publisher]

4. Stat Med. 2013 May 20;32(11):1795-814. doi: 10.1002/sim.5627. Epub 2012 Sep 28.

Robust estimation of causal effects of binary treatments in unconfounded studies with dichotomous outcomes.

Gutman R, Rubin DB.
Department of Biostatistics, Brown University, 121 S. Main St., Providence, RI 02912, USA

The estimation of causal effects has been the subject of extensive research. In unconfounded studies with a dichotomous outcome, Y, Cangul, Chretien, Gutman and Rubin (2009) demonstrated that logistic regression for a scalar continuous covariate X is generally statistically invalid for testing null treatment effects when the distributions of X in the treated and control populations differ and the logistic model for Y given X is misspecified. In addition, they showed that an approximately valid statistical test can be generally obtained by discretizing X followed by regression adjustment within each interval defined by the discretized X. This paper extends the work of Cangul et al. 2009 in three major directions. First, we consider additional estimation procedures, including a new one that is based on two independent splines and multiple imputation; second, we consider additional distributional factors; and third, we examine the performance of the procedures when the treatment effect is non-null. Of all the methods considered and in most of the experimental conditions that were examined, our proposed new methodology appears to work best in terms of point and interval estimation. Copyright © 2012 John Wiley & Sons, Ltd. PMID: 23019093 [PubMed – indexed for MEDLINE]

5. Epidemiology. 2015 Feb 16. [Epub ahead of print]

Instrumental Variable Estimation in a Survival Context.

Tchetgen Tchetgen EJ, Walter S, Vansteelandt S, Martinussen T, Glymour M.
From the Departments of Biostatistics and Epidemiology, Harvard University, Boston, MA; Department of Epidemiology and Biostatistics, University of California, San Francisco, CA; Department Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium; and Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark.

Bias due to unobserved confounding can seldom be ruled out with certainty when estimating the causal effect of a nonrandomized treatment. The instrumental variable (IV) design offers, under certain assumptions, the opportunity to tame confounding bias, without directly observing all confounders. The IV approach is very well developed in the context of linear regression and also for certain generalized linear models with a nonlinear link function. However, IV methods are not as well developed for regression analysis with a censored survival outcome. In this article, we develop the IV approach for regression analysis in a survival context, primarily under an additive hazards model, for which we describe 2 simple methods for estimating causal effects. The first method is a straightforward 2-stage regression approach analogous to 2-stage least squares commonly used for IV analysis in linear regression. In this approach, the fitted value from a first-stage regression of the exposure on the IV is entered in place of the exposure in the second-stage hazard model to recover a valid estimate of the treatment effect of interest. The second method is a so-called control function approach, which entails adding to the additive hazards outcome model, the residual from a first-stage regression of the exposure on the IV. Formal conditions are given justifying each strategy, and the methods are illustrated in a novel application to a Mendelian randomization study to evaluate the effect of diabetes on mortality using data from the Health and Retirement Study. We also establish that analogous strategies can also be used under a proportional hazards model specification, provided the outcome is rare over the entire follow-up. PMID: 25692223 [PubMed – as supplied by publisher]

6. Ann Epidemiol. 2015 Mar;25(3):147-54. doi: 10.1016/j.annepidem.2014.11.015. Epub
2014 Dec 11.

A history of the population attributable fraction and related measures.

Poole C.
Department of Epidemiology, University of North Carolina, Chapel Hill.

PURPOSE: Since Doll published the first PAF in 1951, it has been a mainstay. Confusion in terminology abounds with regard to these measures. The ability to estimate all of them in case-control studies as well as in cohort studies is not widely appreciated.
METHODS: This article reviews and comments on the historical development of the population attributable fraction (PAF), the exposed attributable fraction (EAF), the rate difference (ID), the population rate (or incidence) difference (PID), and the caseload difference (CD).
RESULTS: The desire for PAFs to sum to no more than 100% and the interpretation of the complement of a PAF as the proportion of a rate that can be attributed to other causes are shown to stem from the same problem: a failure to recognize the pervasiveness of shared etiologic responsibility among causes. A lack of appreciation that “expected” numbers of cases and deaths are not actually the numbers to be expected when an exposure or intervention appreciably affects person-time denominators for rates, as in the case of smoking and nonnormal body mass, makes many CD estimates inflated. A movement may be gaining momentum to shift away from assuming, often unrealistically, the complete elimination of harmful exposures and toward estimating the effects of realistic interventions. This movement could culminate in a merger of the academic concept of transportability with the applied discipline of risk assessment.
CONCLUSIONS: A suggestion is offered to pay more attention to absolute measures such as the rate difference, the population rate difference, and the CD, when the latter can be validly estimated and less attention to proportional measures such as the EAF and PAF. Copyright © 2015 Elsevier Inc. All rights reserved. PMID: 25721747 [PubMed – in process]

January 2015

1. Stat Med. 2015 Feb 17. doi: 10.1002/sim.6453. [Epub ahead of print]

Assessing potentially time-dependent treatment effect from clinical trials and observational studies for survival data, with applications to the Women’s Health Initiative combined hormone therapy trial.

Yang S, Prentice RL.
Office of Biostatistics Research, National Heart, Lung, and Blood Institute, Bethesda, 20892, MD, U. S. A.

For risk and benefit assessment in clinical trials and observational studies with time-to-event data, the Cox model has usually been the model of choice. When the hazards are possibly non-proportional, a piece-wise Cox model over a partition of the time axis may be considered. Here, we propose to analyze clinical trials or observational studies with time-to-event data using a certain semiparametric model. The model allows for a time-dependent treatment effect. It includes the important proportional hazards model as a sub-model and can accommodate various patterns of time-dependence of the hazard ratio. After estimation of the model parameters using a pseudo-likelihood approach, simultaneous confidence intervals for the hazard ratio function are established using a Monte Carlo method to assess the time-varying pattern of the treatment effect. To assess the overall treatment effect, estimated average hazard ratio and its confidence intervals are also obtained. The proposed methods are applied to data from the Women’s Health Initiative. To compare the Women’s Health Initiative clinical trial and observational study, we use the propensity score in building the regression model. Compared with the piece-wise Cox model, the proposed model yields a better model fit and does not require partitioning of the time axis. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25689356 [PubMed – as supplied by publisher]

2. Biometrics. 2015 Feb 10. doi: 10.1111/biom.12269. [Epub ahead of print]

On Bayesian estimation of marginal structural models.

Saarela O, Stephens DA, Moodie EE, Klein MB.
Dalla Lana School of Public Health, University of Toronto, 155 College Street, 6th floor, Toronto, Ontario, Canada M5T 3M7.

The purpose of inverse probability of treatment (IPT) weighting in estimation of marginal treatment effects is to construct a pseudo-population without imbalances in measured covariates, thus removing the effects of confounding and informative censoring when performing inference. In this article, we formalize the notion of such a pseudo-population as a data generating mechanism with particular characteristics, and show that this leads to a natural Bayesian interpretation of IPT weighted estimation. Using this interpretation, we are able to propose the first fully Bayesian procedure for estimating parameters of marginal structural models using an IPT weighting. Our approach suggests that the weights should be derived from the posterior predictive treatment assignment and censoring probabilities, answering the question of whether and how the uncertainty in the estimation of the weights should be incorporated in Bayesian inference of marginal treatment effects. The proposed approach is compared to existing methods in simulated data, and applied to an analysis of the Canadian Co-infection Cohort. © 2015, The International Biometric Society. PMID: 25677103 [PubMed – as supplied by publisher]

3. Epidemiology. 2015 Mar;26(2):238-41. doi: 10.1097/EDE.0000000000000241.

Evaluating possible confounding by prescriber in comparative effectiveness research.

Franklin JM, Schneeweiss S, Huybrechts KF, Glynn RJ.
From the Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA.

In nonrandomized studies of comparative effectiveness of medications, the prescriber may be the most important determinant of treatment assignment, yet the majority of analyses ignore the prescriber. Via Monte Carlo simulation, we evaluated the bias of 3 approaches that utilize the prescriber in analysis compared against the default approach that ignores the prescriber. Prescriber preference instrumental variable (IV) analyses were unbiased when IV criteria were met, which required no clustering of unmeasured patient characteristics within prescriber. In all other scenarios, IV analyses were highly biased, and stratification on the prescriber reduced confounding bias at the patient or prescriber levels. Including a prescriber random intercept in the propensity score model reversed the direction of confounding from measured patient factors and resulted in unpredictable changes in bias. Therefore, we recommend caution when using the IV approach, particularly when the instrument is weak. Stratification on the prescriber may be more robust; this approach warrants additional research. PMID: 25643103 [PubMed – in process]

4. Stat Med. 2015 Jan 28. doi: 10.1002/sim.6433. [Epub ahead of print]

Penalized regression procedures for variable selection in the potential outcomes framework.

Ghosh D, Zhu Y, Coffman DL.
Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO, 80045, U.S.A.

A recent topic of much interest in causal inference is model selection. In this article, we describe a framework in which to consider penalized regression approaches to variable selection for causal effects. The framework leads to a simple ‘impute, then select’ class of procedures that is agnostic to the type of imputation algorithm as well as penalized regression used. It also clarifies how model selection involves a multivariate regression model for causal inference problems and that these methods can be applied for identifying subgroups in which treatment effects are homogeneous. Analogies and links with the literature on machine learning methods, missing data, and imputation are drawn. A difference least absolute shrinkage and selection operator algorithm is defined, along with its multiple imputation analogs. The procedures are illustrated using a well-known right-heart catheterization dataset. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25628185 [PubMed – as supplied by publisher]

5. Stat Med. 2015 Jan 11. doi: 10.1002/sim.6421. [Epub ahead of print]

Design and inference for the intent-to-treat principle using adaptive treatment.

Dawson R, Lavori PW.
Frontier Science Technology and Research Foundation, Boston, MA, U.S.A.

Nonadherence to assigned treatment jeopardizes the power and interpretability of intent-to-treat comparisons from clinical trial data and continues to be an issue for effectiveness studies, despite their pragmatic emphasis. We posit that new approaches to design need to complement developments in methods for causal inference to address nonadherence, in both experimental and practice settings. This paper considers the conventional study design for psychiatric research and other medical contexts, in which subjects are randomized to treatments that are
fixed throughout the trial and presents an alternative that converts the fixed treatments into an adaptive intervention that reflects best practice. The key element is the introduction of an adaptive decision point midway into the study to address a patient’s reluctance to remain on treatment before completing a full-length trial of medication. The clinical uncertainty about the appropriate adaptation prompts a second randomization at the new decision point to evaluate relevant options. Additionally, the standard ‘all-or-none’ principal stratification (PS) framework is applied to the first stage of the design to address treatment discontinuation that occurs too early for a midtrial adaptation. Drawing upon the adaptive intervention features, we develop
assumptions to identify the PS causal estimand and to introduce restrictions on outcome distributions to simplify expectation-maximization calculations. We evaluate the performance of the PS setup, with particular attention to the role played by a binary covariate. The results emphasize the importance of collecting covariate data for use in design and analysis. We consider the generality of our approach beyond the setting of psychiatric research. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25581413 [PubMed – as supplied by publisher]

6. Pharmacoepidemiol Drug Saf. 2015 Jan 28. doi: 10.1002/pds.3743. [Epub ahead of print]

Beginning and duration of pregnancy in automated health care databases: review of estimation methods and validation results.

Margulis AV, Palmsten K, Andrade SE, Charlton RA, Hardy JR, Cooper WO, Hernández-Díaz S.
RTI Health Solutions, Barcelona, Spain.

PURPOSE: To describe methods reported in the literature to estimate the beginning or duration of pregnancy in automated health care data, and to present results of validation exercises where available.
METHODS: Papers reporting methods for determining the beginning or duration of pregnancy were identified based on Pubmed searches, by consulting investigators with expertise in the field and by reviewing conference abstracts and reference lists of relevant papers. From each paper or abstract, we extracted information to characterize the study population, data sources, and estimation algorithm. We then grouped these studies into categories reflecting their general methodological approach.
RESULTS: Methods were classified into 5 categories: (i) methods that assign a uniform duration for all pregnancies, (ii) methods that assign pregnancy duration based on preterm delivery or health care related codes, or codes for other pregnancy outcomes, (iii) methods based on the timing of prenatal care, (iv) methods based on birth weight, and (v) methods that combine elements from 2 and 3. Validation studies evaluating these methods used varied approaches, with results generally reporting on the mistiming of the start of pregnancy, incorrect estimation of the duration of pregnancy, or misclassification of drug exposure during pregnancy or early pregnancy.
CONCLUSIONS: In the absence of accurate information on the beginning or duration of pregnancy, several methods of varying complexity are available to estimate them. Validation studies have been performed for many of them and can serve as a guide for method selection for a particular study. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25627986

7. Pharmacoepidemiol Drug Saf. 2015 Jan 20. doi: 10.1002/pds.3738. [Epub ahead of print]

Short look-back periods in pharmacoepidemiologic studies of new users of antibiotics and asthma medications introduce severe misclassification.

Riis AH, Johansen MB, Jacobsen JB, Brookhart MA, Stürmer T, Støvring H. Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus N, Denmark.

PURPOSE: The aim of this study was to quantify the effect of the look back period on the misclassification of new users of antibiotics and asthma medications.
METHODS: We included all children born in Denmark from 1995 through 2006 and all prescriptions of antibiotics and asthma medication from 1995 through 2011. The study period was 2007 through 2011. True new users redeemed their first prescription in the study period whereas prior users redeemed their first prescription before the study period. Look-back periods ranged from 30 days up to 12 years prior to the study period, and we defined new users as those without a prescription in the look-back period. The relative misclassification (RM) was estimated as the number of defined new users divided by the number of true new users.
RESULTS: For antibiotics, the RM decreased from 4.75 for look-back periods of 30 days to 2.36 for 2 years and 1.33 for 5 years. For asthma medication, the RM decreased from 2.53 for look-back periods of 30 days to 1.48 for 2 years and 1.20 for 5 years. Older age, male gender, and absence of treatment-related diagnoses were associated with higher RM.
CONCLUSIONS: Studies applying the new user design are strongly dependent on the available information on prescriptions. For drug classes with intermittent use such as asthma medications, even a 2-year look-back period produced severe misclassification. Excluding children with a prior treatment-related diagnosis can reduce the level of misclassification. Copyright © 2015 John Wiley & Sons, Ltd. PMID: 25601142

8. Am J Epidemiol. 2015 Feb 1;181(3):198-203. doi: 10.1093/aje/kwu276. Epub 2015 Jan 14.

A cautionary note about estimating effects of secondary exposures in cohort studies.

Ahrens KA, Cole SR, Westreich D, Platt RW, Schisterman EF.

Cohort studies are often enriched for a primary exposure of interest to improve cost-effectiveness, which presents analytical challenges not commonly discussed in epidemiology. In this paper, we use causal diagrams to represent exposure-enriched cohort studies, illustrate a scenario wherein the risk ratio for the effect of a secondary exposure on an outcome is biased, and propose an analytical method for correcting for such bias. In our motivating example, maternal smoking (Z) is a cause of fetal growth restriction (X), which subsequently affects preterm birth (Y) (i.e., Z → X → Y); strong positive associations exist between both Z, X and X, Y; and enrichment for X increases its prevalence from 10% to 50%. In the X-enriched cohort, unadjusted and X-adjusted analyses lead to bias in the risk ratio for the total effect of Z on Y. After application of inverse probability weights, the bias is corrected, with a small loss of efficiency in comparison with a same-sized study without X-enrichment. With increasing interest in conducting secondary analyses to reduce research costs, caution should be employed when analyzing studies that have already been enriched, intentionally or unintentionally, for a primary exposure of interest. Causal diagrams can help identify scenarios in which secondary analyses may be biased. Inverse probability weights can be used to remove the bias. © The Author 2014. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. PMID: 25589243 [PubMed – in process]