DEcIDE Methods Center Monthly Literature Scan

AHRQ has funded the Brigham DEcIDE Center for Comparative Effectiveness Research to lead the DEcIDE Methods Center (DMC). One of the primary aims of the DMC is to develop a multifaceted Methods Learning Network for comparative effectiveness research. As part of the Learning Network, monthly literature reviews are conducted to alert the DEcIDE network to articles on methodological innovations or reviews of analytic approaches that may help improve the validity of original comparative effectiveness research. Highly specific topics are weeded out and there is a focus on approaches relevant to a larger number of applied investigators in CER.

The DEcIDE Center for Comparative Effectiveness Research has finished its monthly literature scans and webinars. The division will continue to provide monthly comparative effectiveness research methods literature scans here.

For some months, a cluster of methods references on a specific theme thought to be of interest is also provided.

Current and previous scans may be found below:

April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011

April 2013

CER Scan [e-Pub]

  1. Pharmacoepidemiol Drug Saf. 2013 Apr 29. [Epub ahead of print]
  2. A simulation study to compare three self-controlled case series approaches: correction for violation of assumption and evaluation of bias. Hua W, Sun G, Dodd CN, Romio SA, Whitaker HJ, Izurieta HS, Black S, Sturkenboom MC, Davis RL, Deceuninck G, Andrews NJ. Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, Food and Drug Administration, Rockville, Maryland, USA.

    PURPOSE: The assumption that the occurrence of outcome event must not alter subsequent exposure probability is critical for preserving the validity of the self-controlled case series (SCCS) method. This assumption is violated in scenarios in which the event constitutes a contraindication for exposure. In this simulation study, we compared the performance of the standard SCCS approach and two alternative approaches when the event-independent exposure assumption was violated. METHODS: Using the 2009 H1N1 and seasonal influenza vaccines and Guillain-Barré syndrome as a model, we simulated a scenario in which an individual may encounter multiple unordered exposures and each exposure may be contraindicated by the occurrence of outcome event. The degree of contraindication was varied at 0%, 50%, and 100%. The first alternative approach  used only cases occurring after exposure with follow-up time starting from exposure. The second used a pseudo-likelihood method. RESULTS:  When the event-independent exposure assumption was satisfied, the standard SCCS approach produced nearly unbiased relative incidence estimates. When this assumption was partially or completely violated, two alternative SCCS approaches could be used. While the post-exposure cases only approach could handle only one exposure, the
    pseudo-likelihood approach was able to correct bias for both exposures. CONCLUSIONS: Violation of the event-independent exposure assumption leads to an overestimation of relative incidence which could be corrected by alternative SCCS approaches. In multiple exposure situations, the pseudo-likelihood approach is optimal; the post-exposure cases only approach is limited in handling a second exposure and may introduce additional bias, thus should be used with caution. Copyright © 2013 John Wiley & Sons, Ltd.
    PMID: 23625875  [PubMed - as supplied by publisher]

  3. Clin Trials. 2013 Apr 22. [Epub ahead of print]
  4. Kaplan-Meier curves for survivor causal effects with time-to-event outcomes. Chiba Y. Division of Biostatistics, Clinical Research Center, Kinki University School of Medicine, Osaka, Japan.

    BACKGROUND: In clinical trials, an outcome of interest may be undefined for individuals who die before the outcome is evaluated. One approach to deal with such issues is to consider the survivor causal effect (SCE), which is defined as the effect of treatment on the outcome among the subpopulation that would have survived under either treatment arm. Although several methods have been presented to estimate the SCE with time-to-event outcomes, they are difficult to implement in practice. PURPOSE: We present a simple method to create Kaplan-Meier curves and to estimate the hazard ratio (HR) for the SCE with time-to-event outcomes. METHODS: To develop such a method, we applied the weighted average method presented for the SCE to outcomes with no censoring, where weights are calculated using the probability that a patient would have survived had the patient been in the other treatment arm. By multiplying the weight to each patient, Kaplan-Meier curves can be created for the SCE to outcomes with censoring. The HR is then calculated using a weighted proportional hazard model. For this method, two assumptions need to be introduced to achieve unbiasedness. RESULTS: The proposed method is illustrated using data from a randomized Phase II clinical trial, comparing two chemotherapy treatments with radiotherapy in patients with esophageal cancer. Here, we focus on the loco-regional control rate, which is calculated from the time after randomization until recurrence in the radiation field. The duration is undefined for patients who died without recurrence. The proposed method yielded a HR of 1.026 (95% confidence interval (CI): 0.627, 1.677). The standard method, where data of patients who died without progression were regarded as censored at the time of death, yielded a HR of 1.121 (95% CI: 0.688, 1.827). LIMITATIONS: The proposed method requires two assumptions. As a general problem, unfortunately, whether these assumptions hold cannot be confirmed from the observed data. Thus, we cannot confirm whether the Kaplan-Meier curves and the HR are unbiased. CONCLUSION: We have proposed a simple method for the SCE with time-to-event outcomes, which is easy to implement in practice. The proposed method is a potentially valuable supplement to the standard method.
    PMID: 23610455  [PubMed - as supplied by publisher]

  5. Stat Med. 2013 Apr 22. doi: 10.1002/sim.5826. [Epub ahead of print]
  6. Estimating a marginal causal odds ratio in a case-control design: analyzing the effect of low birth weight on the risk of type 1 diabetes mellitus. Persson E, Waernbaum I. Department of Statistics, USBE, Umeå University, SE-90187 Umeå, Sweden.

    Estimation of marginal causal effects from case-control data has two complications: (i) confounding due to the fact that the exposure under study is not randomized, and (ii) bias from the case-control sampling scheme. In this paper, we study estimators of the marginal causal odds ratio, addressing these issues for matched and unmatched case-control designs when utilizing the knowledge of the known prevalence of being a case. The estimators are implemented in simulations where their finite sample properties are studied and approximations of their variances are derived with the delta method. Also, we illustrate the methods by analyzing the effect of low birth weight on the risk of type 1 diabetes mellitus using data from the Swedish Childhood Diabetes Register, a nationwide population-based incidence register. Copyright © 2013 John Wiley & Sons, Ltd.
    PMID: 23606411  [PubMed - as supplied by publisher]

  7. Pharmacoepidemiol Drug Saf. 2013 Apr 18. doi: 10.1002/pds.3443. [Epub ahead of print]
  8. Methods of linking mothers and infants using health plan data for studies of pregnancy outcomes. Johnson KE, Beaton SJ, Andrade SE, Cheetham TC, Scott PE, Hammad TA, Dashevsky I, Cooper WO, Davis RL, Pawloski PA, Raebel MA, Smith DH, Toh S, Li DK, Haffenreffer K, Dublin S. Group Health Research Institute, Group Health Cooperative, Seattle, WA, USA.

    PURPOSE: Research on medication safety in pregnancy often utilizes health plan and birth certificate records. This study discusses methods used to link mothers with infants, a crucial step in such research. METHODS: We describe how eight sites participating in the Medication Exposure in Pregnancy Risk Evaluation Program created linkages between deliveries, infants and birth certificates for the 2001-2007 birth cohorts. We describe linkage rates across sites, and for two sites, we compare the characteristics of populations linked using different methods. RESULTS: Of 299 260 deliveries, 256 563 (86%; range by site, 74-99%) could be linked to infants using a deterministic algorithm. At two sites, using birth certificate data to augment mother-infant linkage increased the representation of mothers who were Hispanic or non-White, younger, Medicaid recipients, or had low educational level. A total of 236 460 (92%; range by site, 82-100%) deliveries could be linked to a birth certificate. CONCLUSIONS: Tailored approaches enabled linking most deliveries to infants and to birth certificates, even when data systems differed. The methods used may affect the composition of the population identified. Linkages established with such methods can support sound pharmacoepidemiology studies of maternal drug exposure outside the context of a formal registry. Copyright © 2013 John Wiley & Sons, Ltd.
    PMID: 23596095  [PubMed - as supplied by publisher]

  9. Stat Med. 2013 Mar 31. doi: 10.1002/sim.5795. [Epub ahead of print]
  10. Propensity scores used for analysis of cluster randomized trials with selection bias: a simulation study. Leyrat C, Caille A, Donner A, Giraudeau B. INSERM UMR-S 738, Paris, France; INSERM CIC 202, Tours, France.

    Cluster randomized trials (CRTs) are often prone to selection bias despite randomization. Using a simulation study, we investigated the use of propensity score (PS) based methods in estimating treatment effects in CRTs with selection bias when the outcome is quantitative. Of four PS-based methods (adjustment on PS, inverse weighting, stratification, and optimal full matching method), three successfully corrected the bias, as did an approach using classical multivariable regression. However, they showed poorer statistical efficiency than classical methods, with higher standard error for the treatment effect, and type I error much smaller than the 5% nominal level. Copyright © 2013 John Wiley & Sons, Ltd.
    PMID: 23553813  [PubMed - as supplied by publisher]

  11. Clin Trials. 2013 Apr 3. [Epub ahead of print]
  12. Accommodating missingness when assessing surrogacy via principal stratification. Elliott MR, Li Y, Taylor JM. Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.

    BACKGROUND: When an outcome of interest in a clinical trial is late-occurring or difficult to obtain, surrogate markers can extract information about the effect of the treatment on the outcome of interest. Understanding associations between the causal effect (CE) of treatment on the outcome and the causal effect of treatment on the surrogate is critical to understanding the value of a surrogate from a clinical perspective. PURPOSE: Traditional regression approaches to determine the proportion of the treatment effect explained by surrogate markers suffer from several shortcomings: they can be unstable and can lie outside the 0-1 range. Furthermore, they do not account for the fact that surrogate measures are obtained post randomization, and thus, the surrogate-outcome relationship may be subject to unmeasured confounding. METHODS: to avoid these problems are of key importance. MethodsFrangakis and Rubin suggested assessing the CE within prerandomization ‘principal strata’ defined by the counterfactual joint distribution of the surrogate marker under the different treatment arms, with the proportion of the overall outcome CE attributable to subjects for whom the treatment affects the proposed surrogate as the key measure of interest. Li et al. developed this ‘principal surrogacy’ approach for dichotomous markers and outcomes, utilizing Bayesian methods that accommodated nonidentifiability in the model parameters. Because the surrogate marker is typically observed early, outcome data are often missing. Here, we extend Li et al. to accommodate missing data in the observable final outcome under ignorable and nonignorable settings. We also allow for the possibility that missingness has a counterfactual component, a feature that previous literature has not addressed. RESULTS: We apply the proposed methods to a trial of glaucoma control comparing surgery versus medication, where intraocular pressure (IOP) control at 12 months is a surrogate for IOP control at 96 months. We also conduct a series of simulations to consider the impacts of nonignorability, as well as sensitivity to priors and the ability of the decision information criterion (DIC) to choose the correct model when parameters are not fully identified. LIMITATIONS: Because model parameters cannot be fully identified from data, informative priors can introduce nontrivial bias in moderate sample size settings, while more noninformative priors can yield wide credible intervals. CONCLUSIONS: Assessing the linkage between CEs of treatment on a surrogate marker and CEs of a treatment on an outcome is important to understanding the value of a marker. These CEs are not fully identifiable; hence, we explore the sensitivity and identifiability aspects of these models and show that relatively weak assumptions can still yield meaningful results.
    PMID: 23553326  [PubMed - as supplied by publisher]

CER Scan [Published in the last 30 days]

  1. BMC Med Res Methodol. 2013 Apr 24;13(1):59. [Epub ahead of print]
  2. Methodological approaches to population based research of screening procedures in the presence of selection bias and exposure measurement error: colonoscopy and colorectal cancer outcomes in Ontario. Jacob BJ, Sutradhar R, Moineddin R, Baxter NN, Urbach DR.

    BACKGROUND: The study describes the methodological challenges encountered in an observational study estimating the effectiveness of colonoscopy in reducing colorectal cancer (CRC) incidence and mortality. METHODS: Using Ontario provincial administrative data, we conducted a population-based retrospective cohort study to assess CRC incidence and mortality in a group of average-risk subjects aged 50–74 years who underwent colonoscopy between 1996–2000. We created two study cohorts; unselected and restricted. The unselected cohort consists of subjects aged 50–74 years who were eligible for CRC screening and who had the same primary care physician (PCP) during the period 1996–2000 with at least two years of follow-up. PCPs are general practioners/family physicians who are the main source of health care for Ontarians. The restricted cohort was a nested sample of unselected cohort who were alive and free of CRC as on January 1, 2001 and whose PCPs had at least 10 screen-eligible patients with a colonoscopy referral rate of more than 3%. We compared the outcomes in the two study cohorts; unselected vs. restricted. We then estimated the absolute risk reduction associated with colonoscopy in preventing CRC incidence and mortality in the restricted cohort, using traditional regression analysis, propensity score analysis and instrumental variable analysis. RESULTS: The unselected cohort (N =
    1,341,612) showed that colonoscopy was associated with an increase in CRC incidence (1.61% vs. 4.61%) and mortality (0.36% vs. 1.16%), whereas the restricted cohort (N = 1,089,998) showed that colonoscopy was associated with a reduction in CRC incidence (1.36% vs. 0.84%) and mortality (0.23% vs. 0.15%). For CRC incidence, the absolute risk reduction (ARR) associated with colonoscopy use was 0.52% in an unadjusted model, 0.53% in a multivariate logistic regression model, 0.54% in a propensity score-weighted outcome model, 0.56% in propensity score-matched model, and 0.60% using instrumental variable analysis. For CRC mortality, the ARR was 0.08% in the unadjusted model, multivariate logistic regression model and for a propensity score- weighted outcome model, 0.10% using propensity score matched model and 0.17% using the IVA model. CONCLUSIONS: Colonoscopy use reduced the risk of CRC incidence and mortality in the restricted cohort. The study highlights the importance of appropriate selection of study subjects and use of analytic methods for the evaluation of screening methods using observational data.
    PMID: 23617792  [PubMed - as supplied by publisher]

March 2013

CER Scan [Epub ahead of print]

  1. Stat Med. 2013 Mar 27. doi: 10.1002/sim.5801. [Epub ahead of print]
  2. Estimating parsimonious models of longitudinal causal effects using regressions on propensity scores. Shinohara RT, Narayan AK, Hong K, Kim HS, Coresh J, Streiff MB, Frangakis CE. Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA.

    Parsimony is important for the interpretation of causal effect estimates of longitudinal treatments on subsequent outcomes. One method for parsimonious estimates fits marginal structural models by using inverse propensity scores as weights. This method leads to generally large variability that is uncommon in more likelihood-based approaches. A more recent method fits these models by using simulations from a fitted g-computation, but requires the modeling of high-dimensional longitudinal relations that are highly susceptible to misspecification. We propose a new method that, first, uses longitudinal propensity scores as regressors to reduce the dimension of the problem and then uses the approximate likelihood for the first estimates to fit parsimonious models. We demonstrate the methods by estimating the effect of anticoagulant therapy on survival for cancer and non-cancer patients who have inferior vena cava filters. Copyright ©2013 John Wiley & Sons, Ltd.
    PMID: 23533091  [PubMed - as supplied by publisher]

  3. Epidemiology. 2013 May;24(3):352-62. [Epub ahead of print]
  4. COX-2 Selective Nonsteroidal Anti-inflammatory Drugs and Risk of Gastrointestinal Tract Complications and Myocardial Infarction: An Instrumental Variable Analysis. Davies NM, Smith GD, Windmeijer F, Martin RM. Medical Research Council Centre for Causal Analyses and Translational Epidemiology, School of Social and Community Medicine, Faculty of Medicine and Dentistry, University of Bristol, Barley House, Oakfield Grove, Bristol, United Kingdom; Centre for Market and Public Organisation,  Department of Economics, Faculty of Social Science and Law, University of Bristol, Bristol, UK

    OBJECTIVE: Instrumental variable analysis can estimate treatment effects in the  presence of residual or unmeasured confounding. We compared ordinary least squares regression versus instrumental variable estimates of the effects of selective cyclooxygenase-2 inhibitors (COX-2s) relative to nonselective nonsteroidal anti-inflammatory drug (NSAID) prescriptions on incidence of upper gastrointestinal complications and myocardial infarction (MI). METHODS: We sampled a cohort of 62,933 first-time users of COX-2s or nonselective NSAIDs older than 60 years in the Clinical Practice Research Datalink. The instruments were physicians’ previous prescriptions of COX-2s or nonselective NSAIDs, which are surrogates for physician preferences. We estimated risk differences of incident upper gastrointestinal complications and MI within 180 days of first COX-2 versus nonselective NSAID prescription. RESULTS: Using ordinary least squares regression, adjusted for baseline confounders, we observed little association of COX-2 prescriptions with incident upper gastrointestinal complications (risk difference = -0.08 [95% confidence interval = -0.20 to 0.04]) or MI (0.06 [-0.06 to 0.17]) per 100 patients treated. Our adjusted instrumental variable results suggested 0.46 per 100 (-0.15 to 1.07) fewer upper gastrointestinal complications and little difference in acute MIs (0.08 per 100 [-0.61 to 0.76]), within 180 days of being prescribed COX-2s. Estimates were more precise when we used 20 previous prescriptions; the instrumental variable analysis implied 0.74 (0.28 to 1.19) fewer MIs per 100 patients prescribed COX-2s. CONCLUSIONS: Using instrumental variable analysis, we found some evidence that COX-2 prescriptions reduced the risk of upper gastrointestinal complications, consistent with randomized controlled trials. Our results using multiple instruments suggest that COX-2s may have heterogeneous within-class effects on MI.
    PMID: 23532054  [PubMed - as supplied by publisher]

  5. Epidemiology. 2013 May;24(3):401-9. [Epub ahead of print]
  6. Matching by Propensity Score in Cohort Studies with Three Treatment Groups.Rassen JA, Shelat AA, Franklin JM, Glynn RJ, Solomon DH, Schneeweiss S. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA; and Department of Computer Science, University of Virginia, Charlottesville, VA.

    BACKGROUND:: Nonrandomized pharmacoepidemiology generally compares one medication with another. For many conditions, clinicians can benefit from comparing the safety and effectiveness of three or more appropriate treatment options. We sought to compare three treatment groups simultaneously by creating 1:1:1 propensity score-matched cohorts. METHODS:: We developed a technique that estimates generalized propensity scores and then creates 1:1:1 matched sets. We compared this methodology with two existing approaches-construction of matched cohorts through a common-referent group and a pairwise match for each possible contrast. In a simulation, we varied unmeasured confounding, presence of treatment effect heterogeneity, and the prevalence of treatments and compared each method’s bias, variance, and mean squared error (MSE) of the treatment effect. We applied these techniques to a cohort of rheumatoid arthritis patients treated with nonselective nonsteroidal anti-inflammatory drugs, COX-2 selective inhibitors, or opioids. RESULTS:: We performed 1000 simulation runs. In the base case, we observed an average bias of 0.4% (MSE √ó 100 = 0.2) in the three-way matching approach and an average bias of 0.3% (MSE √ó 100 = 2.1) with the pairwise technique. The techniques showed differing bias and MSE with increasing treatment effect heterogeneity and decreasing propensity score overlap. With highly unequal exposure prevalences, strong heterogeneity, and low overlap, we observed a bias of 6.5% (MSE √ó 100 = 10.8) in the three-way approach and 12.5% (MSE = 12.3) in the pairwise approach. The empirical study displayed better covariate balance using the pairwise approach. Point estimates were substantially similar. CONCLUSIONS:: Our matching approach offers an effective way to study the safety and effectiveness of three treatment options. We recommend its use over the pairwise or common-referent approaches.
    PMID: 23532053  [PubMed - as supplied by publisher]

  7. Stat Med. 2013 Mar 18. doi: 10.1002/sim.5753. [Epub ahead of print]
  8. A tutorial on propensity score estimation for multiple treatments using generalized boosted models. McCaffrey DF, Griffin BA, Almirall D, Slaughter ME, Ramchand R, Burgette LF. The RAND Corporation, 4570 Fifth Avenue, Pittsburgh, PA 15213, U.S.A.

    The use of propensity scores to control for pretreatment imbalances on observed variables in non-randomized or observational studies examining the causal effects of treatments or interventions has become widespread over the past decade. For settings with two conditions of interest such as a treatment and a control, inverse probability of treatment weighted estimation with propensity scores estimated via boosted models has been shown in simulation studies to yield causal effect estimates with desirable properties. There are tools (e.g., the twang package in R) and guidance for implementing this method with two treatments. However, there is not such guidance for analyses of three or more treatments. The goals of this paper are twofold: (1) to provide step-by-step guidance for researchers who want to implement propensity score weighting for multiple treatments and (2) to propose the use of generalized boosted models (GBM) for estimation of the necessary propensity score weights. We define the causal quantities that may be of interest to studies of multiple treatments and derive weighted estimators of those quantities. We present a detailed plan for using GBM to estimate propensity scores and using those scores to estimate weights and causal effects. We also provide tools for assessing balance and overlap of pretreatment variables among treatment groups in the context of multiple treatments. A case study examining the effects of three treatment programs for adolescent substance abuse demonstrates the methods. Copyright © 2013 John Wiley & Sons, Ltd.
    PMID: 23508673  [PubMed - as supplied by publisher]

  9. Stat Med. 2013 Mar 24. doi: 10.1002/sim.5786. [Epub ahead of print]
  10. Propensity score weighting with multilevel data. Li F, Zaslavsky AM, Landrum MB. Department of Statistical Science, Duke University, Durham, NC, 27708, U.S.A.

    Propensity score methods are being increasingly used as a less parametric alternative to traditional regression to balance observed differences across groups in both descriptive and causal comparisons. Data collected in many disciplines often have analytically relevant multilevel or clustered structure. The propensity score, however, was developed and has been used primarily with unstructured data. We present and compare several propensity-score-weighted estimators for clustered data, including marginal, cluster-weighted, and doubly robust estimators. Using both analytical derivations and Monte Carlo simulations, we illustrate bias arising when the usual assumptions of propensity score analysis do not hold for multilevel data. We show that exploiting the multilevel structure, either parametrically or nonparametrically, in at least one stage of the propensity score analysis can greatly reduce these biases. We applied these methods to a study of racial disparities in breast cancer screening among beneficiaries of Medicare health plans. Copyright © 2013 John Wiley & Sons, Ltd.
    PMID: 23526267  [PubMed - as supplied by publisher]

  11. Pharmacoepidemiol Drug Saf. 2013 Mar 22. [Epub ahead of print]
  12. Estimation using all available covariate information versus a fixed look-back window for dichotomous covariates. Brunelli SM, Gagne JJ, Huybrechts KF, Wang SV, Patrick AR, Rothman KJ, Seeger JD. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School.

    PURPOSE: When using claims data, dichotomous covariates (C) are often assumed to be absent unless a claim for the condition is observed. When available historical data differs among subjects, investigators must choose between using all available historical data versus data from a fixed window to assess C. Our purpose was to compare estimation under these two approaches. METHODS: We simulated cohorts of 20‚000 subjects with dichotomous variables representing exposure (E), outcome (D), and a single time-invariant C, as well as varying availability of historical data. C was operationally defined under each  paradigm and used to estimate the adjusted risk ratio of E on D via Mantel-Haenszel methods. RESULTS: In the base case scenario, less bias and lower mean square error were observed using all available information compared with a fixed window; differences were magnified at higher modeled confounder strength. Upon introduction of an unmeasured covariate (F), the all-available approach remained less biased in most circumstances and rendered estimates that better approximated those that were adjusted for the true (modeled) value of C in all instances. CONCLUSIONS: In most instances considered, operationally defining time-invariant dichotomous C based on all available historical data, rather than on data observed over a commonly shared fixed historical window, results in less biased estimates. Copyright ¬© 2013 John Wiley & Sons, Ltd.
    PMID: 23526818  [PubMed - as supplied by publisher]

CER Scan [published in the last 30 days]

  1. BMC Med Res Methodol. 2013 Mar 14;13(1):40. [Epub ahead of print]
  2. Advancing current approaches to disease management evaluation: capitalizing on heterogeneity to understand what works and for whom. Elissen AM, Adams JL, Spreeuwenberg M, Duimel-Peeters IG, Spreeuwenberg C, Linden A, Vrijhoef HJ.

    BACKGROUND: Evaluating large-scale disease management interventions implemented in actual health care settings is a complex undertaking for which universally accepted methods do not exist. Fundamental issues, such as a lack of control patients and limited generalizability, hamper the use of the ‘gold-standard’ randomized controlled trial, while methodological shortcomings restrict the value of observational designs. Advancing methods for disease management evaluation in practice is pivotal to learn more about the impact of population-wide approaches. Methods must account for the presence of heterogeneity in effects, which necessitates a more granular assessment of outcomes. METHODS: This paper introduces multilevel regression methods as valuable techniques to evaluate ‘real-world’ disease management approaches in a manner that produces meaningful findings for everyday practice. In a worked example, these methods are applied to retrospectively gathered routine health care data covering a cohort of 105,056 diabetes patients who receive disease management for type 2 diabetes mellitus in the Netherlands. Multivariable, multilevel regression models are fitted to identify trends in clinical outcomes and correct for differences in characteristics of patients (age, disease duration, health status, diabetes complications, smoking status) and the intervention (measurement frequency and range, length of follow-up). RESULTS: After a median one year follow-up, the Dutch disease management approach was associated with small average improvements in systolic blood pressure and low-density lipoprotein, while a slight deterioration occurred in glycated hemoglobin. Differential findings suggest that patients with poorly controlled diabetes tend to benefit most from disease management in terms of improved clinical measures. Additionally, a greater measurement frequency was associated with better outcomes, while longer length of follow-up was accompanied by less positive results. CONCLUSIONS: Despite concerted efforts to adjust for potential sources of confounding and bias, there ultimately are limits to the validity and reliability of findings from uncontrolled research based on routine intervention data. While our   findings are supported by previous randomized research in other settings, the trends in outcome measures presented here may have alternative explanations. Further practice-based research, perhaps using historical data to retrospectively construct a control group, is necessary to confirm results and learn more about the impact of population-wide disease management.

PMID: 23497125  [PubMed - as supplied by publisher]

CER Scan [published in the last 90 days]

  1. Health Serv Res. 2012 Dec 6. doi: 10.1111/1475-6773.12020.
  2. Squeezing the Balloon: Propensity Scores and Unmeasured Covariate Balance. Brooks JM, Ohsfeldt RL. College of Pharmacy and College of Public Health, University of Iowa, Iowa City, IA.

    OBJECTIVE: To assess the covariate balancing properties of propensity score-based algorithms in which covariates affecting treatment choice are both measured and unmeasured. DATA SOURCES/STUDY SETTING: A simulation model of treatment choice and outcome. STUDY DESIGN: Simulation. DATA COLLECTION/EXTRACTION METHODS: Eight simulation scenarios varied with the values placed on measured and unmeasured covariates and the strength of the relationships between the measured and unmeasured covariates. The balance of both measured and unmeasured covariates was compared across patients either grouped or reweighted by propensity scores methods. PRINCIPAL FINDINGS: Propensity score algorithms require unmeasured covariate variation that is unrelated to measured covariates, and they exacerbate the imbalance in this variation between treated and untreated patients relative to the full unweighted sample. CONCLUSIONS: The balance of measured covariates between treated and untreated patients has opposite implications for unmeasured covariates in randomized and observational studies. Measured covariate balance between treated and untreated patients in randomized studies reinforces the notion that all covariates are balanced. In contrast, forced balance of measured covariates using propensity score methods in observational studies exacerbates the imbalance in the independent portion of the variation in the unmeasured covariates, which can be likened to squeezing a balloon. If the unmeasured covariates affecting treatment choice are confounders, propensity score methods can exacerbate the bias in treatment effect estimates. © Health Research and Educational Trust.
    PMID: 23216471  [PubMed - as supplied by publisher]

February 2013

CER Scan [Epub ahead of print]

  1. Walker AM, Patrick AR, Lauer MS, Hornbrook MC, Marin MG, Platt R, Roger VL, Stang P, Schneeweiss S. A tool for assessing the feasibility of comparative effectiveness research. Comparative Effectiveness Research 2013:3 11–20.World Health Information Science Consultants, Newton, MA; Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital, Boston, MA; National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD; The Center for Health Research, Kaiser Permanente Northwest, Portland, OR; Department of Medicine, New Jersey Medical School, Newark, NJ; Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA; Department of Health Sciences Research, Mayo Clinic, Rochester, MN; 8Johnson and Johnson Pharmaceutical Research and Development, Titusville, NJ, USA

Background: Comparative effectiveness research (CER) provides actionable information for health care decision-making. Randomized clinical trials cannot provide the patients, time horizons, or practice settings needed for all required CER. The need for comparative assessments and the infeasibility of conducting randomized clinical trials in all relevant areas is leading researchers and policy makers to non-randomized, retrospective CER. Such studies are possible when rich data exist on large populations receiving alternative therapies that are used as-if interchangeably in clinical practice. This setting we call “empirical equipoise.”
Objectives: This study sought to provide a method for the systematic identification of settings it in which it is empirical equipoise that offers promised non-randomized CER.
Methods: We used a standardizing transformation of the propensity score called “preference” to assess pairs of common treatments for uncomplicated community-acquired pneumonia and new-onset heart failure in a population of low-income elderly people in Pennsylvania, for whom we had access to de-identified insurance records. Treatment pairs were considered suitable for CER if at least half of the dispensings of each treatment-pair member fell within a preference range of 30% to 70%.
Results: Among 3889 community-acquired pneumonia patients, insurance claims histories were sufficiently similar in seven drug pairs to suggest that observational CER might be effective. Relapse appears to have been less common in levofloxacin recipients than in similar patients given other products. In 6035 heart failure patients, metoprolol, carvedilol, and atenolol were employed in patients with similar claims histories, and thus might be suitable for observational CER. The long-acting succinate formulation of metoprolol had lower failure rates in head-to-head comparisons with all other beta-blockers. Both findings are candidates for further investigation. Confounding by unmeasured factors operating in the same manner as the measured covariates would not have produced the apparent superiority of levofloxacin, which was given to people in poorer respiratory health. The baseline covariate distributions of persons starting beta-blockers suggest only that carvedilol recipients were healthier than others.
Conclusion: A straightforward algorithm can identify empirical equipoise, in which prescribers as a group seem evenly divided on the merits of alternative therapies. This is the setting in which CER may be most necessary and is likely to be most accurate. The imbalances identified by propensity models can identify situations in which the results of screening analyses may be biased in the direction of the observed effect.

There is also a brief presentation that accompanies this manuscript: http://www.youtube.com/watch?v=JdgUa7-Urfw

  1. Stat Med. 2013 Feb 25. doi: 10.1002/sim.5764. [Epub ahead of print]

Incorporating data from various trial designs into a mixed treatment comparison model. Schmitz S, Adams R, Walsh C. Trinity College Dublin, Dublin, Ireland.

Estimates of relative efficacy between alternative treatments are crucial for decision making in health care. Bayesian mixed treatment comparison models provide a powerful methodology to obtain such estimates when head-to-head evidence is not available or insufficient. In recent years, this methodology has become widely accepted and applied in economic modelling of healthcare interventions. Most evaluations only consider evidence from randomized controlled trials, while information from other trial designs is ignored. In this paper, we propose three alternative methods of combining data from different trial designs in a mixed treatment comparison model. Naive pooling is the simplest approach and does not differentiate between-trial designs. Utilizing observational data as prior information allows adjusting for bias due to trial design. The most flexible technique is a three-level hierarchical model. Such a model allows for bias adjustment while also accounting for heterogeneity between-trial designs. These techniques are illustrated using an application in rheumatoid arthritis. Copyright © 2013 John Wiley & Sons, Ltd.
PMID: 23440610  [PubMed - as supplied by publisher]

  1. Stat Med. 2013 Feb 24. doi: 10.1002/sim.5762. [Epub ahead of print]

Model selection in competing risks regression. Kuk D, Varadhan R. Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY, U.S.A.

In the analysis of time-to-event data, the problem of competing risks occurs when an individual may experience one, and only one, of m different types of events. The presence of competing risks complicates the analysis of time-to-event data, and standard survival analysis techniques such as Kaplan-Meier estimation, log-rank test and Cox modeling are not always appropriate and should be applied with caution. Fine and Gray developed a method for regression analysis that models the hazard that corresponds to the cumulative incidence function. This model is becoming widely used by clinical researchers and is now available in all the major software environments. Although model selection methods for Cox proportional hazards models have been developed, few methods exist for competing  risks data. We have developed stepwise regression procedures, both forward and backward, based on AIC, BIC, and BICcr (a newly proposed criteria that is a modified BIC for competing risks data subject to right censoring) as selection criteria for the Fine and Gray model. We evaluated the performance of these model selection procedures in a large simulation study and found them to perform well. We also applied our procedures to assess the importance of bone mineral density in predicting the absolute risk of hip fracture in the Women’s Health Initiative-Observational Study, where mortality was the competing risk. We have implemented our method as a freely available R package called crrstep. Copyright © 2013 John Wiley & Sons, Ltd.
PMID: 23436643  [PubMed - as supplied by publisher]

  1. Stat Methods Med Res. 2013 Feb 19. [Epub ahead of print]

Practical and statistical issues in missing data for longitudinal patient reported outcomes. Bell ML, Fairclough DL. The Psycho-Oncology Co-operative Research Group (PoCoG), University of Sydney, Sydney, Australia.

Patient reported outcomes are increasingly used in health research, including randomized controlled trials and observational studies. However, the validity of  results in longitudinal studies can crucially hinge on the handling of missing data. This paper considers the issues of missing data at each stage of research. Practical strategies for minimizing missingness through careful study design and conduct are given. Statistical approaches that are commonly used, but should be avoided, are discussed, including how these methods can yield biased and misleading results. Methods that are valid for data which are missing at random are outlined, including maximum likelihood methods, multiple imputation and extensions to generalized estimating equations: weighted generalized estimating equations, generalized estimating equations with multiple imputation, and doubly robust generalized estimating equations. Finally, we discuss the importance of sensitivity analyses, including the role of missing not at random models, such as pattern mixture, selection, and shared parameter models. We demonstrate many of these concepts with data from a randomized controlled clinical trial on renal cancer patients, and show that the results are dependent on missingness assumptions and the statistical approach.
PMID: 23427225  [PubMed - as supplied by publisher]

  1. Ann Epidemiol. 2013 Feb 15. [Epub ahead of print]

Confounding control in a nonexperimental study of STAR*D data: logistic regression balanced covariates better than boosted CART. Ellis AR, Dusetzina SB, Hansen RA, Gaynes BN, Farley JF, Stürmer T. Cecil G. Sheps Center for Health Services Research, University of North Carolina at Chapel Hill. Electronic address: are@unc.edu.

PURPOSE: Propensity scores (PSs), a powerful bias-reduction tool, can balance treatment groups on measured covariates in nonexperimental studies. We demonstrate the use of multiple PS estimation methods to optimize covariate balance. METHODS: We used secondary data from 1292 adults with nonpsychotic major depressive disorder in the Sequenced Treatment Alternatives to Relieve Depression trial (2001-2004). After initial citalopram treatment failed, patient preference influenced assignment to medication augmentation (n = 565) or switch (n = 727). To reduce selection bias, we used boosted classification and regression trees (BCART) and logistic regression iteratively to identify two potentially optimal PSs. We assessed and compared covariate balance. RESULTS: After iterative selection of interaction terms to minimize imbalance, logistic regression yielded better balance than BCART (average standardized absolute mean difference across 47 covariates: 0.03 vs. 0.08, matching; 0.02 vs. 0.05, weighting). CONCLUSIONS: Comparing multiple PS estimates is a pragmatic way to optimize balance. Logistic regression remains valuable for this purpose. Simulation studies are needed to compare PS models under varying conditions. Such studies should consider more flexible estimation methods, such as logistic models with automated selection of interactions or hybrid models using main effects logistic regression instead of a constant log-odds as the initial model for BCART. Copyright © 2013 Elsevier Inc. All rights reserved.
PMID: 23419508  [PubMed - as supplied by publisher]

  1. Biometrics. 2013 Feb 4. doi: 10.1111/j.1541-0420.2012.01830.x. [Epub ahead of print]

Model Feedback in Bayesian Propensity Score Estimation. Zigler CM, Watts K, Yeh RW, Wang Y, Coull BA, Dominici F. Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, U.S.A. Cardiology Division, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston Massachusetts 02114, U.S.A.

Summary Methods based on the propensity score comprise one set of valuable tools for comparative effectiveness research and for estimating causal effects more generally. These methods typically consist of two distinct stages: (1) a propensity score stage where a model is fit to predict the propensity to receive treatment (the propensity score), and (2) an outcome stage where responses are compared in treated and untreated units having similar values of the estimated propensity score. Traditional techniques conduct estimation in these two stages separately; estimates from the first stage are treated as fixed and known for use in the second stage. Bayesian methods have natural appeal in these settings because separate likelihoods for the two stages can be combined into a single joint likelihood, with estimation of the two stages carried out simultaneously. One key feature of joint estimation in this context is "feedback" between the outcome stage and the propensity score stage, meaning that quantities in a model for the outcome contribute information to posterior distributions of quantities in the model for the propensity score. We provide a rigorous assessment of Bayesian propensity score estimation to show that model feedback can produce poor estimates of causal effects absent strategies that augment propensity score adjustment with adjustment for individual covariates. We illustrate this phenomenon with a simulation study and with a comparative effectiveness investigation of carotid artery stenting versus carotid endarterectomy among 123,286 Medicare beneficiaries hospitlized for stroke in 2006 and 2007. © 2013, The International Biometric Society.
PMID: 23379793  [PubMed - as supplied by publisher]

  1. Pharmacoepidemiol Drug Saf. 2013 Feb 12. doi: 10.1002/pds.3412. [Epub ahead of print]

Near real-time adverse drug reaction surveillance within population-based health networks:  methodology considerations for data accrual.Avery TR, Kulldorff M, Vilk Y, Li L, Cheetham TC, Dublin S, Davis RL, Liu L, Herrinton L, Brown JS. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA; The HMO Research Network Center for Education and Research in Therapeutics, Boston, MA, USA.

PURPOSE: This study describes practical considerations for implementation of near real-time medical product safety surveillance in a distributed health data network. METHODS: We conducted pilot active safety surveillance comparing generic divalproex sodium to historical branded product at four health plans from April to October 2009. Outcomes reported are all-cause emergency room visits and fractures. One retrospective data extract was completed (January 2002-June 2008), followed by seven prospective monthly extracts (January 2008-November 2009). To evaluate delays in claims processing, we used three analytic approaches: near real-time sequential analysis, sequential analysis with 1.5 month delay, and nonsequential (using final retrospective data). Sequential analyses used the maximized sequential probability ratio test. Procedural and logistical barriers to active surveillance were documented. RESULTS: We identified 6586 new users of  generic divalproex sodium and 43 960 new users of the branded product. Quality control methods identified 16 extract errors, which were corrected. Near real-time extracts captured 87.5% of emergency room visits and 50.0% of fractures, which improved to 98.3% and 68.7% respectively with 1.5 month delay. We did not identify signals for either outcome regardless of extract timeframe, and slight differences in the test statistic and relative risk estimates were found. CONCLUSIONS: Near real-time sequential safety surveillance is feasible, but several barriers warrant attention. Data quality review of each data extract was necessary. Although signal detection was not affected by delay in analysis, when using a historical control group differential accrual between exposure and outcomes may theoretically bias near real-time risk estimates towards the null, causing failure to detect a signal. Copyright © 2013 John Wiley & Sons, Ltd.
PMID: 23401239  [PubMed - as supplied by publisher]

 

CER Scan [published in the last 30 days]

  1. BMC Med Res Methodol. 2013 Feb 20;13(1):25. [Epub ahead of print]

Assessment of reproducibility of cancer survival risk predictions across medical centers.Chen HC, Chen JJ.

BACKGROUND: Two most important considerations in evaluation of survival prediction models are 1) predictability – ability to predict survival risks accurately and 2) reproducibility – ability to generalize to predict samples generated from different studies. We present approaches for assessment of reproducibility of survival risk score predictions across medical centers. METHODS: Reproducibility was evaluated in terms of consistency and transferability. Consistency is the agreement of risk scores predicted between two centers. Transferability from one center to another center is the agreement of the risk scores of the second center predicted by each of the two centers. The transferability can be: 1) model transferability – whether a predictive model developed from one center can be applied to predict the samples generated from other centers and 2) signature transferability – whether signature markers of a predictive model developed from one center can be applied to predict the samples from other centers. We considered eight prediction models, including two clinical models, two gene expression models, and their combinations. Predictive performance of the eight models was evaluated by several common measures. Correlation coefficients between predicted risk scores of different centers were computed to assess reproducibility – consistency and transferability. RESULTS: Two public datasets, the lung cancer data generated from four medical centers and colon cancer data generated from two medical centers, were analyzed. The risk score estimates for lung cancer patients predicted by three of four centers agree reasonably well. In general, a good prediction model showed better cross-center consistency and transferability. The risk scores for the colon cancer patients from one (Moffitt) medical center that were predicted by the clinical models developed from the other (Vanderbilt) medical center were shown to have excellent model transferability and signature transferability. CONCLUSIONS: This study illustrates an analytical approach to assessing reproducibility of predictive models and signatures. Based on the analyses of the two cancer datasets, we conclude that the models with clinical variables appear to perform reasonable well with high degree of consistency and transferability. There should have more investigations on the reproducibility of prediction models including gene expression data across studies.
PMID: 23425000  [PubMed - as supplied by publisher]

  1. Med Care. 2013 Mar;51(3):251-8. doi: 10.1097/MLR.0b013e31827da594.

Improved cardiovascular risk prediction using nonparametric regression and electronic health record data.Kennedy EH, Wiitala WL, Hayward RA, Sussman JB. VA Center for Clinical Management Research, Ann Arbor VA Health Services Research and Development (HSR&D) Center of Excellence; Department of Internal Medicine, Robert Wood Johnson Foundation Clinical Scholars Program, University of Michigan, Ann Arbor, MI.

BACKGROUND: : Use of the electronic health record (EHR) is expected to increase rapidly in the near future, yet little research exists on whether analyzing internal EHR data using flexible, adaptive statistical methods could improve clinical risk prediction. Extensive implementation of EHR in the Veterans Health Administration provides an opportunity for exploration.
OBJECTIVES: : To compare the performance of various approaches for predicting risk of cerebrovascular and cardiovascular (CCV) death, using traditional risk predictors versus more comprehensive EHR data.
RESEARCH DESIGN: : Retrospective cohort study. We identified all Veterans Health Administration patients without recent CCV events treated at 12 facilities from 2003 to 2007, and predicted risk using the Framingham risk score, logistic regression, generalized additive modeling, and gradient tree boosting.
MEASURES: : The outcome was CCV-related death within 5 years. We assessed each method’s predictive performance with the area under the receiver operating characteristic curve (AUC), the Hosmer-Lemeshow goodness-of-fit test, plots of estimated risk, and reclassification tables, using cross-validation to penalize  overfitting.
RESULTS: : Regression methods outperformed the Framingham risk score, even with the same predictors (AUC increased from 71% to 73% and calibration also improved). Even better performance was attained in models using additional EHR-derived predictor variables (AUC increased to 78% and net reclassification improvement was as large as 0.29). Nonparametric regression further improved calibration and discrimination compared with logistic regression.
CONCLUSIONS: : Despite the EHR lacking some risk factors and its imperfect data quality, health care systems may be able to substantially improve risk prediction for their patients by using internally developed EHR-derived models and flexible statistical methodology.
PMID: 23269109  [PubMed - in process]

  1. BMJ. 2013 Jan 21;346:e8668. doi: 10.1136/bmj.e8668.

Differential dropout and bias in randomised controlled trials: when it matters and when it may not. Bell ML, Kenward MG, Fairclough DL, Horton NJ. Psycho-Oncology Co-operative Research Group, University of Sydney, Sydney, Australia. melanie.bell@sydney.edu.au

PMID: 23338004  [PubMed - in process]

  1. Am J Kidney Dis. 2013 Jan;61(1):13-7. doi: 10.1053/j.ajkd.2012.08.030.

Emerging analytical techniques for comparative effectiveness research. Brunelli SM, Rassen JA.

PMID: 23021799  [PubMed - indexed for MEDLINE]

  1. Implement Sci. 2013 Jan 9;8:6. doi: 10.1186/1748-5908-8-6.

Developing and Evaluating Communication Strategies to Support Informed Decisions  and Practice Based on Evidence (DECIDE): protocol and preliminary results. Treweek S, Oxman AD, Alderson P, Bossuyt PM, Brandt L, Brożek J, Davoli M, Flottorp S, Harbour R, Hill S, Liberati A, Liira H, Schünemann HJ, Rosenbaum S, Thornton J, Vandvik PO, Alonso-Coello P; DECIDE Consortium. Population Health Sciences, University of Dundee, Kirsty Semple Way, Dundee, UK.  streweek@mac.com

BACKGROUND: Healthcare decision makers face challenges when using guidelines, including understanding the quality of the evidence or the values and preferences upon which recommendations are made, which are often not clear. METHODS: GRADE is a systematic approach towards assessing the quality of evidence and the strength of recommendations in healthcare. GRADE also gives advice on how to go from evidence to decisions. It has been developed to address the weaknesses of other grading systems and is now widely used internationally. The Developing and Evaluating Communication Strategies to Support Informed Decisions and Practice Based on Evidence (DECIDE) consortium (http://www.decide-collaboration.eu/), which includes members of the GRADE Working Group and other partners, will explore methods to ensure effective communication of evidence-based recommendations targeted at key stakeholders: healthcare professionals, policymakers, and managers, as well as patients and the general public. Surveys and interviews with guideline producers and other stakeholders will explore how presentation of the evidence could be improved to better meet their information needs. We will collect further stakeholder input from advisory groups, via consultations and user testing; this will be done across a wide range of healthcare systems in Europe, North America, and other countries. Targeted communication strategies will be developed, evaluated in randomized trials, refined, and assessed during the development of real guidelines.
DISCUSSION: Results of the DECIDE project will improve the communication of evidence-based healthcare recommendations. Building on the work of the GRADE Working Group, DECIDE will develop and evaluate methods that address communication needs of guideline users. The project will produce strategies for communicating recommendations that have been rigorously evaluated in diverse settings, and it will support the transfer of research into practice in healthcare systems globally.
PMCID: PMC3553065
PMID: 23302501  [PubMed - in process]

January 2013

CER Scan [Epub ahead of print]

  1. Am J Epidemiol. 2013 Jan 29. [Epub ahead of print]

Mortality Risk Score Prediction in an Elderly Population Using Machine Learning. Rose S.

Standard practice for prediction often relies on parametric regression methods. Interesting new methods from the machine learning literature have been introduced in epidemiologic studies, such as random forest and neural networks. However, a priori, an investigator will not know which algorithm to select and may wish to try several. Here I apply the super learner, an ensembling machine learning approach that combines multiple algorithms into a single algorithm and returns a prediction function with the best cross-validated mean squared error. Super learning is a generalization of stacking methods. I used super learning in the Study of Physical Performance and Age-Related Changes in Sonomans (SPPARCS) to predict death among 2,066 residents of Sonoma, California, aged 54 years or more during the period 1993-1999. The super learner for predicting death (risk score) improved upon all single algorithms in the collection of algorithms, although its performance was similar to that of several algorithms. Super learner outperformed the worst algorithm (neural networks) by 44% with respect to estimated cross-validated mean squared error and had an R(2) value of 0.201. The improvement of super learner over random forest with respect to R(2) was approximately 2-fold. Alternatives for risk score prediction include the super learner, which can provide improved performance.
PMID: 23364879  [PubMed - as supplied by publisher]

  1. Eur J Epidemiol. 2013 Jan 25. [Epub ahead of print]

Time-dependent propensity score and collider-stratification bias: an example of beta(2)-agonist use and the risk of coronary heart disease.Sanni Ali M, Groenwold RH, Pestman WR, Belitser SV, Hoes AW, de Boer A, Klungel OH. Department of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands.

Stratification and conditioning on time-varying cofounders which are also intermediates can induce collider-stratification bias and adjust-away the (indirect) effect of exposure. Similar bias could be expected when one conditions on time-dependent PS. We explored collider-stratification and confounding bias due to conditioning or stratifying on time-dependent PS using a clinical example on the effect of inhaled short- and long-acting beta(2)-agonist use (SABA and LABA, respectively) on coronary heart disease (CHD). In an electronic general practice database we selected a cohort of patients with an indication for SABA and/or LABA use and ascertained potential confounders and SABA/LABA use per three month intervals. Hazard ratios (HR) were estimated using PS stratification as well as covariate adjustment and compared with those of Marginal Structural Models (MSMs) in both SABA and LABA use separately. In MSMs, censoring was accounted for by including inverse probability of censoring weights.The crude HR of CHD was 0.90 [95 % CI: 0.63, 1.28] and 1.55 [95 % CI: 1.06, 2.62] in SABA and LABA users respectively. When PS stratification, covariate adjustment using PS, and MSMs were used, the HRs were 1.09 [95 % CI: 0.74, 1.61], 1.07 [95 % CI: 0.72, 1.60], and 0.86 [95 % CI: 0.55, 1.34] for SABA, and 1.09 [95 % CI: 0.74, 1.62], 1.13 [95 % CI: 0.76, 1.67], 0.77 [95 % CI: 0.45, 1.33] for LABA, respectively. Results were similar for different PS methods, but higher than those of MSMs. When treatment and confounders vary during follow-up, conditioning or stratification on time-dependent PS could induce substantial collider-stratification or confounding bias; hence, other methods such as MSMs are recommended.
PMID: 23354982  [PubMed - as supplied by publisher]

  1. Biostatistics. 2013 Jan 24. [Epub ahead of print]

Adjusting for observational secondary treatments in estimating the effects of randomized treatments. Zhang M, Wang Y. Department of Biostatistics, University of Michigan, Ann Arbor, MI.

In randomized clinical trials, for example, on cancer patients, it is not uncommon that patients may voluntarily initiate a secondary treatment postrandomization, which needs to be properly adjusted for in estimating the "true" effects of randomized treatments. As an alternative to the approach based on a marginal structural Cox model (MSCM) in Zhang and Wang [(2012). Estimating treatment effects from a randomized trial in the presence of a secondary treatment. Biostatistics 13, 625-636], we propose methods that treat the time to start a secondary treatment as a dependent censoring process, which is handled separately from the usual censoring such as the loss to follow-up. Two estimators are proposed, both based on the idea of inversely weighting by the probability of having not started a secondary treatment yet. The second estimator focuses on improving efficiency of inference by a robust covariate-adjustment that does not require any additional assumptions. The proposed methods are evaluated and compared with the MSCM-based method in terms of bias and variance tradeoff using simulations and application to a cancer clinical trial.
PMID: 23349243  [PubMed - as supplied by publisher]

  1. Epidemiology. 2013 Jan 23. [Epub ahead of print]

Sensitivity Analysis for Nonignorable Missingness and Outcome Misclassification from Proxy Reports. Shardell M, Simonsick EM, Hicks GE, Resnick B, Ferrucci L, Magaziner J. Department of Epidemiology and Public Health, University of Maryland School of Medicine; National Institute on Aging; Department of Physical Therapy, University of Delaware; and University of Maryland School of Nursing, Baltimore, MD.

Researchers often recruit proxy respondents, such as relatives or caregivers, for epidemiologic studies of older adults when study participants are unable to provide self-reports (eg, because of illness or cognitive impairment). In most studies involving proxy-reported outcomes, proxies are recruited only to report on behalf of participants who have missing self-reported outcomes; thus, either a proxy report or participant self-report, but not both, is available for each participant. When outcomes are binary and investigators conceptualize participant self-reports as gold standard measures, substituting proxy reports in place of missing participant self-reports in statistical analysis can introduce misclassification error and lead to biased parameter estimates. However, excluding observations from participants with missing self-reported outcomes may also lead to bias. We propose a pattern-mixture model that uses error-prone proxy reports to reduce selection bias from missing outcomes, and we describe a sensitivity analysis to address bias from differential outcome misclassification. We perform model estimation with high-dimensional (eg, continuous) covariates using propensity-score stratification and multiple imputation. We apply the methods to the Second Cohort of the Baltimore Hip Studies, a study of elderly hip fracture patients, to assess the relation between type of surgical treatment and perceived physical recovery. Simulation studies show that the proposed methods perform well. We provide SAS programs in the Appendix (http://links.lww.com/EDE/A646) to enhance the methods’ accessibility.
PMID: 23348065  [PubMed - as supplied by publisher]

  1. Pharmacoepidemiol Drug Saf. 2013 Jan 7. [Epub ahead of print]

Calendar time-specific propensity scores and comparative effectiveness research for stage III colon cancer chemotherapy. Mack CD, Glynn RJ, Brookhart MA, Carpenter WR, Meyer AM, Sandler RS, Sturmer T. Department of Epidemiology, UNC Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC.

PURPOSE: Nonexperimental studies of treatment effectiveness provide an important complement to randomized trials by including heterogeneous populations. Propensity scores (PSs) are common in these studies but may not adequately capture changes in channeling experienced by innovative treatments. We use calendar time-specific (CTS) PSs to examine the effect of oxaliplatin during dissemination from off-label to widespread use. METHODS: Stage III colon cancer patients aged 65+ years initiating chemotherapy between 2003 and 2006 were examined using cancer registry data linked with Medicare claims. Two PS approaches for receipt of oxaliplatin versus 5-flourouricil were constructed using logistic models with key components of age, sex, substage, grade, census-level income, and comorbidities: (i) a conventional, year-adjusted PS and (ii) a CTS PS constructed and matched separately within 1-year intervals, then combined. We compared PS-matched hazard ratios (HRs) for mortality using Cox models. RESULTS: Oxaliplatin use increased significantly; 8% (n=86) of patients received it in the first time period versus 52% (n=386) in the last. Channeling by comorbidities, income, and age appeared to change over time. The CTS PS improved covariate balance within calendar time strata and yielded an attenuated estimated benefit of oxaliplatin (HR =0.75) compared with the conventional PS (HR=0.69). CONCLUSION: In settings where prescribing patterns have changed and calendar time acts as a confounder, a CTS PS can characterize changes in treatment choices and estimating separate PSs within specific calendar time periods may result in enhanced confounding control. To increase validity of comparative effectiveness research, researchers should carefully consider drug lifecycles and effects of innovative treatment dissemination over time. Copyright © 2013 John Wiley & Sons, Ltd.
PMID: 23296544  [PubMed - as supplied by publisher]

  1. Pharmacoepidemiol Drug Saf. 2012 Dec 28. [Epub ahead of print]

Investigating differences in treatment effect estimates between propensity score matching and weighting: a demonstration using STARD trial data.Ellis AR, Dusetzina SB, Hansen RA, Gaynes BN, Farley JF, Sturmer T. Cecil G. Sheps Center for Health Services Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA. are@unc.edu.

PURPOSE: The choice of propensity score (PS) implementation influences treatment effect estimates not only because different methods estimate different quantities, but also because different estimators respond in different ways to phenomena such as treatment effect heterogeneity and limited availability of potential matches. Using effectiveness data, we describe lessons learned from sensitivity analyses with matched and weighted estimates. METHODS: With subsample data (N=1292) from Sequenced Treatment Alternatives to Relieve Depression, a 2001-2004 effectiveness trial of depression treatments, we implemented PS matching and weighting to estimate the treatment effect in the treated and conducted multiple sensitivity analyses. RESULTS: Matching and weighting both balanced covariates but yielded different samples and treatment effect estimates (matched RR 1.00, 95% CI: 0.75-1.34; weighted RR 1.28, 95% CI: 0.97-1.69). In sensitivity analyses, as increasing numbers of observations at both ends of the PS distribution were excluded from the weighted analysis, weighted estimates approached the matched estimate (weighted RR 1.04, 95% CI 0.77-1.39 after excluding all observations below the 5th percentile of the treated and above the 95th percentile of the untreated). Treatment appeared to have benefits only in the highest and lowest PS strata. CONCLUSIONS: Matched and weighted estimates differed due to incomplete matching, sensitivity of weighted estimates to extreme observations, and possibly treatment effect heterogeneity. PS analysis requires identifying the population and treatment effect of interest, selecting an appropriate implementation method, and conducting and reporting sensitivity analyses. Weighted estimation especially should include sensitivity analyses relating to influential observations, such as those treated contrary to prediction. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 23280682  [PubMed - as supplied by publisher]

  1. Ann Epidemiol. 2013 Jan 16. [Epub ahead of print]

Adjusting for outcome misclassification: the importance of accounting for case-control sampling and other forms of outcome-related selection.Jurek AM, Maldonado G, Greenland S. Center for Healthcare Research & Innovation, Allina Health, Minneapolis, MN; Division of Environmental Health Sciences, University of Minnesota School of Public Health, Minneapolis. Electronic address: jure0007@umn.edu.

PURPOSE: Special care must be taken when adjusting for outcome misclassification in case-control data. Basic adjustment formulas using either sensitivity and specificity or predictive values (as with external validation data) do not account for the fact that controls are sampled from a much larger pool of potential controls. A parallel problem arises in surveys and cohort studies in which participation or loss is outcome related. METHODS: We review this problem and provide simple methods to adjust for outcome misclassification in case-control studies, and illustrate the methods in a case-control birth certificate study of cleft lip/palate and maternal cigarette smoking during pregnancy. RESULTS: Adjustment formulas for outcome misclassification that ignore case-control sampling can yield severely biased results. In the data we examined, the magnitude of error caused by not accounting for sampling is small when population sensitivity and specificity are high, but increases as (1) population sensitivity decreases, (2) population specificity decreases, and (3) the magnitude of the differentiality increases. Failing to account for case-control sampling can result in an odds ratio adjusted for outcome misclassification that is either too high or too low. CONCLUSIONS: One needs to account for outcome-related selection (such as case-control sampling) when adjusting for outcome misclassification using external information. Copyright© 2013 Elsevier Inc. All rights reserved.
PMID: 23332712  [PubMed - as supplied by publisher]

  1. Evid Based Complement Alternat Med. 2012. Epub 2012 Dec 26.

Building a strategic framework for comparative effectiveness research in complementary and integrative medicine. Witt CM, Chesney M, Gliklich R, Green L, Lewith G, Luce B, McCaffrey A, Rafferty Withers S, Sox HC, Tunis S, Berman BM. Institute for Social Medicine, Epidemiology and Health Economics, Charité University Medical Center, Berlin, Germany ; Center for Integrative Medicine, University of Maryland School of Medicine, Baltimore, MD.

The increasing burden of chronic diseases presents not only challenges to the knowledge and expertise of the professional medical community, but also highlights the need to improve the quality and relevance of clinical research in this domain. Many patients now turn to complementary and integrative medicine (CIM) to treat their chronic illnesses; however, there is very little evidence to guide their decision-making in usual care. The following research recommendations were derived from a CIM Stakeholder Symposium on Comparative Effectiveness Research (CER): (1) CER studies should be made a priority in this field; (2) stakeholders should be engaged at every stage of the research; (3) CER study designs should highlight effectiveness over efficacy; (4) research questions should be well defined to enable the selection of an appropriate CER study design; (5) the CIM community should cultivate widely shared understandings, discourse, tools, and technologies to support the use and validity of CER methods; (6) Effectiveness Guidance Documents on methodological standards should be developed to shape future CER studies. CER is an emerging field and its development and impact must be reflected in future research strategies within CIM. This stakeholder symposium was a first step in providing systematic guidance for future CER in this field.
PMCID: PMC3544532; PMID: 23346206  [PubMed]

CER Scan [published in the last 30 days]

  1. Am J Epidemiol. 2013 Jan 15;177(2):131-41. Epub 2013 Jan 4.

Adapting Group Sequential Methods to Observational Postlicensure Vaccine Safety Surveillance: Results of a Pentavalent Combination DTaP-IPV-Hib Vaccine Safety Study. Nelson JC, Yu O, Dominguez-Islas CP, Cook AJ, Peterson D, Greene SK, Yih WK, Daley MF, Jacobsen SJ, Klein NP, Weintraub ES, Broder KR, Jackson LA.

To address gaps in traditional postlicensure vaccine safety surveillance and to promote rapid signal identification, new prospective monitoring systems using large health-care database cohorts have been developed. We newly adapted clinical trial group sequential methods to this observational setting in an original safety study of a combination diphtheria and tetanus toxoids and acellular pertussis adsorbed (DTaP), inactivated poliovirus (IPV), and Haemophilus influenzae type b (Hib) conjugate vaccine (DTaP-IPV-Hib) among children within the Vaccine Safety Datalink population. For each prespecified outcome, we conducted 11 sequential Poisson-based likelihood ratio tests during September 2008-January 2011 to compare DTaP-IPV-Hib vaccinees with historical recipients of other DTaP-containing vaccines. No increased risk was detected among 149,337 DTaP-IPV-Hib vaccinees versus historical comparators for any outcome, including medically attended fever, seizure, meningitis/encephalitis/myelitis,
nonanaphylactic serious allergic reaction, anaphylaxis, Guillain-Barr√© syndrome, or invasive  Hib disease. In end-of-study prespecified subgroup analyses, risk of medically attended fever as elevated among 1- to 2-year-olds who received DTaP-IPV-Hib vaccine versus historical comparators (relative risk = 1.83, 95% confidence interval: 1.34, 2.50) but not among infants under 1 year old (relative risk = 0.83, 95% confidence interval: 0.73, 0.94). Findings were similar in analyses with concurrent comparators who received other DTaP-containing vaccines during the study period. Although lack of a controlled experiment presents numerous challenges, implementation of group sequential monitoring methods in observational safety surveillance studies is promising and warrants further investigation.
PMID: 23292957  [PubMed - in process]

Theme: “Looking at CER Methods and Ways to Use Electronic Clinical Data”

Increasing availability of electronic clinical data (ECD) presents new opportunities to answer research questions that can drive improvement in patient outcomes.  The EDM Forum’s new issue brief, “Getting Answers We Can Believe In: Methodological Considerations When Using Electronic Clinical Data for CER and PCOR”, outlines important research design considerations when using ECD for comparative effectiveness research (CER) and patient-centered outcomes research.

This brief provides practical orientation to study designs, highlighting the importance of thinking critically about which methods are best suited to answer particular research and quality improvement questions.  Differences between observational studies and randomized controlled trials (RCTs), specific challenges, and analytic approaches that can be used with observational ECD for CER, and ways of making information obtained from ECD more useful to clinical and policy decision-makers are discussed. For more information and additional issue briefs from the EDM Forum, please visit http://www.edm-forum.org/Publications/BriefsReports.

December 2012

CER Scan [Epub ahead of print]

  1. Med Care. 2012 Dec 23. [Epub ahead of print]

Improved Cardiovascular Risk Prediction Using Nonparametric Regression and Electronic Health Record Data. Kennedy EH, Wiitala WL, Hayward RA, Sussman JB. VA Center for Clinical Management Research, Ann Arbor VA Health Services Research and Development (HSR&D) Center of Excellence‚ Department of Internal Medicine, Robert Wood Johnson Foundation Clinical Scholars Program, University of Michigan, Ann Arbor, MI.

BACKGROUND: Use of the electronic health record (EHR) is expected to increase rapidly in the near future, yet little research exists on whether analyzing internal EHR data using flexible, adaptive statistical methods could improve clinical risk prediction. Extensive implementation of EHR in the Veterans Health Administration provides an opportunity for exploration. OBJECTIVES: To compare the performance of various approaches for predicting risk of cerebrovascular and cardiovascular (CCV) death, using traditional risk predictors versus more comprehensive EHR data. RESEARCH DESIGN:: Retrospective cohort study. We identified all Veterans Health Administration patients without recent CCV events treated at 12 facilities from 2003 to 2007, and predicted risk using the Framingham risk score, logistic regression, generalized additive modeling, and gradient tree boosting. MEASURES: The outcome was CCV-related death within 5 years. We assessed each method’s predictive performance with the area under the receiver operating characteristic curve (AUC), the Hosmer-Lemeshow goodness-of-fit test, plots of estimated risk, and reclassification tables, using cross-validation to penalize overfitting. RESULTS:: Regression methods outperformed the Framingham risk score, even with the same predictors (AUC increased from 71% to 73% and calibration also improved). Even better performance was attained in models using additional EHR-derived predictor variables (AUC increased to 78% and net reclassification improvement was as large as 0.29). Nonparametric regression further improved calibration and discrimination compared with logistic regression. CONCLUSION: Despite the EHR lacking some risk factors and its imperfect data quality, health care systems may be able to substantially improve risk prediction for their patients by using internally developed EHR-derived models and flexible statistical methodology.
PMID: 23269109  [PubMed - as supplied by publisher]

  1. Stat Med. 2012 Dec 12. doi: 10.1002/sim.5705. [Epub ahead of print]

The performance of different propensity score methods for estimating marginal hazard ratios. Austin PC. Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada; Institute of Health Management, Policy and Evaluation, University of Toronto, Toronto, Ontario, Canada; Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.

Propensity score methods are increasingly being used to reduce or minimize the effects of confounding when estimating the effects of treatments, exposures, or interventions when using observational or non-randomized data. Under the assumption of no unmeasured confounders, previous research has shown that propensity score methods allow for unbiased estimation of linear treatment effects (e.g., differences in means or proportions). However, in biomedical research, time-to-event outcomes occur frequently. There is a paucity of research into the performance of different propensity score methods for estimating the effect of treatment on time-to-event outcomes. Furthermore, propensity score methods allow for the estimation of marginal or population-average treatment effects. We conducted an extensive series of Monte Carlo simulations to examine the performance of propensity score matching (1:1 greedy nearest-neighbor matching within propensity score calipers), stratification on the propensity score, inverse probability of treatment weighting (IPTW) using the propensity score, and covariate adjustment using the propensity score to estimate marginal hazard ratios. We found that both propensity score matching and IPTW using the propensity score allow for the estimation of marginal hazard ratios with minimal bias. Of these two approaches, IPTW using the propensity score resulted in estimates with lower mean squared error when estimating the effect of treatment in the treated. Stratification on the propensity score and covariate adjustment using the propensity score result in biased estimation of both marginal and conditional hazard ratios. Applied researchers are encouraged to use propensity score matching and IPTW using the propensity score when estimating the relative effect of treatment on time-to-event outcomes. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 23239115  [PubMed - as supplied by publisher]

  1. Eur J Epidemiol. 2012 Dec 7. [Epub ahead of print]

A proposal for an additional clinical trial outcome measure assessing preventive effect as delay of events. Lytsy P, Berglund L, Sundstram J. Department of Medical Sciences, Entrance 40, 5th Floor, Uppsala University Hospital, SE-751 85, Uppsala, Sweden.

Many effect measures used in clinical trials are problematic because they are differentially understood by patients and physicians. The emergence of novel methods such as accelerated failure-time models and quantile regression has shifted the focus of effect measurement from probability measures to time-to-event measures. Such modeling techniques are rapidly evolving, but matching non-parametric descriptive measures are lacking. We propose such a measure, the delay of events, demonstrating treatment effect as a gain in event-free time. We believe this measure to be of value for shared clinical decision-making. The rationale behind the measure is given, and it is conceptually explained using the Kaplan-Meier estimate and the quantile regression framework. A formula for calculation of the delay of events is given. Hypothetical and empirical examples are used to demonstrate the measure. The measure is discussed in relation to other measures highlighting the time effects of preventive treatments. There is a need to further investigate the properties of the measure as well as its role in clinical decision-making.
PMID: 23224516  [PubMed - as supplied by publisher]

  1. Stat Med. 2012 Dec 3. doi: 10.1002/sim.5690. [Epub ahead of print]

On estimating average effects for multiple treatment groups. Landsman V, Pfeiffer RM. Center for Global Health Research, St. Michael’s Hospital, Toronto, Ontario M5B1W8, Canada.

We propose to estimate average exposure (or treatment) effects from observational data for multiple exposure groups by fitting an approximation of the marginal sample distribution of the response variable in each exposure group to the data. The marginal sample distribution is a function of the true distribution of the response variable in the population and the assignment rule governing the allocation of the subjects to different exposure groups. The assignment rule can depend on the response variable in addition to measured covariates, and hence the method is appropriate even when the assumption of ignorable treatment assignment is not justified. We estimate the exposure effects are estimated based on the population expectation (PE) of the outcome variable. We compare the PE approach with an instrumental variable method and with several other methods including propensity score based approaches that assume ignorable assignment mechanisms. We evaluate the robustness of the PE method under model misspecifications and illustrate it using data from a study of the impact of soy consumption on urinary concentrations of estrogen and estrogen metabolites in Asian American women. Published 2012. This article is a US Government work and is in the public domain in the USA. Published 2012. This article is a US Government work and is in the public domain in the USA.
PMID: 23208873  [PubMed - as supplied by publisher]

  1. Stat Med. 2012 Dec 3. doi: 10.1002/sim.5686. [Epub ahead of print]

Methods for dealing with time-dependent confounding. Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JA. Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, London, U.K.

Longitudinal studies, where data are repeatedly collected on subjects over a period, are common in medical research. When estimating the effect of a time-varying treatment or exposure on an outcome of interest measured at a later time, standard methods fail to give consistent estimators in the presence of time-varying confounders if those confounders are themselves affected by the treatment. Robins and colleagues have proposed several alternative methods that provided certain assumptions hold, avoid the problems associated with standard approaches. They include the g-computation formula, inverse probability weighted estimation of marginal structural models and g-estimation of structural nested models. In this tutorial, we give a description of each of these methods, exploring the links and differences between them and the reasons for choosing one over the others in different settings. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 23208861  [PubMed - as supplied by publisher]

CER Scan [published in the last 30 days]

  1. Health Serv Outcomes Res Methodol. 2012 Dec;12(4):254-272. Epub 2012 Sep 25.

Instrumental variable specifications and assumptions for longitudinal analysis of mental health cost offsets. O’Malley AJ. Department of Health Care Policy, Harvard Medical School, 180 Longwood Avenue, Boston, MA 02115-5899 USA.

Instrumental variables (IVs) enable causal estimates in observational studies to be obtained in the presence of unmeasured confounders. In practice, a diverse range of models and IV specifications can be brought to bear on a problem, particularly with longitudinal data where treatment effects can be estimated for various functions of current and past treatment. However, in practice the empirical consequences of different assumptions are seldom examined, despite the fact that IV analyses make strong assumptions that cannot be conclusively tested by the data. In this paper, we consider several longitudinal models and specifications of IVs. Methods are applied to data from a 7-year study of mental health costs of atypical and conventional antipsychotics whose purpose was to evaluate whether the newer and more expensive atypical antipsychotic medications lead to a reduction in overall mental health costs.
PMCID: PMC3515775; PMID: 23226968  [PubMed]

  1. Stat Med. 2012 Dec 10;31(28):3563-78. doi: 10.1002/sim.5387.

A new linear model-based approach for inferences about the mean area under the curve. Wilding GE, Chandrasekhar R, Hutson AD. Department of Biostatistics, University at Buffalo, Buffalo, NY.

Outcome versus time data are commonly encountered in biomedical and clinical research. A common strategy adopted in analyzing such longitudinal data is to condense the repeated measurements on each individual into a single summary statistic such as the area under the response versus time curve. Standard parametric or non-parametric methods are then applied to perform inferences on the conditional area under the curve distribution. Disadvantages of this approach include the disregard of the within-subject variation in the longitudinal profile. We propose a general linear model approach, accounting for the within-subject variance, for estimation and hypothesis tests about the mean areas. Inferential properties of our approach are compared with those from standard methods of analysis using Monte Carlo simulation studies. The impact of missing data, within-subject heterogeneity and homogeneity of variance, are also evaluated. A real working example is used to illustrate the methodology. It is seen that the proposed approach is associated with a significant power advantage over traditional methods, especially when missing data are encountered. Copyright© 2012 John Wiley & Sons, Ltd.
PMID: 23175104  [PubMed - in process]

  1. Stat Med. 2012 Dec 10;31(28):3481-93.

The analysis of record-linked data using multiple imputation with data value priors. Goldstein H, Harron K, Wade A. Medical Research Council Centre of Epidemiology for Child health, University College London Institute of Child health, London, WC1N 1EH, U.K.; Centre for Multilevel Modelling, Graduate School of Education, University of Bristol, BS8 1JA, Bristol, U.K.

Probabilistic record linkage techniques assign match weights to one or more potential matches for those individual records that cannot be assigned ‘unequivocal matches’ across data files. Existing methods select the single record having the maximum weight provided that this weight is higher than an assigned threshold. We argue that this procedure, which ignores all information from matches with lower weights and for some individuals assigns no match, is inefficient and may also lead to biases in subsequent analysis of the linked data. We propose that a multiple imputation framework be utilised for data that belong to records that cannot be matched unequivocally. In this way, the information from all potential matches is transferred through to the analysis stage. This procedure allows for the propagation of matching uncertainty through a full modelling process that preserves the data structure. For purposes of statistical modelling, results from a simulation example suggest that a full probabilistic record linkage is unnecessary and that standard multiple imputation will provide unbiased and efficient parameter estimates. Copyright ¬© 2012 John Wiley & Sons, Ltd.
PMID: 22807145  [PubMed - in process]

  1. Stat Med. 2012 Dec 10;31(28):3693-707. doi: 10.1002/sim.5429.

On Bayesian methods of exploring qualitative interactions for targeted treatment. Chen W, Ghosh D, Raghunathan TE, Norkin M, Sargent DJ, Bepler G. Department of Oncology, School of Medicine, Wayne State University, Detroit, MI.

Providing personalized treatments designed to maximize benefits and minimizing harms is of tremendous current medical interest. One problem in this area is the evaluation of the interaction between the treatment and other predictor variables. Treatment effects in subgroups having the same direction but different magnitudes are called quantitative interactions, whereas those having opposite directions in subgroups are called qualitative interactions (QIs). Identifying QIs is challenging because they are rare and usually unknown among many potential biomarkers. Meanwhile, subgroup analysis reduces the power of hypothesis testing and multiple subgroup analyses inflate the type I error rate. We propose a new Bayesian approach to search for QI in a multiple regression setting with adaptive decision rules. We consider various regression models for the outcome. We illustrate this method in two examples of phase III clinical trials. The algorithm is straightforward and easy to implement using existing software packages. We provide a sample code in Appendix. Copyright © 2012 John Wiley & Sons, Ltd.
PMCID: PMC3528020 [Available on 2013/12/10]
PMID: 22733620  [PubMed - in process]

Theme: “Data Augmentation in CER”
Data augmentation can be used as an alternative to Markov-chain Monte-Carlo (MCMC) procedure for conducting Bayesian analyses. It allows you to add supplemental data, so that each distribution is represented by one or more prior-data record. Results are comparable to MCMC, but are able to run in a fraction of the time without requiring specialized software.  This compilation of articles provides guidance on the application of data augmentation and examples of its use. Furthermore, the article by Sullivan and Greenland (2012) provides a complete online appendix with SAS code for Bayesian conditional logistic, Poisson loglinear and Cox regression.

  1. Ann Epidemiol. 2012 Dec 5. [Epub ahead of print]

Sparse-data bias accompanying overly fine stratification in an analysis of beryllium exposure and lung cancer risk. Rothman KJ, Mosquin PL. RTI Health Solutions, Research Triangle Institute, 200 Park Offices Drive, Research Triangle Park, NC. Electronic address: KRothman@rti.org.

PURPOSE: Beryllium’s classification as a carcinogen is based on limited human data that show inconsistent associations with lung cancer. Therefore, a thorough examination of those data is warranted. We reanalyzed data from the largest study of occupational beryllium exposure, conducted by the National Institute of Occupational Safety and Health (NIOSH). METHODS: Data had been analyzed using stratification and standardization. We reviewed the strata in the original analysis, and reanalyzed using fewer strata. We also fit a Poisson regression, and analyzed simulated datasets that generated lung cancer cases randomly without regard to exposure. RESULTS: The strongest association reported in the NIOSH study, a standardized rate ratio for death from lung cancer of 3.68 for the highest versus lowest category of time since first employment, is affected by sparse-data bias, stemming from stratifying 545 lung cancer cases and their associated person-time into 1792 categories. For time since first employment, the measure of beryllium exposure with the strongest reported association with lung cancer, there were no strata without zeroes in at least one of the two contrasting exposure categories. Reanalysis using fewer strata or with regression models gave substantially smaller effect estimates. Simulations confirmed that the original stratified analysis was upwardly biased. Other metrics used in the NIOSH study found weaker associations and were less affected by sparse-data bias. CONCLUSIONS: The strongest association reported in the NIOSH study seems to be biased as a result of non-overlap of data across the numerous strata. Simulation results indicate that most of the effect reported in the NIOSH paper for time since first employment is attributable to sparse-data bias. Copyright ¬© 2013 Elsevier Inc. All rights reserved.
PMID: 23219098  [PubMed - as supplied by publisher]

  1. Int J Epidemiol. 2012 Dec 10. [Epub ahead of print]

Bayesian regression in SAS software. Sullivan SG, Greenland S. Department of Epidemiology, University of California, Los Angeles, CA, WHO Collaborating Centre for Reference and Research on Influenza, Melbourne, Australia and Department of Statistics, University of California, Los Angeles, CA.

Bayesian methods have been found to have clear utility in epidemiologic analyses involving sparse-data bias or considerable background information. Easily implemented methods for conducting Bayesian analyses by data augmentation have been previously described but remain in scant use. Thus, we provide guidance on how to do these analyses with ordinary regression software. We describe in detail and provide code for the implementation of data augmentation for Bayesian and semi-Bayes regression in SAS® software, and illustrate their use in a real logistic-regression analysis. For comparison, the same model was fitted using the Markov-chain Monte Carlo (MCMC) procedure. The two methods required a similar number of steps and yielded similar results, although for the main example, data augmentation ran in about 0.5% of the time required for MCMC. We also provide online appendices with details and examples for conditional logistic, Poisson and Cox proportional-hazards regression.
PMID: 23230299  [PubMed - as supplied by publisher]

  1. J Clin Epidemiol. 2010 Apr;63(4):370-83.

A valid and reliable belief elicitation method for Bayesian priors. Johnson SR, Tomlinson GA, Hawker GA, Granton JT, Grosbein HA, Feldman BM. Division of Rheumatology, Department of Medicine, Toronto Western Hospital, University Health Network, Toronto, Ontario M5T 2S8, Canada. Sindhu.Johnson@uhn.on.ca

OBJECTIVE: Bayesian inference has the advantage of formally incorporating prior beliefs about the effect of an intervention into analyses of treatment effect through the use of prior probability distributions or "priors." Multiple methods to elicit beliefs from experts for inclusion in a Bayesian study have been used; however, the measurement properties of these methods have been infrequently evaluated. The objectives of this study were to evaluate the feasibility, validity, and reliability of a belief elicitation method for Bayesian priors.
STUDY DESIGN AND SETTING: A single-center, cross-sectional study using a sample of academic specialists who treat pulmonary hypertension patients was conducted to test the feasibility, face and construct validity, and reliability of a belief elicitation method. Using this method, participants expressed the probability of 3-year survival with and without warfarin. Applying adhesive dots or "chips," each representing 5% probability, in "bins" on a line, participants expressed
their uncertainty and weight of belief about the effect of warfarin on 3-year survival.
RESULTS: Of the 12 participants, 11 (92%) reported that the belief elicitation method had face validity, 10 (83%) found the questions clear, and 11 (92%) found the response option easy to use. The median time to completion was 10 minutes (5-15 minutes). Internal validity testing found moderate agreement (weighted kappa=0.54-0.57). The intraclass correlation coefficient for test-retest reliability was 0.93.
CONCLUSION: This method of belief elicitation for Bayesian priors is feasible, valid, and reliable. It can be considered for application in Bayesian clinical studies. Copyright 2010 Elsevier Inc. All rights reserved.
PMID: 19926253  [PubMed - indexed for MEDLINE]

  1. Int J Epidemiol. 2007 Feb;36(1):195-202. Epub 2007 Feb 28.

Bayesian perspectives for epidemiological research. II. Regression analysis. Greenland S. Departments of Epidemiology and Statistics, University of California-Los Angeles, Los Angeles, CA 90095-1772, USA. lesdomes@ucla.edu

This article describes extensions of the basic Bayesian methods using data priors to regression modelling, including hierarchical (multilevel) models. These methods provide an alternative to the parsimony-oriented approach of frequentist regression analysis. In particular, they replace arbitrary variable-selection criteria by prior distributions, and by doing so facilitate realistic use of imprecise but important prior information. They also allow Bayesian analyses to be conducted using standard regression packages; one need only be able to add variables and records to the data set. The methods thus facilitate the use of Bayesian solutions to problems of sparse data, multiple comparisons, subgroup analyses and study bias. Because these solutions have a frequentist interpretation as "shrinkage" (penalized) estimators, the methods can also be viewed as a means of implementing shrinkage approaches to multiparameter problems.
PMID: 17329317  [PubMed - indexed for MEDLINE]

  1. Biometrics. 2001 Sep;57(3):663-70.

Putting background information about relative risks into conjugate prior distributions. Greenland S. Department of Epidemiology, UCLA School of Public Health, and Topanga, CA.

In Bayesian and empirical Bayes analyses of epidemiologic data, the most easily implemented prior specifications use a multivariate normal distribution for the log relative risks or a conjugate distribution for the discrete response vector. This article describes problems in translating background information about relative risks into conjugate priors and a solution. Traditionally, conjugate priors have been specified through flattening constants, an approach that leads to conflicts with the true prior covariance structure for the log relative risks. One can, however, derive a conjugate prior consistent with that structure by using a data-augmentation approximation to the true log relative-risk prior, although a rescaling step is needed to ensure the accuracy of the approximation. These points are illustrated with a logistic regression analysis of
neonatal-death risk.
PMID: 11550913  [PubMed - indexed for MEDLINE]

  1. Stat Med. 2001 Aug 30;20(16):2421-8.

Data augmentation priors for Bayesian and semi-Bayes analyses of conditional-logistic and proportional-hazards regression. Greenland S, Christensen R. Department of Epidemiology, UCLA School of Public Health, Los Angeles, CA.

Data augmentation priors have a long history in Bayesian data analysis. Formulae for such priors have been derived for generalized linear models, but their accuracy depends on two approximation steps. This note presents a method for using offsets as well as scaling factors to improve the accuracy of the approximations in logistic regression. This method produces an exceptionally simple form of data augmentation that allows it to be used with any standard package for conditional-logistic or proportional-hazards regression to perform Bayesian and semi-Bayes analyses of matched and survival data. The method is illustrated with an analysis of a matched case-control study of diet and breast cancer. Copyright 2001 John Wiley & Sons, Ltd.
PMID: 11512132  [PubMed - indexed for MEDLINE]

  1. J Occup Med Toxicol. 2010 Aug 11;5:23. doi: 10.1186/1745-6673-5-23.

Bayesian bias adjustments of the lung cancer SMR in a cohort of German carbon black production workers. Morfeld P, McCunney RJ. Institute for Occupational Medicine of Cologne University/Germany. peter.morfeld@evonik.com.

BACKGROUND: A German cohort study on 1,528 carbon black production workers estimated an elevated lung cancer SMR ranging from 1.8-2.2 depending on the reference population. No positive trends with carbon black exposures were noted in the analyses. A nested case control study, however, identified smoking and previous exposures to known carcinogens, such as crystalline silica, received prior to work in the carbon black industry as important risk factors.We used a Bayesian procedure to adjust the SMR, based on a prior of seven independent parameter distributions describing smoking behaviour and crystalline silica dust exposure (as indicator of a group of correlated carcinogen exposures received previously) in the cohort and population as well as the strength of the relationship of these factors with lung cancer mortality. We implemented the approach by Markov Chain Monte Carlo Methods (MCMC) programmed in R, a statistical computing system freely available on the internet, and we provide the program code.
RESULTS: When putting a flat prior to the SMR a Markov chain of length 1,000,000 returned a median posterior SMR estimate (that is, the adjusted SMR) in the range between 1.32 (95% posterior interval: 0.7, 2.1) and 1.00 (0.2, 3.3) depending on the method of assessing previous exposures.
CONCLUSIONS: Bayesian bias adjustment is an excellent tool to effectively combine data about confounders from different sources. The usually calculated lung cancer SMR statistic in a cohort of carbon black workers overestimated effect and precision when compared with the Bayesian results. Quantitative bias adjustment should become a regular tool in occupational epidemiology to address narrative discussions of potential distortions.
PMCID: PMC2928247; PMID: 20701747  [PubMed]

November 2012

CER Scan [Epub ahead of print]

    1. J Eval Clin Pract. 2012 Nov 19. doi: 10.1111/jep.12012. [Epub ahead of print]

Predicting post-discharge death or readmission: deterioration of model performance in population having multiple admissions per patient. van Walraven C, Wong J, Forster AJ, Hawken S. Clinical Epidemiology, Ottawa Hospital Research Institute, Ottawa, Ontario,
Canada; Institute for Clinical Evaluative Sciences, Ottawa, Ontario, Canada; Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada.

BACKGROUND: To avoid biased estimates of standard errors in regression models, statisticians commonly limit the analytical dataset to one observation per patient. OBJECTIVE: Measure and explain changes in model performance when a model predicting 30-day risk of death or urgent readmission (derived on a dataset having one hospitalization per patient) was applied to all hospitalizations for study patients. METHODS: Using administrative data from Ontario, we identified all hospitalizations of 499‚996 patients between 2004 and 2009. We calculated the expected risk for 30-day death or urgent readmission using a validated model. The observed-to-expected ratio was determined after categorizing patients into quintiles of rates for hospitalization, emergent hospitalizations, hospital day and total diagnostic risk score. RESULTS: Study patients had a total of 858‚410 hospitalizations. Compared with a dataset having one hospitalization per patient, model performance declined significantly when applied to all hospitalizations [c-statistic decreased from 0.768 to 0.730; the observed-to-expected ratio increased from 0.998 (95% confidence interval 0.977-0.999) to 1.305 (1.297-1.313)]. Model deterioration was most pronounced in patients with higher hospital utilization, with the observed-to-expected ratio increasing to 1.67 in the highest quintile of emergent hospitalization rates. CONCLUSIONS: The accuracy of predicting 30-day death or urgent readmission decreased significantly when the unit of analysis changed from the patient to the hospitalization. Patients with heavy hospital utilization likely have characteristics, not adequately captured in the model, that increase the risk of death or urgent readmission after discharge from hospital. Adequately capturing the characteristics of such high-end hospital users may improve readmission models. © 2012 Blackwell Publishing Ltd.
PMID: 23163303  [PubMed - as supplied by publisher]

    1. Int J Biostat. 2012 Nov 5;8. doi:10.1515/1557-4679.1397.

Partial Identification arising from Nondifferential Exposure Misclassification: How Informative are Data on the Unlikely, Maybe, and Likely Exposed? Wang D, Shen T, Gustafson P. University of British Columbia.

There is quite an extensive literature on the deleterious impact of exposure misclassification when inferring exposure-disease associations, and on statistical methods to mitigate this impact. Virtually all of this work, however, presumes a common number of states for the true exposure status and the classified exposure status. In the simplest situation, for instance, both the true status and the classified status are binary. The present work diverges from the norm, in considering classification into three states when the actual exposure status is simply binary. Intuitively, the classification states might be labeled as `unlikely exposed,’ `maybe exposed,’ and `likely exposed.’ While this situation has been discussed informally in the epidemiological literature, we provide some theory concerning what can be learned about the exposure-disease relationship, under various assumptions about the classification scheme. We focus on the challenging situation whereby no validation data is available from which to infer classification probabilities, but some prior assertions about these probabilities might be justified.
PMID: 23152432  [PubMed - in process]

    1. J Epidemiol Community Health. 2012 Nov 23. [Epub ahead of print]

Invited commentary: what can epidemiology contribute to comparative effectiveness research? Chubak J. Group Health Research Institute, Seattle, WA.

PMID: 23180807  [PubMed - as supplied by publisher]

CER Scan [published in the last 30 days]

    1. Diabetes Care 35:2665–2673, 2012

Metformin and the Risk of Cancer: Time-related biases in observational studies.
Suissa S, Azoulay L

OBJECTIVE: Time-related biases in observational studies of drug effects have been described extensively in different therapeutic areas but less so in diabetes. Immortal time bias, time-window bias, and time-lag bias all tend to greatly exaggerate the benefits observed with a drug.
RESEARCH DESIGN AND METHODS: These time-related biases are described and shown to be prominent in observational studies that have associated metformin with impressive reductions in the incidence of and mortality from cancer. As a consequence, metformin received much attention as a potential anticancer agent; these observational studies sparked the conduction of randomized, controlled trials of metformin as cancer treatment. However, the spectacular effects reported in these studies are compatible with time-related biases.
RESULTS: We found that 13 observational studies suffered from immortal time bias; 9 studies had not considered time-window bias, whereas other studies did not consider inherent timelagging issues when comparing the first-line treatment metformin with second- or third-line treatments. These studies, subject to time-related biases that are avoidable with proper study design and data analysis, led to illusory extraordinarily significant effects, with reductions in cancer risk with metformin ranging from 20 to 94%. Three studies that avoided these biases reported no effect of metformin use on cancer incidence.
CONCLUSIONS: Although observational studies are important to better understand the effects of drugs, their proper design and analysis is essential to avoid major time-related biases. With respect to metformin, the scientific evidence of its potential beneficial effects on cancer would need to be reassessed critically before embarking on further long and expensive trials.
PMCID: PMC3507580 [Available on 2013/12/1]
PMID: 23173135  [PubMed - in process]

    1. BMC Med Res Methodol. 2012 Nov 26;12(1):180. [Epub ahead of print]

Supplementing claims data with outpatient laboratory test results to improve confounding adjustment in effectiveness studies of lipid-lowering treatments. Schneeweiss S, Rassen JA, Glynn RJ, Myers J, Daniel GW, Singer J, Solomon DH, Kim S, Rothman KJ, Liu J, Avorn J.

BACKGROUND: Adjusting for laboratory test results may result in better confounding control when added to administrative claims data in the study of treatment effects. However, missing values can arise through several mechanisms. METHODS: We studied the relationship between availability of outpatient lab test results, lab values, and patient and system characteristics in a large healthcare database using LDL, HDL, and HbA1c in a cohort of initiators of statins or Vytorin (ezetimibe & simvastatin) as examples. RESULTS: Among 703,484 patients 68% had at least one lab test performed in the 6 months before treatment. Performing an LDL test was negatively associated with several patient characteristics, including recent hospitalization (OR = 0.32, 95% CI: 0.29-0.34), MI (OR = 0.77, 95% CI: 0.69-0.85), or carotid revascularization (OR = 0.37, 95%
CI: 0.25-0.53). Patient demographics, diagnoses, and procedures predicted well who would have a lab test performed (AUC = 0.89 to 0.93). Among those with test results available claims data explained only 14% of variation. CONCLUSIONS: In a claims database linked with outpatient lab test results, we found that lab tests
are performed selectively corresponding to current treatment guidelines. Poor ability to predict lab values and the high proportion of missingness reduces the added value of lab tests for effectiveness research in this setting.
PMID: 23181419  [PubMed - as supplied by publisher]

    1. Stat Med. 2012 Nov 10;31(25):3051-3. doi: 10.1002/sim.5400.

Commentary: How the debate about comparative effectiveness research should impact the future of clinical trials. Lauer MS. Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, Bethesda, MD.

Comparative effectiveness research represents the kind of research that arguably more directly affects clinical practice and policy. It includes observational studies, clinical trials, and systematic syntheses of existing literature. In this commentary, I argue for the ongoing and critical role of randomization in comparative effectiveness, noting the key differences between practical and explanatory trials. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 23055180  [PubMed - in process]

    1. Stat Med. 2012 Nov 10;31(25):3060-1. doi: 10.1002/sim.5397.

Comments on Lauer’s ‘How the debate about comparative effectiveness research should impact the future of clinical trials’. Hernán MA. Department of Epidemiology, Harvard School of Public Health Boston, MA.

PMID: 23055183  [PubMed - in process]

    1. Stat Med. 2012 Nov 10;31(25):3054-6. doi: 10.1002/sim.5399.

Commentary on ‘How the debate about comparative effectiveness research (CER) should impact the future of clinical trials’ by Michael S. Lauer. Ellenberg JH. Center for Clinical Epidemiology and Biostatistics, Division of Biostatistics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA.

PMID: 23055181  [PubMed - in process]

DECEMBER THEME: Comparative Effectiveness Research in Cardiology

    1. JAMA. 2012 Nov 7;308(17):1747-8. doi: 10.1001/jama.2012.28745.

The future of cardiovascular clinical research: informatics, clinical investigators, and community engagement. Califf RM, Sanderson I, Miranda ML. Duke Translational Medicine Institute, Duke University Medical Center, 200 Trent Dr, 1117 Davison Bldg, Durham, NC

PMID: 23117773  [PubMed - indexed for MEDLINE]

    1. Heart Fail. Clin. 2013 Jan;9(1):15-28.

End points for comparative effectiveness research in heart failure. Allen LA, Spertus JA. Division of Cardiology, University of Colorado School of Medicine, Anschutz Medical Center, Academic Office 1, 12631 East 17th Avenue, Mailstop B130, Aurora, CO.

With the increasing availability of therapeutic strategies and the growing complexity of health care delivery for patients with heart failure, objective evidence of the tangible benefits of different approaches to care is needed. Comparative effectiveness research (CER) offers an important avenue for making progress in the field. CER, like any well-designed research program, requires articulation of clinically important outcomes to be compared. In this review, available CER end-point domains are discussed, the wide variety of end points used in CER are summarized, and future steps for greater standardization of end points across heart failure CER are suggested. Copyright © 2013 Elsevier Inc. All rights reserved.
PMCID: PMC3506122 [Available on 2014/1/1]
PMID: 23168314  [PubMed - in process]

    1. Heart Fail Clin. 2013 Jan;9(1):37-47. doi: 10.1016/j.hfc.2012.09.008.

Comparative effectiveness research: drug-drug comparisons in heart failure. Fosbol EL. Duke Clinical Research Institute, Duke University Medical Center, Durham, NC

This article outlines the strengths and weaknesses of drug-drug comparative effectiveness research (CER), with a specific focus on heart failure research, to characterize the optimal use of and approaches to CER. Although there have been important therapeutic advances in heart failure over the past several decades, CER provides great opportunities for guiding researchers and clinicians in improving management of this important disease characterized by excess morbidity, mortality, and costs. Copyright © 2013 Elsevier Inc. All rights reserved.
PMID: 23168316  [PubMed - in process]

    1. Heart Fail Clin. 2013 Jan;9(1):ix. doi: 10.1016/j.hfc.2012.09.009.

Comparative effectiveness research in heart failure. Hlatky M. Stanford University School of Medicine, HRP Redwood Building, Room 150, Stanford, CA.

PMID: 23168321  [PubMed - in process]

    1. Heart Fail Clin. 2013 Jan;9(1):29-36. doi: 10.1016/j.hfc.2012.09.007.

Epidemiologic and statistical methods for comparative effectiveness research. Hlatky MA, Winkelmayer WC, Setoguchi S. Department of Health Research and Policy, Department of Medicine, Stanford University School of Medicine, Stanford, CA

Observational methods are evolving in response to the widespread availability of data from clinical registries, electronic health records, and administrative databases. These approaches will never eliminate the need for randomized trials, but clearly have a role in evaluating the effect of therapies in unselected populations treated in routine practice. This article reviews several approaches to the analysis of observational data that are in common use, or that may have promise even though they are not yet often applied.  Copyright © 2013 Elsevier Inc. All rights reserved.
PMID: 23168315  [PubMed - in process]

    1. Heart Fail Clin. 2013 Jan;9(1):1-13. doi: 10.1016/j.hfc.2012.09.001.

Data sources for heart failure comparative effectiveness research. Xian Y, Hammill BG, Curtis LH. Duke Clinical Research Institute, Duke University School of Medicine, Durham, NC

Existing data sources for heart failure research offer advantages and disadvantages for comparative effectiveness research. Clinical registries collect detailed information about disease presentation, treatment, and outcomes on a large number of patients and provide the “real-world” population that is the hallmark of comparative effectiveness research. Data are not collected longitudinally, however, and follow-up is often limited. Large administrative datasets provide the broadest population coverage with longitudinal outcomes follow-up but lack clinical detail. Linking clinical registries with other databases to assess longitudinal outcomes holds great promise. Copyright © 2013 Elsevier Inc. All rights reserved.
PMID: 23168313  [PubMed - in process]

October 2012

CER Scan [Epub ahead of print]

    1. Pharmacoepidemiol Drug Saf. 2012 Oct 16. doi: 10.1002/pds.3356. [Epub ahead of print]

Variable selection for propensity score models when estimating treatment effects on multiple outcomes: a simulation study. Wyss R, Girman CJ, Locasale RJ, Alan Brookhart M, Sturmer T. Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapell Hill, NC. til.sturmer@post.harvard.edu.

PURPOSE: It is often preferable to simplify the estimation of treatment effects on multiple outcomes by using a single propensity score (PS) model. Variable selection in PS models impacts the efficiency and validity of treatment effects. However, the impact of different variable selection strategies on the estimated treatment effects in settings involving multiple outcomes is not well understood. The authors use simulations to evaluate the impact of different variable selection strategies on the bias and precision of effect estimates to provide insight into the performance of various PS models in settings with multiple outcomes. METHODS: Simulated studies consisted of dichotomous treatment, two Poisson outcomes, and eight standard-normal covariates. Covariates were selected for the PS models based on their effects on treatment, a specific outcome, or both outcomes. The PSs were implemented using stratification, matching, and weighting (inverse probability treatment weighting). RESULTS: PS models including only covariates affecting a specific outcome (outcome-specific models) resulted in the most efficient effect estimates. The PS model that only included covariates affecting either outcome (generic-outcome model) performed best among the models that simultaneously controlled measured confounding for both outcomes. Similar patterns were observed over the range of parameter values assessed and all PS implementation methods. CONCLUSIONS: A single, generic-outcome model performed well compared with separate outcome-specific models in most scenarios considered. The results emphasize the benefit of using prior knowledge to identify covariates that affect the outcome when constructing PS models and support the potential to use a single, generic-outcome PS model when multiple outcomes are being examined. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 23070806  [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2012 Oct 14. [Epub ahead of print]

Estimating controlled direct effects in the presence of intermediate confounding of the mediator-outcome relationship: Comparison of five different methods. Lepage B, Dedieu D, Savy N, Lang T. Inserm UMR1027, Toulouse, France.

In mediation analysis between an exposure X and an outcome Y, estimation of the direct effect of X on Y by usual regression after adjustment for the mediator M may be biased if Z is a confounder between M and Y, and is also affected by X. Alternative methods have been described to avoid such a bias: inverse probability of treatment weighting with and without weight truncation, the sequential g-estimator and g-computation. Our aim was to compare the usual linear regression adjusted for M to these methods when estimating the controlled direct effect between X and Y in the causal structure and to explore the size of the potential bias. Estimations were computed in several simulated data sets as well as real data. We observed an increased bias of the controlled direct effect estimation using linear regression adjusted for M for larger effects of X on M and larger effects of Z on M. The sequential g-estimator and g-computation gave unbiased estimations with adequate coverage values in every situation studied. With continuous exposure X and mediator M, inverse probability of treatment weighting resulted in some bias and less satisfactory coverage for large effects of X on M and Z on M.
PMID: 23070596  [PubMed - as supplied by publisher]

    1. Contemp Clin Trials. 2012 Oct 11. pii: S1551-7144(12)00227-3. doi: 10.1016/j.cct.2012.09.008. [Epub ahead of print]

Competing event risk stratification may improve the design and efficiency of clinical trials: Secondary analysis of SWOG 8794. Zakeri K, Rose BS, Gulaya S, D’Amico AV, Mell LK. Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, CA, United States.

BACKGROUND: Composite endpoints can be problematic in the presence of competing risks when a treatment does not affect events comprising the endpoint equally. METHODS: We conducted secondary analysis of SWOG 8794 trial of adjuvant radiation therapy (RT) for high-risk post-operative prostate cancer. The primary outcome was metastasis-free survival (MFS), defined as time to first occurrence of metastasis or death from any cause (competing mortality (CM)). We developed separate risk scores for time to metastasis and CM using competing risk regression. We estimated treatment effects using Cox models adjusted for risk scores and identified an enriched subgroup of 75 patients at high risk of metastasis and low risk of CM. RESULTS: The mean CM risk score was significantly lower in the RT arm vs. control arm (p=0.001). The effect of RT on MFS (HR 0.70; 95% CI, 0.53-0.92; p=0.010) was attenuated when controlling for metastasis and CM risk (HR 0.76; 95% CI, 0.58-1.00; p=0.049), and the effect of RT on overall survival (HR 0.73; 95% CI, 0.55-0.96; p=0.02) was no longer significant when controlling for metastasis and CM risk (HR 0.80; 95% CI, 0.60-1.06; p=0.12). Compared to the whole sample, the enriched subgroup had the same 10-year incidence of MFS (40%; 95% CI, 22-57%), but a higher incidence of metastasis (30% (95% CI, 15-47%) vs. 20% (95% CI, 15-26%)). A randomized trial in the subgroup would have achieved 80% power with 56% less patients (313 vs. 709, respectively). CONCLUSION: Stratification on competing event risk may improve the efficiency of clinical trials. Copyright © 2012. Published by Elsevier Inc. PMID: 23063467  [PubMed - as supplied by publisher]

    1. J Clin Oncol. 2012 Oct 15. [Epub ahead of print]

Recommendations for Incorporating Patient-Reported Outcomes Into Clinical Comparative Effectiveness Research in Adult Oncology. Basch E, Abernethy AP, Mullins CD, Reeve BB, Smith ML, Coons SJ, Sloan J, Wenzel K, Chauhan C, Eppard W, Frank ES, Lipscomb J, Raymond SA, Spencer M, Tunis S. Memorial Sloan-Kettering Cancer Center, New York, NY; Duke University, Durham; University of North Carolina at Chapel Hill, Chapel Hill, NC; University of Maryland; Center for Medical Technology Policy, Baltimore,
MD; Research Advocacy Network, Plano, TX; Critical Path Institute, Tucson, AZ; Mayo Clinic; Mayo Clinic Breast Specialized Programs of Research Excellence; North Central Cancer Treatment Group Patient Advocacy Committee, Rochester, MN; Perceptive Informatics; Dana-Farber Cancer Institute; PHT Corporation, Boston, MA; and Emory University, Atlanta, GA.

Examining the patient’s subjective experience in prospective clinical comparative effectiveness research (CER) of oncology treatments or process interventions is essential for informing decision making. Patient-reported outcome (PRO) measures are the standard tools for directly eliciting the patient experience. There are currently no widely accepted standards for developing or implementing PRO measures in CER. Recommendations for the design and implementation of PRO measures in CER were developed via a standardized process including multistakeholder interviews, a technical working group, and public comments. Key recommendations are to include assessment of patient-reported symptoms as well as health-related quality of life in all prospective clinical CER studies in adult oncology; to identify symptoms relevant to a particular study population and context based on literature review and/or qualitative and quantitative methods; to assure that PRO measures used are valid, reliable, and sensitive in a comparable population (measures particularly recommended include EORTC QLQ-C30, FACT, MDASI, PRO-CTCAE, and PROMIS); to collect PRO data electronically whenever possible; to employ methods that minimize missing patient reports and include a plan for analyzing and reporting missing PRO data; to report the proportion of responders and cumulative distribution of responses in addition to mean changes in scores; and to publish results of PRO analyses simultaneously with other clinical outcomes. Twelve core symptoms are recommended for consideration in studies in advanced or metastatic cancers. Adherence to methodologic standards for the selection, implementation, and analysis/reporting of PRO measures will lead to an understanding of the patient experience that informs better decisions by patients, providers, regulators, and payers. PMID: 23071244  [PubMed - as supplied by publisher]

    1. Ann Epidemiol. 2012 Oct 4. pii: S1047-2797(12)00363-8. doi: 10.1016/j.annepidem.2012.09.003. [Epub ahead of print]

Correcting for exposure misclassification using survival analysis with a time-varying exposure. Ahrens K, Lash TL, Louik C, Mitchell AA, Werler MM. Slone Epidemiology Center at Boston University, Boston, MA; Boston University School of Public Health, Boston, MA.

PURPOSE: Survival analysis is increasingly being used in perinatal epidemiology to assess time-varying risk factors for various pregnancy outcomes. Here we show how quantitative correction for exposure misclassification can be applied to a Cox regression model with a time-varying dichotomous exposure. METHODS: We evaluated influenza vaccination during pregnancy in relation to preterm birth among 2267 non-malformed infants whose mothers were interviewed as part of the Slone Birth Defects Study during 2006 through 2011. The hazard of preterm birth was modeled using a time-varying exposure Cox regression model with gestational age as the time-scale. The effect of exposure misclassification was then modeled using a probabilistic bias analysis that incorporated vaccination date assignment. The parameters for the bias analysis were derived from both internal and external validation data. RESULTS: Correction for misclassification of prenatal influenza vaccination resulted in an adjusted hazard ratio (AHR) slightly higher and less precise than the conventional analysis: Bias-corrected AHR 1.04 (95% simulation interval, 0.70-1.52); conventional AHR, 1.00 (95% confidence interval, 0.71-1.41). CONCLUSIONS: Probabilistic bias analysis allows epidemiologists to assess quantitatively the possible confounder-adjusted effect of misclassification of a time-varying exposure, in contrast with a speculative approach to understanding information bias. Copyright © 2012 Elsevier Inc. All rights reserved.
PMID: 23041654  [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2012 Oct 14. [Epub ahead of print]

Obtaining evidence by a single well-powered trial or several modestly powered trials. Inthout J, Ioannidis JP, Borm GF. Department of Epidemiology, Biostatistics and HTA, Radboud University NijmegenMedical Centre, Nijmegen, the Netherlands.

There is debate whether clinical trials with suboptimal power are justified and whether results from large studies are more reliable than the (combined) results of smaller trials. We quantified the error rates for evaluations based on single conventionally powered trials (80% or 90% power) versus evaluations based on the random-effects meta-analysis of a series of smaller trials. When a treatment was assumed to have no effect but heterogeneity was present, the error rates for a single trial were increased more than 10-fold above the nominal rate, even for low heterogeneity. Conversely, for meta-analyses on a series of trials, the error rates were correct. When selective publication was present, the error rates were always increased, but they still tended to be lower for a series of trials than single trials. We conclude that evidence of efficacy based on a series of (smaller) trials, may lower the error rates compared with using a single well-powered trial. Only when both heterogeneity and selective publication can be excluded, a single trial is able to provide conclusive evidence.

PMID: 23070590  [PubMed - as supplied by publisher]

CER Scan [published in the last 30 days]

    1. Int J Epidemiol. 2012 Oct;41(5):1383-93. doi: 10.1093/ije/dys141.

The effect of non-differential measurement error on bias, precision and power in Mendelian randomization studies. Pierce BL, Vanderweele TJ. Department of Health Studies and Comprehensive Cancer Center, University of Chicago, Chicago, IL, USA, Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA and Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA.

BACKGROUND: Mendelian randomization (MR) studies assess the causality of associations between exposures and disease outcomes using data on genetic determinants of the exposure. In this work, we explore the effect of exposure and outcome measurement error in MR studies.
METHODS: For continuous traits, we describe measurement error in terms of a theoretical regression of the measured variable on the true variable. We quantify error in terms of the slope (calibration) and the R(2) values (discrimination or classical measurement error). We simulated cohort data sets under realistic parameters and used two-stage least squares regression to assess the effect of measurement error for continuous exposures and outcomes on bias, precision and power. For simulations of binary outcomes, we varied sensitivity and specificity.
RESULTS: Discrimination error in continuous exposures and outcomes did not bias the MR estimate, and only outcome discrimination error substantially reduced power. Calibration error biased the MR estimate when the exposure and the outcome measures were not calibrated in a similar fashion, but power was not affected. For binary outcomes, exposure calibration error introduced substantial bias (with negligible impact on power), but exposure discrimination error did not. Reduced outcome specificity and, to a lesser degree, reduced sensitivity biased MR estimates towards the null.
CONCLUSIONS: Understanding the potential effects of measurement error is an important consideration when interpreting estimates from MR analyses. Based on these results, future MR studies should consider methods for accounting for such error and minimizing its impact on inferences derived from MR analyses.
PMID: 23045203  [PubMed - in process]

    1. J Bone Joint Surg Am. 2012 Jul 18;94 Suppl 1:80-4.

On the prevention and analysis of missing data in randomized clinical trials: the state of the art. Scharfstein DO, Hogan J, Herman A. Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD. dscharf@jhsph.edu

We summarize and elaborate on the recently published National Research Council report entitled “The Prevention and Treatment of Missing Data in Clinical Trials.” We tailor our discussion to orthopaedic trials. In particular, we discuss the intent-to-treat principle, review study design and prevention ideas to minimize missing data, and present state-of-the-art sensitivity analysis methods for analyzing and reporting the results of studies with missing data.
PMCID: PMC3393113 [Available on 2013/7/18]
PMID: 22810454  [PubMed - indexed for MEDLINE]

September 2012

CER Scan [Epub ahead of print]

  1. Stat Methods Med Res. 2012 Sep 11. [Epub ahead of print]A Bayesian path analysis to estimate causal effects of bazedoxifene acetate on incidence of vertebral fractures, either directly or through non-linear changes in bone mass density. Detilleux J, Reginster JY, Chines A, Bruyere O. Department of Public Health, Epidemiology and Health Economics, CHU Sart-Tilman, University of Liege, Belgium.Background/Aims: Bone mass density values have been related with risk of vertebral fractures in post-menopausal women. However, bone mass density is not perfectly accurate in predicting risk of fracture, which decreases its usefulness as a surrogate in clinical trials. We propose a modeling framework with three interconnected parts to improve the evaluation of bone mass density accuracy in forecasting fractures after treatment. Methods: The modeling framework includes: (1) a piecewise regression to describe non-linear temporal BMD changes more accurately than crude percent changes, (2) a structural equation model to analyze interdependencies among vertebral fractures and their potential risk factors in preference to regression techniques that consider only directional associations, and (3) a counterfactual causal interpretation of the direct and indirect relationships between treatment and occurrence of vertebral fractures. We apply the methods to BMD repeated measurements from a study of the effect of bazedoxifene acetate on incident vertebral fractures in three different geographical regions.Results: We made four observations: (1) bone mass density changes varied largely across participants, (2) baseline age and body mass index influenced baseline bone mass density that, in turn, had an effect on prevalent fractures, (3) direct and/or indirect effects of bazedoxifene acetate on incident fractures were different across regions, and (4) estimates of indirect effects were sensible to the presence of post-treatment unmeasured confounders. In one region, around 40% of the bazedoxifene acetate effect on the occurrence of fracture is explained by its effect on bone mass density. Under the counterfactual approach, these 40% represent the average difference in the occurrence of fracture observed for untreated individuals when their bone mass density values are set at the value under bazedoxifene acetate versus under placebo. Conclusions: Computational methods are available to evaluate and interpret the surrogacytic capability of a biomarker of a primary outcome.
    PMID: 22967963  [PubMed - as supplied by publisher]
  2. Pharmacoepidemiol Drug Saf. 2012 Oct 1. [Epub ahead of print]The incident user design in comparative effectiveness research. Johnson ES, Bartman BA, Briesacher BA, Fleming NS, Gerhard T, Kornegay CJ, Nourjah P, Sauer B, Schumock GT, Sedrakyan A, St√ºrmer T, West SL, Schneeweiss S. The Center for Health Research, Kaiser Permanente, Portland, Oregon, USA.Comparative effectiveness research includes cohort studies and registries of interventions. When investigators design such studies, how important is it to follow patients from the day they initiated treatment with the study interventions? Our article considers this question and related issues to start a dialogue on the value of the incident user design in comparative effectiveness research. By incident user design, we mean a study that sets the cohort’s inception date according to patients’ new use of an intervention. In contrast, most epidemiologic studies enroll patients who were currently or recently using an intervention when follow-up began. We take the incident user design as a reasonable default strategy because it reduces biases that can impact non-randomized studies, especially when investigators use healthcare databases. We review case studies where investigators have explored the consequences of designing a cohort study by restricting to incident users, but most of the discussion has been informed by expert opinion, not by systematic evidence. Published 2012. This article is a U.S. Government work and is in the public domain in the USA.
    PMID: 23023988  [PubMed - as supplied by publisher]

CER Scan [published in the last 30 days]

  1. BMC Med Res Methodol. 2012 Sep 25;12(1):149. [Epub ahead of print]Investigating linkage rates among probabilistically linked birth and hospitalization records. Bentley JP, Ford JB, Taylor LK, Irvine KA, Roberts CL.BACKGROUND: With the increasing use of probabilistically linked administrative data in health research, it is important to understand whether systematic differences occur between the populations with linked and unlinked records. While probabilistic linkage involves combining records for individuals, population perinatal health research requires a combination of information from both the mother and her infant(s). The aims of this study were to (i) describe probabilistic linkage for perinatal records in New South Wales (NSW) Australia, (ii) determine linkage proportions for these perinatal records, and (iii) assess records with linked mother and infant hospital-birth record, and unlinked records for systematic differences. METHODS: This is a population-based study of probabilistically linked statutory birth and hospital records from New South Wales, Australia,  2001-2008. Linkage groups were created where the birth record had complete linkage with hospital admission records for both the mother and infant(s), partial linkage (the mother only or the infant(s) only) or neither. Unlinked hospital records for mothers and infants were also examined. Rates of linkage as a percentage of birth records and descriptive statistics for maternal and infant characteristics by linkage groups were determined. RESULTS: Complete linkage (mother hospital record – birth record – infant hospital record) was available for 95.9% of birth records, partial linkage for 3.6%, and 0.5% with no linked hospital records (unlinked). Among live born singletons (complete linkage = 96.5%) the mothers without linked infant records (1.6%) had slightly higher proportions of young, non- Australian born, socially disadvantaged women with adverse pregnancy outcomes. The unlinked birth records (0.5%) had slightly higher proportions of nulliparous, older, Australian born women giving birth in private hospitals by caesarean section. Stillbirths had the highest rate of unlinked records (3-4%). CONCLUSIONS: This study shows that probabilistic linkage of perinatal records can achieve high, representative levels of complete linkage. Records for mother’s that did not link to infant records and unlinked records had slightly different characteristics to fully linked records. However, these groups were small and unlikely to bias results and conclusions in a sustentative way. Stillbirths present additional challenges to the linkage process due to lower rates of linkage for lower gestational ages, where most stillbirths occur.
    PMID: 23009079  [PubMed - as supplied by publisher]
  2. N Engl J Med. 2012 Sep 20;367(12):1119-27.A randomized study of how physicians interpret research funding disclosures. Kesselheim AS, Robertson CT, Myers JA, Rose SL, Gillet V, Ross KM, Glynn RJ, Joffe S, Avorn J. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA. akesselheim@partners.orgComment in
    N Engl J Med. 2012 Sep 20;367(12):1152-3.BACKGROUND: The effects of clinical-trial funding on the interpretation of trial results are poorly understood. We examined how such support affects physicians’ reactions to trials with a high, medium, or low level of methodologic rigor. METHODS: We presented 503 board-certified internists with abstracts that we designed describing clinical trials of three hypothetical drugs. The trials had high, medium, or low methodologic rigor, and each report included one of three support disclosures: funding from a pharmaceutical company, NIH funding, or none. For both factors studied (rigor and funding), one of the three possible variations was randomly selected for inclusion in the abstracts. Follow-up questions assessed the physicians’ impressions of the trials’ rigor, their confidence in the results, and their willingness to prescribe the drugs. RESULTS: The 269 respondents (53.5% response rate) perceived the level of study rigor accurately. Physicians reported that they would be less willing to prescribe drugs tested in low-rigor trials than those tested in medium-rigor trials (odds ratio, 0.64; 95% confidence interval [CI], 0.46 to 0.89; P=0.008) and would be more willing to prescribe drugs tested in high-rigor trials than those tested in medium-rigor trials (odds ratio, 3.07; 95% CI, 2.18 to 4.32; P<0.001). Disclosure of industry funding, as compared with no disclosure of funding, led physicians to downgrade the rigor of a trial (odds ratio, 0.63; 95% CI, 0.46 to 0.87; P=0.006), their confidence in the results (odds ratio, 0.71; 95% CI, 0.51 to 0.98; P=0.04), and their willingness to prescribe the hypothetical drugs (odds ratio, 0.68; 95% CI, 0.49 to 0.94; P=0.02). Physicians were half as willing to prescribe drugs studied in industry-funded trials as they were to prescribe drugs studied in NIH-funded trials (odds ratio, 0.52; 95% CI, 0.37 to 0.71; P<0.001). These effects were consistent across all levels of methodologic rigor. CONCLUSIONS: Physicians discriminate among trials of varying degrees of rigor, but industry sponsorship negatively influences their perception of methodologic quality and reduces their willingness to believe and act on trial findings, independently of the trial’s quality. These effects may influence the translation of clinical research into practice.
    PMID: 22992075  [PubMed - indexed for MEDLINE]
  3. Circ Cardiovasc Qual Outcomes. 2012 Sep 1;5(5):e61-4.Bias in comparative effectiveness studies due to regional variation in medical practice intensity: a legitimate concern, or much ado about nothing? Huybrechts KF, Seeger JD, Rothman KJ, Glynn RJ, Avorn J, Schneeweiss S. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA.PMID: 22991351  [PubMed - in process]

Mini-Theme: Dealing with Missing Data

  1. Biometrics. 2012 Sep 17. [Epub ahead of print]Compliance Mixture Modelling with a Zero-Effect Complier Class and Missing Data. Sobel ME, Muthen B. Department of Statistics, Columbia University New York, NY; Professor Emeritus, University of California, Los Angeles, CA.Randomized experiments are the gold standard for evaluating proposed treatments. The intent to treat estimand measures the effect of treatment assignment, but not the effect of treatment if subjects take treatments to which they are not assigned. The desire to estimate the efficacy of the treatment in this case has been the impetus for a substantial literature on compliance over the last 15 years. In papers dealing with this issue, it is typically assumed there are different types of subjects, for example, those who will follow treatment assignment (compliers), and those who will always take a particular treatment irrespective of treatment assignment. The estimands of primary interest are the complier proportion and the complier average treatment effect (CACE). To estimate CACE, researchers have used various methods, for example, instrumental variables and parametric mixture models, treating compliers as a single class. However, it is often unreasonable to believe all compliers will be affected. This article therefore treats compliers as a mixture of two types, those belonging to a zero-effect class, others to an effect class. Second, in most experiments, some subjects drop out or simply do not report the value of the outcome variable, and the failure to take into account missing data can lead to biased estimates of treatment effects. Recent work on compliance in randomized experiments has addressed this issue by assuming missing data are missing at random or latently ignorable. We extend this work to the case where compliers are a mixture of types and also examine alternative types of nonignorable missing data assumptions. © 2012, International Biometric Society.
    PMID: 22985224  [PubMed - as supplied by publisher]
  2. N Engl J Med. 2012 Oct 4;367(14):1355-1360.The Prevention and Treatment of Missing Data in Clinical Trials. Little RJ, D’Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, Frangakis C, Hogan JW, Molenberghs G, Murphy SA, Neaton JD, Rotnitzky A, Scharfstein D, Shih WJ, Siegel JP, Stern H.Missing data in clinical trials can have a major effect on the validity of the inferences that can be drawn from the trial. This article reviews methods for preventing missing data and, failing that, dealing with data that are missing.
    PMID: 23034025  [PubMed - as supplied by publisher]
  3. N Engl J Med. 2012 Oct 4;367(14):1353-4. doi: 10.1056/NEJMsm1210043.Missing data. Ware JH, Harrington D, Hunter DJ, D’Agostino RB Sr.The statistical consultants to the Journal provide guidance on missing data in reports submitted for publication. PMID: 23034024  [PubMed - in process]

Theme: Targeted Maximum Likelihood Estimation (TMLE)

  1. Book: M.J. van der Laan, S. Rose, Targeted Learning, Causal Inference for Observational and Experimental Data, Springer, New York, 2012.
  2. M.J. van der Laan, D. Rubin (2006). Targeted Maximum Likelihood Learning. The International Journal of Biostatistics, http://www.bepress.com/ijb/vol2/iss1/11.Comprehensive introduction to targeted maximum likelihood estimation theoretical development and procedure for the estimation of causal inference, variable importance, and other parameters of interest.
  3. M.J. van der Laan, E.C. Polley, A.E. Hubbard (2007). Super Learner. Statistical Applications in Genetics and Molecular Biology, http://www.bepress.com/sagmb/vol6/iss1/art25.Proposes an algorithm for constructing a super learner which uses cross-validation to select weights to combine an initial set of candidate estimators.
  4. O. Bembom, M.L. Petersen, S.-Y. Rhee , W. J. Fessel, S.E. Sinisi, R.W. Shafer, M.J. van der Laan (2008). Biomarker discovery using targeted maximum likelihood estimation: Application to the treatment of antiretroviral resistant HIV infection. Statistics in Medicine, http://www3.interscience.wiley.com/journal/121422393/abstract.Discusses and implements targeted maximum likelihood estimation for variable importance to rank a set of candidate biomarkers.
  5. Gruber, Susan and van der Laan, Mark J., “tmle: An R Package for Targeted Maximum Likelihood Estimation” (February 2011). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 275.
    http://biostats.bepress.com/ucbbiostat/paper275
  6. van der Laan, Mark J. and Gruber, Susan, “Targeted Minimum Loss Based Estimation of an Intervention Specific Mean Outcome” (August 2011). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 290.
    http://biostats.bepress.com/ucbbiostat/paper290, to appear in IJB.
  7. van der Laan, Mark J.; Rose, Sherri; and Gruber, Susan, “Readings in Targeted Maximum Likelihood Estimation” (September 2009). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 254.
    http://biostats.bepress.com/ucbbiostat/paper254

 

August 2012

Jump to top of page

 

CER Scan [Epub ahead of print]

    1. Med Care. 2012 Aug 23. [Epub ahead of print]

The Use of Patient-reported Outcomes (PRO) Within Comparative Effectiveness Research: Implications for Clinical Practice and Health Care Policy. Ahmed S, Berzon RA, Revicki DA, Lenderking WR, Moinpour CM, Basch E, Reeve BB, Wu AW; on behalf of the International Society for Quality of Life Research. School of Physical and Occupational Therapy, McGill University, Clinical Epidemiology, McGill University Health Center, Centre de recherche interdisciplinaire en réadaptation, Montreal, QC; National Institute on Minority Health and Health Disparities, National Institutes of Health, Bethesda, MD Health Outcomes Research, United BioSource Corporation, Bethesda, MD; Center for Health Outcomes Research, United BioSource Corporation, Lexington, MA; Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA; Memorial Sloan-Kettering Cancer Center, New York, NY; Lineberger Comprehensive Cancer Center & Department of Health Policy and Management, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC; John Hopkins Bloomberg School of Public Health, Baltimore, MD.

BACKGROUND: The goal of comparative effectiveness research (CER) is to explain the differential benefits and harms of alternate methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care. To inform decision making, information from the patient’s perspective that reflects outcomes that patients care about are needed and can be collected rigorously using appropriate patient-reported outcomes (PRO). It can be challenging to select the most appropriate PRO measure given the proliferation of such questionnaires over the past 20 years. OBJECTIVE: In this paper, we discuss the value of PROs within CER, types of measures that are likely to be useful in the CER context, PRO instrument selection, and key challenges associated with using PROs in CER.
METHODS: We delineate important considerations for defining the CER context, selecting the appropriate measures, and for the analysis and interpretation of PRO data. Emerging changes that may facilitate CER using PROs as an outcome are also reviewed including implementation of electronic and personal health records, hospital and population-based registries, and the use of PROs in national monitoring initiatives. The potential benefits of linking the information derived from PRO endpoints in CER to decision making is also reviewed. CONCLUSIONS: The recommendations presented for incorporating PROs in CER are intended to provide a guide to researchers, clinicians, and policy makers to ensure that information derived from PROs is applicable and interpretable for a given CER context. In turn, CER will provide information that is necessary for clinicians, patients, and families to make informed care decisions.

PMID: 22922434  [PubMed - as supplied by publisher]

    1. Clin Trials. 2012 Aug 22. [Epub ahead of print]

Utilizing the integrated difference of two survival functions to quantify the treatment contrast for designing, monitoring, and analyzing a comparative clinical study. Zhao L, Tian L, Uno H, Solomon SD, Pfeffer MA, Schindler JS, Wei LJ. Department of Preventive Medicine, Northwestern University, Chicago, IL, USA.

BACKGROUND: Consider a comparative, randomized clinical study with a specific event time as the primary end point. In the presence of censoring, standard methods of summarizing the treatment difference are based on Kaplan-Meier curves, the logrank test, and the point and interval estimates via Cox’s procedure. Moreover, for designing and monitoring the study, one usually utilizes an event-driven scheme to determine the sample sizes and interim analysis time points. PURPOSE: When the proportional hazards (PHs) assumption is violated, the logrank test may not have sufficient power to detect the difference between two event time distributions. The resulting hazard ratio estimate is difficult, if not impossible, to interpret as a treatment contrast. When the event rates are low, the corresponding interval estimate for the ‘hazard ratio’ can be quite large due to the fact that the interval length depends on the observed numbers of events. This may indicate that there is not enough information for making inferences about the treatment comparison even when there is no difference between two groups. This situation is quite common for a postmarketing safety study. We need an alternative way to quantify the group difference. METHODS: Instead of quantifying the treatment group difference using the hazard ratio, we consider an easily interpretable and model-free parameter, the integrated survival rate difference over a prespecified time interval, as an alternative. We present the inference procedures for such a treatment contrast. This approach is purely nonparametric and does not need any model assumption such as the PHs. Moreover, when we deal with equivalence or noninferiority studies and the event rates are low, our procedure would provide more information about the treatment difference. We used a cardiovascular trial data set to illustrate our approach. RESULTS: The results using the integrated event rate differences have a heuristic interpretation for the treatment difference even when the PHs assumption is not valid. When the event rates are low, for example, for the cardiovascular study discussed in this article, the procedure for the integrated event rate difference provides tight interval estimates in contrast to those based on the event-driven inference method. LIMITATIONS: The design of a trial with the integrated event rate difference may be more complicated than that using the event-driven procedure. One may use simulation to determine the sample size and the estimated duration of the study. CONCLUSIONS: The procedure discussed in this article can be a useful alternative to the standard PHs method in the survival analysis. Clinical Trials 2012; 0: 1-8.

PMID: 22914867  [PubMed - as supplied by publisher]

CER Scan [published in the last 30 days]

  1. JAMA. 2012 Aug 22;308(8):773-4.The value of statistical analysis plans in observational research: defining high-quality research from the start. Thomas L, Peterson ED. Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina, USA.PMID: 22910753  [PubMed - indexed for MEDLINE]
  2. BMC Med Res Methodol. 2012 Aug 6;12(1):119. [Epub ahead of print]Comparing marginal structural models to standard methods for estimating treatment effects of antihypertensive combination therapy. Gerhard T, Delaney JA, Cooper-Dehoff RM, Shuster J, Brumback BA, Johnson JA, Pepine CJ, Winterstein AG.
  3. INTRODUCTION: Due to time-dependent confounding by blood pressure and differential loss to follow-up, it is difficult to estimate the effectiveness of aggressive versus conventional antihypertensive combination therapies in non-randomized comparisons. METHODS: We utilized data from 22,576 hypertensive coronary artery disease patients, prospectively enrolled in the International VErapamil-Trandolapril STudy (INVEST). Our post-hoc analyses did not consider the randomized treatment strategies, but instead defined exposure time-dependently as aggressive treatment ([greater than or equal to]3 concomitantly used antihypertensive medications) versus conventional treatment ([less than or equal to]2 concomitantly used antihypertensive medications). Study outcome was defined as time to first serious cardiovascular event (non-fatal myocardial infarction, non-fatal stroke, or all-cause death). We compared hazard ratio (HR) estimates for aggressive vs. conventional treatment from a Marginal Structural Cox Model (MSCM) to estimates from a standard Cox model. Both models included exposure to antihypertensive treatment at each follow-up visit, demographics, and baseline cardiovascular risk factors, including blood pressure. The MSCM further adjusted for systolic blood pressure at each follow-up visit, through inverse probability of treatment weights. RESULTS: 2,269 (10.1%) patients experienced a cardiovascular event over a total follow-up of 60,939 person-years. The HR for aggressive treatment estimated by the standard Cox model was 0.96 (95% confidence interval 0.87-1.07). The equivalent MSCM, which was able to account for changes in systolic blood pressure during follow-up, estimated a HR of 0.81 (95% CI 0.71-0.92). CONCLUSIONS: Using a MSCM, aggressive treatment was associated with a lower risk for serious cardiovascular outcomes compared to conventional treatment. In contrast, a standard Cox model estimated similar risks for aggressive and conventional treatments.PMID: 22866767  [PubMed - as supplied by publisher] 

July 2012

Jump to top of page

CER Scan [Epub ahead of print]

    1. Ann Epidemiol. 2012 Aug 2. [Epub ahead of print]

Accounting for context in studies of health inequalities: a review and comparison of analytic approaches
Schempf AH, Kaufman JS.
Office of Epidemiology, Policy & Evaluation, Maternal and Child Health Bureau, Health Resources and Services Administration, Rockville, MD 20857.

BACKGROUND: A common epidemiologic objective is to evaluate the contribution of residential context to individual-level disparities by race or socioeconomic position. PURPOSE: We reviewed analytic strategies to account for the total (observed and unobserved factors) contribution of environmental context to health inequalities, including conventional fixed effects (FE) and hybrid FE implemented within a random effects (RE) or a marginal model. METHODS: To illustrate results and limitations of the various analytic approaches of accounting for the total contextual component of health disparities, we used data on births nested within neighborhoods as an applied example of evaluating neighborhood confounding of racial disparities in gestational age at birth, including both a continuous and a binary outcome. RESULTS: Ordinary and RE models provided disparity estimates that can be substantially biased in the presence of neighborhood confounding. Both FE and hybrid FE models can account for cluster level confounding and provide disparity estimates unconfounded by neighborhood, with the latter having greater flexibility in allowing estimation of neighborhood-level effects and intercept/slope variability when implemented in a RE specification. CONCLUSIONS: Given the range of models that can be implemented in a hybrid approach and the frequent goal of accounting for contextual confounding, this approach should be used more often. Copyright © 2012 Elsevier Inc. All rights reserved.

PMID: 22858050 [PubMed - as supplied by publisher]

    1. Stat Med. 2012 Jul 17. doi: 10.1002/sim.5508. [Epub ahead of print]

The analysis of record-linked data using multiple imputation with data value priors
Goldstein H, Harron K, Wade A.
Medical Research Council Centre of Epidemiology for Child health, University College London Institute of Child health, London, WC1N 1EH, U.K.; Centre for Multilevel Modelling, Graduate School of Education, University of Bristol, BS8 1JA, Bristol, U.K.

Probabilistic record linkage techniques assign match weights to one or more potential matches for those individual records that cannot be assigned ‘unequivocal matches’ across data files. Existing methods select the single record having the maximum weight provided that this weight is higher than an assigned threshold. We argue that this procedure, which ignores all information from matches with lower weights and for some individuals assigns no match, is inefficient and may also lead to biases in subsequent analysis of the linked data. We propose that a multiple imputation framework be utilised for data that belong to records that cannot be matched unequivocally. In this way, the information from all potential matches is transferred through to the analysis stage. This procedure allows for the propagation of matching uncertainty through a full modelling process that preserves the data structure. For purposes of statistical modelling, results from a simulation example suggest that a full probabilistic record linkage is unnecessary and that standard multiple imputation will provide unbiased and efficient parameter estimates. Copyright © 2012 John Wiley & Sons, Ltd.

PMID: 22807145 [PubMed - as supplied by publisher]

    1. Stat Med. 2012 Jul 16. doi: 10.1002/sim.5498. [Epub ahead of print]

Methods for analyzing data from probabilistic linkage strategies based on partially identifying variables.
Hof MH, Zwinderman AH.
Department of Clinical Epidemiology, Biostatistics, and Bioinformatics, Academic Medical Center, University of Amsterdam, 1105 AZ Amsterdam, The Netherlands. m.h.hof@amc.uva.nl.

In record linkage studies, unique identifiers are often not available, and therefore, the linkage procedure depends on combinations of partially identifying variables with low discriminating power. As a consequence, wrongly linked covariate and outcome pairs will be created and bias further analysis of the linked data. In this article, we investigated two estimators that correct for linkage error in regression analysis. We extended the estimators developed by Lahiri and Larsen and also suggested a weighted least squares approach to deal with linkage error. We considered both linear and logistic regression problems and evaluated the performance of both methods with simulations. Our results show that all wrong covariate and outcome pairs need to be removed from the analysis in order to calculate unbiased regression coefficients in both approaches. This removal requires strong assumptions on the structure of the data. In addition, the bias significantly increases when the assumptions do not hold and wrongly linked records influence the coefficient estimation. Our simulations showed that both methods had similar performance in linear regression problems. With logistic regression problems, the weighted least squares method showed less bias. Because the specific structure of the data in record linkage problems often leads to different assumptions, it is necessary that the analyst has prior knowledge on the nature of the data. These assumptions are more easily introduced in the weighted least squares approach than in the Lahiri and Larsen estimator. Copyright © 2012 John Wiley & Sons, Ltd.

PMID: 22807060 [PubMed - as supplied by publisher]

  1. Stat Med. 2012 Jul 16. doi: 10.1002/sim.5482. [Epub ahead of print]

Comparative effectiveness research: does one size fit all?
Kunz LM, Yeh RW, Normand SL.
Department of Biostatistics, Harvard School of Public Health, Boston, MA, U.S.A.

In this commentary, we argue that although randomization has many benefits, not all questions we seek to answer fit into a randomized setting. Our argument utilizes the clinical setting of carotid atherosclerosis management where specific clinical questions are answered by using a variety of comparative effectiveness designs. Observational studies should not be ruled out when designing studies to address questions of comparative effectiveness. Copyright © 2012 John Wiley & Sons, Ltd.

PMID: 22806612 [PubMed - as supplied by publisher]

  1. Biostatistics. 2012 Jul 12. [Epub ahead of print]

Targeted maximum likelihood estimation for marginal time-dependent treatment effects under density misspecification.
Schnitzer ME, Moodie EE, Platt RW.
Department of Epidemiology, Biostatistics, & Occupational Health, McGill University, Montreal, QC, Canada

Targeted maximum likelihood methods have been proposed to estimate treatment effects for longitudinal data in the presence of time-dependent confounders. This class of methods has been mathematically proven to be doubly robust and to optimize the asymptotic estimating efficiency among the class of regular, semi-parametric estimators when all estimated density components are correctly specified. We show that methods previously proposed to build a one-step estimator with a logistic loss function generalize to a generalized linear loss function, and so may be applied naturally to an outcome that can be described by any exponential family member. We evaluate several methods for estimating unstructured marginal treatment effects for data with two time intervals in a simulation study, showing that these estimators have competitively low bias and variance in an array of misspecified situations, and can be made to perform well under near-positivity violations. We apply the methods to the PROmotion of Breastfeeding Intervention Trial data, demonstrating that longer term breastfeeding can protect infants from gastrointestinal infection.

PMID: 22797173 [PubMed - as supplied by publisher]

    1. Pharmacoepidemiol Drug Saf. 2012 Jul 3. doi: 10.1002/pds.3319. [Epub ahead of print]

Validity of health plan and birth certificate data for pregnancy research.
Andrade SE, Scott PE, Davis RL, Li DK, Getahun D, Cheetham TC, Raebel MA, Toh S, Dublin S, Pawloski PA, Hammad TA, Beaton SJ, Smith DH, Dashevsky I, Haffenreffer K, Cooper WO.
Meyers Primary Care Institute and University of Massachusetts Medical School, Worcester, MA, USA. sandrade@meyersprimary.org.

PURPOSE: To evaluate the validity of health plan and birth certificate data for pregnancy research. METHODS: A retrospective study was conducted using administrative and claims data from 11 U.S. health plans and corresponding birth certificate data from state health departments. Diagnoses, drug dispensings, and procedure codes were used to identify infant outcomes (cardiac defects, anencephaly, preterm birth, and neonatal intensive care unit [NICU] admission) and maternal diagnoses (asthma and systemic lupus erythematosus [SLE]) recorded in the health plan data for live born deliveries between January 2001 and December 2007. A random sample of medical charts (n = 802) was abstracted for infants and mothers identified with the specified outcomes. Information on newborn, maternal, and paternal characteristics (gestational age at birth, birth weight, previous pregnancies and live births, race/ethnicity) was also abstracted and compared to birth certificate data. Positive predictive values (PPVs) were calculated with documentation in the medical chart serving as the gold standard.

RESULTS: PPVs were 71% for cardiac defects, 37% for anencephaly, 87% for preterm birth, and 92% for NICU admission. PPVs for algorithms to identify maternal diagnoses of asthma and SLE were ≥ 93%. Our findings indicated considerable agreement (PPVs > 90%) between birth certificate and medical record data for measures related to birth weight, gestational age, prior obstetrical history, and race/ethnicity. CONCLUSIONS: Health plan and birth certificate data can be useful to accurately identify some infant outcomes, maternal diagnoses, and newborn, maternal, and paternal characteristics. Other outcomes and variables may require medical record review for validation. Copyright © 2012 John Wiley & Sons, Ltd.

PMID: 22753079 [PubMed - as supplied by publisher]

    1. Clin Trials. 2012 Jul 2. [Epub ahead of print]

The role for pragmatic randomized controlled trials (pRCTs) in comparative effectiveness research
Chalkidou K, Tunis S, Whicher D, Fowler R, Zwarenstein M.
National Institute for Health and Clinical Excellence, London, UK.

There is a growing appreciation that our current approach to clinical research leaves important gaps in evidence from the perspective of patients, clinicians, and payers wishing to make evidence-based clinical and health policy decisions. This has been a major driver in the rapid increase in interest in comparative effectiveness research (CER), which aims to compare the benefits, risks, and sometimes costs of alternative health-care interventions in ‘the real world’. While a broad range of experimental and nonexperimental methods will be used in conducting CER studies, many important questions are likely to require experimental approaches – that is, randomized controlled trials (RCTs). Concerns about the generalizability, feasibility, and cost of RCTs have been frequently articulated in CER method discussions. Pragmatic RCTs (or ‘pRCTs’) are intended to maintain the internal validity of RCTs while being designed and implemented in ways that would better address the demand for evidence about real-world risks and benefits for informing clinical and health policy decisions. While the level of interest and activity in conducting pRCTs is increasing, many challenges remain for their routine use. This article discusses those challenges and offers some potential ways forward. Clinical Trials 2012; XX: 1 -11. http://ctj.sagepub.com.

PMID: 22752634 [PubMed - as supplied by publisher]

    1. Clin Microbiol Infect. 2012 Jun 14. [Epub ahead of print]

Methods to assess seasonal effects in epidemiological studies of infectious diseases-exemplified by application to the occurrence of meningococcal disease.
Christiansen CF, Pedersen L, Sørensen HT, Rothman KJ.
Department of Clinical Epidemiology, Aarhus University Hospital, Aarhus, Denmark  RTI Health Solutions, Research Triangle Park, NC, USA.

Seasonal variation in occurrence is a common feature of many diseases, especially those of infectious origin. Studies of seasonal variation contribute to healthcare planning and to the understanding of the aetiology of infections. In this article, we provide an overview of statistical methods for the assessment and quantification of seasonality of infectious diseases, as exemplified by their application to meningococcal disease in Denmark in 1995-2011. Additionally, we discuss the conditions under which seasonality should be considered as a covariate in studies of infectious diseases. The methods considered range from the simplest comparison of disease occurrence between the extremes of summer and winter, through modelling of the intensity of seasonal patterns by use of a sine curve, to more advanced generalized linear models. All three classes of method have advantages and disadvantages. The choice among analytical approaches should ideally reflect the research question of interest. Simple methods are compelling, but may overlook important seasonal peaks that would have been identified if more advanced methods had been applied. For most studies, we suggest the use of methods that allow estimation of the magnitude and timing of seasonal peaks and valleys, ideally with a measure of the intensity of seasonality, such as the peak-to-low ratio. Seasonality may be a confounder in studies of infectious disease occurrence when it fulfils the three primary criteria for being a confounder, i.e. when both the disease occurrence and the exposure vary seasonally without seasonality being a step in the causal pathway. In these situations, confounding by seasonality should be controlled as for any confounder. © 2012 The Authors. Clinical Microbiology and Infection © 2012 European Society of Clinical Microbiology and Infectious Diseases.

PMID: 22817396 [PubMed - as supplied by publisher]

CER Scan [published in the last 30 days]

    1. BMC Med Res Methodol. 2012 Jul 23;12(1):105. [Epub ahead of print]

Analyzing repeated data collected by mobile phones and frequent text messages. An example of Low back pain measured weekly for 18 weeks. Axén I, Bodin L, Kongsted A, Wedderkopp N, Jensen I, Bergström G.

BACKGROUND: Repeated data collection is desirable when monitoring fluctuating conditions. Mobile phones can be used to gather such data from large groups of respondents by sending and receiving frequently repeated short questions and answers as text messages. The analysis of repeated data involves some challenges. Vital issues to consider are the within-subject correlation, the between measurement occasion correlation and the presence of missing values. The overall aim of this commentary is to describe different methods of analyzing repeated data. It is meant to give an overview for the clinical researcher in order for complex outcome measures to be interpreted in a clinically meaningful way. METHODS: A model data set was formed using data from two clinical studies, where patients with low back pain were followed with weekly text messages for 18 weeks. Different research questions and analytic approaches were illustrated and discussed, as well as the handling of missing data. In the applications the weekly outcome “number of days with pain” was analyzed in relation to the patients’ “previous duration of pain” (categorized as more or less than 30 days in the previous year). Research questions with appropriate analytical methods 1: How many days with pain do patients experience? This question was answered with data summaries. 2: What is the proportion of participants “recovered” at a specific time point? This question was answered using logistic regression analysis. 3: What is the time to recovery? This question was answered using survival analysis, illustrated in Kaplan-Meier curves, Proportional Hazard regression analyses and spline regression analyses. 4: How is the repeatedly measured data associated with baseline (predictor) variables? This question was answered using generalized Estimating Equations, Poisson regression and Mixed linear models analyses. 5: Are there subgroups of patients with similar courses of pain within the studied population? A visual approach and hierarchical cluster analyses revealed different subgroups using subsets of the model data.

CONCLUSIONS: We have illustrated several ways of analysing repeated measures with both traditional analytic approaches using standard statistical packages, as well as recently developed statistical methods that will utilize all the vital features inherent in the data.

PMID: 22824413 [PubMed - as supplied by publisher]

June 2012

Jump to top of page

CER Scan [Epub ahead of print]

    1. Pharmacoepidemiol Drug Saf. 2012 Ju n 1. doi: 10.1002/pds.3299. [Epub ahead of print]

Comparing the cohort design and the nested case-control design in the presence of both time-invariant and time-dependent treatment and competing risks: bias and precision.Austin PC, Anderson GM, Cigsar C, Gruneir A. Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada; Institute of Health Management, Policy and Evaluation, University of Toronto, Toronto, Ontario, Canada; Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.

PURPOSE: Observational studies using electronic administrative healthcare databases are often used to estimate the effects of treatments and exposures. Traditionally, a cohort design has been used to estimate these effects, but increasingly, studies are using a nested case-control (NCC) design. The relative statistical efficiency of these two designs has not been examined in detail.
METHODS: We used Monte Carlo simulations to compare these two designs in terms of the bias and precision of effect estimates. We examined three different settings: (A) treatment occurred at baseline, and there was a single outcome of interest; (B) treatment was time varying, and there was a single outcome; and C treatment occurred at baseline, and there was a secondary event that competed with the primary event of interest. Comparisons were made of percentage bias, length of 95% confidence interval, and mean squared error (MSE) as a combined measure of bias and precision. RESULTS: In Setting A, bias was similar between designs, but the cohort design was more precise and had a lower MSE in all scenarios. In Settings B and C, the cohort design was more precise and had a lower MSE in all scenarios. In both Settings B and C, the NCC design tended to result in estimates with greater bias compared with the cohort design. CONCLUSIONS: We conclude that in a range of settings and scenarios, the cohort design is superior in terms of precision and MSE. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22653805  [PubMed - as supplied by publisher]

    1. BMC Med Res Methodol. 2012 May 30;12(1):70. [Epub ahead of print]

Evaluation of the propensity score methods for estimating marginal odds ratios in case of small sample size. Pirracchio R, Resche Rigon M, Chevret S.

BACKGROUND: Propensity score (PS) methods are increasingly used, even when sample sizes are small or treatments are seldom used. However, the relative performance of the two mainly recommended PS methods, namely PS-matching or inverse probability of treatment weighting (IPTW), have not been studied in the context of small sample sizes. METHODS: We conducted a series of Monte Carlo simulations to evaluate the influence of sample size, prevalence of treatment exposure, and strength of the association between the variables and the outcome and/or the treatment exposure, on the performance of these two methods. RESULTS: Decreasing the sample size from 1,000 to 40 subjects did not substantially alter the Type I error rate, and led to relative biases below 10 %. The IPTW method performed better than the PS-matching down to 60 subjects. When N was set at 40, the PS matching estimators were either similarly or even less biased than the IPTW estimators. Including variables unrelated to the exposure but related to the outcome in the PS model decreased the bias and the variance as compared to models omitting such variables. Excluding the true confounder from the PS model resulted, whatever the method used, in a significantly biased estimation of treatment effect. These results were illustrated in a real dataset. CONCLUSION: Even in case of small study samples or low prevalence of treatment, PS-matching and IPTW can yield correct estimations of treatment effect unless the true confounders and the variables related only to the outcome are not included in the PS model.
PMID: 22646911  [PubMed - as supplied by publisher]

    1. J Crit Care. 2012 May 15. [Epub ahead of print]

Prognostic models based on administrative data alone inadequately predict the survival outcomes for critically ill patients at 180 days post-hospital discharge. Bohensky MA, Jolley D, Pilcher DV, Sundararajan V, Evans S, Brand CA. Centre for Research Excellence in Patient Safety, School of Public Health & Preventive Medicine, Monash University, Prahran, VIC 3181, Australia.

There is interest in evaluating the quality of critical care by auditing patient outcomes after hospital discharge. Risk adjustment using acuity of illness scores, such as Acute Physiology and Chronic Health Evaluation (APACHE III) scores, derived from clinical databases is commonly performed for in-hospital mortality outcome measures. However, these clinical databases do not routinely track patient outcomes after hospital discharge. Linkage of clinical databases to administrative data sets that maintain records on patient survival after discharge can allow for the measurement of survival outcomes of critical care patients after hospital discharge while using validated risk adjustment methods.  OBJECTIVE: The aim of this study was to compare the ability of 4 methods of risk adjustment to predict survival of critically ill patients at 180 days after hospital discharge: one using only variables from an administrative data set, one using only variables from a clinical database, a model using a full range of administrative and clinical variables, and a model using administrative variables plus APACHE III scores. DESIGN: This was a population-based cohort study. PATIENTS: The study sample consisted of adult (>15 years of age) residents of Victoria, Australia, admitted to a public hospital intensive care unit between 1 January 2001 and 31 December 2006 (n = 47,312 linked cases). Logistic regression analyses were used to develop the models. RESULTS: The administrative-only model was the poorest predictor of mortality at 180 days after hospital discharge (C = 0.73). The clinical model had substantially better predictive capabilities (C = 0.82), whereas the full-linked model achieved similar performance (C = 0.83). Adding APACHE III scores to the administrative model also had reasonable predictive capabilities (C = 0.83). CONCLUSIONS: The addition of APACHE III scores to administrative data substantially improved model performance to the level of the clinical model. Although linking data systems requires some investment, having the ability to evaluate case ascertainment and accurately risk adjust outcomes of intensive care patients after discharge will add valuable insights into clinical audit and decision-making processes. Copyright © 2012 Elsevier Inc. All rights reserved.
PMID: 22591572  [PubMed - as supplied by publisher]

    1. Health Serv Res. 2012 Jun;47(3 Pt 2):1232-54. doi: 10.1111/j.1475-6773.2012.01387.x. Epub 2012 Feb 21.Measuring racial/ethnic disparities in health care: methods and practical issues. Cook BL, McGuire TG, Zaslavsky AM. Department of Psychiatry, Center for Multicultural Mental Health Research, Harvard Medical School, Somerville, MA 02143, USA. bcook@charesearch.org

OBJECTIVE: To review methods of measuring racial/ethnic health care disparities. STUDY DESIGN: Identification and tracking of racial/ethnic disparities in health care will be advanced by application of a consistent definition and reliable empirical methods. We have proposed a definition of racial/ethnic health care disparities based in the Institute of Medicine’s (IOM) Unequal Treatment report, which defines disparities as all differences except those due to clinical need and preferences. After briefly summarizing the strengths and critiques of this definition, we review methods that have been used to implement it. We discuss practical issues that arise during implementation and expand these methods to identify sources of disparities. We also situate the focus on methods to measure racial/ethnic health care disparities (an endeavor predominant in the United States) within a larger international literature in health outcomes and health care inequality. EMPIRICAL APPLICATION: We compare different methods of implementing the IOM definition on measurement of disparities in any use of mental health care and mental health care expenditures using the 2004-2008 Medical Expenditure Panel Survey. CONCLUSION: Disparities analysts should be aware of multiple methods available to measure disparities and their differing assumptions. We prefer a method concordant with the IOM definition. © Health Research and Educational Trust.
PMCID: PMC3371391 [Available on 2013/6/1]
PMID: 22353147  [PubMed - in process]

    1. Stat Med. 2012 Jun 26. doi: 10.1002/sim.5429. [Epub ahead of print]

On Bayesian methods of exploring qualitative interactions for targeted treatment.
Chen W, Ghosh D, Raghunathan TE, Norkin M, Sargent DJ, Bepler G. Department of Oncology, School of Medicine, Wayne State University, Detroit, MI 48201, U.S.A.. chenw@karmanos.org.

Providing personalized treatments designed to maximize benefits and minimizing harms is of tremendous current medical interest. One problem in this area is the evaluation of the interaction between the treatment and other predictor variables. Treatment effects in subgroups having the same direction but different magnitudes are called quantitative interactions, whereas those having opposite directions in subgroups are called qualitative interactions (QIs). Identifying QIs is challenging because they are rare and usually unknown among many potential biomarkers. Meanwhile, subgroup analysis reduces the power of hypothesis testing and multiple subgroup analyses inflate the type I error rate. We propose a new Bayesian approach to search for QI in a multiple regression setting with adaptive decision rules. We consider various regression models for the outcome. We illustrate this method in two examples of phase III clinical trials. The algorithm is straightforward and easy to implement using existing software packages. We provide a sample code in Appendix A. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22733620  [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2012 Jun 22. [Epub ahead of print]

Causal inference with a quantitative exposure. Zhang Z, Zhou J, Cao W, Zhang J.
Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, USA.

The current statistical literature on causal inference is mostly concerned with binary or categorical exposures, even though exposures of a quantitative nature are frequently encountered in epidemiologic research. In this article, we review the available methods for estimating the dose-response curve for a quantitative exposure, which include ordinary regression based on an outcome regression model, inverse propensity weighting and stratification based on a propensity function model, and an augmented inverse propensity weighting method that is doubly robust with respect to the two models. We note that an outcome regression model often imposes an implicit constraint on the dose-response curve, and propose a flexible modeling strategy that avoids constraining the dose-response curve. We also propose two new methods: a weighted regression method that combines ordinary regression with inverse propensity weighting and a stratified regression method that combines ordinary regression with stratification. The proposed methods are similar to the augmented inverse propensity weighting method in the sense of double robustness, but easier to implement and more generally applicable. The methods are illustrated with an obstetric example and compared in simulation studies.
PMID: 22729475  [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2012 Jun 11. [Epub ahead of print]

A comparison of two methods of estimating propensity scores after multiple imputation. Mitra R, Reiter JP. School of Mathematics, University of Southampton, Southampton, UK.

In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of the covariates are missing, analysts can use multiple imputation to fill in the missing data, estimate propensity scores based on the m completed datasets, and use the propensity scores to estimate treatment effects. We compare two approaches to implement this process. In the first, the analyst estimates the treatment effect using propensity score matching within each completed data set, and averages the m treatment effect estimates. In the second approach, the analyst averages the m propensity scores for each record across the completed datasets, and performs propensity score matching with these averaged scores to estimate the treatment effect. We compare properties of both methods via simulation studies using artificial and real data. The simulations suggest that the second method has greater potential to produce substantial bias reductions than the first, particularly when the missing values are predictive of treatment assignment.
PMID: 22687877  [PubMed - as supplied by publisher]

    1. Pharmacoepidemiol Drug Saf. 2012 Jun 4. doi: 10.1002/pds.3297. [Epub ahead of print]

Exploring large weight deletion and the ability to balance confounders when using inverse probability of treatment weighting in the presence of rare treatment decisions. Kilpatrick RD, Gilbertson D, Brookhart MA, Polley E, Rothman KJ, Bradbury BD. Amgen, Inc., Center for Observational Research, Thousand Oaks, CA, USA. rkilpatr@amgen.com.

PURPOSE: When medications are modified in response to changing clinical conditions, confounding by indication arises that cannot be controlled using traditional adjustment. Inverse probability of treatment weights (IPTWs) can address this confounding given assumptions of no unmeasured confounders and that all patients have a positive probability of receiving all levels of treatment (positivity). We sought to explore these assumptions empirically in the context of epoetin-alfa (EPO) dosing and mortality. METHODS: We developed a single set of IPTWs for seven EPO dose categories and evaluated achieved covariate balance, mortality hazard ratios, and confidence intervals using two levels of treatment model parameterization and weight deletion. RESULTS: We found that IPTWs improved covariate balance for most confounders, but was not optimal for prior hemoglobin. Including more predictors in the treatment model or retaining highly weighted individuals resulted in estimates closer to the null, although precision decreased. CONCLUSION: We chose to evaluate weights and covariate balance at a
single time-point to facilitate an empirical analysis of model assumptions. These same assumptions are applicable to a time-dependent analysis, although empirical examination is not straight forward in that case. We find that the inclusion of rare treatment decisions and the high weights that result is needed for covariate balance under the positivity assumption. Removal of these influential weights can result in bias in either direction relative to the original confounding. It is therefore important to determine the reason for these rare patterns and whether inference is possible for all treatment levels. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22674782 [PubMed - as supplied by publisher]

CER Search String – Published in the last 30 days:

  • Epidemiology. 2012 Jul;23(4):574-82.

Estimating the Effects of Multiple Time-varying Exposures Using Joint Marginal Structural Models: Alcohol Consumption, Injection Drug Use, and HIV Acquisition. Howe CJ, Cole SR, Mehta SH, Kirk GD. From the Department of Epidemiology, Center for Population Health and Clinical Epidemiology, Brown University Program in Public Health, Providence, RI; Department of Epidemiology, University of North Carolina Gillings School of Global Public Health, Chapel Hill, NC; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD; and Division of Infectious Diseases, Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD.

The joint effects of multiple exposures on an outcome are frequently of interest in epidemiologic research. In 2001, Hernán et al (J Am Stat Assoc. 2001;96:440-448) presented methods for estimating the joint effects of multiple time-varying exposures subject to time-varying confounding affected by prior exposure using joint marginal structural models. Nonetheless, the use of these joint models is rare in the applied literature. Minimal uptake of these joint models, in contrast to the now widely used standard marginal structural model, is due in part to a lack of examples demonstrating the method. In this paper, we review the assumptions necessary for unbiased estimation of joint effects as well as the distinction between interaction and effect measure modification. We demonstrate the use of marginal structural models for estimating the joint effects of alcohol consumption and injection drug use on HIV acquisition, using data from 1525 injection drug users in the AIDS Link to Intravenous Experience cohort study. In the joint model, the hazard ratio (HR) for heavy drinking in the absence of any drug injections was 1.58 (95% confidence interval = 0.67-3.73). The HR for any drug injections in the absence of heavy drinking was 1.78 (1.10-2.89). The HR for heavy drinking and any drug injections was 2.45 (1.45-4.12). The P values for multiplicative and additive interaction were 0.7620 and 0.9200, respectively, indicating a lack of departure from effects that multiply or add. We could not rule out interaction on either scale due to
imprecision.
PMCID: PMC3367098 [Available on 2013/7/1]
PMID: 22495473  [PubMed - in process]

JULY THEME:
Electronic Data Methods (EDM) Forum Special Supplement. Medical Care. 50():i-ii,S1-S101, July 2012.

The Agency for Health Research and Quality funded the Electronic Data Methods Forum (EDM Forum) to share the experiences and learnings from 11 research teams funded through three different grant programs, each of which involve the use of electronic clinical data in Comparative Effectiveness Research and Patient-Centered Outcomes Research.

Link to entire Supplement: http://journals.lww.com/lww-medicalcare/toc/2012/07001

Contents:

  1. Electronic Data Methods (EDM) Forum Building the Infrastructure to Conduct Comparative Effectiveness Research and Patient-Centered Outcomes Research using Electronic Clinical Data. Johnson, Beth Henry
    Medical Care, July 2012,50():ii
    Editorial
  2. EDM Forum Supplement Overview. Calonge, Ned. Medical Care, July 2012,50():S1-S2
    Introduction
  3. Building Sustainable Multi-functional Prospective Electronic Clinical Data Systems. Randhawa, Gurvaneet S.; Slutsky, Jean R. Medical Care, July 2012,50():S3-S6
    Introduction
  4. The Electronic Data Methods (EDM) Forum for Comparative Effectiveness Research (CER) Holve, Erin; Segal, Courtney; Lopez, Marianne Hamilton; Rein, Alison; Johnson, Beth H. Medical Care, July 2012,50():S7-S10
    Introduction
  5. Opportunities and Challenges for Comparative Effectiveness Research (CER) With Electronic Clinical Data: A Perspective From the EDM Forum. Holve, Erin; Segal, Courtney; Hamilton Lopez, Marianne. Medical Care, July 2012,50():S11-S18
    Introduction
  6. Commentary: Electronic Health Records for Comparative Effectiveness Research. Glasgow, Russell E. Medical Care, July 2012,50():S19-S20
    Analytic Methods
  7. A Pragmatic Framework for Single-site and Multisite Data Quality Assessment in Electronic Health Record-based Clinical Research. Kahn, Michael G.; Raebel, Marsha A.; Glanz, Jason M.; Riedlinger, Karen; Steiner, John F. Medical Care, July 2012,50():S21-S29
    Analytic Methods
  8. Diabetes and Asthma Case Identification, Validation, and Representativeness When Using Electronic Health Data to Construct Registries for Comparative Effectiveness and Epidemiologic Research. Desai, Jay R.; Wu, Pingsheng; Nichols, Greg A.; Lieu, Tracy A.; O’Connor, Patrick J. Medical Care, July 2012,50():S30-S35
    Analytic Methods
  9. Commentary: Clinical Informatics. E. Frisse, Mark. Medical Care, July 2012,50():S36-S37
    Clinical Informatics
  10. Building the Informatics Infrastructure for Comparative Effectiveness Research (CER): A Review of the Literature. Lopez, Marianne Hamilton; Holve, Erin; Sarkar, Indra Neil; Segal, Courtney. Medical Care, July 2012,50():S38-S48
    Clinical Informatics
  11. A Survey of Informatics Platforms That Enable Distributed Comparative Effectiveness Research Using Multi-institutional Heterogenous Clinical Data. Sittig, Dean F.; Hazlehurst, Brian L.; Brown, Jeffrey; Murphy, Shawn; Rosenman, Marc; Tarczy-Hornoch, Peter; Wilcox, Adam B. Medical Care, July 2012,50():S49-S59
    Clinical Informatics
  12. Data Model Considerations for Clinical Effectiveness Researchers. Kahn, Michael G.; Batson, Deborah; Schilling, Lisa M. Medical Care, July 2012,50():S60-S67
    Clinical Informatics
  13. Research Data Collection Methods: From Paper to Tablet Computers. Wilcox, Adam B.; Gallagher, Kathleen D.; Boden-Albala, Bernadette; Bakken, Suzanne R. Medical Care, July 2012,50():S68-S73
    Clinical Informatics
  14. Commentary: Protecting Human Subjects and Their Data in Multi-site Research Luft, Harold S. Medical Care, July 2012,50():S74-S76
    Governance
  15. Approaches to Facilitate Institutional Review Board Approval of Multicenter Research Studies. Marsolo, Keith. Medical Care, July 2012,50():S77-S81
    Governance
  16. Strategies for De-identification and Anonymization of Electronic Health Record Data for Use in Multicenter Research Studies. Kushida, Clete A.; Nichols, Deborah A.; Jadrnicek, Rik; Miller, Ric; Walsh, James K.; Griffin, Kara. Medical Care, July 2012,50():S82-S101
    Governance

May 2012

Jump to top of page

CER Scan [Epub ahead of print]

    1. Contemp Clin Trials. 2012 Apr 20. [Epub ahead of print]

A pilot ‘cohort multiple randomised controlled trial’ of treatment by a homeopath for women with menopausal hot flushes. Relton C, O’Cathain A, Nicholl J.

INTRODUCTION: In order to address the limitations of the standard pragmatic RCT design, the innovative ‘cohort multiple RCT’ design was developed. The design was first piloted by addressing a clinical question ” What is the clinical and cost effectiveness of treatment by a homeopath for women with menopausal hot flushes?”. METHODS: A cohort with the condition of interest (hot flushes) was recruited through an observational study of women’s midlife health and consented  to provide observational data and have their data used comparatively. The ‘Hot Flush’ Cohort were then screened in order to identify patients eligible for a trial of the offer of treatment by a homeopath (Eligible Trial Group). A proportion of the Eligible Trial Group was then randomly selected to the Offer Group and offered treatment. A “patient centred” approach to information and consent was adopted. Patients were not (i) told about treatments that they would not be offered, and trial intervention information was only given to the Offer Group after random selection. Patients were not (ii) given prior information that their treatment would be decided by chance. RESULTS: The ‘cohort multiple RCT’ design was acceptable to the NHS Research Ethics Committee. The majority of patients completed multiple questionnaires. Acceptance of the offer was high (17/24). DISCUSSION: This pilot identified the feasibility of an innovative design in practice. Further research is required to test the concept of undertaking multiple trials within a cohort of patients and to assess the acceptability of the “patient centred” approach to information and consent. Copyright © 2012 Elsevier Inc. All rights reserved.

PMID: 22551742  [PubMed - as supplied by publisher]

LINK: http://www.sciencedirect.com/science/article/pii/S1551714412000973

    1. Am J Epidemiol. 2012 Apr 17. [Epub ahead of print]

Comparison of Instrumental Variable Analysis Using a New Instrument With Risk Adjustment Methods to Reduce Confounding by Indication. Fang G, Brooks JM, Chrischilles EA.

Confounding by indication is a vexing problem, especially in evaluating treatment effects using observational data, since treatment decisions are often related to disease severity, prognosis, and frailty. To compare the ability of the instrumental variable (IV) approach with a new instrument based on the local-area practice style and risk adjustment methods, including conventional multivariate regression and propensity score adjustment, to reduce confounding by indication, the authors investigated the effects of long-term control (LTC) therapy on the occurrence of acute asthma exacerbation events among children and young adults with incident and uncontrolled persistent asthma, using Iowa Medicaid claims files from 1997-1999. Established evidence from clinical trials has demonstrated the protective benefits of LTC therapy for persistent asthma. Among patients identified (n = 4,275), those with higher asthma severity at baseline were more likely to receive LTC therapy. The multivariate regression and propensity score adjustment methods suggested that LTC therapy had no effect on the occurrence of acute exacerbation events. Estimates from the new IV approach showed that LTC therapy significantly decreased the occurrence of acute exacerbation events, which is consistent with established clinical evidence. The authors discuss how to interpret estimates from the risk adjustment and IV methods when the treatment effect is heterogeneous.

PMID: 22510277  [PubMed - as supplied by publisher]

LINK: http://aje.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=22510277

    1. Epidemiology. 2012 Apr 10. [Epub ahead of print]

Estimating the Effects of Multiple Time-varying Exposures Using Joint Marginal Structural Models: Alcohol Consumption, Injection Drug Use, and HIV Acquisition. Howe CJ, Cole SR, Mehta SH, Kirk GD. Department of Epidemiology, Center for Population Health and Clinical Epidemiology, Brown University Program in Public Health, Providence, RI; Department of Epidemiology, University of North Carolina Gillings School of Global Public Health, Chapel Hill, NC; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD; and Division of Infectious Diseases, Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD.

The joint effects of multiple exposures on an outcome are frequently of interest in epidemiologic research. In 2001, Hernán et al (J Am Stat Assoc. 2001;96:440-448) presented methods for estimating the joint effects of multiple time-varying exposures subject to time-varying confounding affected by prior exposure using joint marginal structural models. Nonetheless, the use of these joint models is rare in the applied literature. Minimal uptake of these joint models, in contrast to the now widely used standard marginal structural model, is due in part to a lack of examples demonstrating the method. In this paper, we review the assumptions necessary for unbiased estimation of joint effects as well as the distinction between interaction and effect measure modification. We demonstrate the use of marginal structural models for estimating the joint effects of alcohol consumption and injection drug use on HIV acquisition, using data from 1525 injection drug users in the AIDS Link to Intravenous Experience cohort study. In the joint model, the hazard ratio (HR) for heavy drinking in the absence of any drug injections was 1.58 (95% confidence interval = 0.67-3.73). The HR for any drug injections in the absence of heavy drinking was 1.78 (1.10-2.89). The HR for heavy drinking and any drug injections was 2.45 (1.45-4.12). The P values for multiplicative and additive interaction were 0.7620 and 0.9200, respectively, indicating a lack of departure from effects that multiply or add. We could not rule out interaction on either scale due to imprecision.

PMID: 22495473  [PubMed - as supplied by publisher]

LINK: http://journals.lww.com/epidem/Abstract/publishahead/Estimating_the_Effects_of_Multiple_Time_varying.99490.aspx

    1. Stat Methods Med Res. 2012 Apr 4. [Epub ahead of print]

Sample size and power calculations for medical studies by simulation when closed form expressions are not available. Landau S, Stahl D. King’s College London, Institute of Psychiatry, Department of Biostatistics, London, UK.

This paper shows how Monte Carlo simulation can be used for sample size, power or precision calculations when planning medical research studies. Standard study designs can lead to the use of analysis methods for which power formulae do not exist. This may be because complex modelling techniques with optimal statistical properties are used but power formulae have not yet been derived or because analysis models are employed that divert from the population model due to lack of availability of more appropriate analysis tools. Our presentation concentrates on the conceptual steps involved in carrying out power or precision calculations by simulation. We demonstrate these steps in three examples concerned with (i) drop out in longitudinal studies, (ii) measurement error in observational studies and (iii) causal effect estimation in randomised controlled trials with non-compliance. We conclude that the Monte Carlo simulation approach is an important general tool in the methodological arsenal for assessing power and precision.

PMID: 22491174  [PubMed - as supplied by publisher]

LINK: http://smm.sagepub.com/content/early/2012/04/04/0962280212439578.long

CER Scan [published within the last 30 days]

    1. BMC Med Res Methodol. 2012 Apr 10;12(1):46. [Epub ahead of print]

Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods. Seaman SR, Bartlett JW, White IR.

BACKGROUND: Multiple imputation is often used for missing data. When a  model contains as covariates more than one function of a variable, it is not obvious how best to impute missing values in these covariates. Consider a regression with outcome Y and covariates X and X^2. In ‘passive imputation’ a value X* is imputed for X and then X^2 is imputed as (X*)^2. A recent proposal is to treat X^2 as ‘just another variable’ (JAV) and impute X and X^2 under multivariate normality. METHODS: We use simulation to investigate the performance of three methods that can easily be implemented in standard software: 1) linear regression of X on Y to impute X then passive imputation of X^2; 2) the same regression but with predictive mean matching (PMM); and 3) JAV. We also investigate the performance of analogous methods when the analysis involves an interaction, and study the theoretical properties of JAV. The application of the methods when complete or incomplete confounders are also present is illustrated using data from the EPIC Study. RESULTS: JAV gives consistent estimation when the analysis is linear regression with a quadratic or interaction term and X is missing completely at random. When X is missing at random, JAV may be biased, but this bias is generally less than for passive imputation and PMM. Coverage for JAV was usually good when bias was small. However, in some scenarios with a more pronounced quadratic effect, bias was large and coverage poor. When the analysis was logistic regression, JAV’s performance was sometimes very poor. PMM generally improved on passive imputation, in terms of bias and coverage, but did not eliminate the bias. CONCLUSIONS: Given the current state of available software, JAV is the best of a set of imperfect imputation methods for linear regression with a quadratic or interaction effect, but should not be used for logistic regression.

PMID: 22489953  [PubMed - as supplied by publisher]

Available Open-Access: http://www.biomedcentral.com/content/pdf/1471-2288-12-46.pdf

    1. Value Health. 2012 Mar-Apr;15(2):217-30.

Prospective observational studies to assess comparative effectiveness: the ISPOR  good research practices task force report. Berger ML, Dreyer N, Anderson F, Towse A, Sedrakyan A, Normand SL.  OptumInsight, Life Sciences, New York, NY 10026, USA. Marc.Berger@Optum.com

OBJECTIVE: In both the United States and Europe there has been an increased interest in using comparative effectiveness research of interventions to inform health policy decisions. Prospective observational studies will undoubtedly be conducted with increased frequency to assess the comparative effectiveness of different treatments, including as a tool for “coverage with evidence development,” “risk-sharing contracting,” or key element in a “learning health-care system.” The principle alternatives for comparative effectiveness research include retrospective observational studies, prospective observational studies, randomized clinical trials, and naturalistic (“pragmatic”) randomized clinical trials.

METHODS: This report details the recommendations of a Good Research Practice Task Force on Prospective Observational Studies for comparative effectiveness research. Key issues discussed include how to decide when to do a prospective observational study in light of its advantages and disadvantages with respect to alternatives, and the report summarizes the challenges and approaches to the appropriate design, analysis, and execution of prospective observational studies to make them most valuable and relevant to health-care decision makers.

RECOMMENDATIONS: The task force emphasizes the need for precision and clarity in specifying the key policy questions to be addressed and that studies should be designed with a goal of drawing causal inferences whenever possible. If a study is being performed to support a policy decision, then it should be designed as hypothesis testing-this requires drafting a protocol as if subjects were to be randomized and that investigators clearly state the purpose or main hypotheses, define the treatment groups and outcomes, identify all measured and unmeasured confounders, and specify the primary analyses and required sample size. Separate from analytic and statistical approaches, study design choices may strengthen the ability to address potential biases and confounding in prospective observational studies. The use of inception cohorts, new user designs, multiple comparator groups, matching designs, and assessment of outcomes thought not to be impacted by the therapies being compared are several strategies that should be given strong consideration recognizing that there may be feasibility constraints. The reasoning behind all study design and analytic choices should be transparent and explained in study protocol. Execution of prospective observational studies is as important as their design and analysis in ensuring that results are valuable and relevant, especially capturing the target population of interest, having reasonably complete and nondifferential follow-up. Similar to the concept of the importance of declaring a prespecified hypothesis, we believe that the credibility of many prospective observational studies would be enhanced by their registration on appropriate publicly accessible sites (e.g., clinicaltrials.gov and encepp.eu) in advance of their execution. Copyright © 2012 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

PMID: 22433752  [PubMed - in process]

LINK: http://www.valueinhealthjournal.com/article/S1098-3015(12)00007-1/abstract

    1. Pharmacoepidemiol Drug Saf. 2012 May 2. doi: 10.1002/pds.3284. [Epub ahead of print]

Algorithms to estimate the beginning of pregnancy in administrative databases. Margulis AV, Setoguchi S, Mittleman MA, Glynn RJ, Dormuth CR, Hernández-Díaz S. Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital, Boston, MA, USA. andreamargulis@post.harvard.edu.

PURPOSE: The role of administrative databases for research on drug safety during pregnancy can be limited by their inaccurate assessment of the timing of exposure, as the gestational age at birth is typically unavailable. Therefore, we sought to develop and validate algorithms to estimate the gestational age at birth using information available in these databases. METHODS: Using a population-based cohort of 286,432 mother-child pairs in British Columbia (1998-2007), we validated an ICD-9/10-based preterm-status indicator and developed algorithms to estimate the gestational age at birth on the basis of this indicator, maternal age, singleton/multiple status, and claims for routine prenatal care tests. We assessed the accuracy of the algorithm-based estimates relative to the gold standard of the clinical gestational age at birth recorded in the delivery discharge record. RESULTS: The preterm-status indicator had specificity and sensitivity of 98% and 91%, respectively. Estimates from an algorithm that assigned 35 weeks of gestational age at birth to deliveries with the preterm-status indicator and 39 weeks to those without them were within 2 weeks of the clinical gestational age at birth in 75% of preterm and 99% of term deliveries. CONCLUSIONS: Subtracting 35 weeks (245 days) from the date of birth in deliveries with codes for preterm birth and 39 weeks (273 days) in those without them provided the optimal estimate of the beginning of pregnancy among the algorithms studied. Copyright © 2012 John Wiley & Sons, Ltd.

PMID: 22550030  [PubMed - as supplied by publisher]

LINK: http://dx.doi.org/10.1002/pds.3284

    1. Cancer. 2012 Apr 19. doi: 10.1002/cncr.27552. [Epub ahead of print]

Data for cancer comparative effectiveness research: Past, present, and future potential. Meyer AM, Carpenter WR, Abernethy AP, Stürmer T, Kosorok MR. Universisty of North Carolina-Lineberger Comprehensive Cancer Center, University  of North Carolina, Chapel Hill, North Carolina; Cecil G. Sheps Center for Health Services Research, University of North Carolina, Chapel Hill, North Carolina.

Comparative effectiveness research (CER) can efficiently and rapidly generate new scientific evidence and address knowledge gaps, reduce clinical uncertainty, and guide health care choices. Much of the potential in CER is driven by the application of novel methods to analyze existing data. Despite its potential, several challenges must be identified and overcome so that CER may be improved, accelerated, and expeditiously implemented into the broad spectrum of cancer care and clinical practice. To identify and characterize the challenges to cancer CER, the authors reviewed the literature and conducted semistructured interviews with 41 cancer CER researchers at the Agency for Healthcare Research and Quality’s Developing Evidence to Inform Decisions about Effectiveness (DEcIDE) Cancer CER Consortium. Several data sets for cancer CER were identified and differentiated into an ontology of 8 categories and were characterized in terms of strengths, weaknesses, and utility. Several themes emerged during the development of this ontology and discussions with CER researchers. Dominant among them was accelerating cancer CER and promoting the acceptance of findings, which will necessitate transcending disciplinary silos to incorporate diverse perspectives and expertise. Multidisciplinary collaboration is required, including those with expertise in nonexperimental data, statistics, outcomes research, clinical trials, epidemiology, generalist and specialty medicine, survivorship, informatics, data, and methods, among others. Recommendations highlight the

systematic, collaborative identification of critical measures; application of more rigorous study design and sampling methods; policy-level resolution of issues in data ownership, governance, access, and cost; and development and application of consistent standards for data security, privacy, and confidentiality. Cancer 2012. © 2012 American Cancer Society.

PMID: 22517505  [PubMed - as supplied by publisher]

LINK: http://onlinelibrary.wiley.com/doi/10.1002/cncr.27552/abstract

    1. Arch Intern Med. 2012 Apr 9;172(7):548-54.

Influenza vaccine effectiveness in patients on hemodialysis: an analysis of a natural experiment. McGrath LJ, Kshirsagar AV, Cole SR, Wang L, Weber DJ, Stürmer T, Brookhart MA. Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, 2105F McGavran-Greenberg, Campus Box CB 7435, Chapel Hill, NC 27599-7435. mabrook@email.unc.edu.

BACKGROUND: Although the influenza vaccine is recommended for patients with end-stage renal disease, little is known about its effectiveness. Observational studies of vaccine effectiveness (VE) are challenging because vaccinated subjects may be healthier than unvaccinated subjects.

METHODS: Using US Renal Data System data, we estimated VE for influenza-like illness, influenza/pneumonia hospitalization, and mortality in adult patients undergoing hemodialysis by using a natural experiment created by the year-to-year variation in the match of the influenza vaccine to the circulating virus. We compared vaccinated patients in matched years (1998, 1999, and 2001) with a mismatched year (1997) using Cox proportional hazards models. Ratios of hazard ratios compared vaccinated patients between 2 years and unvaccinated patients between 2 years. We calculated VE as 1 - effect measure.

RESULTS: Vaccination rates were less than 50% each year. Conventional analysis comparing vaccinated with unvaccinated patients produced average VE estimates of 13%, 16%, and 30% for influenza-like illness, influenza/pneumonia hospitalization, and mortality, respectively. When restricted to the preinfluenza period, results were even stronger, indicating bias. The pooled ratio of hazard ratios comparing matched seasons with a placebo season resulted in a VE of 0% (95% CI, -3% to 2%) for influenza-like illness, 2% (-2% to 5%) for hospitalization, and 0% (-3% to 3%) for death.

CONCLUSIONS: Relative to a mismatched year, we found little evidence of increased VE in subsequent well-matched years, suggesting that the current influenza vaccine strategy may have a smaller effect on morbidity and mortality in the end-stage renal disease population than previously thought. Alternate strategies (eg, high-dose vaccine, adjuvanted vaccine, and multiple doses) should be investigated.

PMID: 22493462  [PubMed - in process]

LINK: http://archinte.ama-assn.org/cgi/content/full/172/7/548

    1. Epidemiology. 2012 Mar;23(2):223-32.

Using marginal structural models to estimate the direct effect of adverse childhood social conditions on onset of heart disease, diabetes, and stroke. Nandi A, Glymour MM, Kawachi I, VanderWeele TJ. Institute for Health and Social Policy and Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, QC Canada. arijit.nandi@mcgill.ca

Comment in Epidemiology. 2012 Mar;23(2):233-7.

BACKGROUND: Early-life socioeconomic status (SES) is associated with adult chronic disease, but it is unclear whether this effect is mediated entirely via adult SES or whether there is a direct effect of adverse early-life SES on adult disease. Major challenges in evaluating these alternatives include imprecise measurement of early-life SES and bias in conventional regression methods to assess mediation. In particular, conventional regression approaches to direct effect estimation are biased when there is time-varying confounding of the association between adult SES and chronic disease by chronic disease risk factors.

METHODS: First-reported heart disease, diabetes, and stroke diagnoses were assessed in a national sample of 9760 Health and Retirement Study participants followed biennially from 1992 through 2006. Early-life and adult SES measures were derived using exploratory and confirmatory factor analysis. Early-life SES was measured by parental education, father’s occupation, region of birth, and childhood rural residence. Adult SES was measured by respondent’s education, occupation, labor force status, household income, and household wealth. Using marginal structural models, we estimated the direct effect of early-life SES on chronic disease onset that was not mediated by adult SES. Marginal structural models were estimated with stabilized inverse probability-weighted log-linear models to adjust for risk factors that may have confounded associations between adult SES and chronic disease.

RESULTS: During follow-up, 24%, 18%, and 9% of participants experienced first onset of heart disease, diabetes, and stroke, respectively. Comparing those in the most disadvantaged with the least disadvantaged quartile, early-life SES was associated with coronary heart disease (risk ratio = 1.30 [95% confidence interval = 1.12-1.51]) and diabetes (1.23 [1.02-1.48]) and marginally associated  with stroke via pathways not mediated by adult SES.

CONCLUSIONS: Our results suggest that early-life socioeconomic experiences directly influence adult chronic disease outcomes.

PMID: 22317806  [PubMed - in process]

LINK: http://journals.lww.com/epidem/Abstract/2012/03000/Using_Marginal_Structural_Models_to_Estimate_the.8.aspx

MAY THEME: PDS proceedings from the 2011 DEcIDE Methods Symposium on methods for developing and analyzing clinically rich data for patient-centered outcomes

http://www.drugepi.org/recently-at-dope/journal-supplement-from-3rd-decide-now-available/

 

April 2012

Jump to top of page

CER Scan [Epub ahead of print]

    1. Stat Med. 2012 Mar 22. doi: 10.1002/sim.5312. [Epub ahead of print]

Testing superiority at interim analyses in a non-inferiority trial. Joshua Chen YH, Chen C.
Merck Research Laboratories, Rahway, NJ, PA, USA. Joshua_chen@merck.com.

Shift in research and development strategy from developing follow-on or ‘me-too’ drugs to differentiated medical products with potentially better efficacy than the standard of care (e.g., first-in-class, best-in-class, and bio-betters) highlights the scientific and commercial interests in establishing superiority even when a non-inferiority design, adequately powered for a pre-specified non-inferiority margin, is appropriate for various reasons. In this paper, we propose a group sequential design to test superiority at interim analyses in a non-inferiority trial. We will test superiority at the interim analyses using conventional group sequential methods, and we may stop the study because of better efficacy. If the study fails to establish superior efficacy at the interim and final analyses, we will test the primary non-inferiority hypothesis at the final analysis at the nominal level without alpha adjustment. Whereas superiority/non-inferiority testing no longer has the hierarchical structure in which the rejection region for testing superiority is a subset of that for testing non-inferiority, the impact of repeated superiority tests on the false positive rate and statistical power for the primary non-inferiority test at the final analysis is essentially ignorable. For the commonly used O’Brien-Fleming type alpha-spending function, we show that the impact is extremely small based upon Brownian motion boundary-crossing properties. Numerical evaluation further supports the conclusion for other alpha-spending functions with a substantial amount of alpha being spent on the interim superiority tests. We use a clinical trial example to illustrate the proposed design.
Copyright © 2012 John Wiley & Sons, Ltd. Copyright
PMID: 22438208  [PubMed - as supplied by publisher]

LINK: http://onlinelibrary.wiley.com/doi/10.1002/sim.5312/abstract

    1. Am J Epidemiol. 2012 Mar 6. [Epub ahead of print]

Risk Prediction Measures for Case-Cohort and Nested Case-Control Designs: An Application to Cardiovascular Disease. Ganna A, Reilly M, de Faire U, Pedersen N, Magnusson P, Ingelsson E.

Case-cohort and nested case-control designs are often used to select an appropriate subsample of individuals from prospective cohort studies. Despite the great attention that has been given to the calculation of association estimators, no formal methods have been described for estimating risk prediction measures from these 2 sampling designs. Using real data from the Swedish Twin Registry (2004-2009), the authors sampled unstratified and stratified (matched) case-cohort and nested case-control subsamples and compared them with the full cohort (as “gold standard”). The real biomarker (high density lipoprotein cholesterol) and simulated biomarkers (BIO1 and BIO2) were studied in terms of association with cardiovascular disease, individual risk of cardiovascular disease at 3 years, and main prediction metrics. Overall, stratification improved efficiency, with stratified case-cohort designs being comparable to matched nested case-control designs. Individual risks and prediction measures calculated by using case-cohort and nested case-control designs after appropriate reweighting could be assessed with good efficiency, except for the finely matched nested case-control design, where matching variables could not be included in the individual risk estimation. In conclusion, the authors have shown that case-cohort and nested case-control designs can be used in settings where the research aim is to evaluate the prediction ability of new markers and that matching strategies for nested case-control designs may lead to biased prediction measures.
PMID: 22396388  [PubMed - as supplied by publisher]

LINK: http://aje.oxfordjournals.org/content/175/7/715.long

    1. Lifetime Data Anal. 2012 Mar 2. [Epub ahead of print]

Comparison of estimators in nested case-control studies with multiple outcomes. Støer NC, Samuelsen SO. Department of Mathematics, University of Oslo, P.O. Box 1053, 0316, Oslo, Norway, nathalcs@math.uio.no.
Reuse of controls in a nested case-control (NCC) study has not been considered feasible since the controls are matched to their respective cases. However, in the last decade or so, methods have been developed that break the matching and allow for analyses where the controls are no longer tied to their cases. These methods can be divided into two groups; weighted partial likelihood (WPL) methods and full maximum likelihood methods. The weights in the WPL can be estimated in different ways and four estimation procedures are discussed. In addition, we address modifications needed to accommodate left truncation. A full likelihood approach is also presented and we suggest an aggregation technique to decrease the computation time. Furthermore, we generalize calibration for case-cohort designs to NCC studies. We consider a competing risks situation and compare WPL, full likelihood and calibration through simulations and analyses on a real data example.
PMID: 22382602  [PubMed - as supplied by publisher]

LINK: http://www.springerlink.com/content/3101254836k737p4/

    1. Stat Methods Med Res. 2012 Feb 23. [Epub ahead of print]

Consistent causal effect estimation under dual misspecification and implications for confounder selection procedures. Gruber S, van der Laan MJ. Department of Epidemiology, Harvard School of Public Health, 677 Huntington Avenue, Kresge 820, Boston, MA, USA.

In a previously published article in this journal, Vansteeland et al. [Stat Methods Med Res. Epub ahead of print 12 November 2010. DOI: 10.1177/0962280210387717] address confounder selection in the context of causal effect estimation in observational studies. They discuss several selection strategies and propose a procedure whose performance is guided by the quality of the exposure effect estimator. The authors note that when a particular linearity condition is met, consistent estimation of the target parameter can be achieved even under dual misspecification of models for the association of confounders with exposure and outcome and demonstrate the performance of their procedure relative to other estimators when this condition holds. Our earlier published work on collaborative targeted minimum loss based learning provides a general theoretical framework for effective confounder selection that explains the findings of Vansteelandt et al. and underscores the appropriateness of their suggestions that a confounder selection procedure should be concerned with directly targeting the quality of the estimate and that desirable estimators produce valid confidence intervals and are robust to dual misspecification.
PMID: 22368176  [PubMed - as supplied by publisher]

LINK: http://smm.sagepub.com/content/early/2012/02/23/0962280212437451.long

    1. Stat Med. 2012 Feb 24. doi: 10.1002/sim.4504. [Epub ahead of print]

Variance estimation for stratified propensity score estimators. Williamson EJ, Morley R, Lucas A, Carpenter JR. Centre for MEGA Epidemiology, School of Population Health, University of Melbourne, Melbourne, Australia; Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Australia. ewi@unimelb.edu.au.

Propensity score methods are increasingly used to estimate the effect of a treatment or exposure on an outcome in non-randomised studies. We focus on one such method, stratification on the propensity score, comparing it with the method of inverse-probability weighting by the propensity score. The propensity score-the conditional probability of receiving the treatment given observed covariates-is usually an unknown probability estimated from the data. Estimators for the variance of treatment effect estimates typically used in practice, however, do not take into account that the propensity score itself has been estimated from the data. By deriving the asymptotic marginal variance of the stratified estimate of treatment effect, correctly taking into account the estimation of the propensity score, we show that routinely used variance estimators are likely to produce confidence intervals that are too conservative when the propensity score model includes variables that predict (cause) the outcome, but only weakly predict the treatment. In contrast, a comparison with the analogous marginal variance for the inverse probability weighted (IPW) estimator shows that routinely used variance estimators for the IPW estimator are likely to produce confidence intervals that are almost always too conservative. Because exact calculation of the asymptotic marginal variance is likely to be complex, particularly for the stratified estimator, we suggest that bootstrap estimates of variance should be used in practice. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22362427  [PubMed - as supplied by publisher]

LINK: http://onlinelibrary.wiley.com/doi/10.1002/sim.4504/abstract

    1. Health Serv Res. 2012 Feb 21. doi: 10.1111/j.1475-6773.2012.01387.x. [Epub ahead of print]

Measuring Racial/Ethnic Disparities in Health Care: Methods and Practical Issues. Cook BL, McGuire TG, Zaslavsky AM. Department of Psychiatry, Center for Multicultural Mental Health Research, Harvard Medical School, Somerville, MA.

OBJECTIVE: To review methods of measuring racial/ethnic health care disparities. STUDY DESIGN: Identification and tracking of racial/ethnic disparities in health care will be advanced by application of a consistent definition and reliable empirical methods. We have proposed a definition of racial/ethnic health care disparities based in the Institute of Medicine’s (IOM) Unequal Treatment report, which defines disparities as all differences except those due to clinical need and preferences. After briefly summarizing the strengths and critiques of this definition, we review methods that have been used to implement it. We discuss practical issues that arise during implementation and expand these methods to identify sources of disparities. We also situate the focus on methods to measure racial/ethnic health care disparities (an endeavor predominant in the United States) within a larger international literature in health outcomes and health care inequality. EMPIRICAL APPLICATION: We compare different methods of implementing the IOM definition on measurement of disparities in any use of mental health care and mental health care expenditures using the 2004-2008 Medical Expenditure Panel Survey. CONCLUSION: Disparities analysts should be aware of multiple methods available to measure disparities and their differing assumptions. We prefer a method concordant with the IOM definition. © Health Research and Educational Trust.
PMID: 22353147  [PubMed - as supplied by publisher]

LINK: http://onlinelibrary.wiley.com/doi/10.1111/j.1475-6773.2012.01387.x/abstract

CER Scan [published within the last 30 days]

    1. Emerg Themes Epidemiol. 2012 Mar 19;9(1):1. [Epub ahead of print]

Causal diagrams in systems epidemiology. Joffe M, Gambhir M, Chadeau-Hyam M, Vineis P.

Methods of diagrammatic modelling have been greatly developed in the past two decades. Outside the context of infectious diseases, systematic use of diagrams in epidemiology has been mainly confined to the analysis of a single link: that between a disease outcome and its proximal determinant(s). Transmitted causes (“causes of causes”) tend not to be systematically analysed. The infectious disease epidemiology modelling tradition models the human population in its environment, typically with the exposure-health relationship and the determinants of exposure being considered at individual and group/ecological levels, respectively. Some properties of the resulting systems are quite general, and are seen in unrelated contexts such as biochemical pathways. Confining analysis to a single link misses the opportunity to discover such properties. The structure of a causal diagram is derived from knowledge about how the world works, as well as from statistical evidence. A single diagram can be used to
characterise a whole research area, not just a single analysis – although this depends on the degree of consistency of the causal relationships between different populations – and can therefore be used to integrate multiple datasets. Additional advantages of system-wide models include: the use of instrumental variables – now emerging as an important technique in epidemiology in the context of mendelian randomisation, but under-used in the exploitation of “natural experiments”; the explicit use of change models, which have advantages with respect to inferring causation; and in the detection and elucidation of feedback.
PMID: 22429606  [PubMed - as supplied by publisher]

Free Full Text: http://www.ete-online.com/content/pdf/1742-7622-9-1.pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Mar;21(3):241-45. doi: DOI: 10.1002/pds.2306.

Subtle issues in model specification and estimation of marginal structural models. Yang W, Joffe MM.

We review the concept of time-dependent confounding by using the example in paper “Comparative effectiveness of individual angiotensin receptor blockers on risk of mortality in patients with chronic heart failure” by Desai et al. and illustrate how to adjust for it by using inverse probability of treatment weighting through a simulated example. We discuss a few subtle issues that arise in specification of the model for treatment required to fit marginal structural models (MSMs) and in specification of the structural model for the outcome. We discuss the differences between the effects estimated in MSMs and intention-to-treat effects estimated in randomized trials, followed by an outline of some limitations of MSMs. Copyright © 2012 John Wiley & Sons, Ltd.

LINK: http://onlinelibrary.wiley.com/doi/10.1002/pds.2306/abstract

Comment on:
Pharmacoepidemiol Drug Saf. 2012 Mar;21(3):233-40. doi: 10.1002/pds.2175. Epub 2011 Jul 22.
Comparative effectiveness of individual angiotensin receptor blockers on risk of mortality in patients with chronic heart failure. Desai RJ, Ashton CM, Deswal A, Morgan RO, Mehta HB, Chen H, Aparasu RR, Johnson ML. Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA.

OBJECTIVE: There is little evidence on comparative effectiveness of individual angiotensin receptor blockers (ARBs) in patients with chronic heart failure (CHF). This study compared four ARBs in reducing risk of mortality in clinical practice.
METHODS: A retrospective analysis was conducted on a national sample of patients diagnosed with CHF from 1 October 1996 to 30 September 2002 identified from Veterans Affairs electronic medical records, with supplemental clinical data obtained from chart review. After excluding patients with exposure to ARBs within the previous 6 months, four treatment groups were defined based on initial use of
candesartan, valsartan, losartan, and irbesartan between the index date (1 October 2000) and the study end date (30 September 2002). Time to death was measured concurrently during that period. A marginal structural model controlled for sociodemographic factors, comorbidities, comedications, disease severity (left ventricular ejection fraction), and potential time-varying confounding affected by previous treatment (hospitalization). Propensity scores derived from a multinomial logistic regression were used as inverse probability of treatment weights in a generalized estimating equation to estimate causal effects.
RESULTS: Among the 1536 patients identified on ARB therapy, irbesartan was most frequently used (55.21%), followed by losartan (21.74%), candesartan (15.23%), and valsartan (7.81%). When compared with losartan, after adjusting for time-varying hospitalization in marginal structural model, candesartan (OR=0.79, 95%CI=0.42-1.50), irbesartan (OR=1.17, 95%CI=0.72-1.90), and valsartan (OR=0.98, 95%CI=0.45-2.14) were found to have similar effectiveness in reducing mortality in CHF patients.
CONCLUSION: Effectiveness of ARBs in reducing mortality is similar in patients with CHF in everyday clinical practice. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 21786364  [PubMed - in process]

LINK: http://onlinelibrary.wiley.com/doi/10.1002/pds.2175/abstract

 

March 2012

Jump to top of page

CER Scan [Epub ahead of print]

    1. Stat Med. 2012 Feb 17. doi: 10.1002/sim.4510. [Epub ahead of print]

Longitudinal structural mixed models for the analysis of surgical trials with noncompliance. Sitlani CM, Heagerty PJ, Blood EA, Tosteson TD. Department of Biostatistics, University of Washington, F-600 Health Sciences Building, Box 357232, Seattle, WA 98195, USA; Cardiovascular Health Research Unit, University of Washington, 1730 Minor Ave, Suite 1360, Box 358085, Seattle, WA. csitlani@u.washington.edu.

Patient noncompliance complicates the analysis of many randomized trials seeking to evaluate the effect of surgical intervention as compared with a nonsurgical treatment. If selection for treatment depends on intermediate patient characteristics or outcomes, then ‘as-treated’ analyses may be biased for the estimation of causal effects. Therefore, the selection mechanism for treatment and/or compliance should be carefully considered when conducting analysis of surgical trials. We compare the performance of alternative methods when endogenous processes lead to patient crossover. We adopt an underlying longitudinal structural mixed model that is a natural example of a structural nested model. Likelihood-based methods are not typically used in this context; however, we show that standard linear mixed models will be valid under selection mechanisms that depend only on past covariate and outcome history. If there are underlying patient characteristics that influence selection, then likelihood methods can be extended via maximization of the joint likelihood of exposure and outcomes. Semi-parametric causal estimation methods such as marginal structural models, g-estimation, and instrumental variable approaches can also be valid, and we both review and evaluate their implementation in this setting. The assumptions required for valid estimation vary across approaches; thus, the choice of methods for analysis should be driven by which outcome and selection assumptions are plausible. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22344923 [PubMed - as supplied by publisher]

Link: http://onlinelibrary.wiley.com/doi/10.1002/sim.4510/abstract;jsessionid=B8169932E2A812E3947828E1330A31D8.d02t04

    1. Biometrics. 2012 Feb 2. doi: 10.1111/j.1541-0420.2011.01722.x. [Epub ahead of print]

Assessing Treatment-Selection Markers using a Potential Outcomes Framework.
Huang Y, Gilbert PB, Janes H. Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, U.S.A. Department of Biostatistics, University of Washington, Seattle, WA

Summary Treatment-selection markers are biological molecules or patient characteristics associated with one’s response to treatment. They can be used to predict treatment effects for individual subjects and subsequently help deliver treatment to those most likely to benefit from it. Statistical tools are needed to evaluate a marker’s capacity to help with treatment selection. The commonly adopted criterion for a good treatment-selection marker has been the interaction between marker and treatment. While a strong interaction is important, it is, however, not sufficient for good marker performance. In this article, we develop novel measures for assessing a continuous treatment-selection marker, based on a potential outcomes framework. Under a set of assumptions, we derive the optimal decision rule based on the marker to classify individuals according to treatment benefit, and characterize the marker’s performance using the corresponding classification accuracy as well as the overall distribution of the classifier. We develop a constrained maximum-likelihood method for estimation and testing in a randomized trial setting. Simulation studies are conducted to demonstrate the performance of our methods. Finally, we illustrate the methods using an HIV vaccine trial where we explore the value of the level of preexisting immunity to adenovirus serotype 5 for predicting a vaccine-induced increase in the risk of HIV acquisition. © 2012, The nternational Biometric Society.
PMID: 22299708 [PubMed - as supplied by publisher]

Link: http://onlinelibrary.wiley.com/doi/10.1111/j.1541-0420.2011.01722.x/abstract

CER Scan [published within the last 30 days]

    1. Am J Epidemiol. 2012 Feb 1;175(3):210-7. Epub 2011 Dec 23.

Dealing with missing outcome data in randomized trials and observational studies.
Groenwold RH, Donders AR, Roes KC, Harrell FE Jr, Moons KG. Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, the Netherlands. r.h.h.groenwold@umcutrecht.nl

Although missing outcome data are an important problem in randomized trials and observational studies, methods to address this issue can be difficult to apply. Using simulated data, the authors compared 3 methods to handle missing outcome data: 1) complete case analysis; 2) single imputation; and 3) multiple imputation (all 3 with and without covariate adjustment). Simulated scenarios focused on continuous or dichotomous missing outcome data from randomized trials or observational studies. When outcomes were missing at random, single and multiple imputations yielded unbiased estimates after covariate adjustment. Estimates obtained by complete case analysis with covariate adjustment were unbiased as well, with coverage close to 95%. When outcome data were missing not at random, all methods gave biased estimates, but handling missing outcome data by means of 1 of the 3 methods reduced bias compared with a complete case analysis without covariate adjustment. Complete case analysis with covariate adjustment and multiple imputation yield similar estimates in the event of missing outcome data, as long as the same predictors of missingness are included. Hence, complete case analysis with covariate adjustment can and should be used as the analysis of choice more often. Multiple imputation, in addition, can accommodate the missing-not-at-random scenario more flexibly, making it especially suited for sensitivity analyses.
PMID: 22262640 [PubMed - in process]

Link: http://aje.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=22262640

    1. Stat Med. 2012 Feb 20;31(4):383-96. doi: 10.1002/sim.4453.

Hierarchical priors for bias parameters in Bayesian sensitivity analysis for unmeasured confounding. McCandless LC, Gustafson P, Levy AR, Richardson S.

Faculty of Health Sciences, Simon Fraser University, Burnaby, BC V5A 1S6, Canada. mccandless@sfu.ca

Recent years have witnessed new innovation in Bayesian techniques to adjust for unmeasured confounding. A challenge with existing methods is that the user is often required to elicit prior distributions for high-dimensional parameters that model competing bias scenarios. This can render the methods unwieldy. In this paper, we propose a novel methodology to adjust for unmeasured confounding that derives default priors for bias parameters for observational studies with binary covariates. The confounding effects of measured and unmeasured variables are treated as exchangeable within a Bayesian framework. We model the joint distribution of covariates by using a log-linear model with pairwise interaction terms. Hierarchical priors constrain the magnitude and direction of bias parameters. An appealing property of the method is that the conditional distribution of the unmeasured confounder follows a logistic model, giving a simple equivalence with previously proposed methods. We apply the method in a data example from pharmacoepidemiology and explore the impact of different priors for bias parameters on the analysis results. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 22253142 [PubMed - in process]

Link: http://onlinelibrary.wiley.com/doi/10.1002/sim.4453/abstract

    1. Am J Epidemiol. 2012 Mar 1;175(5):368-75. Epub 2012 Feb 3.

Bayesian posterior distributions without markov chains. Cole SR, Chu H, Greenland S, Hamra G, Richardson DB.

Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976-1983) assessing the relation between residential exposure to magnetic fields and the development of childhood cancer. Results from rejection sampling (odds ratio (OR) = 1.69, 95% posterior interval (PI): 0.57, 5.00) were similar to MCMC results (OR = 1.69, 95% PI: 0.58, 4.95) and approximations from data-augmentation priors (OR = 1.74, 95% PI: 0.60, 5.06). In example 2, the authors apply rejection sampling to a cohort study of
315 human immunodeficiency virus seroconverters (1984-1998) to assess the relation between viral load after infection and 5-year incidence of acquired immunodeficiency syndrome, adjusting for (continuous) age at seroconversion and race. In this more complex example, rejection sampling required a notably longer run time than MCMC sampling but remained feasible and again yielded similar results. The transparency of the proposed approach comes at a price of being less broadly applicable than MCMC.
PMCID: PMC3282880 [Available on 2013/3/1] PMID: 22306565 [PubMed - in process]

Link: http://aje.oxfordjournals.org/content/175/5/368.long

    1. Med Care. 2012 Feb;50(2):109-16.

A longitudinal examination of a pay-for-performance program for diabetes care: evidence from a natural experiment. Cheng SH, Lee TT, Chen CC. Institute of Health Policy and Management, College of Public Health, National Taiwan University, Taiwan. shcheng@ntu.edu.tw

BACKGROUND: Numerous studies have examined the impacts of pay-for-performance programs, yet little is known about their long-term effects on health care expenses.
OBJECTIVES: This study aimed to examine the long-term effects of a pay-for-performance program for diabetes care on health care utilization and expenses.
METHODS: This study represents a nationwide population-based natural experiment with a 4-year follow-up period under a compulsory universal health insurance program in Taiwan. The intervention groups consisted of 20,934 patients enrolled in the program in 2005, and 9694 patients continuously participated in the program for 4 years. Two comparison groups were selected by propensity score matching from patients seen by the same group of physicians. Generalized estimating equations were used to estimate differences-in-differences models to examine the effects of the pay-for-performance program.
RESULTS: Patients enrolled in the pay-for-performance program underwent significantly more diabetes specific examinations and tests after enrollment; the differences between the intervention and comparison groups declined gradually over time but remained significant. Patients in the intervention groups had a significantly higher number of diabetes-related physician visits in only the first year after enrollment and had fewer diabetes-related hospitalizations in the follow-up period. Concerning overall health care expenses, patients in the intervention groups spent more than the comparison group in the first year; however, the continual enrollees spent significantly less than their counterparts in the subsequent years.
CONCLUSIONS: The program seemed to achieve its primary goal in improving health care and providing long-term cost benefits.
PMID: 22249920 [PubMed - in process]

Link: http://journals.lww.com/lww-medicalcare/pages/articleviewer.aspx?year=2012&issue=02000&article=00001&type=abstract

CER Scan [articles of interest published within the last 4 months]

    1. Value in Health [Available online 8 November 2011] DOI: 10.1016/j.jval.2011.08.1740

Conducting Comparative Effectiveness Research on Medications: The Views of a Practicing Epidemiologist from the Other Washington. Bruce M. Psaty

No Abstract
Link: http://www.valueinhealthjournal.com/article/PIIS1098301511033274/abstract?rss=yes

    1. Health Serv Outcomes Res Method. 2011; 11:95-114

Extending iterative matching methods: an approach to improving covariate balance that allows prioritisation. Ramsahai RR, Grieve R, Sekhon JS.

Comparative effectiveness studies can identify the causal effect of treatment if treatment is unconfounded with outcome conditional on a set of measured covariates. Matching aims to ensure that the covariate distributions are similar between treatment and control groups in the matched samples, and this should be done iteratively by checking and improving balance. However, an outstanding concern facing matching methods is how to prioritise competing improvements in balance across different covariates. We address this concern by developing a ‘loss function’ that an iterative matching method can minimise. Our ‘loss function’ is a transparent summary of covariate imbalance in a matched sample and follows general recommendations in prioritising balance amongst covariates. We illustrate this approach by extending Genetic Matching (GM), an automated approach to balance checking. We use the method to reanalyse a high profile comparative effectiveness study of right heart catheterisation. We find that our loss function improves covariate balance compared to a standard GM approach, and to matching on the published propensity score.

Link: http://www.springerlink.com/content/bl41h30008667400/

 

February 2012

Jump to top of page

CER Scan [Epub ahead of print]

  1. Am J Epidemiol. 2012 Jan 5. [Epub ahead of print]

Bias in Observational Studies of Prevalent Users: Lessons for Comparative Effectiveness Research From a Meta-Analysis of Statins. Danaei G, Tavakkoli M, Hernán MA.

 

Randomized clinical trials (RCTs) are usually the preferred strategy with which to generate evidence of comparative effectiveness, but conducting an RCT is not always feasible. Though observational studies and RCTs often provide comparable estimates, the questioning of observational analyses has recently intensified because of randomized-observational discrepancies regarding the effect of postmenopausal hormone replacement therapy on coronary heart disease. Reanalyses of observational data that excluded prevalent users of hormone replacement therapy led to attenuated discrepancies, which begs the question of whether exclusion of prevalent users should be generally recommended. In the current study, the authors evaluated the effect of excluding prevalent users of statins in a meta-analysis of observational studies of persons with cardiovascular disease. The pooled, multivariate-adjusted mortality hazard ratio for statin use was 0.77 (95% confidence interval (CI): 0.65, 0.91) in 4 studies that compared incident users with nonusers, 0.70 (95% CI: 0.64, 0.78) in 13 studies that compared a combination of prevalent and incident users with nonusers, and 0.54 (95% CI: 0.45, 0.66) in 13 studies that compared prevalent users with nonusers. The corresponding hazard ratio from 18 RCTs was 0.84 (95% CI: 0.77, 0.91). It appears that the greater the proportion of prevalent statin users in observational studies, the larger the discrepancy between observational and randomized estimates.
PMID:22223710

CER Scan [published within the last 30 days]

    1. J Clin Epidemiol. 2012 Feb;65(2):132-7. Epub 2011 Aug 12.

The “best balance” allocation led to optimal balance in cluster-controlled trials. de Hoop E, Teerenstra S, van Gaal BG, Moerbeek M, Borm GF. Department of Epidemiology, Biostatistics and HTA, 133, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500 HB Nijmegen, The Netherlands.

OBJECTIVE: Balance of prognostic factors between treatment groups is desirable because it improves the accuracy, precision, and credibility of the results. In cluster-controlled trials, imbalance can easily occur by chance when the number of cluster is small. If all clusters are known at the start of the study, the “best balance” allocation method (BB) can be used to obtain optimal balance. This method will be compared with other allocation methods.
STUDY DESIGN AND SETTING: We carried out a simulation study to compare the balance obtained with BB, minimization, unrestricted randomization, and matching for four to 20 clusters and one to five categorical prognostic factors at cluster level.
RESULTS: BB resulted in a better balance than randomization in 13-100% of the situations, in 0-61% for minimization, and in 0-88% for matching. The superior performance of BB increased as the number of clusters and/or the number of factors increased.
CONCLUSION: BB results in a better balance of prognostic factors than randomization, minimization, stratification, and matching in most situations. Furthermore, BB cannot result in a worse balance of prognostic factors than the other methods. Copyright © 2012 Elsevier Inc. All rights reserved.
PMID: 21840173

    1. Clin Pharmacol Ther. 2012 Feb;91(2):165-7. doi: 10.1038/clpt.2011.208.

Challenges in designing comparative-effectiveness trials for antidepressants. Leon AC. Departments of Psychiatry and Public Health, Weill Cornell Medical College, New York, New York, USA.

Comparative-effectiveness antidepressant trials offer promise to provide empirical evidence for clinicians choosing among interventions. Whether such trials posit superiority or noninferiority (NI) hypotheses, they pose formidable challenges. For instance, if meaningful antidepressant differences are seen in comparative-superiority trials, they will be small. NI hypothesis testing, on the other hand, requires an a priori NI margin and evidence of trial assay sensitivity. Either design demands unusually large samples, which could render such trials infeasible.
PMID: 22261683 [PubMed - in process]

FEBRUARY THEME: Selected Methods Manuscripts from the Pharmacoepidemiology and Drug Safety Mini-Sentinel Supplement
(Link to entire supplement: http://onlinelibrary.wiley.com/doi/10.1002/pds.v21.S1/issuetoc)

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:1-8. doi: 10.1002/pds.2343.

The U.S. Food and Drug Administration’s Mini-Sentinel program: status and direction. Platt R, Carnahan RM, Brown JS, Chrischilles E, Curtis LH, Hennessy S, Nelson JC, Racoosin JA, Robb M, Schneeweiss S, Toh S, Weiner MG.

The Mini-Sentinel is a pilot program that is developing methods, tools, resources, policies, and procedures to facilitate the use of routinely collected electronic healthcare data to perform active surveillance of the safety of marketed medical products, including drugs, biologics, and medical devices. The U.S. Food and Drug Administration (FDA) initiated the program in 2009 as part of its Sentinel Initiative, in response to a Congressional mandate in the FDA Amendments Act of 2007. After two years, Mini-Sentinel includes 31 academic and private organizations. It has developed policies, procedures, and technical specifications for developing and operating a secure distributed data system comprised of separate data sets that conform to a common data model covering enrollment, demographics, encounters, diagnoses, procedures, and ambulatory dispensing of prescription drugs. The distributed data sets currently include administrative and claims data from 2000 to 2011 for over 300 million person-years, 2.4 billion encounters, 38 million inpatient hospitalizations, and 2.9 billion dispensings. Selected laboratory results and vital signs data recorded after 2005 are also available. There is an active data quality assessment and characterization program, and eligibility for medical care and pharmacy benefits is known. Systematic reviews of the literature have assessed the ability of administrative data to identify health outcomes of interest, and procedures have been developed and tested to obtain, abstract, and adjudicate full-text medical records to validate coded diagnoses. Mini-Sentinel has also created a taxonomy of study designs and analytical approaches for many commonly occurring situations, and it is developing new statistical and epidemiologic methods to address certain gaps in analytic capabilities. Assessments are performed by distributing computer programs that are executed locally by each data partner. The system is in active use by FDA, with the majority of assessments performed using customizable, reusable queries (programs). Prospective and retrospective assessments that use customized protocols are conducted as well. To date, several hundred unique programs have been distributed and executed. Current activities include active surveillance of several drugs and vaccines, expansion of the population, enhancement of the common data model to include additional types of data from electronic health records and registries, development of new methodologic capabilities, and assessment of methods to identify and validate additional health outcomes of interest. Copyright © 2012 John Wiley & Sons, Ltd.PMID: 22262586 [PubMed - in process]

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2343/pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:18-22. doi: 10.1002/pds.2319.

A policy framework for public health uses of electronic health data. McGraw D, Rosati K, Evans B.

Successful implementation of a program of active safety surveillance of drugs and medical products depends on public trust. This article summarizes how the initial pilot phase of the FDA’s Sentinel Initiative, Mini-Sentinel, is being conducted in compliance with applicable federal and state laws. The article also sets forth the attributes of Mini-Sentinel that enhance privacy and public trust, including the use of a distributed data system (where identifiable information remains at the data partners) and the adoption by participants of additional mandatory policies and procedures implementing fair information practices. The authors conclude by discussing the implications of this model for other types of secondary health data uses. Copyright © 2012 John Wiley & Sons, Ltd.
PMID:22262589 [PubMed - in process]

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2319/pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:23-31. doi: 10.1002/pds.2336.

Design considerations, architecture, and use of the Mini-Sentinel distributed data system. Curtis LH,Weiner MG, Boudreau DM, Cooper WO, Daniel GW, Nair VP, Raebel MA, Beaulieu NU, Rosofsky R, Woodworth TS, Brown JS.

Purpose: We describe the design, implementation, and use of a large, multiorganizational distributed database developed to support the Mini-Sentinel Pilot Program of the US Food and Drug Administration (FDA). As envisioned by the US FDA, this implementation will inform and facilitate the development of an active surveillance system for monitoring the safety of medical products (drugs, biologics, and devices) in the USA.
Methods: A common data model was designed to address the priorities of the Mini-Sentinel Pilot and to leverage the experience and data of participating organizations and data partners. A review of existing common data models informed the process. Each participating organization designed a process to extract, transform, and load its source data, applying the common data model to create the Mini-Sentinel Distributed Database. Transformed data were characterized and evaluated using a series of programs developed centrally and executed locally by participating organizations. A secure communications portal was designed to facilitate queries of the Mini-Sentinel Distributed Database and transfer of confidential data, analytic tools were developed to facilitate rapid response to common questions, and distributed querying software was implemented to facilitate rapid querying of summary data.
Results: As of July 2011, information on 99 260 976 health plan members was included in the Mini-Sentinel Distributed Database. The database includes 316 009 067 person-years of observation time, with members contributing, on average, 27.0 months of observation time. All data partners have successfully executed distributed code and returned findings to the Mini-Sentinel Operations Center.
Conclusion: This work demonstrates the feasibility of building a large, multiorganizational distributed data system in which organizations retain possession of their data that are used in an active surveillance system. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22262590 [PubMed - in process]

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2336/pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:41-49. doi: 10.1002/pds.2328.

Using high-dimensional propensity scores to automate confounding control in a distributed medical product safety surveillance system. Rassen JA, Schneeweiss S.

Distributed medical product safety monitoring systems such as the Sentinel System, to be developed as a part of Food and Drug Administration’s Sentinel Initiative, will require automation of large parts of the safety evaluation process to achieve the necessary speed and scale at reasonable cost without sacrificing validity. Although certain functions will require investigator intervention, confounding control is one area that can largely be automated. The high-dimensional propensity score (hd-PS) algorithm is one option for automated confounding control in longitudinal healthcare databases. In this article, we discuss the use of hd-PS for automating confounding control in sequential database cohort studies, as applied to safety monitoring systems. In particular, we discuss the robustness of the covariate selection process, the potential for over- or under-selection of variables including the possibilities of M-bias and Z-bias, the computation requirements, the practical considerations in a federated database network, and the cases where automated confounding adjustment may not function optimally. We also outline recent improvements to the algorithm and show how the algorithm has performed in several published studies. We conclude that despite certain limitations, hd-PS offers substantial advantages over non-automated alternatives in active product safety monitoring systems. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22262592 [PubMed - in process]

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2328/pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:50-61. doi: 10.1002/pds.2330.

When should case-only designs be used for safety monitoring of medical products? Maclure M, Fireman B, Nelson JC, Hua W, Shoaibi A, Paredes A, Madigan D.

Purpose: To assess case-only designs for surveillance with administrative databases.
Methods: We reviewed literature on two designs that are observational analogs to crossover experiments: the self-controlled case series (SCCS) and the case-crossover (CCO) design.
Results: SCCS views the ‘experiment’ prospectively, comparing outcome risks in windows with different exposures. CCO retrospectively compares exposure frequencies in case and control windows. The main strength of case-only designs is they entail self-controlled analyses that eliminate confounding and selection bias by time-invariant characteristics not recorded in healthcare databases. They also protect privacy and are computationally efficient, as they require fewer subjects and variables. They are better than cohort designs for investigating transient effects of accurately recorded preventive agents, for example, vaccines. They are problematic if timing of self-administration is sporadic and dissociated from dispensing times, for example, analgesics. They tend to have less exposure misclassification bias and time-varying confounding if exposures are brief. Standard SCCS designs are bidirectional (using time both before and after the first exposure event), so they are more susceptible than CCOs to reverse-causality bias, including immortal-time bias. This is true also for sequence symmetry analysis, a simplified SCCS. Unidirectional CCOs use only time before the outcome, so they are less affected by reverse causality but susceptible to exposure-trend bias. Modifications of SCCS and CCO partially deal with these biases. The head-to-head comparison of multiple products helps to control residual biases.
Conclusion: The case-only analyses of intermittent users complement the cohort analyses of prolonged users because their different biases compensate for one another. Copyright © 2012 John Wiley & Sons, Ltd.
PMID:22262593 [PubMed - in process

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2330/pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:62-71. doi: 10.1002/pds.2324.

Challenges in the design and analysis of sequentially monitored postmarket safety surveillance evaluations using electronic observational health care data. Nelson JC, Cook AJ, Yu O, Dominguez C, Zhao S, Greene SK, Fireman BH, Jacobsen SJ, Weintraub ES, Jackson LA.

Purpose: Many challenges arise when conducting a sequentially monitored medical product safety surveillance evaluation using observational electronic data captured during routine care. We review existing sequential approaches for potential use in this setting, including a continuous sequential testing method that has been utilized within the Vaccine Safety Datalink (VSD) and group sequential methods, which are used widely in randomized clinical trials.
Methods: Using both simulated data and preliminary data from an ongoing VSD evaluation, we discuss key sequential design considerations, including sample size and duration of surveillance, shape of the signaling threshold, and frequency of interim testing.
Results and Conclusions: All designs control the overall Type 1 error rate across all tests performed, but each yields different tradeoffs between the probability and timing of true and false positive signals. Designs tailored to monitor efficacy outcomes in clinical trials have been well studied, but less consideration has been given to optimizing design choices for observational safety settings, where the hypotheses, population, prevalence and severity of the outcomes, implications of signaling, and costs of false positive and negative findings are very different. Analytic challenges include confounding, missing and partially accrued data, high misclassification rates for outcomes presumptively defined using diagnostic codes, and unpredictable changes in dynamically accessed data over time (e.g., differential product uptake). Many of these factors influence the variability of the adverse events under evaluation and, in turn, the probability of committing a Type 1 error. Thus, to ensure proper Type 1 error control, planned sequential thresholds should be adjusted over time to account for these issues. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22262594 [PubMed - in process]

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2324/pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:72-81. doi: 10.1002/pds.2320.

Statistical approaches to group sequential monitoring of postmarket safety surveillance data: current state of the art for use in the Mini-Sentinel pilot. Cook AJ, Tiwari RC, Wellman RD, Heckbert SR, Li L, Heagerty P, Marsh T, Nelson JC.

Purpose: This manuscript describes the current statistical methodology available for active postmarket surveillance of pre-specified safety outcomes using a prospective incident user concurrent control cohort design with existing electronic healthcare data.
Methods: Motivation of the active postmarket surveillance setting is provided using the Food and Drug Administration’s Mini-Sentinel Pilot as an example. Four sequential monitoring statistical methods are presented including the Lan–Demets error spending approach, a matched likelihood ratio test statistic approach with the binomial MaxSPRT as a special case, the conditional sequential sampling procedure with stratification, and a generalized estimating equation regression approach using permutation. Information on the assumptions, limitations, and advantages of each approach is provided, including how each method defines sequential monitoring boundaries, what test statistic is used, and how robust it is to settings of rare events or frequent testing.
Results: A hypothetical example of how the approaches could be applied to data comparing a medical product of interest, drug A, to a concurrent control drug, drug B, is presented including providing the type of information one would have available for monitoring such drugs. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22262595 [PubMed - in process]

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2320/pdf

    1. Pharmacoepidemiol Drug Saf. 2012 Jan;21 Suppl 1:282-290. doi: 10.1002/pds.2337.

A protocol for active surveillance of acute myocardial infarction in association with the use of a new antidiabetic pharmaceutical agent. Fireman B, Toh S, Butler MG, Go AS, Joffe HV, Graham DJ, Nelson JC, Daniel GW, Selby JV.

Purpose: To describe a protocol for active surveillance of acute myocardial infarction (AMI) in users of a recently approved oral antidiabetic medication, saxagliptin, and to provide the rationale for decisions made in drafting the protocol.
Methods: A new-user cohort design is planned for evaluating data from at least four Mini-Sentinel data partners from 1 August 2009 (following US Food and Drug Administration’s approval of saxagliptin) through mid-2013. New users of saxagliptin will be compared in separate analyses with new users of sitagliptin, pioglitazone, long-acting insulins, and second-generation sulfonylureas. Two approaches to controlling for confounding will be evaluated: matching by exposure propensity score and stratification by AMI risk score. The primary analyses will use Cox regression models specified in a way that does not require pooling of patient-level data from the data partners. The Cox models are fit to summarized data on risk sets composed of saxagliptin users and similar comparator users at the time of an AMI. Secondary analyses will use alternative methods including Poisson regression and will explore whether further adjustment for covariates available only at some data partners (e.g., blood pressure) modifies results.
Results: The results of this study are pending.
Conclusions: The proposed protocol describes a design for surveillance to evaluate the safety of a newly marketed agent as postmarket experience accrues. It uses data from multiple partner organizations without requiring sharing of patient-level data and compares alternative approaches to controlling for confounding. It is hoped that this initial active surveillance project of the Mini-Sentinel will provide insights that inform future population-based surveillance of medical product safety. Copyright © 2012 John Wiley & Sons, Ltd.
PMID: 22262618 [PubMed - in process]

Link to Free PDF: http://onlinelibrary.wiley.com/doi/10.1002/pds.2337/pdf

 

January 2012

Jump to top of page

CER Scan [Epub ahead of print]

    1. Pharmacoepidemiol Drug Saf. 2011 Dec 8. doi: 10.1002/pds.2256. [Epub ahead of print]

Applying propensity scores estimated in a full cohort to adjust for confounding in subgroup analyses. Rassen JA, Glynn RJ, Rothman KJ, Setoguchi S, Schneeweiss S. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. jrassen@post.harvard.edu.

BACKGROUND: A correctly specified propensity score (PS) estimated in a cohort (“cohort PS”) should, in expectation, remain valid in a subgroup population.
OBJECTIVE: We sought to determine whether using a cohort PS can be validly applied to subgroup analyses and, thus, add efficiency to studies with many subgroups or restricted data. METHODS: In each of three cohort studies, we estimated a cohort PS, defined five subgroups, and then estimated subgroup-specific PSs. We compared difference in treatment effect estimates for subgroup analyses adjusted by cohort PSs versus subgroup-specific PSs. Then, over 10 million times, we simulated a population with known characteristics of confounding, subgroup size, treatment interactions, and treatment effect and again assessed difference in point estimates. RESULTS: We observed that point estimates in most subgroups were substantially similar with the two methods of adjustment. In simulations, the effect estimates differed by a median of 3.4% (interquartile (IQ) range 1.3-10.0%). The IQ range exceeded 10% only in cases where the subgroup had < 1000 patients or few outcome events. CONCLUSIONS: Our empirical and simulation results indicated that using a cohort PS in subgroup analyses was a feasible approach, particularly in larger subgroups. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 22162077 [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2011 Nov 8. [Epub ahead of print]

Extension of the modified Poisson regression model to prospective studies with correlated binary data. Zou GY, Donner A. Department of Epidemiology & Biostatistics, and Robarts Clinical Trials of Robarts Research Institute, Schulich School of Medicine & Dentistry, Canada.

The Poisson regression model using a sandwich variance estimator has become a viable alternative to the logistic regression model for the analysis of prospective studies with independent binary outcomes. The primary advantage of this approach is that it readily provides covariate-adjusted risk ratios and associated standard errors. In this article, the model is extended to studies with correlated binary outcomes as arise in longitudinal or cluster randomization studies. The key step involves a cluster-level grouping strategy for the computation of the middle term in the sandwich estimator. For a single binary exposure variable without covariate adjustment, this approach results in risk ratio estimates and standard errors that are identical to those found in the survey sampling literature. Simulation results suggest that it is reliable for studies with correlated binary data, provided the total number of clusters is at least 50. Data from observational and cluster randomized studies are used to illustrate the methods.
PMID: 22072596 [PubMed - as supplied by publisher]

    1. J Clin Psychopharmacol. 2011 Dec 22. [Epub ahead of print]

Treating Depression After Initial Treatment Failure: Directly Comparing Switch and Augmenting Strategies in STAR*D. Gaynes BN, Dusetzina SB, Ellis AR, Hansen RA, Farley JF, Miller WC, Stürmer T. Department of Psychiatry, School of Medicine, UNC at Chapel Hill, Chapel Hill, NC; Department of Health Care Policy, Harvard Medical School, Boston, MA; Harrison School of Pharmacy, Auburn University, Auburn, AL; Division of Pharmaceutical Outcomes and Policy, Eshelman School of Pharmacy, and Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC.

OBJECTIVE: Augmenting and switching antidepressant medications are the 2 most common next-step strategies for depressed patients failing initial medication treatment. These approaches have not been directly compared; thus, our objectives are to compare outcomes for medication augmentation versus switching for patients with major depressive disorder in the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) clinical trial. METHODS: We conducted a retrospective analysis of participants aged 18 to 75 years with DSM-IV nonpsychotic depression who failed to remit with initial treatment in the STAR*D clinical trial (N =1292). We compared depressive symptom remission, response, and quality of life among participants in each study arm using propensity score matching to minimize selection bias. RESULTS: The propensity-score-matched augment (N = 269) and switch (N = 269) groups were well balanced on measured characteristics. Neither the likelihood of remission (risk ratio, 1.14; 95% confidence level, 0.82-1.58) or response (risk ratio, 1.14; 95% confidence level, 0.82-1.58), nor the time to remission (log-rank test, P = 0.946) or response (log-rank test, P = 0.243) differed by treatment strategy. Similarly, quality of life did not differ. Post hoc analyses suggested that augmentation improved outcomes for patients tolerating 12 or more weeks of initial treatment and those with partial initial treatment response. CONCLUSIONS: For patients receiving and tolerating aggressive initial antidepressant trials, there is no clear preference for next-step augmentation versus switching. Findings tentatively suggest that those who complete an initial treatment of 12 weeks or more and have a partial response with residual mild depressive severity may benefit more from augmentation relative to switching.
PMID: 22198447 [PubMed - as supplied by publisher]

    1. J Clin Psychopharmacol. 2011 Dec 22. [Epub ahead of print]

Variation in Antipsychotic Treatment Choice Across US Nursing Homes. Huybrechts KF, Rothman KJ, Brookhart MA, Silliman RA, Crystal S, Gerhard T, Schneeweiss S. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School; Department of Epidemiology, Boston University School of Public Health, Boston, MA; RTI Health Solutions, Research Triangle Park; UNC, Gillings School of Global Public Health, Chapel Hill, NC; Department of Medicine, Boston University School of Medicine, Boston, MA; and Rutgers University, New Brunswick, NJ.

OBJECTIVE: Despite serious safety concerns, antipsychotic medications continue to be used widely in US nursing homes. The objective of this study was to quantify the variation in antipsychotic treatment choice across US nursing homes, and to characterize its correlates.
METHODS: Prescribing practices were assessed in a cohort of 65,618 patients 65 years or older in 45 states who initiated treatment with an antipsychotic medication after nursing home admission between 2001 and 2005, using merged Medicaid; Medicare; Minimum Data Set; and Online Survey, Certification, and Reporting data. We fit mixed-effects logistic regression models to examine how antipsychotic treatment choice at the patient-level depends on patient and nursing home fixed and random effects. RESULTS: Among antipsychotic medication users, 9% of patients initiated treatment with a conventional agent. After adjustment for case-mix and facility characteristics, 95% of nursing homes had a predicted conventional antipsychotic prescribing rate between 2% and 20%. Individually, patient characteristics accounted for 36% of the explained variation, facility characteristics for 23%, and nursing home prescribing tendency for 81%. Results were consistent in the subgroup of nursing home patients with a diagnosis of dementia. The prescribing physician was not considered as a determinant of treatment choice owing to data limitations.
CONCLUSION: These findings indicate that antipsychotic treatment choice is to some extent influenced by a nursing home’s underling prescribing “culture.” This culture may reveal strategies for targeting quality improvement interventions. In addition, these findings suggest that a nursing home’s tendency for specific antipsychotics merits further exploration as an instrumental variable for improved confounding adjustment in comparative effectiveness studies.
PMID: 22198446 [PubMed - as supplied by publisher]

    1. Stat Med. 2011 Dec 4. doi: 10.1002/sim.4413. [Epub ahead of print]

Diagnosing imputation models by applying target analyses to posterior replicates of completed data. He Y, Zaslavsky AM. Department of Health Care Policy, Harvard Medical School, Boston, MA, 02115, USA. he@hcp.med.harvard.edu.

Multiple imputation fills in missing data with posterior predictive draws from imputation models. To assess the adequacy of imputation models, we can compare completed data with their replicates simulated under the imputation model. We apply analyses of substantive interest to both datasets and use posterior predictive checks of the differences of these estimates to quantify the evidence of model inadequacy. We can further integrate out the imputed missing data and their replicates over the completed-data analyses to reduce variance in the comparison. In many cases, the checking procedure can be easily implemented using standard imputation software by treating re-imputations under the model as posterior predictive replicates. Thus, it can be applied for non-Bayesian imputation methods. We also sketch several strategies for applying the method in the context of practical imputation analyses. We illustrate the method using two real data applications and study its property using a simulation. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 22139814 [PubMed - as supplied by publisher]

CER Scan [published within the last 30 days]

    1. Epidemiology. 2012 Jan;23(1):151-8.

Is probabilistic bias analysis approximately bayesian? Maclehose RF, Gustafson P. From the Divisions of Biostatistics, and Epidemiology and Community Health, University of Minnesota, Minneapolis, MN; and Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada.

Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. Although the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well.
PMID: 22157311 [PubMed - in process]

    1. BMC Med Inform Decis Mak. 2011 Dec 14;11(1):75. [Epub ahead of print]

Evaluation of an automated safety surveillance system using risk adjusted Sequential Probability Ratio Testing. Matheny ME, Normand SL, Gross TP, Marinac-Dabic D, Loyo-Berrios N, Vidi VD, Donnelly S, Resnic FS.

BACKGROUND: Automated adverse outcome surveillance tools and methods have potential utility in quality improvement and medical product surveillance activities. Their use for assessing hospital performance on the basis of patient outcomes has received little attention. We compared risk-adjusted sequential probability ratio testing (RA-SPRT) implemented in an automated tool to Massachusetts public reports of 30-day mortality after isolated coronary artery bypass graft surgery. METHODS: A total of 23,020 isolated adult coronary artery bypass surgery admissions performed in Massachusetts hospitals between January 1, 2002 and September 30, 2007 were retrospectively re-evaluated. The RA-SPRT method was implemented within an automated surveillance tool to identify hospital outliers in yearly increments. We used an overall type I error rate of 0.05, an overall type II error rate of 0.10, and a threshold that signaled if the odds of dying 30-days after surgery was at least twice than expected. Annual hospital outlier status, based on the state-reported classification, was considered the gold standard. An event was defined as at least one occurrence of a higher-than-expected hospital mortality rate during a given year. RESULTS: We examined a total of 83 hospital-year observations. The RA-SPRT method alerted 6 events among three hospitals for 30-day mortality compared with 5 events among two hospitals using the state public reports, yielding a sensitivity of 100% (5/5) and specificity of 98.8% (79/80). CONCLUSIONS: The automated RA-SPRT method performed well, detecting all of the true institutional outliers with a small false positive alerting rate. Such a system could provide confidential automated notification to local institutions in advance of public reporting providing opportunities for earlier quality improvement interventions.
PMID: 22168892 [PubMed - as supplied by publisher]

Free Full Text: http://www.biomedcentral.com/content/pdf/1472-6947-11-75.pdf

    1. Stat Med. 2011 Dec 20;30(29):3447-60. doi: 10.1002/sim.4355.

Gaussian-based routines to impute categorical variables in health surveys. Yucel RM, He Y, Zaslavsky AM. Department of Epidemiology and Biostatistics, School of Public Health, University at Albany, SUNY, One University Place, Rensselaer, NY 12144-3456, USA. ryucel@albany.edu

The multivariate normal (MVN) distribution is arguably the most popular parametric model used in imputation and is available in most software packages (e.g., SAS PROC MI, R package norm). When it is applied to categorical variables as an approximation, practitioners often either apply simple rounding techniques for ordinal variables or create a distinct ‘missing’ category and/or disregard the nominal variable from the imputation phase. All of these practices can potentially lead to biased and/or uninterpretable inferences. In this work, we develop a new rounding methodology calibrated to preserve observed distributions to multiply impute missing categorical covariates. The major attractiveness of this method is its flexibility to use any ‘working’ imputation software, particularly those based on MVN, allowing practitioners to obtain usable imputations with small biases. A simulation study demonstrates the clear advantage of the proposed method in rounding ordinal variables and, in some scenarios, its plausibility in imputing nominal variables. We illustrate our methods on a widely used National Survey of Children with Special Health Care Needs where incomplete values on race posed a valid threat on inferences pertaining to disparities. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 21976366 [PubMed - in process]

JANUARY THEME: Applications of MSMs for Dealing with Time-varying Exposure

    1. Int J Biostat. 2011;7(1):Article 34. Epub 2011 Sep 8.

Antihypertensive medication use and change in kidney function in elderly adults: a marginal structural model analysis. Odden MC, Tager IB, van der Laan MJ, Delaney JA, Peralta CA, Katz R, Sarnak MJ, Psaty BM, Shlipak MG. Oregon State University, USA.

BACKGROUND: The evidence for the effectiveness of antihypertensive medication use for slowing decline in kidney function in older persons is sparse. We addressed this research question by the application of novel methods in a marginal structural model.
METHODS: Change in kidney function was measured by two or more measures of cystatin C in 1,576 hypertensive participants in the Cardiovascular Health Study over 7 years of follow-up (1989-1997 in four U.S. communities). The exposure of interest was antihypertensive medication use. We used a novel estimator in a marginal structural model to account for bias due to confounding and informative censoring.
RESULTS: The mean annual decline in eGFR was 2.41 ± 4.91 mL/min/1.73 m(2). In unadjusted analysis, antihypertensive medication use was not associated with annual change in kidney function. Traditional multivariable regression did not substantially change these estimates. Based on a marginal structural analysis, persons on antihypertensives had slower declines in kidney function; participants had an estimated 0.88 (0.13, 1.63) ml/min/1.73 m(2) per year slower decline in eGFR compared with persons on no treatment. In a model that also accounted for bias due to informative censoring, the estimate for the treatment effect was 2.23
(-0.13, 4.59) ml/min/1.73 m(2) per year slower decline in eGFR.
CONCLUSION: In summary, estimates from a marginal structural model suggested that antihypertensive therapy was associated with preserved kidney function in hypertensive elderly adults. Confirmatory studies may provide power to determine the strength and validity of the findings.
PMCID: PMC3204667 [Available on 2012/9/8]
PMID: 22049266 [PubMed - in process]

    1. Epidemiology. 2011 Nov;22(6):877-8.

Hormonal contraception and HIV risk: evaluating marginal-structural-model assumptions. Chen PL, Cole SR, Morrison CS.

Letter to the editor

PMID: 21968782 [PubMed - in process]

    1. Pharmacoepidemiol Drug Saf. 2011 Jul 22. doi: 10.1002/pds.2175. [Epub ahead of print] Comparative effectiveness of individual angiotensin receptor blockers on risk of mortality in patients with chronic heart failure. Desai RJ, Ashton CM, Deswal A, Morgan RO, Mehta HB, Chen H, Aparasu RR, Johnson ML. Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA.

 

OBJECTIVE: There is little evidence on comparative effectiveness of individual angiotensin receptor blockers (ARBs) in patients with chronic heart failure (CHF). This study compared four ARBs in reducing risk of mortality in clinical practice. METHODS: A retrospective analysis was conducted on a national sample of patients diagnosed with CHF from 1 October 1996 to 30 September 2002 identified from Veterans Affairs electronic medical records, with supplemental clinical data obtained from chart review. After excluding patients with exposure to ARBs within the previous 6 months, four treatment groups were defined based on initial use of candesartan, valsartan, losartan, and irbesartan between the index date (1 October 2000) and the study end date (30 September 2002). Time to death was measured concurrently during that period. A marginal structural model controlled for sociodemographic factors, comorbidities, comedications, disease severity (left ventricular ejection fraction), and potential time-varying confounding affected by previous treatment (hospitalization). Propensity scores derived from a multinomial logistic regression were used as inverse probability of treatment weights in a generalized estimating equation to estimate causal effects. RESULTS: Among the 1536 patients identified on ARB therapy, irbesartan was most frequently used (55.21%), followed by losartan (21.74%), candesartan (15.23%), and valsartan (7.81%). When compared with losartan, after adjusting for time-varying hospitalization in marginal structural model, candesartan (OR = 0.79, 95%CI = 0.42-1.50), irbesartan (OR = 1.17, 95%CI = 0.72-1.90), and valsartan (OR = 0.98, 95%CI = 0.45-2.14) were found to have similar effectiveness in reducing mortality in CHF patients. CONCLUSION: Effectiveness of ARBs in reducing mortality is similar in patients with CHF in everyday clinical practice. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 21786364 [PubMed - as supplied by publisher]

    1. Clin Trials. 2011 Jun;8(3):277-87. doi: 10.1177/1740774511402526.

How to use marginal structural models in randomized trials to estimate the natural direct and indirect effects of therapies mediated by causal intermediates. Oba K, Sato T, Ogihara T, Saruta T, Nakao K. Translational Research and Clinical Trial Center, Hokkaido University Hospital, Hokkaido University, Japan. k.oba@huhp.hokudai.ac.jp

Erratum in
Clin Trials. 2011;8(5):680.

BACKGROUND: Although intention-to-treat analysis is a standard approach, additional supplemental analyses are often required to evaluate the biological relationship among interventions, intermediates, and outcomes. Therefore, we need to evaluate whether the effect of an intervention on a particular outcome is mediated by a hypothesized intermediate variable.
PURPOSE: To evaluate the size of the direct effect in the total effect, we applied the marginal structural model to estimate the average natural direct and indirect effects in a large-scale randomized controlled trial (RCT). Method The average natural direct effect is defined as the difference in the probability of a counterfactual outcome between the experimental and control arms, with the intermediate set to what it would have been, had the intervention been a control treatment. We considered two marginal structural models to estimate the average natural direct and indirect effects introduced by VanderWeele (Epidemiology 2009) and applied them in a large-scale RCT – the Candesartan Antihypertensive Survival
Evaluation in Japan (CASE-J trial) – that compared angiotensin receptor blockers and calcium-channel blockers in high-risk hypertensive patients.
RESULTS: There were no strong blood pressure-independent or dependent effects; however, a systolic blood pressure reduction of about 1.9  mmHg suppressed all events. Compared to the blood pressure-independent effects of calcium channel blockers, those of angiotensin receptor blockers contributed positively to cardiovascular and cardiac events, but negatively to cerebrovascular events.
LIMITATIONS: There is a particular condition for estimating the average natural direct effect. It is impossible to check whether this condition is satisfied with the available data.
CONCLUSION: We estimated the average natural direct and indirect effects through the achieved systolic blood pressure in the CASE-J trial. This first application of estimating the average natural effects in an RCT can be useful for obtaining an in-depth understanding of the results and further development of similar interventions.
PMID: 21730076 [PubMed - indexed for MEDLINE]

    1. J Consult Clin Psychol. 2011 Apr;79(2):225-35. A marginal structural model analysis for loneliness: implications for intervention trials and clinical practice. VanderWeele TJ, Hawkley LC, Thisted RA, Cacioppo JT. Harvard University, Department of Epidemiology, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA. tvanderw@hsph.harvard.edu

 

OBJECTIVE: Clinical scientists, policymakers, and individuals must make decisions concerning effective interventions that address health-related issues. We use longitudinal data on loneliness and depressive symptoms and a new class of causal models to illustrate how empirical evidence can be used to inform intervention trial design and clinical practice.
METHOD: Data were obtained from a population-based study of non-Hispanic Caucasians, African Americans, and Latino Americans (N = 229) born between 1935 and 1952. Loneliness and depressive symptoms were measured with the UCLA Loneliness Scale-Revised and Center for Epidemiologic Studies Depression Scale, respectively. Marginal structural causal models were employed to evaluate the extent to which depressive symptoms depend not only on loneliness measured at a single point in time (as in prior studies of the effect of loneliness) but also on an individual’s entire loneliness history.
RESULTS: Our results indicate that if interventions to reduce loneliness by 1 standard deviation were made 1 and 2 years prior to assessing depressive symptoms, both would have an effect; together, they would result in an average reduction in depressive symptoms of 0.33 standard deviations, 95% CI [0.21,
0.44], p < .0001.
CONCLUSIONS: The magnitude and persistence of these effects suggest that greater effort should be devoted to developing practical interventions on alleviating loneliness and that doing so could be useful in the treatment and prevention of depressive symptoms. In light of the persistence of the effects of loneliness, our results also suggest that, in the evaluation of interventions on loneliness, it may be important to allow for a considerable follow-up period in assessing outcomes.
(c) 2011 APA, all rights reserved.
PMCID: PMC3079447 [Available on 2012/4/1]
PMID: 21443322 [PubMed - indexed for MEDLINE]

    1. J Clin Psychopharmacol. 2011 Apr;31(2):226-30.

Differential 3-year effects of first- versus second-generation antipsychotics on subjective well-being in schizophrenia using marginal structural models. Lambert M, Schimmelmann BG, Schacht A, Suarez D, Haro JM, Novick D, Wagner T, Wehmeier PM, Huber CG, Hundemer HP, Dittmann RW, Naber D. Psychosis Centre, Department of Psychiatry and Psychotherapy, Centre for Psychosocial Medicine, University Medical Centre Hamburg-Eppendorf, Germany.

OBJECTIVE: This study examined the differential effects of first- (FGAs) versus second-generation antipsychotics (SGAs) on subjective well-being in patients with schizophrenia.
METHOD: Data were collected in an observational 3-year follow-up study of 2224 patients with schizophrenia. Subjective well-being was assessed with the Subjective Well-being under Neuroleptic Treatment Scale (SWN-K). Differential effects of FGAs versus SGAs were analyzed using marginal structural models in those patients taking antipsychotic monotherapy.
RESULTS: The marginal structural model, which analyzed the differential effect on the SWN-K total score, revealed that patients on SGAs had significantly higher SWN-K total scores, starting at 6 months (3.02 points; P = 0.0061, d = 0.20) and remaining significant thereafter (end point: 5.35 points; P = 0.0074, d = 0.36).
CONCLUSIONS: Results of this large observational study are consistent with a small but clinically relevant superiority of SGAs over FGAs in subjective well-being extending previous positive findings of differential effects on quality of life.
PMID: 21346606 [PubMed - indexed for MEDLINE]

    1. Arch Intern Med. 2011 Jan 24;171(2):110-8. Epub 2010 Sep 27.

Similar outcomes with hemodialysis and peritoneal dialysis in patients with end-stage renal disease. Mehrotra R, Chiu YW, Kalantar-Zadeh K, Bargman J, Vonesh E. Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA 90502, USA. rmehrotra@labiomed.org

Comment in
Arch Intern Med. 2011 Jan 24;171(2):107-9.

BACKGROUND: The annual payer costs for patients treated with peritoneal dialysis (PD) are lower than with hemodialysis (HD), but in 2007, only 7% of dialysis patients in the United States were treated with PD. Since 1996, there has been no change in the first-year mortality of HD patients, but both short- and long-term outcomes of PD patients have improved.
METHODS: Data from the US Renal Data System were examined for secular trends in survival among patients treated with HD and PD on day 90 of end-stage renal disease (HD, 620 020 patients; PD, 64 406 patients) in three 3-year cohorts (1996-1998, 1999-2001, and 2002-2004) for up to 5 years of follow-up using a nonproportional hazards marginal structural model with inverse probability of treatment and censoring weighting.
RESULTS: There was a progressive attenuation in the higher risk for death seen in patients treated with PD in earlier cohorts; for the 2002-2004 cohort, there was no significant difference in the risk of death for HD and PD patients through 5 years of follow-up. The median life expectancy of HD and PD patients was 38.4 and 36.6 months, respectively. Analyses in 8 subgroups based on age (<65 and ≥65 years), diabetic status, and baseline comorbidity (none and ≥1) showed greater improvement in survival among patients treated with PD relative to HD at all follow-up periods.
CONCLUSION: In the most recent cohorts, patients who began treatment with HD or PD have similar outcomes.
PMID: 20876398 [PubMed - indexed for MEDLINE]

    1. Epidemiology. 2010 Jul;21(4):528-39.

Estimating absolute risks in the presence of nonadherence: an application to a follow-up study with baseline randomization. Toh S, Hernández-Díaz S, Logan R, Robins JM, Hernán MA. Department of Epidemiology, Harvard School of Public Health, Boston, MA 02215

The intention-to-treat (ITT) analysis provides a valid test of the null hypothesis and naturally results in both absolute and relative measures of risk. However, this analytic approach may miss the occurrence of serious adverse effects that would have been detected under full adherence to the assigned treatment. Inverse probability weighting of marginal structural models has been used to adjust for nonadherence, but most studies have provided only relative measures of risk. In this study, we used inverse probability weighting to estimate both absolute and relative measures of risk of invasive breast cancer under full adherence to the assigned treatment in the Women’s Health Initiative estrogen-plus-progestin trial. In contrast to an ITT hazard ratio (HR) of 1.25 (95% confidence interval [CI] = 1.01 to 1.54), the HR for 8-year continuous estrogen-plus-progestin use versus no use was 1.68 (1.24 to 2.28). The estimated risk difference (cases/100 women) at year 8 was 0.83 (-0.03 to 1.69) in the ITT analysis, compared with 1.44 (0.52 to 2.37) in the adherence-adjusted analysis. Results were robust across various dose-response models. We also compared the dynamic treatment regimen “take hormone therapy until certain adverse events become apparent, then stop taking hormone therapy” with no use (HR = 1.64; 95% CI
= 1.24 to 2.18). The methods described here are also applicable to observational studies with time-varying treatments.
PMID: 20526200 [PubMed - indexed for MEDLINE]

    1. Lifetime Data Anal. 2010 Jan;16(1):71-84. Epub 2009 Nov 6.

Relation between three classes of structural models for the effect of a time-varying exposure on survival. Young JG, Hernán MA, Picciotto S, Robins JM. Department of Epidemiology, Harvard School of Public Health, 677 Huntington Avenue, Kresge Bldg Suite 820, Boston, MA 02115, USA. jyoung@hsph.harvard.edu

Standard methods for estimating the effect of a time-varying exposure on survival may be biased in the presence of time-dependent confounders themselves affected by prior exposure. This problem can be overcome by inverse probability weighted estimation of Marginal Structural Cox Models (Cox MSM), g-estimation of Structural Nested Accelerated Failure Time Models (SNAFTM) and g-estimation of
Structural Nested Cumulative Failure Time Models (SNCFTM). In this paper, we describe a data generation mechanism that approximately satisfies a Cox MSM, an SNAFTM and an SNCFTM. Besides providing a procedure for data simulation, our formal description of a data generation mechanism that satisfies all three models allows one to assess the relative advantages and disadvantages of each modeling approach. A simulation study is also presented to compare effect estimates across the three models.
PMID: 19894116 [PubMed - indexed for MEDLINE]

    1. J Rheumatol. 2009 Mar;36(3):560-4. Epub 2009 Feb 4.

Prednisone, lupus activity, and permanent organ damage. Thamer M, Hernán MA, Zhang Y, Cotter D, Petri M. Medical Technology and Practice Patterns Institute, Bethesda, MD 20814

OBJECTIVE: To estimate the effect of corticosteroids (prednisone dose) on permanent organ damage among persons with systemic lupus erythematosus (SLE). METHODS: We identified 525 patients with incident SLE in the Hopkins Lupus Cohort. At each visit, clinical activity indices, laboratory data, and treatment were recorded. The study population was followed from the month after the first visit until June 29, 2006, or attainment of irreversible organ damage, death, loss to follow-up, or receipt of pulse methylprednisolone therapy. We estimated the effect of cumulative average dose of prednisone on organ damage using a marginal structural model to adjust for time-dependent confounding by indication due to SLE disease activity.
RESULTS: Compared with non-prednisone use, the hazard ratio of organ damage for prednisone was 1.16 (95% CI 0.54, 2.50) for cumulative average doses > 0-180 mg/month, 1.50 (95% CI 0.58, 3.88) for > 180-360 mg/month, 1.64 (95% CI 0.58, 4.69) for > 360-540 mg/month, and 2.51 (95% CI 0.87, 7.27) for > 540 mg/month. In contrast, standard Cox regression models estimated higher hazard ratios at all dose levels.
CONCLUSION: Our results suggest that low doses of prednisone do not result in a substantially increased risk of irreversible organ damage.
PMID: 19208608 [PubMed - indexed for MEDLINE]

 

December 2011

Jump to top of page

CER Scan [Epub ahead of print]

  • Drug Saf. 2011 Jan 2012; 35(1):61-78 [Epub ahead of print]

Identifying Adverse Events of Vaccines Using a Bayesian Method of Medically Guided Information Sharing. Crooks CJ, Prieto-Merino D, Evans SJ. Division of Epidemiology and Public Health, University of Nottingham, Nottingham, UK.

Background: The detection of adverse events following immunization (AEFI) fundamentally depends on how these events are classified. Standard methods impose a choice between either grouping similar events together to gain power or splitting them into more specific definitions. We demonstrate a method of medically guided Bayesian information sharing that avoids grouping or splitting the data, and we further combine this with the standard epidemiological tools of stratification and multivariate regression. Objective: The aim of this study was to assess the ability of a Bayesian hierarchical model to identify gastrointestinal AEFI in children, and then combine this with testing for effect modification and adjustments for confounding. Study Design: Reporting odds ratios were calculated for each gastrointestinal AEFI and vaccine combination. After testing for effect modification, these were then re-estimated using multivariable logistic regression adjusting for age, sex, year and country of report. A medically guided hierarchy of AEFI terms was then derived to allow information sharing in a Bayesian model. Setting: All spontaneous reports of AEFI in children under 18 years of age in the WHO VigiBase™ (Uppsala Monitoring Centre, Uppsala, Sweden) before June 2010. Reports with missing age were included in the main analysis in a separate category and excluded in a subsequent sensitivity analysis. Exposures: The 15 most commonly prescribed childhood vaccinations, excluding influenza vaccines. Main Outcome Measures: All gastrointestinal AEFI coded by WHO Adverse Reaction Terminology. Results: A crude analysis identified 132 signals from 655 reported combinations of gastrointestinal AEFI. Adjusting for confounding by age, sex, year of report and country of report, where appropriate, reduced the number of signals identified to 88. The addition of a Bayesian hierarchical model identified four further signals and removed three. Effect modification by age and sex was identified for six vaccines for the outcomes of vomiting, nausea, diarrhoea and salivary gland enlargement.
Conclusion: This study demonstrated a sequence of methods for routinely analysing spontaneous report databases that was easily understandable and reproducible. The combination of classical and Bayesian methods in this study help to focus the limited resources for hypothesis testing studies towards adverse events with the strongest support from the data.
PMID: 22136183 [PubMed - as supplied by publisher]

CER Scan [published within the last 30 days]

  • Am J Epidemiol. 2011 Dec 1;174(11):1213-22. Epub 2011 Oct 24.

Effects of adjusting for instrumental variables on bias and precision of effect estimates.
Myers JA, Rassen JA, Gagne JJ, Huybrechts KF, Schneeweiss S, Rothman KJ, Joffe MM, Glynn RJ.

Recent theoretical studies have shown that conditioning on an instrumental variable (IV), a variable that is associated with exposure but not associated with outcome except through exposure, can increase both bias and variance of exposure effect estimates. Although these findings have obvious implications in cases of known IVs, their meaning remains unclear in the more common scenario where investigators are uncertain whether a measured covariate meets the criteria for an IV or rather a confounder. The authors present results from two simulation studies designed to provide insight into the problem of conditioning on potential IVs in routine epidemiologic practice. The simulations explored the effects of conditioning on IVs, near-IVs (predictors of exposure that are weakly associated with outcome), and confounders on the bias and variance of a binary exposure effect estimate. The results indicate that effect estimates which are conditional on a perfect IV or near-IV may have larger bias and variance than the unconditional estimate. However, in most scenarios considered, the increases in error due to conditioning were small compared with the total estimation error. In these cases, minimizing unmeasured confounding should be the priority when selecting variables for adjustment, even at the risk of conditioning on IVs.
PMID: 22025356 [PubMed - in process]

  • Am J Epidemiol. 2011 Dec 1;174(11):1223-7. Epub 2011 Oct 27.

Invited commentary: understanding bias amplification. Pearl J.

In choosing covariates for adjustment or inclusion in propensity score analysis, researchers must weigh the benefit of reducing confounding bias carried by those covariates against the risk of amplifying residual bias carried by unmeasured confounders. The latter is characteristic of covariates that act like instrumental variables-that is, variables that are more strongly associated with the exposure than with the outcome. In this issue of the Journal (Am J Epidemiol. 2011;174(11):1213-1222), Myers et al. compare the bias amplification of a near-instrumental variable with its bias-reducing potential and suggest that, in practice, the latter outweighs the former. The author of this commentary sheds broader light on this comparison by considering the cumulative effects of conditioning on multiple covariates and showing that bias amplification may build up at a faster rate than bias reduction. The author further derives a partial order on sets of covariates which reveals preference for conditioning on outcome-related, rather than exposure-related, confounders.
PMCID: PMC3224255 [Available on 2012/12/1] PMID: 22034488 [PubMed - in process]

  • Am J Epidemiol. 2011 Dec 1;174(11):1228-9. Epub 2011 Oct 24. Myers et Al. Response to “understanding bias amplification”. Myers JA, Rassen JA, Gagne JJ, Huybrechts KF, Schneeweiss S, Rothman KJ, Glynn RJ.

 

Response to Invited Commentary
PMID: 22025355 [PubMed - in process]

  • Epidemiology. 2011 Nov;22(6):815-22.

Estimating bias from loss to follow-up in the Danish National Birth Cohort. Greene N, Greenland S, Olsen J, Nohr EA. Department of Epidemiology, School of Public Health, University of California

Loss to follow-up in cohort studies may result in biased association estimates. Of 61,895 women entering the Danish National Birth Cohort and completing the first data-collection phase, 37,178 (60%) opted to be in the 7-year follow-up. Using national registry data to obtain end point information on all members of the cohort, we estimated associations in the baseline and the 7-year follow-up participant populations for 5 exposure-outcome associations: (a) size at birth and childhood asthma, (b) assisted reproductive treatment and childhood hospitalizations, (c) prepregnancy body mass index and childhood infections, (d) alcohol drinking in early pregnancy and childhood developmental disorders, and (e) maternal smoking in pregnancy and childhood attention-deficit hyperactivity disorder (ADHD). We estimated follow-up bias in the odds or rate ratios by calculating relative ratios. For all but one of the above analyses, the bias appeared to be small, between -10% and +8%. For maternal smoking in pregnancy and childhood ADHD, we estimated a positive bias of approximately 33% (95% bootstrap limits of -30% and +152%). The presence and magnitude of bias due to loss to follow-up depended on the nature of the factors or outcomes examined, with the most pronounced contribution in this study coming from maternal smoking. Our methods and results may inform bias analyses in future pregnancy cohort studies.
PMID: 21918455 [PubMed - in process]

DECEMBER THEME: Methods for Addressing Missing Data in CER

    1. Stat Med. 2011 Dec 4. doi: 10.1002/sim.4413. [Epub ahead of print]

Diagnosing imputation models by applying target analyses to posterior replicates of completed data. He Y, Zaslavsky AM. Department of Health Care Policy, Harvard Medical School, Boston, MA, 02115, USA. he@hcp.med.harvard.edu.

Multiple imputation fills in missing data with posterior predictive draws from imputation models. To assess the adequacy of imputation models, we can compare completed data with their replicates simulated under the imputation model. We apply analyses of substantive interest to both datasets and use posterior predictive checks of the differences of these estimates to quantify the evidence of model inadequacy. We can further integrate out the imputed missing data and their replicates over the completed-data analyses to reduce variance in the comparison. In many cases, the checking procedure can be easily implemented using standard imputation software by treating re-imputations under the model as posterior predictive replicates. Thus, it can be applied for non-Bayesian imputation methods. We also sketch several strategies for applying the method in the context of practical imputation analyses. We illustrate the method using two real data applications and study its property using a simulation. Copyright © 2011 John Wiley & Sons, Ltd. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 22139814 [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2011 Mar 23. [Epub ahead of print]

Using causal diagrams to guide analysis in missing data problems. Daniel RM, Kenward MG, Cousens SN, De Stavola BL. Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK.

Estimating causal effects from incomplete data requires additional and inherently untestable assumptions regarding the mechanism giving rise to the missing data. We show that using causal diagrams to represent these additional assumptions both complements and clarifies some of the central issues in missing data theory, such as Rubin’s classification of missingness mechanisms (as missing completely at random (MCAR), missing at random (MAR) or missing not at random (MNAR)) and the circumstances in which causal effects can be estimated without bias by analysing only the subjects with complete data. In doing so, we formally extend the back-door criterion of Pearl and others for use in incomplete data examples. These ideas are illustrated with an example drawn from an occupational cohort study of the effect of cosmic radiation on skin cancer incidence.
PMID: 21389091 [PubMed - as supplied by publisher]

    1. Stat Med. 2011 Mar 15;30(6):627-41. doi: 10.1002/sim.4124. Epub 2010 Dec 28.

Estimating propensity scores with missing covariate data using general location mixture models. Mitra R, Reiter JP. School of Mathematics, University of Southampton, Southampton, SO17 1BJ, U.K. R.Mitra@soton.ac.uk

In many observational studies, analysts estimate causal effects using propensity scores, e.g. by matching, sub-classifying, or inverse probability weighting based on the scores. Estimation of propensity scores is complicated when some values of the covariates are missing. Analysts can use multiple imputation to create completed data sets from which propensity scores can be estimated. We propose a general location mixture model for imputations that assumes that the control units are a latent mixture of (i) units whose covariates are drawn from the same distributions as the treated units’ covariates and (ii) units whose covariates are drawn from different distributions. This formulation reduces the influence of control units outside the treated units’ region of the covariate space on the estimation of parameters in the imputation model, which can result in more plausible imputations. In turn, this can result in more reliable estimates of propensity scores and better balance in the true covariate distributions when matching or sub-classifying. We illustrate the benefits of the latent class modeling approach with simulations and with an observational study of the effect of breast feeding on children’s cognitive abilities. Copyright © 2010 John Wiley & Sons, Ltd.
PMID: 21337358 [PubMed - indexed for MEDLINE]

    1. Am J Epidemiol. 2010 Nov 1;172(9):1070-6. Epub 2010 Sep 14.

Multiple imputation for missing data via sequential regression trees. Burgette LF, Reiter JP. Department of Statistical Science, Duke University, Durham, North Carolina 27708.

Multiple imputation is particularly well suited to deal with missing data in large epidemiologic studies, because typically these studies support a wide range of analyses by many data users. Some of these analyses may involve complex modeling, including interactions and nonlinear relations. Identifying such relations and encoding them in imputation models, for example, in the conditional regressions for multiple imputation via chained equations, can be daunting tasks with large numbers of categorical and continuous variables. The authors present a nonparametric approach for implementing multiple imputation via chained equations by using sequential regression trees as the conditional models. This has the potential to capture complex relations with minimal tuning by the data imputer. Using simulations, the authors demonstrate that the method can result in more plausible imputations, and hence more reliable inferences, in complex settings than the naive application of standard sequential regression imputation techniques. They apply the approach to impute missing values in data on adverse birth outcomes with more than 100 clinical and survey variables. They evaluate the imputations using posterior predictive checks with several epidemiologic analyses of interest.
PMID: 20841346 [PubMed - indexed for MEDLINE]

Free Full Text: http://aje.oxfordjournals.org/content/172/9/1070.long

    1. Artif Intell Med. 2010 Oct;50(2):105-15. Epub 2010 Jul 16.

Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Jerez JM, Molina I, García-Laencina PJ, Alba E, Ribelles N, Martín M, Franco L. Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, E.T.S.I. Informática, Campus de Teatinos s/n, 29071 Málaga, Spain. jja@lcc.uma.es

OBJECTIVES: Missing data imputation is an important task in cases where it is crucial to use all available data and not discard records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set.
MATERIALS AND METHODS: Imputation methods based on statistical techniques, e.g., mean, hot-deck and multiple imputation, and machine learning techniques, e.g., multi-layer perceptron (MLP), self-organisation maps (SOM) and k-nearest neighbour (KNN), were applied to data collected through the “El Álamo-I” project, and the results were then compared to those obtained from the listwise deletion
(LD) imputation method. The database includes demographic, therapeutic and recurrence-survival information from 3679 women with operable invasive breast cancer diagnosed in 32 different hospitals belonging to the Spanish Breast Cancer Research Group (GEICAM). The accuracies of predictions on early cancer relapse were measured using artificial neural networks (ANNs), in which different ANNs were estimated using the data sets with imputed missing values.
RESULTS: The imputation methods based on machine learning algorithms outperformed imputation statistical methods in the prediction of patient outcome. Friedman’s test revealed a significant difference (p=0.0091) in the observed area under the ROC curve (AUC) values, and the pairwise comparison test showed that the AUCs for MLP, KNN and SOM were significantly higher (p=0.0053, p=0.0048 and p=0.0071, respectively) than the AUC from the LD-based prognosis model.
CONCLUSION: The methods based on machine learning techniques were the most suited for the imputation of missing values and led to a significant enhancement of prognosis accuracy compared to imputation methods based on statistical procedures.
Copyright © 2010 Elsevier B.V. All rights reserved.
PMID: 20638252 [PubMed - indexed for MEDLINE]

    1. J Clin Epidemiol. 2010 Jul;63(7):728-36. Epub 2010 Mar 25.

Unpredictable bias when using the missing indicator method or complete case analysis for missing confounder values: an empirical example. Knol MJ, Janssen KJ, Donders AR, Egberts AC, Heerdink ER, Grobbee DE, Moons KG, Geerlings MI. Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Str. 6.131, PO Box 85500, 3508 GA Utrecht, The Netherlands. m.j.knol@umcutrecht.nl

OBJECTIVE: Missing indicator method (MIM) and complete case analysis (CC) are frequently used to handle missing confounder data. Using empirical data, we demonstrated the degree and direction of bias in the effect estimate when using these methods compared with multiple imputation (MI).
STUDY DESIGN AND SETTING: From a cohort study, we selected an exposure (marital status), outcome (depression), and confounders (age, sex, and income). Missing values in “income” were created according to different patterns of missingness: missing values were created completely at random and depending on exposure and outcome values. Percentages of missing values ranged from 2.5% to 30%.
RESULTS: When missing values were completely random, MIM gave an overestimation of the odds ratio, whereas CC and MI gave unbiased results. MIM and CC gave under- or overestimations when missing values depended on observed values. Magnitude and direction of bias depended on how the missing values were related to exposure and outcome. Bias increased with increasing percentage of missing
values.
CONCLUSION: MIM should not be used in handling missing confounder data because it gives unpredictable bias of the odds ratio even with small percentages of missing values. CC can be used when missing values are completely random, but it gives loss of statistical power.
Copyright 2010 Elsevier Inc. All rights reserved.
PMID: 20346625 [PubMed - indexed for MEDLINE]

    1. Pharmacoepidemiol Drug Saf. 2010 Jun;19(6):618-26.

Issues in multiple imputation of missing data for large general practice clinical databases. Marston L, Carpenter JR, Walters KR, Morris RW, Nazareth I, Petersen I. Department of Primary Care and Population Health, University College London, Rowland Hill Street, London NW32PF

PURPOSE: Missing data are a substantial problem in clinical databases. This paper aims to examine patterns of missing data in a primary care database, compare this to nationally representative datasets and explore the use of multiple imputation (MI) for these data.
METHODS: The patterns and extent of missing health indicators in a UK primary care database (THIN) were quantified using 488 384 patients aged 16 or over in their first year after registration with a GP from 354 General Practices. MI models were developed and the resulting data compared to that from nationally representative datasets (14 142 participants aged 16 or over from the Health Survey for England 2006 (HSE) and 4 252 men from the British Regional Heart Study (BRHS)).
RESULTS: Between 22% (smoking) and 38% (height) of health indicator data were missing in newly registered patients, 2004-2006. Distributions of height, weight and blood pressure were comparable to HSE and BRHS, but alcohol and smoking were not. After MI the percentage of smokers and non-drinkers was higher in THIN than the comparison datasets, while the percentage of ex-smokers and heavy drinkers was lower. Height, weight and blood pressure remained similar to the comparison datasets.
CONCLUSIONS: Given available data, the results are consistent with smoking and alcohol data missing not at random whereas height, weight and blood pressure missing at random. Further research is required on suitable imputation methods for smoking and alcohol in such databases.
PMID: 20306452 [PubMed - indexed for MEDLINE]

    1. Circ Cardiovasc Qual Outcomes. 2010 Jan;3(1):98-105.

Missing data analysis using multiple imputation: getting to the heart of the matter. He Y. Department of Health Care Policy, Harvard Medical School

Missing data are a pervasive problem in health investigations. We describe some background of missing data analysis and criticize ad hoc methods that are prone to serious problems. We then focus on multiple imputation, in which missing cases are first filled in by several sets of plausible values to create multiple completed datasets, then standard complete-data procedures are applied to each completed dataset, and finally the multiple sets of results are combined to yield a single inference. We introduce the basic concepts and general methodology and provide some guidance for application. For illustration, we use a study assessing the effect of cardiovascular diseases on hospice discussion for late stage lung cancer patients.
PMCID: PMC2818781; PMID: 20123676 [PubMed - indexed for MEDLINE]

Free PDF: http://circoutcomes.ahajournals.org/content/3/1/98.full.pdf+html

    1. Am J Epidemiol. 2010 Mar 1;171(5):624-32. Epub 2010 Jan 27.

Multiple imputation for missing data: fully conditional specification versus multivariate normal imputation. Lee KJ, Carlin JB. Clinical Epidemiology and Biostatistics Unit, Murdoch Childrens Research Institute, Royal Children’s Hospital, Flemington Road, Parkville, Victoria

Statistical analysis in epidemiologic studies is often hindered by missing data, and multiple imputation is increasingly being used to handle this problem. In a simulation study, the authors compared 2 methods for imputation that are widely available in standard software: fully conditional specification (FCS) or “chained equations” and multivariate normal imputation (MVNI). The authors created data sets of 1,000 observations to simulate a cohort study, and missing data were induced under 3 missing-data mechanisms. Imputations were performed using FCS (Royston’s “ice”) and MVNI (Schafer’s NORM) in Stata (Stata Corporation, College Station, Texas), with transformations or prediction matching being used to manage nonnormality in the continuous variables. Inferences for a set of regression parameters were compared between these approaches and a complete-case analysis. As expected, both FCS and MVNI were generally less biased than complete-case analysis, and both produced similar results despite the presence of binary and ordinal variables that clearly did not follow a normal distribution. Ignoring
skewness in a continuous covariate led to large biases and poor coverage for the corresponding regression parameter under both approaches, although inferences for other parameters were largely unaffected. These results provide reassurance that similar results can be expected from FCS and MVNI in a standard regression analysis involving variously scaled variables.
PMID: 20106935 [PubMed - indexed for MEDLINE]

Free Full Text: http://aje.oxfordjournals.org/content/171/5/624.long

    1. J Sch Psychol. 2010 Feb;48(1):5-37.

An introduction to modern missing data analyses. Baraldi AN, Enders CK. Arizona State University, USA. Amanda.Baraldi@asu.edu

A great deal of recent methodological research has focused on two modern missing data analysis methods: maximum likelihood and multiple imputation. These approaches are advantageous to traditional techniques (e.g. deletion and mean imputation techniques) because they require less stringent assumptions and mitigate the pitfalls of traditional techniques. This article explains the theoretical underpinnings of missing data analyses, gives an overview of traditional missing data techniques, and provides accessible descriptions of maximum likelihood and multiple imputation. In particular, this article focuses on maximum likelihood estimation and presents two analysis examples from the Longitudinal Study of American Youth data. One of these examples includes a description of the use of auxiliary variables. Finally, the paper illustrates ways that researchers can use intentional, or planned, missing data to enhance their research designs.
PMID: 20006986 [PubMed - indexed for MEDLINE]

    1. Int J Epidemiol. 2010 Feb;39(1):118-28. Epub 2009 Oct 25.

Modelling relative survival in the presence of incomplete data: a tutorial. Nur U, Shack LG, Rachet B, Carpenter JR, Coleman MP. Cancer Research UK Cancer Survival Group, London School of Hygiene and Tropical Medicine, London, UK. ula.nur@lshtm.ac.uk

BACKGROUND: Missing data frequently create problems in the analysis of population-based data sets, such as those collected by cancer registries. Restriction of analysis to records with complete data may yield inferences that are substantially different from those that would have been obtained had no data been missing. ‘Naive’ methods for handling missing data, such as restriction of the analysis to complete records or creation of a ‘missing’ category, have drawbacks that can invalidate the conclusions from the analysis. We offer a tutorial on modern methods for handling missing data in relative survival analysis.
METHODS: We estimated relative survival for 29 563 colorectal cancer patients who were diagnosed between 1997 and 2004 and registered in the North West Cancer Intelligence Service. The method of multiple imputation (MI) was applied to account for the common example of incomplete stage at diagnosis, under the missing at random (MAR) assumption. Multivariable regression with a generalized linear model and Poisson error structure was then used to estimate the excess hazard of death of the colorectal cancer patients, over and above the background mortality, adjusting for significant predictors of mortality.
RESULTS: Incomplete information on stage, morphology and grade meant that only 55% of the data could be included in the ‘complete-case’ analysis. All cases could be included after indicator method (IM) or MI method. Handling missing data by MI produced a significantly lower estimate of the excess mortality for stage, morphology and grade, with the largest reductions occurring for late-stage and high-grade tumours, when compared with the results of complete-case analysis.
CONCLUSION: In complete-case analysis, almost 50% of the information could not be included, and with the IM, all records with missing values for stage were combined into a single ‘missing’ category. We show that MI methods greatly improved the results by exploiting all the information in the incomplete records. This method also helped to ensure efficient inferences about survival were made from the multivariate regression analyses.
PMID: 19858106 [PubMed - indexed for MEDLINE]

Free Full Text: http://ije.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=19858106

 

November 2011

Jump to top of page

CER Scan [Epub ahead of print]

    1. Clin Pharmacol Ther. 2011 Nov 2. doi: 10.1038/clpt.2011.235. [Epub ahead of print]

Assessing the Comparative Effectiveness of Newly Marketed Medications: Methodological Challenges and Implications for Drug Development. Schneeweiss S, Gagne JJ, Glynn RJ, Ruhl M, Rassen JA. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA

Comparative-effectiveness research (CER) aims to produce actionable evidence regarding the effectiveness and safety of medical products and interventions as they are used outside of controlled research settings. Although CER evidence regarding medications is particularly needed shortly after market approval, key methodological challenges include (i) potential bias due to channeling of patients to the newly marketed medication because of various patient-, physician-, and system-related factors; (ii) rapid changes in the characteristics of the user population during the early phase of marketing; and (iii) lack of timely data and the often small number of users in the first few months of marketing. We propose a mix of approaches to generate comparative-effectiveness data in the early marketing period, including sequential cohort monitoring with secondary health-care data and propensity score (PS) balancing, as well as extended follow-up of phase III and phase IV trials, indirect comparisons of placebo-controlled trials, and modeling and simulation of virtual trials.
PMID: 22048230 [PubMed - as supplied by publisher]

    1. Ann Epidemiol. 2011 Oct 28. [Epub ahead of print]

Antidepressant Use and Cognitive Deficits in Older Men: Addressing Confounding by Indications with Different Methods. Han L, Kim N, Brandt C, Allore HG. Yale University Internal Medicine Program on Aging, New Haven, CT.

PURPOSE: Antidepressant use has been associated with cognitive impairment in older persons. We sought to examine whether this association might reflect an indication bias.

METHODS: A total of 544 community-dwelling hypertensive men aged =65 years completed the Hopkins Verbal Learning Test at baseline and 1 year. Antidepressant medications were ascertained by the use of medical records. Potential confounding by indications was examined by adjusting for depression-related diagnoses and severity of depression symptoms using multiple linear regression, a propensity score, and a structural equation model (SEM).

RESULTS: Before adjusting for the indications, a one unit cumulative exposure to antidepressants was associated with -1.00 (95% confidence interval [CI], -1.94, -0.06) point lower HVLT score. After adjusting for the indications using multiple linear regression or a propensity score, the association diminished to -0.48 (95% CI, -0.62, 1.58) and -0.58 (95% CI, -0.60, 1.58), respectively. The most clinical interpretable empirical SEM with adequate fit involves both direct and indirect paths of the two indications. Depression-related diagnoses and depression symptoms significantly predict antidepressant use (p < .05). Their total standardized path coefficients on Hopkins Verbal Learning Test score were twice (0.073) or as large (0.034) as the antidepressant use (0.035).

CONCLUSION: The apparent association between antidepressant use and memory deficit in older persons may be confounded by indications. SEM offers a heuristic empirical method for examining confounding by indications but not quantitatively superior bias reduction compared with conventional methods.
PMID: 22037381 [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2011 Oct 19. [Epub ahead of print]

Observational data for comparative effectiveness research: An emulation of randomised trials of statins and primary prevention of coronary heart disease. Danaei G, García Rodríguez LA, Cantero OF, Logan R, Hernán MA. Department of Epidemiology, Harvard School of Public Health, Boston, MA.

This article reviews methods for comparative effectiveness research using observational data. The basic idea is using an observational study to emulate a hypothetical randomised trial by comparing initiators versus non-initiators of treatment. After adjustment for measured baseline confounders, one can then conduct the observational analogue of an intention-to-treat analysis. We also explain two approaches to conduct the analogues of per-protocol and as-treated analyses after further adjusting for measured time-varying confounding and selection bias using inverse-probability weighting. As an example, we implemented these methods to estimate the effect of statins for primary prevention of coronary heart disease (CHD) using data from electronic medical records in the UK. Despite strong confounding by indication, our approach detected a potential benefit of statin therapy. The analogue of the intention-to-treat hazard ratio (HR) of CHD was 0.89 (0.73, 1.09) for statin initiators versus non-initiators. The HR of CHD was 0.84 (0.54, 1.30) in the per-protocol analysis and 0.79 (0.41, 1.41) in the as-treated analysis for 2 years of use versus no use. In contrast, a conventional comparison of current users versus never users of statin therapy resulted in a HR of 1.31 (1.04, 1.66). We provide a flexible and annotated SAS program to implement the proposed analyses.
PMID: 22016461 [PubMed - as supplied by publisher]

    1. Clin Trials. 2011 Oct 12. [Epub ahead of print]

Challenges in the design and implementation of the Multicenter Uveitis Steroid Treatment (MUST) Trial – lessons for comparative effectiveness trials. Holbrook JT, Kempen JH, Prusakowski NA, Altaweel MM, Jabs DA. Center for Clinical Trials, Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA.

BACKGROUND: Randomized clinical trials (RCTs) are an important component of comparative effectiveness (CE) research because they are the optimal design for head-to-head comparisons of different treatment options.

PURPOSE: To describe decisions made in the design of the Multicenter Uveitis Steroid Treatment (MUST) Trial to ensure that the results would be widely generalizable.

METHODS: Review of design and implementation decisions and their rationale for the trial.

RESULTS: The MUST Trial is a multicenter randomized controlled CE trial evaluating a novel local therapy (intraocular fluocinolone acetonide implant) versus the systemic therapy standard of care for noninfectious uveitis. Decisions made in protocol design in order to broaden enrollment included allowing patients with very poor vision and media opacity to enroll and including clinical sites outside the United States. The treatment protocol was designed to follow standard care. The primary outcome, visual acuity, is important to patients and can be evaluated in all eyes with uveitis. Other outcomes include patient-reported visual function, quality of life, and disease and treatment related complications.

LIMITATIONS: The trial population is too small for subgroup analyses that are of interest and the trial is being conducted at tertiary medical centers.

CONCLUSION: CE trials require greater emphasis on generalizability than many RCTs but otherwise face similar challenges for design choices as any RCT. The increase in heterogeneity in patients and treatment required to ensure generalizability can be balanced with a rigorous approach to implementation, outcome assessment, and statistical design. This approach requires significant resources that may limit implementation in many RCTs, especially in clinical practice settings. Clinical Trials 2011; XX: 1-8. http://ctj.sagepub.com.
PMID: 21994128 [PubMed - as supplied by publisher]

    1. Stat Methods Med Res. 2011 Oct 3. [Epub ahead of print]

Assessing the sensitivity of methods for estimating principal causal effects. Stuart EA, Jo B. Departments of Mental Health and Biostatistics, Johns Hopkins Bloomberg School of Public Health, 624 N Broadway, 8th Floor, Baltimore, MD, USA.

The framework of principal stratification provides a way to think about treatment effects conditional on post-randomization variables, such as level of compliance. In particular, the complier average causal effect (CACE) – the effect of the treatment for those individuals who would comply with their treatment assignment under either treatment condition – is often of substantive interest. However, estimation of the CACE is not always straightforward, with a variety of estimation procedures and underlying assumptions, but little advice to help
researchers select between methods. In this article, we discuss and examine two methods that rely on very different assumptions to estimate the CACE: a maximum likelihood (‘joint’) method that assumes the ‘exclusion restriction,’ (ER) and a propensity score-based method that relies on ‘principal ignorability.’ We detail the assumptions underlying each approach, and assess each methods’ sensitivity to both its own assumptions and those of the other method using both simulated data and a motivating example. We find that the ER-based joint approach appears somewhat less sensitive to its assumptions, and that the performance of both methods is significantly improved when there are strong predictors of compliance. Interestingly, we also find that each method performs particularly well when the assumptions of the other approach are violated. These results highlight the importance of carefully selecting an estimation procedure whose assumptions are likely to be satisfied in practice and of having strong predictors of principal stratum membership.
PMID: 21971481 [PubMed - as supplied by publisher]

CER Scan [published within the last 30 days]

    1. Am J Epidemiol. 2011 Nov 15;174(10):1204-10. Epub 2011 Oct 7.

Comparing different strategies for timing of dialysis initiation through inverse probability weighting. Sjölander A, Nyrén O, Bellocco R, Evans M.

Dialysis has been used in the treatment of patients with end-stage renal disease since the 1960s. Recently, several large observational studies have been conducted to assess whether early initiation of dialysis prolongs survival, as compared with late initiation. However, these studies have used analytic approaches which are likely to suffer from either lead-time bias or immortal-time bias. In this paper, the authors demonstrate that recently developed methods in the causal inference literature can be used to avoid both types of bias and accurately estimate the ideal time for dialysis initiation from observational data. This is illustrated using data from a nationwide population-based cohort of patients with chronic kidney disease in Sweden (1996-2003).
PMID: 21984655 [PubMed - in process]

    1. BMJ. 2011 Oct 3;343:d5888. doi: 10.1136/bmj.d5888.

Estimating treatment effects for individual patients based on the results of randomised clinical trials. Dorresteijn JA, Visseren FL, Ridker PM, Wassink AM, Paynter NP, Steyerberg EW, van der Graaf Y, Cook NR. Department of Vascular Medicine, University Medical Center Utrecht, PO Box 85500, 3508 GA Utrecht, Netherlands.

OBJECTIVES: To predict treatment effects for individual patients based on data from randomised trials, taking rosuvastatin treatment in the primary prevention of cardiovascular disease as an example, and to evaluate the net benefit of making treatment decisions for individual patients based on a predicted absolute treatment effect.

SETTING: As an example, data were used from the Justification for the Use of Statins in Prevention (JUPITER) trial, a randomised controlled trial evaluating the effect of rosuvastatin 20 mg daily versus placebo on the occurrence of
cardiovascular events (myocardial infarction, stroke, arterial revascularisation, admission to hospital for unstable angina, or death from cardiovascular causes). Population 17,802 healthy men and women who had low density lipoprotein cholesterol levels of less than 3.4 mmol/L and high sensitivity C reactive protein levels of 2.0 mg/L or more.

METHODS: Data from the Justification for the Use of Statins in Prevention trial were used to predict rosuvastatin treatment effect for individual patients based on existing risk scores (Framingham and Reynolds) and on a newly developed prediction model. We compared the net benefit of prediction based rosuvastatin treatment (selective treatment of patients whose predicted treatment effect exceeds a decision threshold) with the net benefit of treating either everyone or no one.

RESULTS: The median predicted 10 year absolute risk reduction for cardiovascular events was 4.4% (interquartile range 2.6-7.0%) based on the Framingham risk score, 4.2% (2.5-7.1%) based on the Reynolds score, and 3.9% (2.5-6.1%) based on the newly developed model (optimal fit model). Prediction based treatment was associated with more net benefit than treating everyone or no one, provided that the decision threshold was between 2% and 7%, and thus that the number willing to treat (NWT) to prevent one cardiovascular event over 10 years was between 15 and 50.

CONCLUSIONS: Data from randomised trials can be used to predict treatment effect in terms of absolute risk reduction for individual patients, based on a newly developed model or, if available, existing risk scores. The value of such prediction of treatment effect for medical decision making is conditional on the NWT to prevent one outcome event. Trial registration number Clinicaltrials.gov NCT00239681.
PMCID: PMC3184644
PMID: 21968126 [PubMed - in process]

Free Full Text: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3184644/?tool=pubmed

    1. BMC Med Res Methodol. 2011 Sep 21;11:132.

Benefits of ICU admission in critically ill patients: whether instrumental variable methods or propensity scores should be used. Pirracchio R, Sprung C, Payen D, Chevret S. Département de Biostatistique et Informatique Médicale, Unité INSERM UMR 717, Hôpital Saint Louis, APHP, Paris, 75010, France. romainpirracchio@yahoo.fr

BACKGROUND: The assessment of the causal effect of Intensive Care Unit (ICU) admission generally involves usual observational designs and thus requires controlling for confounding variables. Instrumental variable analysis is an econometric technique that allows causal inferences of the effectiveness of some treatments during situations to be made when a randomized trial has not been or cannot be conducted. This technique relies on the existence of one variable or “instrument” that is supposed to achieve similar observations with a different treatment for “arbitrary” reasons, thus inducing substantial variation in the treatment decision with no direct effect on the outcome. The objective of the study was to assess the benefit in terms of hospital mortality of ICU admission in a cohort of patients proposed for ICU admission (ELDICUS cohort).

METHODS: Using this cohort of 8,201 patients triaged for ICU (including 6,752 (82.3%) patients admitted), the benefit of ICU admission was evaluated using 3 different approaches: instrumental variables, standard regression and propensity score matched analyses. We further evaluated the results obtained using different instrumental variable methods that have been proposed for dichotomous outcomes.

RESULTS: The physician’s main specialization was found to be the best instrument. All instrumental variable models adequately reduced baseline imbalances, but failed to show a significant effect of ICU admission on hospital mortality, with confidence intervals far higher than those obtained in standard or propensity-based analyses.

CONCLUSIONS: Instrumental variable methods offer an appealing alternative to handle the selection bias related to nonrandomized designs, especially when the presence of significant unmeasured confounding is suspected. Applied to the ELDICUS database, this analysis failed to show any significant beneficial effect of ICU admission on hospital mortality. This result could be due to the lack of statistical power of these methods.
PMCID: PMC3185268
PMID: 21936926 [PubMed - in process]

Free Full Text: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3185268/?tool=pubmed

    1. Med Care. 2011 Oct;49(10):940-7.

The mortality risk score and the ADG score: two points-based scoring systems for the johns hopkins aggregated diagnosis groups to predict mortality in a general adult population cohort in Ontario, Canada. Austin PC, Walraven C. Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada. peter.austin@ices.on.ca

BACKGROUND: Logistic regression models that incorporated age, sex, and indicator variables for the Johns Hopkins’ Aggregated Diagnosis Groups (ADGs) categories have been shown to accurately predict all-cause mortality in adults.

OBJECTIVES: To develop 2 different point-scoring systems using the ADGs. The Mortality Risk Score (MRS) collapses age, sex, and the ADGs to a single summary score that predicts the annual risk of all-cause death in adults. The ADG Score derives weights for the individual ADG diagnosis groups.

RESEARCH DESIGN: Retrospective cohort constructed using population-based administrative data.

PARTICIPANTS: All 10,498,413 residents of Ontario, Canada, between the age of 20 and 100 years who were alive on their birthday in 2007, participated in this study. Participants were randomly divided into derivation and validation samples.

MEASURES: Death within 1 year.

RESULTS: In the derivation cohort, the MRS ranged from -21 to 139 (median value 29, IQR 17 to 44). In the validation group, a logistic regression model with the MRS as the sole predictor significantly predicted the risk of 1-year mortality with a c-statistic of 0.917. A regression model with age, sex, and the ADG Score has similar performance. Both methods accurately predicted the risk of 1-year mortality across the 20 vigintiles of risk.

CONCLUSIONS: The MRS combined values for a person’s age, sex, and the John Hopkins ADGs to accurately predict 1-year mortality in adults. The ADG Score is a weighted score representing the presence or absence of the 32 ADG diagnosis groups. These scores will facilitate health services researchers conducting risk adjustment using administrative health care databases.
PMID: 21921849 [PubMed - in process]

    1. Stat Med. 2011 Oct 30;30(24):2947-58. doi: 10.1002/sim.4324. Epub 2011 Jul 29.

Analyzing direct and indirect effects of treatment using dynamic path analysis applied to data from the Swiss HIV Cohort Study. Røysland K, Gran JM, Ledergerber B, von Wyl V, Young J, Aalen OO. Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, Norway. kjetil.roysland@medisin.uio.no

When applying survival analysis, such as Cox regression, to data from major clinical trials or other studies, often only baseline covariates are used. This is typically the case even if updated covariates are available throughout the observation period, which leaves large amounts of information unused. The main reason for this is that such time-dependent covariates often are internal to the disease process, as they are influenced by treatment, and therefore lead to confounded estimates of the treatment effect. There are, however, methods to exploit such covariate information in a useful way. We study the method of dynamic path analysis applied to data from the Swiss HIV Cohort Study. To adjust for time-dependent confounding between treatment and the outcome ‘AIDS or death’, we carried out the analysis on a sequence of mimicked randomized trials constructed from the original cohort data. To analyze these trials together, regular dynamic path analysis is extended to a composite analysis of weighted dynamic path models. Results using a simple path model, with one indirect effect mediated through current HIV-1 RNA level, show that most or all of the total effect go through HIV-1 RNA for the first 4?years. A similar model, but with CD4 level as mediating variable, shows a weaker indirect effect, but the results are in the same direction. There are many reasons to be cautious when drawing conclusions from estimates of direct and indirect effects. Dynamic path analysis is however a useful tool to explore underlying processes, which are ignored in regular analyses.
PMID: 21800346 [PubMed - in process]

    1. Epidemiology. 2011 Sep;22(5):718-23.

A comparison of methods to estimate the hazard ratio under conditions of time-varying confounding and nonpositivity. Naimi AI, Cole SR, Westreich DJ, Richardson DB. Department of Epidemiology, Gillings School of Global Public Health, UNC-Chapel Hill, NC 27599, USA.

In occupational epidemiologic studies, the healthy worker survivor effect refers to a process that leads to bias in the estimates of an association between cumulative exposure and a health outcome. In these settings, work status acts both as an intermediate and confounding variable and may violate the positivity assumption (the presence of exposed and unexposed observations in all strata of the confounder). Using Monte Carlo simulation, we assessed the degree to which crude, work-status adjusted, and weighted (marginal structural) Cox proportional hazards models are biased in the presence of time-varying confounding and nonpositivity. We simulated the data representing time-varying occupational exposure, work status, and mortality. Bias, coverage, and root mean squared error (MSE) were calculated relative to the true marginal exposure effect in a range of scenarios. For a base-case scenario, using crude, adjusted, and weighted Cox models, respectively, the hazard ratio was biased downward 19%, 9%, and 6%; 95% confidence interval coverage was 48%, 85%, and 91%; and root MSE was 0.20, 0.13, and 0.11. Although marginal structural models were less biased in most scenarios studied, neither standard nor marginal structural Cox proportional hazards models fully resolve the bias encountered under conditions of time-varying confounding and nonpositivity.
PMCID: PMC3155387 [Available on 2012/9/1]
PMID: 21747286 [PubMed - in process]

CER Scan [published within the last 90 days]

    1. Stat Biosci. 2011 Sep;3(1):6-27.

Estimating Decision-Relevant Comparative Effects Using Instrumental Variables. Basu A. Departments of Health Services and Pharmacy, University of Washington, Seattle, 1959 NE Pacific St, Box 357660, Seattle, WA 98195-7660, USA.

Instrumental variables methods (IV) are widely used in the health economics literature to adjust for hidden selection biases in observational studies when estimating treatment effects. Less attention has been paid in the applied literature to the proper use of IVs if treatment effects are heterogeneous across subjects. Such a heterogeneity in effects becomes an issue for IV estimators when individuals’ self-selected choices of treatments are correlated with expected idiosyncratic gains or losses from treatments. We present an overview of the challenges that arise with IV estimators in the presence of effect heterogeneity and self-selection and compare conventional IV analysis with alternative approaches that use IVs to directly address these challenges. Using a Medicare sample of clinically localized breast cancer patients, we study the impact of breast-conserving surgery and radiation with mastectomy on 3-year survival rates. Our results reveal the traditional IV results may have masked important heterogeneity in treatment effects. In the context of these results, we discuss the advantages and limitations of conventional and alternative IV methods in estimating mean treatment-effect parameters, the role of heterogeneity in comparative effectiveness research and the implications for diffusion of technology.
PMCID: PMC3193796 [Available on 2012/9/1]
PMID: 22010051 [PubMed]

 

October 2011

Jump to top of page

CER Scan [Epub ahead of print]

    1. Pharmacoepidemiol Drug Saf. 2011 Sep 23. doi: 10.1002/pds.2251. [Epub ahead of print]

Balance measures for propensity score methods: a clinical example on beta-agonist use and the risk of myocardial infarction. Groenwold RH, de Vries F, de Boer A, Pestman WR, Rutten FH, Hoes AW, Klungel OH. Department of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Utrecht, The Netherlands; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands. r.h.h.groenwold@umcutrecht.nl.

PURPOSE: Propensity score (PS) methods aim to control for confounding by balancing confounders between exposed and unexposed subjects with the same PS. PS balance measures have been compared in simulated data but limited in empirical data. Our objective was to compare balance measures in clinical data and assessed the association between long-acting inhalation beta-agonist (LABA) use and myocardial infarction.

METHODS: We estimated the relationship between LABA use and myocardial infarction in a cohort of adults with a diagnosis of asthma or chronic obstructive pulmonary disorder from the Utrecht General Practitioner Research Network database. More than two thousand PS models, including information on the observed confounders age, sex, diabetes, cardiovascular disease and chronic obstructive pulmonary disorder status, were applied. The balance of these confounders was assessed using the standardised difference (SD), Kolmogorov-Smirnov (KS) distance and overlapping coefficient. Correlations between these balance measures were calculated. In addition, simulation studies were performed to assess the correlation between balance measures and bias.

RESULTS: LABA use was not related to myocardial infarction after conditioning on the PS (median heart rate=1.14, 95%CI=0.47-2.75). When using the different balance measures for selecting a PS model, similar associations were obtained. In our empirical data, SD and KS distance were highly correlated balance measures (r=0.92). In simulations, SD, KS distance and overlapping coefficient were similarly correlated to bias (e.g. r=0.55, r=0.52 and r=-0.57, respectively, when conditioning on the PS).

CONCLUSIONS: We recommend using the SD or the KS distance to quantify the balance of confounder distributions when applying PS methods. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21953948 [PubMed - as supplied by publisher]

    1. Clin Trials. 2011 Sep 23. [Epub ahead of print]

Beyond the intention-to-treat in comparative effectiveness research. Hernán MA, Hernández-Díaz S. Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA; Harvard-MIT Division of Health Sciences and Technology, Boston, MA, USA.

BACKGROUND: The intention-to-treat comparison is the primary, if not the only, analytic approach of many randomized clinical trials.

PURPOSE: To review the shortcomings of intention-to-treat analyses, and of ‘as treated’ and ‘per protocol’ analyses as commonly implemented, with an emphasis on problems that are especially relevant for comparative effectiveness research.

METHODS and RESULTS: In placebo-controlled randomized clinical trials, intention-to-treat analyses underestimate the treatment effect and are therefore nonconservative for both safety trials and noninferiority trials. In randomized clinical trials with an active comparator, intention-to-treat estimates can overestimate a treatment’s effect in the presence of differential adherence. In either case, there is no guarantee that an intention-to-treat analysis estimates the clinical effectiveness of treatment. Inverse probability weighting, g-estimation, and instrumental variable estimation can reduce the bias introduced by nonadherence and loss to follow-up in ‘as treated’ and ‘per protocol’ analyses.

LIMITATIONS: These analyse require untestable assumptions, a dose-response model, and time-varying data on confounders and adherence.

CONCLUSIONS: We recommend that all randomized clinical trials with substantial lack of adherence or loss to follow-up are analyzed using different methods. These include an intention-to-treat analysis to estimate the effect of assigned treatment and ‘as treated’ and ‘per protocol’ analyses to estimate the effect of treatment after appropriate adjustment via inverse probability weighting or g-estimation.

PMID: 21948059 [PubMed - as supplied by publisher]

    1. Am J Epidemiol. 2011 Sep 20. [Epub ahead of print]

Comparison of Different Approaches to Confounding Adjustment in a Study on the Association of Antipsychotic Medication With Mortality in Older Nursing Home Patients. Huybrechts KF, Brookhart MA, Rothman KJ, Silliman RA, Gerhard T, Crystal S, Schneeweiss S.

Selective prescribing of conventional antipsychotic medication (APM) to frailer patients is thought to have led to overestimation of the association with mortality in pharmacoepidemiologic studies relying on claims data. The authors assessed the validity of different analytic techniques to address such confounding. The cohort included 82,012 persons initiating APM use after admission to a nursing home in 45 states with 2001-2005 Medicaid/Medicare data, linked to clinical data (Minimum Data Set) and institutional characteristics. The authors compared the association between APM class and 180-day mortality with multivariate outcome modeling, propensity score (PS) adjustment, and instrumental variables. The unadjusted risk difference (per 100 patients) of 10.6 (95% confidence interval (CI): 9.4, 11.7) comparing use of conventional medication with atypical APM was reduced to 7.8 (95% CI: 6.6, 9.0) and 7.0 (95% CI: 5.8, 8.2) after PS adjustment and high-dimensional PS (hdPS) adjustment, respectively. Results were similar in analyses limited to claims-based Medicaid/Medicare variables (risk difference = 8.2 for PS, 7.1 for hdPS). Instrumental-variable estimates were imprecise (risk difference = 8.8, 95% CI: -1.3, 19.0) because of the weak instrument. These results suggest that residual confounding has a relatively small impact on the effect estimate and that hdPS methods based on claims alone provide estimates at least as good as those from conventional analyses using claims enriched with clinical information.

PMID: 21934095 [PubMed - as supplied by publisher]

    1. Pharmacoepidemiol Drug Saf. 2011 Sep 15. doi: 10.1002/pds.2196. [Epub ahead of print]

Study design for a comprehensive assessment of biologic safety using multiple healthcare data systems. Herrinton LJ, Curtis JR, Chen L, Liu L, Delzell E, Lewis JD, Solomon DH, Griffin MR, Ouellet-Hellstom R, Beukelman T, Grijalva CG, Haynes K, Kuriya B, Lii J, Mitchel E, Patkar N, Rassen J, Winthrop KL, Nourjah P, Saag KG. Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA. lisa.herrinton@kp.org.

BACKGROUND: Although biologic treatments have excellent efficacy for many autoimmune diseases, safety concerns persist. Understanding the absolute and comparative risks of adverse events in patient and disease subpopulations is critical for optimal prescribing of biologics.

PURPOSE: The Safety Assessment of Biologic Therapy collaborative was federally funded to provide robust estimates of rates and relative risks of adverse events among biologics users using data from national Medicaid and Medicare plus Medicaid dual-eligible programs, Tennessee Medicaid, Kaiser Permanente, and state pharmaceutical assistance programs supplementing New Jersey and Pennsylvania Medicare programs. This report describes the organizational structure of the collaborative and the study population and methods.

METHODS: This retrospective cohort study (1998-2007) examined risks of seven classes of adverse events in relation to biologic treatments prescribed for seven autoimmune diseases. Propensity scores were used to control for confounding and enabled pooling of individual-level data across data systems while concealing personal health information. Cox proportional hazard modeling was used to analyze study hypotheses.

RESULTS: The cohort was composed of 159,000 subjects with rheumatic diseases, 33,000 with psoriasis, and 46,000 with inflammatory bowel disease. This report summarizes demographic characteristics and drug exposures. Separate reports will provide outcome definitions and estimated hazard ratios for adverse events.

CONCLUSION: This comprehensive research will improve understanding of the safety of these treatments. The methods described may be useful to others planning similar evaluations. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21919113 [PubMed - as supplied by publisher]

    1. Contemp Clin Trials. 2011 Sep 6. [Epub ahead of print]

Comparison of statistical approaches for physician-randomized trials with survival outcomes. Stedman MR, Lew RA, Losina E, Gagnon DR, Solomon DH, Brookhart MA. Orthopedics and Arthritis Center for Outcomes Research, Department of Orthopedics, Brigham and Women’s Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.

This study compares methods for analyzing correlated survival data from physician-randomized trials of health care quality improvement interventions. Several proposed methods adjust for correlated survival data; however the most suitable method is unknown. Applying the characteristics of our study example, we performed three simulation studies to compare conditional, marginal, and non-parametric methods for analyzing clustered survival data. We simulated 1000 datasets using a shared frailty model with (1) fixed cluster size, (2) variable cluster size, and (3) non-lognormal random effects. Methods of analyses included: the nonlinear mixed model (conditional), the marginal proportional hazards model with robust standard errors, the clustered logrank test, and the clustered permutation test (non-parametric). For each method considered we estimated Type I error, power, mean squared error, and the coverage probability of the treatment effect estimator. We observed underestimated Type I error for the clustered logrank test. The marginal proportional hazards method performed well even when model assumptions were violated. Nonlinear mixed models were only advantageous when the distribution was correctly specified.

PMID: 21924382 [PubMed - as supplied by publisher]

CER Scan [published within the last 30 days]

    1. BMC Med Res Methodol. 2011 Sep 21;11(1):132. [Epub ahead of print]

Benefits of ICU admission in critically ill patients: Whether instrumental variable methods or propensity scores should be used. Pirracchio R, Sprung C, Payen D, Chevret S.

BACKGROUND: The assessment of the causal effect of Intensive Care Unit (ICU) admission generally involves usual observational designs and thus requires controlling for confounding variables. Instrumental variable analysis is an econometric technique that allows causal inferences of the effectiveness of some treatments during situations to be made when a randomized trial has not been or cannot be conducted. This technique relies on the existence of one variable or “instrument” that is supposed to achieve similar observations with a different treatment for “arbitrary” reasons, thus inducing substantial variation in the treatment decision with no direct effect on the outcome. The objective of the study was to assess the benefit in terms of hospital mortality of ICU admission in a cohort of patients proposed for ICU admission (ELDICUS cohort).

METHODS: Using this cohort of 8,201 patients triaged for ICU (including 6,752 (82.3%) patients admitted), the benefit of ICU admission was evaluated using 3 different approaches: instrumental variables, standard regression and propensity score matched analyses. We further evaluated the results obtained using different instrumental variable methods that have been proposed for dichotomous outcomes.

RESULTS: The physician’s main specialization was found to be the best instrument. All instrumental variable models adequately reduced baseline imbalances, but failed to show a significant effect of ICU admission on hospital mortality, with confidence intervals far higher than those obtained in standard or propensity-based analyses.

CONCLUSIONS: Instrumental variable methods offer an appealing alternative to handle the selection bias related to nonrandomized designs, especially when the presence of significant unmeasured confounding is suspected. Applied to the ELDICUS database, this analysis failed to show any significant beneficial effect of ICU admission on hospital mortality. This result could be due to the lack of statistical power of these methods.

PMID: 21936926 [PubMed - as supplied by publisher]

Free Full Text: http://www.biomedcentral.com/content/pdf/1471-2288-11-132.pdf

    1. BMC Med Res Methodol. 2011 Sep 19;11(1):129.

Imputation of missing values of tumour stage in population-based cancer registration. Eisemann N, Waldmann A, Katalinic A. Institute of Cancer Epidemiology, University Luebeck, Ratzeburger Allee 160 (Haus 50), 23562 Luebeck, Germany. nora.eisemann@uksh.de.

BACKGROUND: Missing data on tumour stage information is a common problem in population-based cancer registries. Statistical analyses on the level of tumour stage may be biased, if no adequate method for handling of missing data is applied. In order to determine a useful way to treat missing data on tumour stage, we examined different imputation models for multiple imputation with chained equations for analysing the stage-specific numbers of cases of malignant melanoma and female breast cancer.

METHODS: This analysis was based on the malignant melanoma data set and the female breast cancer data set of the cancer registry Schleswig-Holstein, Germany. The cases with complete tumour stage information were extracted and their stage information partly removed according to a MAR missingness-pattern, resulting in five simulated data sets for each cancer entity. The missing tumour stage values were then treated with multiple imputation with chained equations, using polytomous regression, predictive mean matching, random forests and proportional sampling as imputation models. The estimated tumour stages, stage-specific numbers of cases and survival curves after multiple imputation were compared to the observed ones.

RESULTS: The amount of missing values for malignant melanoma was too high to estimate a reasonable number of cases for each UICC stage. However, multiple imputation of missing stage values led to stage-specific numbers of cases of T-stage for malignant melanoma as well as T- and UICC-stage for breast cancer close to the observed numbers of cases. The observed tumour stages on the individual level, the stage-specific numbers of cases and the observed survival curves were best met with polytomous regression or predictive mean matching but not with random forest or proportional sampling as imputation models.

CONCLUSIONS: This limited simulation study indicates that multiple imputation with chained equations is an appropriate technique for dealing with missing information on tumour stage in population-based cancer registries, if the amount of unstaged cases is on a reasonable level.

PMID: 21929796 [PubMed - as supplied by publisher]

Free Full Text: http://www.biomedcentral.com/content/pdf/1471-2288-11-129.pdf

    1. Med Care. 2011 Oct;49(10):940-7.

The Mortality Risk Score and the ADG Score: Two Points-Based Scoring Systems for the Johns Hopkins Aggregated Diagnosis Groups to Predict Mortality in a General Adult Population Cohort in Ontario, Canada. Austin PC, Walraven C. *Institute for Clinical Evaluative Sciences, Toronto, Ontario †Department of Health Management, Policy and Evaluation ‡Dalla Lana School of Public Health, University of Toronto §Ottawa Hospital Research Institute Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada.

BACKGROUND: Logistic regression models that incorporated age, sex, and indicator variables for the Johns Hopkins’ Aggregated Diagnosis Groups (ADGs) categories have been shown to accurately predict all-cause mortality in adults.

OBJECTIVES: To develop 2 different point-scoring systems using the ADGs. The Mortality Risk Score (MRS) collapses age, sex, and the ADGs to a single summary score that predicts the annual risk of all-cause death in adults. The ADG Score derives weights for the individual ADG diagnosis groups.

RESEARCH DESIGN: Retrospective cohort constructed using population-based administrative data.

PARTICIPANTS: All 10,498,413 residents of Ontario, Canada, between the age of 20 and 100 years who were alive on their birthday in 2007, participated in this study. Participants were randomly divided into derivation and validation samples.

MEASURES: Death within 1 year.

RESULTS: In the derivation cohort, the MRS ranged from -21 to 139 (median value 29, IQR 17 to 44). In the validation group, a logistic regression model with the MRS as the sole predictor significantly predicted the risk of 1-year mortality with a c-statistic of 0.917. A regression model with age, sex, and the ADG Score has similar performance. Both methods accurately predicted the risk of 1-year mortality across the 20 vigintiles of risk.

CONCLUSIONS: The MRS combined values for a person’s age, sex, and the John Hopkins ADGs to accurately predict 1-year mortality in adults. The ADG Score is a weighted score representing the presence or absence of the 32 ADG diagnosis groups. These scores will facilitate health services researchers conducting risk adjustment using administrative health care databases.

PMID: 21921849 [PubMed - in process]

    1. Ann Epidemiol. 2011 Oct;21(10):780-6.

Mixture analysis of heterogeneous physical activity outcomes. Lee AH, Xiang L. Department of Epidemiology and Biostatistics, School of Public Health, Curtin University, Perth, WA, Australia.

PURPOSE: The health benefits of physical activity (PA) are well established. PA outcomes, being semicontinuous in nature, often exhibit a large portion of zero values together with continuous positive values that are right-skewed. We propose a novel two-part mixture regression model with random effects to characterize heterogeneity of the clustered PA data.

METHODS: In the binary part, the odds of PA participation are modeled with the use of a logistic mixed regression model. In the continuous part, the PA intensity conditional on those individuals engaging in PA is assessed by a gamma mixture regression model. Random effects are incorporated within the two parts to account for correlation of the observations.

RESULTS: Model fitting and inference are performed through the Gaussian quadrature technique, which is implemented conveniently in the SAS PROC NLMIXED. The development of mixture methodology for analyzing PA is motivated by a study of PA in the daily life of patients with chronic obstructive pulmonary disease.

CONCLUSIONS: The findings demonstrate the usefulness of the mixture analysis, which enables the separate identification of pertinent factors affecting PA participation and PA intensity for different patient subgroups.

PMID: 21684174 [PubMed - in process]

Additional Article of Interest [published within the last 90 days]:

    1. Am J Prev Med 2011;40(6):637–644.

A Proposal to Speed Translation of Healthcare Research Into Practice: Dramatic Change Is Needed. Kessler R, Glasgow RE.

Efficacy trials have generated interventions to improve health behaviors and biomarkers. However, these efforts have had limited impact on practice and policy. It is suggested that key methodologic and contextual issues have contributed to this state of affairs. Current research paradigms generally have not provided the answers needed for more probable and more rapid translation. A major shift is proposed to produce research with more rapid clinical, public health, and policy impact. Copyright © 2011 American Journal of Preventive Medicine. All rights reserved.

PMID: 21565657 [PubMed - indexed for MEDLINE]

 

September 2011

Jump to top of page

CER Scan [Epub ahead of print]

    1. Biostatistics. 2011 Aug 18. [Epub ahead of print]

A robust method using propensity score stratification for correcting verification bias for binary tests. He H, McDermott MP. Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY 14642, USA. mikem@bst.rochester.edu.

Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified.

PMID: 21856650 [PubMed - as supplied by publisher]

    1. Prev Med. 2011 Aug 17. [Epub ahead of print]

Null misinterpretation in statistical testing and its impact on health risk assessment. Greenland S.

Statistical methods play a pivotal role in health risk assessment, but not always an enlightened one. Problems well known to academics are frequently overlooked in crucial nonacademic venues such as litigation, even though those venues can have profound impacts on population health and medical practice. Statisticians have focused heavily on how statistical significance overstates evidence against null hypotheses, but less on how statistical nonsignificance does not correspond to evidence for the null. I thus present an example of a highly credentialed statistical expert conflating high “nonsignificance” with strong support for the null, via misinterpretation of a P-value as a posterior probability of the null hypothesis. Reverse-Bayes analyses reveal that nearly all the support for the null claimed by the expert must have come from the expert’s prior, rather than the data, and that there was no background data that could support a strong prior. The example illustrates how carelessness about the actual meaning of P-values and confidence limits allow extremely biased prior opinions (including null-spiked opinions) to be presented as if they were objective inferences from the data.

PMID: 21871481 [PubMed - as supplied by publisher]

    1. J Clin Epidemiol. 2011 Aug 11. [Epub ahead of print]

The “best balance” allocation led to optimal balance in cluster-controlled trials. de Hoop E, Teerenstra S, van Gaal BG, Moerbeek M, Borm GF. Department of Epidemiology, Biostatistics and HTA, 133, Radboud University Nijmegen Medical Centre, PO Box 9101, 6500 HB Nijmegen, The Netherlands.

OBJECTIVE: Balance of prognostic factors between treatment groups is desirable because it improves the accuracy, precision, and credibility of the results. In cluster-controlled trials, imbalance can easily occur by chance when the number of cluster is small. If all clusters are known at the start of the study, the “best balance” allocation method (BB) can be used to obtain optimal balance. This method will be compared with other allocation methods.

STUDY DESIGN AND SETTING: We carried out a simulation study to compare the balance obtained with BB, minimization, unrestricted randomization, and matching for four to 20 clusters and one to five categorical prognostic factors at cluster level.

RESULTS: BB resulted in a better balance than randomization in 13-100% of the situations, in 0-61% for minimization, and in 0-88% for matching. The superior performance of BB increased as the number of clusters and/or the number of factors increased.

CONCLUSION: BB results in a better balance of prognostic factors than randomization, minimization, stratification, and matching in most situations. Furthermore, BB cannot result in a worse balance of prognostic factors than the other methods.

PMID: 21840173 [PubMed - as supplied by publisher]

    1. Stat Med. 2011 Aug 4. doi: 10.1002/sim.4322. [Epub ahead of print]

Subgroup identification from randomized clinical trial data. Foster JC, Taylor JM, Ruberg SJ. Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.

We consider the problem of identifying a subgroup of patients who may have an enhanced treatment effect in a randomized clinical trial, and it is desirable that the subgroup be defined by a limited number of covariates. For this problem, the development of a standard, pre-determined strategy may help to avoid the well-known dangers of subgroup analysis. We present a method developed to find subgroups of enhanced treatment effect. This method, referred to as ‘Virtual Twins’, involves predicting response probabilities for treatment and control ‘twins’ for each subject. The difference in these probabilities is then used as the outcome in a classification or regression tree, which can potentially include any set of the covariates. We define a measure Q(Â) to be the difference between the treatment effect in estimated subgroup  and the marginal treatment effect. We present several methods developed to obtain an estimate of Q(Â), including estimation of Q(Â) using estimated probabilities in the original data, using estimated probabilities in newly simulated data, two cross-validation-based approaches, and a bootstrap-based bias-corrected approach. Results of a simulation study indicate that the Virtual Twins method noticeably outperforms logistic regression with forward selection when a true subgroup of enhanced treatment effect exists. Generally, large sample sizes or strong enhanced treatment effects are needed for subgroup estimation. As an illustration, we apply the proposed methods to data from a randomized clinical trial. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21815180 [PubMed - as supplied by publisher]

    1. Pharmacoepidemiol Drug Saf. 2011 Aug 2. doi: 10.1002/pds.2205. [Epub ahead of print]

Record linkage for pharmacoepidemiological studies in cancer patients. Herk-Sukel MP, Lemmens VE, Poll-Franse LV, Herings RM, Coebergh JW. PHARMO Institute for Drug Outcomes Research, Utrecht, the Netherlands. myrthe.van.herk@pharmo.nl.

BACKGROUND: An increasing need has developed for the post-approval surveillance of (new) anti-cancer drugs by means of pharmacoepidemiology and outcomes research in the area of oncology.

OBJECTIVES: To create an overview that makes researchers aware of the available database linkages in Northern America and Europe which facilitate pharmacoepidemiology and outcomes research in cancer patients.

METHODS: In addition to our own database, i.e. the Eindhoven Cancer Registry (ECR) linked to the PHARMO Record Linkage System, we considered database linkages between a population-based cancer registry and an administrative healthcare database that at least contains information on drug use and offers a longitudinal perspective on healthcare utilization. Eligible database linkages were limited to those that had been used in multiple published articles in English language included in Pubmed. The HMO Cancer Research Network (CRN) in the US was excluded from this review, as an overview of the linked databases participating in the CRN is already provided elsewhere. Researchers who had worked with the data resources included in our review were contacted for additional information and verification of the data presented in the overview.

RESULTS: The following database linkages were included: the Surveillance, Epidemiology, and End-Results-Medicare; cancer registry data linked to Medicaid; Canadian cancer registries linked to population-based drug databases; the Scottish cancer registry linked to the Tayside drug dispensing data; linked databases in the Nordic Countries of Europe: Norway, Sweden, Finland and Denmark; and the ECR-PHARMO linkage in the Netherlands. Descriptives of the included database linkages comprise population size, generalizability of the population, year of first data availability, contents of the cancer registry, contents of the administrative healthcare database, the possibility to select a cancer-free control cohort, and linkage to other healthcare databases.

CONCLUSIONS: The linked databases offer a longitudinal perspective, allowing for observations of health care utilization before, during, and after cancer diagnosis. They create new powerful data resources for the monitoring of post-approval drug utilization, as well as a framework to explore the cost-effectiveness of new, often expensive, anti-cancer drugs as used in everyday practice. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21812067 [PubMed - as supplied by publisher]

CER Scan [published within the last 30 days]

    1. JAMA. 2011 Aug 24;306(8):848-55.

Automated identification of postoperative complications within an electronic medical record using natural language processing. Murff HJ, FitzHenry F, Matheny ME, Gentry N, Kotter KL, Crimin K, Dittus RS, Rosen AK, Elkin PL, Brown SH, Speroff T. Tennessee Valley Healthcare System, Veterans Affairs Medical Center, Nashville, TN, USA. harvey.j.murff@vanderbilt.edu

Comment in: JAMA. 2011 Aug 24;306(8):880-1.

CONTEXT: Currently most automated methods to identify patient safety occurrences rely on administrative data codes; however, free-text searches of electronic medical records could represent an additional surveillance approach. OBJECTIVE: To evaluate a natural language processing search-approach to identify postoperative surgical complications within a comprehensive electronic medical record.

DESIGN, SETTING, AND PATIENTS: Cross-sectional study involving 2974 patients undergoing inpatient surgical procedures at 6 Veterans Health Administration (VHA) medical centers from 1999 to 2006.

MAIN OUTCOME MEASURES: Postoperative occurrences of acute renal failure requiring dialysis, deep vein thrombosis, pulmonary embolism, sepsis, pneumonia, or myocardial infarction identified through medical record review as part of the VA Surgical Quality Improvement Program. We determined the sensitivity and specificity of the natural language processing approach to identify these complications and compared its performance with patient safety indicators that use discharge coding information. RESULTS: The proportion of postoperative events for each sample was 2% (39 of 1924) for acute renal failure requiring dialysis, 0.7% (18 of 2327) for pulmonary embolism, 1% (29 of 2327) for deep vein thrombosis, 7% (61 of 866) for sepsis, 16% (222 of 1405) for pneumonia, and 2% (35 of 1822) for myocardial infarction. Natural language processing correctly identified 82% (95% confidence interval [CI], 67%-91%) of acute renal failure cases compared with 38% (95% CI, 25%-54%) for patient safety indicators. Similar results were obtained for venous thromboembolism (59%, 95% CI, 44%-72% vs 46%, 95% CI, 32%-60%), pneumonia (64%, 95% CI, 58%-70% vs 5%, 95% CI, 3%-9%), sepsis (89%, 95% CI, 78%-94% vs 34%, 95% CI, 24%-47%), and postoperative myocardial infarction (91%, 95% CI, 78%-97%) vs 89%, 95% CI, 74%-96%). Both natural language processing and patient safety indicators were highly specific for these diagnoses.

CONCLUSION: Among patients undergoing inpatient surgical procedures at VA medical centers, natural language processing analysis of electronic medical records to identify postoperative complications had higher sensitivity and lower specificity compared with patient safety indicators based on discharge coding.

PMID: 21862746 [PubMed - indexed for MEDLINE]

    1. JAMA. 2011 Aug 17;306(7):709; author reply 709-10.

Efficacy research and unanswered clinical questions. Vohra S, Shamseer L, Sampson M.

Comment on: JAMA. 2011 May 18;305(19):2005-6.

PMID: 21846851 [PubMed - indexed for MEDLINE]

    1. Pharmacoepidemiol Drug Saf. 2011 Aug;20(8):858-65. doi: 10.1002/pds.2160. Epub 2011 Jun 13.

Why do covariates defined by International Classification of Diseases codes fail to remove confounding in pharmacoepidemiologic studies among seniors? Jackson ML, Nelson JC, Jackson LA. Group Health Research Institute, Seattle, WA, USA. jackson.ml@ghc.org.

PURPOSE: The common practice of using administrative diagnosis codes as the sole source of data on potential confounders in pharmacoepidemiologic studies has been shown to leave substantial residual confounding. We explored reasons why adjustment for comorbid illness defined from International Classification of Diseases (ICD) codes fails to remove confounding.

METHODS: We used data from a case-control study among immunocompetent seniors enrolled in Group Health to estimate bias in the estimated association between receipt of influenza vaccine and the risk of community-acquired pneumonia during non-influenza control periods and to estimate the effects of adjusting for comorbid illnesses defined from either ICD codes or the medical record. We also estimated the accuracy of ICD codes for identifying comorbid illnesses compared with the gold standard of medical record review.

RESULTS: Sensitivity of ICD codes for illnesses recorded in the medical record ranged from 59 to 97% (median, 76%). Strong confounding was present in the vaccine/pneumonia association, as evidenced by the non-null odds ratio of 0.60 (95% confidence interval, 0.38-0.95) during this control period. Adjusting for the presence/absence of comorbid illnesses defined from either medical record review (odds ratio, 0.73) or from ICD codes (odds ratio, 0.68) left considerable residual confounding.

CONCLUSIONS: ICD codes may fail to control for confounding because they often lack sensitivity for detecting comorbid illnesses and because measures of the presence/absence of comorbid illnesses may be insufficient to remove confounding. These findings call for caution in the use of ICD codes to control for confounding. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21671442 [PubMed - in process]

    1. J Clin Epidemiol. 2011 Aug;64(8):821-9. Epub 2010 Dec 30.

Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A. The Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada. ebenchimol@cheo.on.ca

BACKGROUND AND OBJECTIVES: Validation of health administrative data for identifying patients with different health states (diseases and conditions) is a research priority, but no guidelines exist for ensuring quality. We created reporting guidelines for studies validating administrative data identification algorithms and used them to assess the quality of reporting of validation studies in the literature.

METHODS: Using Standards for Reporting of Diagnostic accuracy (STARD) criteria as a guide, we created a 40-item checklist of items with which identification accuracy studies should be reported. A systematic review identified studies that validated identification algorithms using administrative data. We used the checklist to assess the quality of reporting.

RESULTS: In 271 included articles, goals and data sources were well reported but few reported four or more statistical estimates of accuracy (36.9%). In 65.9% of studies reporting positive predictive value (PPV)/negative predictive value (NPV), the prevalence of disease in the validation cohort was higher than in the administrative data, potentially falsely elevating predictive values. Subgroup accuracy (53.1%) and 95% confidence intervals for accuracy measures (35.8%) were also underreported.

CONCLUSIONS: The quality of studies validating health states in the administrative data varies, with significant deficits in reporting of markers of diagnostic accuracy, including the appropriate estimation of PPV and NPV. These omissions could lead to misclassification bias and incorrect estimation of incidence and health services utilization rates. Use of a reporting checklist, such as the one created for this study by modifying the STARD criteria, could improve the quality of reporting of validation studies, allowing for accurate application of algorithms, and interpretation of research using health administrative data.

PMID: 21194889 [PubMed - indexed for MEDLINE]

SEPTEMBER THEME: Application of Propensity Scores in CER of Surgical Interventions (This is a cross-section of studies published within the last year that demonstrate the level discussion in the field. The Methods Center does not necessarily endorse the studies’ methodology)

    1. Arch Surg. 2010 Oct;145(10):939-45.

Introduction to propensity scores: A case study on the comparative effectiveness of laparoscopic vs open appendectomy. Hemmila MR, Birkmeyer NJ, Arbabi S, Osborne NH, Wahl WL, Dimick JB. Department of Surgery, University of Michigan Medical School, Ann Arbor, 48109-5033, USA. mhemmila@umich.edu

Comment in Arch Surg. 2010 Oct;145(10):945-6.

OBJECTIVE: To demonstrate the use of propensity scores to evaluate the comparative effectiveness of laparoscopic and open appendectomy.

DESIGN: Retrospective cohort study.

SETTING: Academic and private hospitals.

PATIENTS: All patients undergoing open or laparoscopic appendectomy (n = 21 475) in the Public Use File of the American College of Surgeons National Surgical Quality Improvement Program were included in the study. We first evaluated the surgical approach (laparoscopic vs open) using multivariate logistic regression. We next generated propensity scores and compared outcomes for open and laparoscopic appendectomy in a 1:1 matched cohort. Covariates in the model for propensity scores included comorbidities, age, sex, race, and evidence of perforation.

MAIN OUTCOME MEASURES: Patient morbidity and mortality, rate of return to operating room, and hospital length of stay.

RESULTS: Twenty-eight percent of patients underwent open appendectomy, and 72% had a laparoscopic approach; 33% (open) vs 14% (laparoscopic) had evidence of a ruptured appendix. In the propensity-matched cohort, there was no difference in mortality (0.3% vs 0.2%), reoperation (1.8% vs 1.5%), or incidence of major complications (5.9% vs 5.4%) between groups. Patients undergoing laparoscopic appendectomy experienced fewer wound infections (odds ratio [OR], 0.4; 95% confidence interval [CI], 0.3-0.5) and fewer episodes of sepsis (0.8; 0.6-1.0) but had a greater risk of intra-abdominal abscess (1.7; 1.3-2.2). An analysis using multivariate adjustment resulted in similar findings.

CONCLUSIONS: After accounting for patient severity, open and laparoscopic appendectomy had similar clinical outcomes. In this case study, propensity score methods and multivariate adjustment yielded nearly identical results.

PMID: 20956761 [PubMed - indexed for MEDLINE]

    1. Arch Surg. 2011 Jul;146(7):887-8.

Propensity score methods: setting the score straight. Mayo SC, Pawlik TM. Department of Surgery, Johns Hopkins University, 600 N Wolfe St, Blalock 655, Baltimore, MD 21287. tpawlik1@jhmi.edu.

PMID: 21768443 [PubMed - in process]

    1. J Thorac Cardiovasc Surg. 2011 Aug 13. [Epub ahead of print]

On-pump and off-pump coronary artery bypass grafting in patients with left main stem disease: A propensity score analysis. Murzi M, Caputo M, Aresu G, Duggan S, Miceli A, Glauber M, Angelini GD. Bristol Heart Institute, University of Bristol, Bristol, UK.

OBJECTIVE: This study compared safety and efficacy between off-pump coronary artery bypass grafting (OPCAB), a relatively new technique, and conventional on-pump coronary artery bypass grafting (CCAB) in patients with left main stem disease.

METHODS: In a retrospective, observational, cohort study of prospectively collected data on 2375 consecutive patients with left main stem disease undergoing isolated CABG (1297 OPCAB, 1078 CCAB) between April 1996 and December 2009 at the Bristol Heart Institute, 548 patients undergoing OPCAB were matched with 548 patients undergoing CCAB by propensity score.

RESULTS: After propensity matching, groups were comparable in preoperative characteristics. Relative to CCAB, OPCAB was associated with lower in-hospital mortality (0.5% vs 2.9%; P = .001), incidence of stroke (0% vs 0.9%; P = .02), postoperative renal dysfunction (4.9% vs 10.8%; P = .001), pulmonary complications (10.2% vs 16.6%; P = .002), and infectious complications (3.5% vs 6.2%; P = .03). The OPCAB group received fewer grafts than did the CCAB group (2.7 ± 0.7 vs 3 ± 0.7; P = .001) and had a lower rate of complete revascularization (88.3% vs 92%; P = .04). In multivariable analysis, cardiopulmonary bypass was confirmed to be an independent predictor of in-hospital mortality (odds ratio, 5.74; P = .001). Survivals at 1, 5, and 10 years were similar between groups (OPCAB, 96.8%, 87.3%, and 71.7%; CCAB, 96.8%, 88.6%, and 69.8%).

CONCLUSIONS: OPCAB in patients with left main stem disease is a safe procedure with reduced early morbidity and mortality and similar long-term survival to conventional on-pump revascularization.

PMID: 21843893 [PubMed - as supplied by publisher]

    1. J Thorac Cardiovasc Surg. 2011 Jun 16. [Epub ahead of print]

Results of matching valve and root repair to aortic valve and root pathology.Svensson LG, Batizy LH, Blackstone EH, Marc Gillinov A, Moon MC, D’Agostino RS, Nadolny EM, Stewart WJ, Griffin BP, Hammer DF, Grimm R, Lytle BW. Department of Thoracic and Cardiovascular Surgery, Cleveland Clinic, Cleveland, Ohio.

OBJECTIVE: For patients with aortic root pathology and aortic valve regurgitation, aortic valve replacement is problematic because no durable bioprosthesis exists, and mechanical valves require lifetime anticoagulation. This study sought to assess outcomes of combined aortic valve and root repair, including comparison with matched bioprosthesis aortic valve replacement.

METHODS: From November 1990 to January 2005, 366 patients underwent modified David reimplantation (n = 72), root remodeling (n = 72), or valve repair with sinotubular junction tailoring (n = 222). Active follow-up was 99% complete, with a mean of 5.6 ± 4.0 years (maximum 17 years); follow-up for vital status averaged 8.5 ± 3.6 years (maximum 19 years). Propensity-adjusted models were developed for fair comparison of outcomes.

RESULTS: Thirty-day and 5-, 10-, and 15-year survivals were 98%, 86%, 74%, and 58%, respectively, similar to that of the US matched population and better than that after bioprosthesis aortic valve replacement. Propensity-score-adjusted survival was similar across procedures (P > .3). Freedom from reoperation at 30 days and 5 and 10 years was 99%, 92%, and 89%, respectively, and was similar across procedures (P > .3) after propensity-score adjustment. Patients with tricuspid aortic valves were more likely to be free of reoperation than those with bicuspid valves at 10 years (93% vs 77%, P = .002), equivalent to bioprosthesis aortic valve replacement and superior after 12 years. Bioprostheses increasingly deteriorated after 7 years, and hazard functions for reoperation crossed at 7 years.

CONCLUSIONS: Valve preservation (rather than replacement) and matching root procedures have excellent early and long-term results, with increasing survival benefit at 7 years and fewer reoperations by 12 years. We recommend this procedure for experienced surgical teams.

PMID: 21683965 [PubMed - as supplied by publisher]

    1. Ann Surg. 2011 Feb;253(2):385-92.

Can the impact of change of surgical teams in cardiovascular surgery be measured by operative mortality or morbidity? A propensity adjusted cohort comparison. Brown ML, Parker SE, Quiñonez LG, Li Z, Sundt TM. Division of Anesthesiology and Pain Medicine, University of Alberta, Edmonton, AB, Canada.

OBJECTIVE: Our objective was to examine the impact of team changeover and unfamiliar teams in cardiovascular surgery on traditional clinical outcome measures.

BACKGROUND: The importance of teamwork in the operating room is increasingly being appreciated, but the impact on more traditional outcome measures is unclear.

METHODS: Elective or urgent cardiovascular procedures were divided into categories: team D (patients who had an operation with a day team); team E (patients who had an operation with an evening team); team C (patients who had an operation which included changeover between a day and evening team). Comparison groups were adjusted using propensity scores.

RESULTS: We identified 6698 patients who met inclusion criteria (team D, n =3781; team E, n = 518; team C, n = 2399). After propensity score adjustment,there was an increased skin–skin time of 28 minutes in team C when compared with team D (P < 0.001) and of 21 minutes when compared with team E (P <0.001). There were also more episodes of septicemia among team C patients(OR 1.85, P = 0.013) when compared with team D. Patients operated by a day team had a statistically significantly lower number of ventilated hours and shorter hospital length of stay when compared with team E and team C (P < 0.001 and P < 0.001, respectively). There was no difference between teams in operative death, reoperation for bleeding, blood transfusion, renal failure/dialysis, neurologic events, or deep/superficial wound infections.

CONCLUSIONS: The change in operating room personnel from the day team to the evening team added significant length to the total operating department time in cardiovascular surgery; however, its impact on most traditional outcome measures was difficult to demonstrate. More sensitive outcome measures may be required to assess the impact of teamwork interventions.

PMID: 21173693 [PubMed - indexed for MEDLINE]

    1. J Thorac Cardiovasc Surg. 2011 Jan;141(1):72-80.e1-4. Epub 2010 Nov 19.

Robotic repair of posterior mitral valve prolapse versus conventional approaches: potential realized. Mihaljevic T, Jarrett CM, Gillinov AM, Williams SJ, DeVilliers PA, Stewart WJ, Svensson LG, Sabik JF 3rd, Blackstone EH. Department of Thoracic and Cardiovascular Surgery, Heart and Vascular Institute, Cleveland Clinic, Cleveland, Ohio 44195, USA. mihaljt@ccf.org

OBJECTIVE: Robotic mitral valve repair is the least invasive approach to mitral valve repair, yet there are few data comparing its outcomes with those of conventional approaches. Therefore, we compared outcomes of robotic mitral valve repair with those of complete sternotomy, partial sternotomy, and right mini-anterolateral thoracotomy.

METHODS: From January 2006 to January 2009, 759 patients with degenerative mitral valve disease and posterior leaflet prolapse underwent primary isolated mitral valve surgery by complete sternotomy (n = 114), partial sternotomy (n = 270), right mini-anterolateral thoracotomy (n = 114), or a robotic approach (n = 261). Outcomes were compared on an intent-to-treat basis using propensity-score matching.

RESULTS: Mitral valve repair was achieved in all patients except 1 patient in the complete sternotomy group. In matched groups, median cardiopulmonary bypass time was 42 minutes longer for robotic than complete sternotomy, 39 minutes longer than partial sternotomy, and 11 minutes longer than right mini-anterolateral thoracotomy (P < .0001); median myocardial ischemic time was 26 minutes longer than complete sternotomy and partial sternotomy, and 16 minutes longer than right mini-anterolateral thoracotomy (P < .0001). Quality of mitral valve repair was similar among matched groups (P = .6, .2, and .1, respectively). There were no in-hospital deaths. Neurologic, pulmonary, and renal complications were similar among groups (P > .1). The robotic group had the lowest occurrences of atrial fibrillation and pleural effusion, contributing to the shortest hospital stay (median 4.2 days), 1.0, 1.6, and 0.9 days shorter than for complete sternotomy, partial sternotomy, and right mini-anterolateral thoracotomy (all P < .001), respectively.

CONCLUSIONS: Robotic repair of posterior mitral valve leaflet prolapse is as safe and effective as conventional approaches. Technical complexity and longer operative times for robotic repair are compensated for by lesser invasiveness and shorter hospital stay.

PMID: 21093881 [PubMed - indexed for MEDLINE]

    1. J Urol. 2011 Jan;185(1):111-5. Epub 2010 Nov 12.

Comparative effectiveness of perineal versus retropubic and minimally invasive radical prostatectomy. Prasad SM, Gu X, Lavelle R, Lipsitz SR, Hu JC. Division of Urologic Surgery, Brigham and Women’s Hospital, Boston, Massachusetts, USA. sprasad1@bsd.surgery.uchicago.edu

Comment in

J Urol. 2011 Jul;186(1):350-1; author reply 351.

J Urol. 2011 Jul;186(1):351; author reply 351-2.

PURPOSE: While perineal radical prostatectomy has been largely supplanted by retropubic and minimally invasive radical prostatectomy, it was the predominant surgical approach for prostate cancer for many years. In our population based study we compared the use and outcomes of perineal radical prostatectomy vs retropubic and minimally invasive radical prostatectomy.

MATERIALS AND METHODS: We identified men diagnosed with prostate cancer from 2003 to 2005 who underwent perineal (452), minimally invasive (1,938) and retropubic (6,899) radical prostatectomy using Surveillance, Epidemiology and End Results-Medicare linked data through 2007. We compared postoperative 30-day and anastomotic stricture complications, incontinence and erectile dysfunction, and cancer therapy (hormonal therapy and/or radiotherapy).

RESULTS: Perineal radical prostatectomy comprised 4.9% of radical prostatectomies during our study period and use decreased with time. On propensity score adjusted analysis men who underwent perineal vs retropubic radical prostatectomy had shorter hospitalization (median 2 vs 3 days, p < 0.001), received fewer heterologous transfusions (7.2% vs 20.8%, p < 0.001) and required less additional cancer therapy (4.9% vs 6.9%, p = 0.020). When comparing perineal vs minimally invasive radical prostatectomy men who underwent the former required more heterologous transfusions (7.2% vs 2.7%, p = 0.018) but experienced fewer miscellaneous medical complications (5.3% vs 10.0%, p = 0.045) and erectile dysfunction procedures (1.4 vs 2.3/100 person-years, p = 0.008). The mean and median expenditure for perineal radical prostatectomy in the first 6 months postoperatively was $1,500 less than for retropubic or minimally invasive radical prostatectomy (p < 0.001).

CONCLUSIONS: Men who undergo perineal vs retropubic and minimally invasive radical prostatectomy experienced favorable outcomes associated with lower expenditure. Urologists may be abandoning an underused but cost-effective surgical approach that compares favorably with its successors.

PMID: 21074198 [PubMed - indexed for MEDLINE]

    1. Ann Surg. 2010 Nov;252(5):765-73.

Infrapopliteal percutaneous transluminal angioplasty versus bypass surgery as first-line strategies in critical leg ischemia: a propensity score analysis. Söderström MI, Arvela EM, Korhonen M, Halmesmäki KH, Albäck AN, Biancari F, Lepäntalo MJ, Venermo MA. Department of Vascular Surgery, Helsinki University Central Hospital, Helsinki, Finland.

INTRODUCTION: Recently, endovascular revascularization (percutaneous transluminal angioplasty [PTA]) has challenged surgery as a method for the salvage of critically ischemic legs (CLI). Comparison of surgical and endovascular techniques in randomized controlled trials is difficult because of differences in patient characteristics. To overcome this problem, we adjusted the differences by using propensity score analysis.

MATERIALS AND METHODS: The study cohort comprised 1023 patients treated for CLI with 262 endovascular and 761 surgical revascularization procedures to their crural or pedal arteries. A propensity score was used for adjustment in multivariable analysis, for stratification, and for one-to-one matching.

RESULTS: In the overall series, PTA and bypass surgery achieved similar 5-year leg salvage (75.3% vs 76.0%), survival (47.5% vs 43.3%), and amputation-free survival (37.7% vs 37.3%) rates and similar freedom from any further revascularization (77.3% vs 74.4%), whereas freedom from surgical revascularization was higher after bypass surgery (94.3% vs 86.2%, P < 0.001). In propensity-score-matched pairs, outcomes did not differ, except for freedom from surgical revascularization, which was significantly higher in the bypass surgery group (91.4% vs 85.3% at 5 years, P = 0.045). In a subgroup of patients who underwent isolated infrapopliteal revascularization, PTA was associated with better leg salvage (75.5% vs 68.0%, P = 0.042) and somewhat lower freedom from surgical revascularization (78.8% vs 85.2%, P = 0.17). This significant difference in the leg salvage rate was also observed after adjustment for propensity score (P = 0.044), but not in propensity-score-matched pairs (P = 0.12).

CONCLUSIONS: When feasible, infrapopliteal PTA as a first-line strategy is expected to achieve similar long-term results to bypass surgery in CLI when redo surgery is actively utilized.

PMID: 21037432 [PubMed - indexed for MEDLINE]

 

August 2011

Jump to top of page

CER Scan [Epub ahead of print]

    1. Pharmacoepidemiol Drug Saf. 2011 Jul 29. doi: 10.1002/pds.2188. [Epub ahead of print]

Measuring balance and model selection in propensity score methods. Belitser SV, Martens EP, Pestman WR, Groenwold RH, de Boer A, Klungel OH. Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute ofPharmaceutical Sciences, Utrecht University, Utrecht, Netherlands.

PURPOSE: Propensity score (PS) methods focus on balancing confounders between groups to estimate an unbiased treatment or exposure effect. However, there is lack of attention in actually measuring, reporting and using the information on balance, for instance for model selection. We propose to use a measure for balance in PS methods and describe several of such measures: the overlapping coefficient, the Kolmogorov-Smirnov distance, and the Lévy distance.

METHODS: We performed simulation studies to estimate the association between these three and several mean based measures for balance and bias (i.e., discrepancy between the true and the estimated treatment effect).

RESULTS: For large sample sizes (n=2000) the average Pearson’s correlation coefficients between bias and Kolmogorov-Smirnov distance (r=0.89), the Lévy distance (r=0.89) and the absolute standardized mean difference (r=0.90) were similar, whereas this was lower for the overlapping coefficient (r=-0.42). When sample size decreased to 400, mean based measures of balance had stronger correlations with bias. Models including all confounding variables, their squares and interaction terms resulted in smaller bias than models that included only main terms for confounding variables.

CONCLUSIONS: We conclude that measures for balance are useful for reporting the amount of balance reached in propensity score analysis and can be helpful in selecting the final PS model. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21805529 [PubMed - as supplied by publisher]

    1. Stat Med. 2011 Jul 29. doi: 10.1002/sim.4324. [Epub ahead of print]

Analyzing direct and indirect effects of treatment using dynamic path analysis applied to data from the Swiss HIV Cohort Study. Røysland K, Gran JM, Ledergerber B, Wyl V, Young J, Aalen OO. Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, Norway. kjetil.roysland@medisin.uio.no.

When applying survival analysis, such as Cox regression, to data from major clinical trials or other studies, often only baseline covariates are used. This is typically the case even if updated covariates are available throughout the observation period, which leaves large amounts of information unused. The main reason for this is that such time-dependent covariates often are internal to the disease process, as they are influenced by treatment, and therefore lead to confounded estimates of the treatment effect. There are, however, methods to exploit such covariate information in a useful way. We study the method of dynamic path analysis applied to data from the Swiss HIV Cohort Study. To adjust for time-dependent confounding between treatment and the outcome ‘AIDS or death’, we carried out the analysis on a sequence of mimicked randomized trials constructed from the original cohort data. To analyze these trials together, regular dynamic path analysis is extended to a composite analysis of weighted dynamic path models. Results using a simple path model, with one indirect effect mediated through current HIV-1 RNA level, show that most or all of the total effect go through HIV-1 RNA for the first 4 years. A similar model, but with CD4 level as mediating variable, shows a weaker indirect effect, but the results are in the same direction. There are many reasons to be cautious when drawing conclusions from estimates of direct and indirect effects. Dynamic path analysis is however a useful tool to explore underlying processes, which are ignored in regular analyses. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21800346 [PubMed - as supplied by publisher]

    1. Arch Intern Med. 2011 Jul 25. [Epub ahead of print]

Predicting Death: An Empirical Evaluation of Predictive Tools for Mortality. Siontis GC, Tzoulaki I, Ioannidis JP.University of Ioannina School of Medicine, Ioannina, Greece (Drs Siontis, Tzoulaki, and Ioannidis); Department of Epidemiology and Biostatistics, Imperial College of Medicine, London, England (Drs Tzoulaki and Ioannidis); the Institute for Clinical Research and Health Policy Studies, Department of Medicine, Tufts University School of Medicine, Boston, Massachusetts (Dr Ioannidis); the Department of Epidemiology, Harvard School of Public Health, Boston (Dr Ioannidis); and the Stanford Prevention Research Center, Stanford University School of Medicine, Stanford, California (Dr Ioannidis).

BACKGROUND: The ability to predict death is crucial in medicine, and many relevant prognostic tools have been developed for application in diverse settings. We aimed to evaluate the discriminating performance of predictive tools for death and the variability in this performance across different clinical conditions and studies.

METHODS: We used Medline to identify studies published in 2009 that assessed the accuracy (based on the area under the receiver operating characteristic curve [AUC]) of validated tools for predicting all-cause mortality. For tools where accuracy was reported in 4 or more assessments, we calculated summary accuracy measures. Characteristics of studies of the predictive tools were evaluated to determine if they were associated with the reported accuracy of the tool.

RESULTS: A total of 94 eligible studies provided data on 240 assessments of 118 predictive tools. The AUC ranged from 0.43 to 0.98 (median [interquartile range], 0.77 [0.71-0.83]), with only 23 of the assessments reporting excellent discrimination (10%) (AUC, >0.90). For 10 tools, accuracy was reported in 4 or more assessments; only 1 tool had a summary AUC exceeding 0.80. Established tools showed large heterogeneity in their performance across different cohorts (I(2) range, 68%-95%). Reported AUC was higher for tools published in journals with lower impact factor (P = .01), with larger sample size (P = .01), and for those that aimed to predict mortality among the highest-risk patients (P = .002) and among children (P < .001).

CONCLUSIONS: Most tools designed to predict mortality have only modest accuracy, and there is large variability across various diseases and populations. Most proposed tools do not have documented clinical utility.

PMID: 21788535 [PubMed - as supplied by publisher]

    1. Am J Epidemiol. 2011 Jul 16. [Epub ahead of print]

Reducing the Variance of the Prescribing Preference-based Instrumental Variable Estimates of the Treatment Effect. Abrahamowicz M, Beauchamp ME, Ionescu-Ittu R, Delaney JA, Pilote L.

Instrumental variable (IV) methods based on the physician’s prescribing preference may remove bias due to unobserved confounding in pharmacoepidemiologic studies. However, IV estimates, originally defined as the treatment prescribed for a single previous patient of a given physician, show important variance inflation. The authors proposed and validated in simulations a new method to reduce the variance of IV estimates even when physicians’ preferences change over time. First, a potential “change-time,” after which the physician’s preference has changed, was estimated for each physician. Next, all patients of a given physician were divided into 2 homogeneous subsets: those treated before the change-time versus those treated after the change-time. The new IV was defined as the proportion of all previous patients in a corresponding homogeneous subset who were prescribed a specific drug. In simulations, all alternative IV estimators avoided strong bias of the conventional estimates. The change-time method reduced the standard deviation of the estimates by approximately 30% relative to the original previous patient-based IV. In an empirical example, the proposed IV correlated better with the actual treatment and yielded smaller standard errors than alternative IV estimators. Therefore, the new method improved the overall accuracy of IV estimates in studies with unobserved confounding and time-varying prescribing preferences.

PMID: 21765169 [PubMed - as supplied by publisher]

    1. Am J Epidemiol. 2011 Jul 12. [Epub ahead of print]

Performance of Disease Risk Scores, Propensity Scores, and Traditional Multivariable Outcome Regression in the Presence of Multiple Confounders. Arbogast PG, Ray WA.

Propensity scores are widely used in cohort studies to improve performance of regression models when considering large numbers of covariates. Another type of summary score, the disease risk score (DRS), which estimates disease probability conditional on nonexposure, has also been suggested. However, little is known about how it compares with propensity scores. Monte Carlo simulations were conducted comparing regression models using the DRS and the propensity score with models that directly adjust for all of the individual covariates. The DRS was calculated in 2 ways: from the unexposed population and from the full cohort. Compared with traditional multivariable outcome regression models, all 3 summary scores had comparable performance for moderate correlation between exposure and covariates and, for strong correlation, the full-cohort DRS and propensity score had comparable performance. When traditional methods had model misspecification, propensity scores and the full-cohort DRS had superior performance. All 4 models were affected by the number of events per covariate, with propensity scores and traditional multivariable outcome regression least affected. These data suggest that, for cohort studies for which covariates are not highly correlated with exposure, the DRS, particularly that calculated from the full cohort, is a useful tool.

PMID: 21749976 [PubMed - as supplied by publisher]

    1. Epidemiology. 2011 Jul 8. [Epub ahead of print]

A Comparison of Methods to Estimate the Hazard Ratio Under Conditions of Time-varying Confounding and Nonpositivity. Naimi AI, Cole SR, Westreich DJ, Richardson DB. Department of Epidemiology, Gillings School of Global Public Health, UNC-Chapel Hill, NC and Department of Obstetrics and Gynecology and Duke Global Health Institute, Duke University.

In occupational epidemiologic studies, the healthy worker survivor effect refers to a process that leads to bias in the estimates of an association between cumulative exposure and a health outcome. In these settings, work status acts both as an intermediate and confounding variable and may violate the positivity assumption (the presence of exposed and unexposed observations in all strata of the confounder). Using Monte Carlo simulation, we assessed the degree to which crude, work-status adjusted, and weighted (marginal structural) Cox proportional hazards models are biased in the presence of time-varying confounding and nonpositivity. We simulated the data representing time-varying occupational exposure, work status, and mortality. Bias, coverage, and root mean squared error (MSE) were calculated relative to the true marginal exposure effect in a range of scenarios. For a base-case scenario, using crude, adjusted, and weighted Cox models, respectively, the hazard ratio was biased downward 19%, 9%, and 6%; 95% confidence interval coverage was 48%, 85%, and 91%; and root MSE was 0.20, 0.13, and 0.11. Although marginal structural models were less biased in most scenarios studied, neither standard nor marginal structural Cox proportional hazards models fully resolve the bias encountered under conditions of time-varying confounding and nonpositivity.

PMID: 21747286 [PubMed - as supplied by publisher]

CER Scan [published within the last 30 days]

    1. BMC Health Serv Res. 2011 Jul 21;11(1):171. [Epub ahead of print]

Does adding risk-trends to survival models improve in-hospital mortality predictions? A cohort study. Wong J, Taljaard M, Forster AJ, van Walraven C.

BACKGROUND: Clinicians informally assess changes in patients’ status over time to prognosticate their outcomes. The incorporation of trends in patient status into regression models could improve their ability to predict outcomes. In this study, we used a unique approach to measure trends in patient hospital death risk and determined whether the incorporation of these trend measures into a survival model improved the accuracy of its risk predictions.

METHODS: We included all adult inpatient hospitalizations between 1 April 2004 and 31 March 2009 at our institution. We used the daily mortality risk scores from an existing time-dependent survival model to create five trend indicators: absolute and relative percent change in the risk score from the previous day; absolute and relative percent change in the risk score from the start of the trend; and number of days with a trend in the risk score. In the derivation set, we determined which trend indicators were associated with time to death in hospital, independent of the existing covariates. In the validation set, we compared the predictive performance of the existing model with and without the trend indicators.

RESULTS: Three trend indicators were independently associated with time to hospital mortality: the absolute change in the risk score from the previous day; the absolute change in the risk score from the start of the trend; and the number of consecutive days with a trend in the risk score. However, adding these trend indicators to the existing model resulted in only small improvements in model discrimination and calibration.

CONCLUSIONS: We produced several indicators of trend in patient risk that were significantly associated with time to hospital death independent of the model used to create them. In other survival models, our approach of incorporating risk trends could be explored to improve their performance without the collection of additional data.

PMID: 21777460 [PubMed - as supplied by publisher]

Open Access: http://www.biomedcentral.com/content/pdf/1472-6963-11-171.pdf

    1. Stat Med. 2011 Jul 20;30(16):1917-32. doi: 10.1002/sim.4262. Epub 2011 May 3.

Alternative methods for testing treatment effects on the basis of multiple outcomes: Simulation and case study. Yoon FB, Fitzmaurice GM, Lipsitz SR, Horton NJ, Laird NM, Normand SL. Harvard Medical School, Boston, MA, U.S.A.. yoon@hcp.med.harvard.edu.

In clinical trials multiple outcomes are often used to assess treatment interventions. This paper presents an evaluation of likelihood-based methods for jointly testing treatment effects in clinical trials with multiple continuous outcomes. Specifically, we compare the power of joint tests of treatment effects obtained from joint models for the multiple outcomes with univariate tests based on modeling the outcomes separately. We also consider the power and bias of tests when data are missing, a common feature of many trials, especially in psychiatry. Our results suggest that joint tests capitalize on the correlation of multiple outcomes and are more powerful than standard univariate methods, especially when outcomes are missing completely at random. When outcomes are missing at random, test procedures based on correctly specified joint models are unbiased, while standard univariate procedures are not. Results of a simulation study are reported, and the methods are illustrated in an example from the Clinical Antipsychotic Trials of Intervention Effectiveness for schizophrenia. Copyright © 2011 John Wiley & Sons, Ltd.

PMCID: PMC3116112 [Available on 2012/7/20]

PMID: 21538986 [PubMed - in process]

    1. Health Serv Res. 2011 Aug;46(4):1259-80. doi: 10.1111/j.1475-6773.2011.01253.x. Epub 2011 Mar 17.

Crowd-out and Exposure Effects of Physical Comorbidities on Mental Health Care Use: Implications for Racial-Ethnic Disparities in Access. Lê Cook B, McGuire TG, Alegría M, Normand SL. Center for Multicultural Mental Health Research, 120 Beacon St., 4th Floor, Somerville, MA 02143 Department of Psychiatry, Harvard Medical School, Boston, MA Department of Health Care Policy, Harvard Medical School, Boston, MA Center for Multicultural Mental Health Research, Somerville, MA.

Objectives. In disparities models, researchers adjust for differences in “clinical need,” including indicators of comorbidities. We reconsider this practice, assessing (1) if and how having a comorbidity changes the likelihood of recognition and treatment of mental illness; and (2) differences in mental health care disparities estimates with and without adjustment for comorbidities. Data. Longitudinal data from 2000 to 2007 Medical Expenditure Panel Survey (n=11,083) split into pre and postperiods for white, Latino, and black adults with probable need for mental health care. Study Design. First, we tested a crowd-out effect (comorbidities decrease initiation of mental health care after a primary care provider [PCP] visit) using logistic regression models and an exposure effect (comorbidities cause more PCP visits, increasing initiation of mental health care) using instrumental variable methods. Second, we assessed the impact of adjustment for comorbidities on disparity estimates. Principal Findings. We found no evidence of a crowd-out effect but strong evidence for an exposure effect. Number of postperiod visits positively predicted initiation of mental health care. Adjusting for racial/ethnic differences in comorbidities increased black-white disparities and decreased Latino-white disparities. Conclusions. Positive exposure findings suggest that intensive follow-up programs shown to reduce disparities in chronic-care management may have additional indirect effects on reducing mental health care disparities.

PMCID: PMC3130831 [Available on 2012/8/1]

PMID: 21413984 [PubMed - in process]

Theme: CER Education

    1. Pharmacoepidemiol Drug Saf. 2011 Aug;20(8):797-804. doi: 10.1002/pds.2100. Epub 2011 Jan 10.

Curricular considerations for pharmaceutical comparative effectiveness research. Murray MD. Purdue University College of Pharmacy and Regenstrief Institute, Indianapolis, USA. mmurray@regenstrief.org.

In the U.S. pharmacoepidemiology and related health professions can potentially flourish with the congressional appropriation of $1.1 billion of federal funding for comparative effectiveness research (CER). A direct result of this legislation will be the need for sufficient numbers of trained scientists and decision-makers to address the research and implementation associated with CER. An interdisciplinary expert panel comprised mostly of professionals with pharmaceutical interests was convened to examine the knowledge, skills, and abilities to be considered in the development of a CER curriculum for the health professions focusing predominantly on pharmaceuticals. A limitation of the panel’s composition was that it did not represent the breadth of comparative effectiveness research, which additionally includes devices, services, diagnostics, behavioral treatments, and delivery system changes. This bias affects the generalizability of these findings. Notwithstanding, important components of the curriculum identified by the panel included study design considerations and understanding the strengths and limitations of data sources. Important skills and abilities included methods for adjustment of differences in comparator group characteristics to control confounding and bias, data management skills, and clinical skills and insights into the relevance of comparisons. Most of the knowledge, skills, and abilities identified by the panel were consistent with the training of pharmacoepidemiologists. While comparative effectiveness is broader than the pharmaceutical sciences, pharmacoepidemiologists have much to offer academic and professional CER training programs. As such, pharmacoepidemiologists should have a central role in curricular design and provision of the necessary training for needed comparative effectiveness researchers within the realm of pharmaceutical sciences. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21796716 [PubMed - in process]

    1. Pharmacoepidemiol Drug Saf. 2011 Aug;20(8):805-6. doi: 10.1002/pds.2122. Epub 2011 May 25.

The central role of pharmacoepidemiology in comparative effectiveness research education: critical next steps. Selker HP. Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA; Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA, USA. hselker@tuftsmedicalcenter.org.

PMID: 21618339 [PubMed - in process]

    1. Pharmacoepidemiol Drug Saf. 2011 Aug;20(8):807-9. doi: 10.1002/pds.2173. Epub 2011 Jun 17.

Starting the conversation. Lawrence W. Center for Outcomes and Evidence, Agency for Healthcare Research and Quality, 540 Gaither Rd., Rockville, MD, 20850, USA. William.lawrence@ahrq.hhs.gov.

PMID: 21681851 [PubMed - in process]

 

July 2011

Jump to top of page

CER Scan [Published within the past 30 days]

    1. Pharmacoepidemiol Drug Saf. 2011 Jun 30. doi: 10.1002/pds.2152. [Epub ahead of print]

Confounding adjustment via a semi-automated high-dimensional propensity score algorithm: an application to electronic medical records. Toh S, García Rodríguez LA, Hernán MA.Department of Population Medicine, Harvard Medical School/Harvard Pilgrim Health Care Institute, Boston, MA, USA. darrentoh@post.harvard.edu.

PURPOSE: A semi-automated high-dimensional propensity score (hd-PS) algorithm has been proposed to adjust for confounding in claims databases. The feasibility of using this algorithm in other types of healthcare databases is unknown.

METHODS: We estimated the comparative safety of traditional non-steroidal anti-inflammatory drugs (NSAIDs) and selective COX-2 inhibitors regarding the risk of upper gastrointestinal bleeding (UGIB) in The Health Improvement Network, an electronic medical record (EMR) database in the UK. We compared the adjusted effect estimates when the confounders were identified using expert knowledge or the semi-automated hd-PS algorithm.

RESULTS: Compared with the 411,616 traditional NSAID initiators, the crude odds ratio (OR) of UGIB was 1.50 (95%CI: 0.98, 2.28) for the 43,569 selective COX-2 inhibitor initiators. The OR dropped to 0.81 (0.52, 1.27) upon adjustment for known risk factors for UGIB that are typically available in both claims and EMR databases. The OR remained similar when further adjusting for covariates-smoking, alcohol consumption, and body mass index-that are not typically recorded in claims databases (OR 0.81; 0.51, 1.26) or adding 500 empirically identified covariates using the hd-PS algorithm (OR 0.78; 0.49, 1.22). Adjusting for age and sex plus 500 empirically identified covariates produced an OR of 0.87 (0.56, 1.34).

CONCLUSIONS: The hd-PS algorithm can be implemented in pharmacoepidemiologic studies that use primary care EMR databases such as The Health Improvement Network. For the NSAID-UGIB association for which major confounders are well known, further adjustment for covariates selected by the algorithm had little impact on the effect estimate. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21717528 [PubMed - as supplied by publisher]

CER Scan [published within the last 2 months]

    1. BMC Med Res Methodol. 2011 May 23;11:77.

Logistic random effects regression models: a comparison of statistical packages for binary and ordinal outcomes. Li B, Lingsma HF, Steyerberg EW, Lesaffre E.Department of Biostatistics, Erasmus MC, Dr, Molewaterplein 50, Rotterdam, the Netherlands. e.lesaffre@erasmusmc.nl.

BACKGROUND: Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models.

METHODS: We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesianapproaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC.Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted.

RESULTS: The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient.

CONCLUSIONS: On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either afrequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen”non-informative” prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain.

PMCID: PMC3112198 PMID: 21605357 [PubMed - in process]

Free article: http://www.ncbi.nlm.nih.gov/pnc/articles/PMC3112198/?tool=pubmed

We recommend reviewing Supplemental Material: Additional File #2

    1. Stat Med. 2011 Jul 10;30(15):1837-51. doi: 10.1002/sim.4240. Epub 2011 Apr 15.

Semiparametric regression models for detecting effect modification in matched case-crossover studies. Kim I, Cheong HK, Kim H. Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, U.S.A.

In matched case-crossover studies, it is generally accepted that covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model because any stratum effect is removed by the conditioning on the fixed number of sets of a case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. In addition, the matching covariates may be effect modification and the methods for assessing and characterizing effect modification by matching covariates are quite limited. In this article, we propose a unified approach in its ability to detect both parametric and nonparametric relationships between the predictor and the relative risk of disease or binary outcome, as well as potential effect modifications by matching covariates. Two methods are developed using two semiparametric models: (1) the regression spline varying coefficients model and (2) the regression spline interaction model. Simulation results show that the two approaches are comparable. These methods can be used in any matched case-control study and extend to multilevel effect modification studies. We demonstrate the advantage of our approach using an epidemiological example of a 1-4 bi-directional case-crossover study of childhood aseptic meningitis associated with drinking water turbidity. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21495061 [PubMed - in process]

Theme: Data Linkage

    1. J Clin Epidemiol. 2011 May;64(5):565-72. Epub 2010 Oct 16.

Results from simulated data sets: probabilistic record linkage outperforms deterministic record linkage. Tromp M, Ravelli AC, Bonsel GJ, Hasman A, Reitsma JB. Department of Medical Informatics, Academic Medical Center, University of Amsterdam, 1100 DE Amsterdam, The Netherlands. m.tromp@amc.uva.nl

OBJECTIVE: To gain insight into the performance of deterministic record linkage (DRL) vs. probabilistic record linkage (PRL) strategies under different conditions by varying the frequency of registration errors and the amount of discriminating power.

STUDY DESIGN AND SETTING: A simulation study in which data characteristics were varied to create a range of realistic linkage scenarios. For each scenario, we compared the number of misclassifications (number of false nonlinks and false links) made by the different linking strategies: deterministic full, deterministic N-1, and probabilistic.

RESULTS: The full deterministic strategy produced the lowest number of false positive links but at the expense of missing considerable numbers of matches dependent on the error rate of the linking variables. The probabilistic strategy outperformed the deterministic strategy (full or N-1) across all scenarios. A deterministic strategy can match the performance of a probabilistic approach providing that the decision about which disagreements should be tolerated is made correctly. This requires a priori knowledge about the quality of all linking variables, whereas this information is inherently generated by a probabilistic strategy.

CONCLUSION: PRL is more flexible and provides data about the quality of the linkage process that in turn can minimize the degree of linking errors, given the data provided.

PMID: 20952162 [PubMed - indexed for MEDLINE]

    1. Am J Epidemiol. 2011 May 1;173(9):1059-68. Epub 2011 Mar 23.

Use of a medical records linkage system to enumerate a dynamic population over time: the Rochester epidemiology project. St Sauver JL, Grossardt BR, Yawn BP, Melton LJ 3rd, Rocca WA. Division of Epidemiology, Department of Health Sciences Research, College of Medicine, Mayo Clinic, Rochester, Minnesota 55905, USA.

The Rochester Epidemiology Project (REP) is a unique research infrastructure in which the medical records of virtually all persons residing in Olmsted County, Minnesota, for over 40 years have been linked and archived. In the present article, the authors describe how the REP links medical records from multiple health care institutions to specific individuals and how residency is confirmed over time. Additionally, the authors provide evidence for the validity of the REP Census enumeration. Between 1966 and 2008, 1,145,856 medical records were linked to 486,564 individuals in the REP. The REP Census was found to be valid when compared with a list of residents obtained from random digit dialing, a list of residents of nursing homes and senior citizen complexes, a commercial list of residents, and a manual review of records. In addition, the REP Census counts were comparable to those of 4 decennial US censuses (e.g., it included 104.1% of 1970 and 102.7% of 2000 census counts). The duration for which each person was captured in the system varied greatly by age and calendar year; however, the duration was typically substantial. Comprehensive medical records linkage systems like the REP can be used to maintain a continuously updated census and to provide an optimal sampling framework for epidemiologic studies.

PMCID: PMC3105274 [Available on 2012/5/1] PMID: 21430193 [PubMed - indexed for MEDLINE]

    1. Stat Methods Med Res. 2011 Jun 10. [Epub ahead of print]

Linkage of patient records from disparate sources. Li X, Shen C. Division of Biostatistics, Indiana University School of Medicine, Indianapolis, US.

We review ideas, approaches and progress in the field of record linkage. We point out that the latent class models used in probabilistic matching have been well developed and applied in a different context of diagnostic testing when the true disease status is unknown. The methodology developed in the diagnostic testing setting can be potentially translated and applied in record linkage. Although there are many methods for record linkage, a comprehensive evaluation of methods for a wide range of real-world data with different data characteristics and with true match status is absent due to lack of data sharing. However, the recent availability of generators of synthetic data with realistic characteristics renders such evaluations feasible.

PMID: 21665896 [PubMed - as supplied by publisher]

 

June 2011

Jump to top of page

CER Scan [Published within the past 30 days]

    1. Clin Trials. 2011 May 24. [Epub ahead of print]

Confounding due to changing background risk in adaptively randomized trials. Lipsky AM, Greenland S. Gertner Institute for Epidemiology and Health Policy Research, Chaim Sheba Medical Center, Tel Hashomer, Israel.

BACKGROUND: While adaptive trials tend to improve efficiency, they are also subject to some unique biases. PURPOSE: We address a bias that arises from adaptive randomization in the setting of a time trend in disease incidence.

METHODS: We use a potential-outcome model and directed acyclic graphs to illustrate the bias that arises from a changing subject allocation ratio with a concurrent change in background risk.

RESULTS: In a trial that uses adaptive randomization, time trends in risk can bias the crude effect estimate obtained by naively combining the data from the different stages of the trial. We illustrate how the bias arises from an interplay of departures from exchangeability among groups and the changing randomization proportions.

LIMITATIONS: We focus on risk-ratio and risk-difference analysis.

CONCLUSIONS: Analysis of trials using adaptive randomization should involve attention to or adjustment for possible trends in background risk. Numerous modeling strategies are available for that purpose, including stratification, trend modeling, inverse-probability-of-treatment weighting, and hierarchical regression.

PMID: 21610005 [PubMed - as supplied by publisher]

    1. Pharmacoepidemiol Drug Saf. 2011 May 30. doi: 10.1002/pds.2121. [Epub ahead of print]

Simultaneously assessing intended and unintended treatment effects of multiple treatment options: a pragmatic “matrix design.” Rassen JA, Solomon DH, Glynn RJ, Schneeweiss S. Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. jrassen@post.harvard.edu.

PURPOSE: A key aspect of comparative effectiveness research is the assessment of competing treatment options and multiple outcomes rather than a single treatment option and a single benefit or harm. In this commentary, we describe a methodological framework that supports the simultaneous examination of a “matrix” of treatments and outcomes in non-randomized data.

METHODS: We outline the methodological challenges to a matrix-type study (matrix design). We consider propensity score matching with multiple treatment groups, statistical analysis, and choice of association measure when evaluating multiple outcomes. We also discuss multiple testing, use of high-dimensional propensity scores for covariate balancing in light of multiple outcomes, and suitability of available software.

CONCLUSION: The matrix design study methods facilitate examination of the comparative benefits and harms of competing treatment choices, and also provides the input required for calculating the numbers needed to treat and for a broader benefit/harm assessment that weighs endpoints of varying severity. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21626604 [PubMed - as supplied by publisher]

    1. Stat Med. 2011 May 20;30(11):1199-217. doi: 10.1002/sim.4156. Epub 2010 Dec 29.

Exploring the benefits of adaptive sequential designs in time-to-event endpoint settings. Emerson SC, Rudser KD, Emerson SS. Department of Statistics, Oregon State University, U.S.A.. scemerson@gmail.com.

Sequential analysis is frequently employed to address ethical and financial issues in clinical trials. Sequential analysis may be performed using standard group sequential designs, or, more recently, with adaptive designs that use estimates of treatment effect to modify the maximal statistical information to be collected. In the general setting in which statistical information and clinical trial costs are functions of the number of subjects used, it has yet to be established whether there is any major efficiency advantage to adaptive designs over traditional group sequential designs. In survival analysis, however, statistical information (and hence efficiency) is most closely related to the observed number of events, while trial costs still depend on the number of patients accrued. As the number of subjects may dominate the cost of a trial, an adaptive design that specifies a reduced maximal possible sample size when an extreme treatment effect has been observed may allow early termination of accrual and therefore a more cost-efficient trial. We investigate and compare the tradeoffs between efficiency (as measured by average number of observed events required), power, and cost (a function of the number of subjects accrued and length of observation) for standard group sequential methods and an adaptive design that allows for early termination of accrual. We find that when certain trial design parameters are constrained, an adaptive approach to terminating subject accrual may improve upon the cost efficiency of a group sequential clinical trial investigating time-to-event endpoints. However, when the spectrum of group sequential designs considered is broadened, the advantage of the adaptive designs is less clear. Copyright © 2010 John Wiley & Sons, Ltd.

PMID: 21538450 [PubMed - in process]

    1. Stat Methods Med Res. 2011 Jun;20(3):191-215. Epub 2008 Nov 26.

Estimating dose-response effects in psychological treatment trials: the role of instrumental variables. Maracy M, Dunn G. Biostatistics, Health Methodology Research Group, School of Community Based Medicine, University of Manchester, UK.

We present a relatively non-technical and practically orientated review of statistical methods that can be used to estimate dose-response relationships in randomised controlled psychotherapy trials in which participants fail to attend all of the planned sessions of therapy. Here we are investigating the effects on treatment outcome of the number of sessions attended when the latter is possibly subject to hidden selection effects (hidden confounding). The aim is to estimate the parameters of a structural mean model (SMM) using randomisation, and possibly randomisation by covariate interactions, as instrumental variables. We describe, compare and illustrate the equivalence of the use of a simple G-estimation algorithm and two two-stage least squares procedures that are traditionally used in economics.

PMID: 19036909 [PubMed - in process]

 

May 2011

Jump to top of page

CER Scan – Published within the past 30 days

    1. BMC Med Res Methodol. 2011 Apr 25;11(1):57. [Epub ahead of print]

Exploratory trials, confirmatory observations: a new reasoning model in the era of patient-centered medicine. Sacristan JA.
BACKGROUND:
The prevailing view in therapeutic clinical research today is that observational studies are useful for generating new hypotheses and that controlled experiments (i.e., randomized clinical trials, RCTs) are the most appropriate method for assessing and confirming the efficacy of interventions.

DISCUSSION:
The current trend towards patient-centered medicine calls for alternative ways of reasoning, and in particular for a shift towards hypothetico-deductive logic, in which theory is adjusted in light of individual facts. A new model of this kind should change our approach to drug research and development, and regulation. The assessment of new therapeutic agents would be viewed as a continuous process, and regulatory approval would no longer be regarded as the final step in the testing of a hypothesis, but rather, as the hypothesis-generating step. The main role of RCTs in this patient-centered research paradigm would be to generate hypotheses, while observations would serve primarily to test their validity for different types of patients. Under hypothetico-deductive logic, RCTs are considered “exploratory” and observations, “confirmatory”.

SUMMARY:
In this era of tailored therapeutics, the answers to therapeutic questions cannot come exclusively from methods that rely on data aggregation, the analysis of similarities, controlled experiments, and a search for the best outcome for the average patient; they must also come from methods based on data disaggregation, analysis of subgroups and individuals, an integration of research and clinical practice, systematic observations, and a search for the best outcome for the individual patient. We must look not only to evidence-based medicine, but also to medicine-based evidence, in seeking the knowledge that we need.

Free Article: http://www.biomedcentral.com/1471-2288/11/57

PMID: 21518440

    1. JAMA. 2011 Apr 13;305(14):1482-3.

Infection prevention and comparative effectiveness research. Perencevich EN, Lautenbach E.

Division of General Internal Medicine, University of Iowa, Carver College of Medicine, Iowa City, USA. eli-perencevich@uiowa.edu

PMID: 21486981

CER Scan [Epub Ahead of Print]

    1. Stat Med. 2011 May 3. doi: 10.1002/sim.4262. [Epub ahead of print]

Alternative methods for testing treatment effects on the basis of multiple outcomes: Simulation and case study. Yoon FB, Fitzmaurice GM, Lipsitz SR, Horton NJ, Laird NM, Normand SL. Harvard Medical School, Boston, MA, U.S.A.. yoon@hcp.med.harvard.edu.
In clinical trials multiple outcomes are often used to assess treatment interventions. This paper presents an evaluation of likelihood-based methods for jointly testing treatment effects in clinical trials with multiple continuous outcomes. Specifically, we compare the power of joint tests of treatment effects obtained from joint models for the multiple outcomes with univariate tests based on modeling the outcomes separately. We also consider the power and bias of tests when data are missing, a common feature of many trials, especially in psychiatry. Our results suggest that joint tests capitalize on the correlation of multiple outcomes and are more powerful than standard univariate methods, especially when outcomes are missing completely at random. When outcomes are missing at random, test procedures based on correctly specified joint models are unbiased, while standard univariate procedures are not. Results of a simulation study are reported, and the methods are illustrated in an example from the Clinical
Antipsychotic Trials of Intervention Effectiveness for schizophrenia.

Copyright ©2011 John Wiley & Sons, Ltd.

PMID: 21538986 [PubMed - as supplied by publisher]

    1. Stat Med. 2011 Apr 15. doi: 10.1002/sim.4241. [Epub ahead of print]

Two-stage instrumental variable methods for estimating the causal o dds ratio: Analysis of bias. Cai B, Small DS, TenHave TR.

Merck Research Laboratories, UG1D-60, P.O. Box 1000, North Wales, PA 19454-1099, U.S.A.; Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Blockley Hall, 6th FLR 423 Guardian Dr., Philadelphia, PA 19104-6021, U.S.A.. bcai@mail.med.upenn.edu.
We present closed-form expressions of asymptotic bias for the causal odds ratio from two estimation approaches of instrumental variable logistic regression: (i) the two-stage predictor substitution (2SPS) method and (ii) the two-stage residual inclusion (2SRI) approach. Under the 2SPS approach, the first stage model yields the predicted value of treatment as a function of an instrument and covariates, and in the second stage model for the outcome, this predicted value replaces the observed value of treatment as a covariate. Under the 2SRI approach, the first stage is the same, but the residual term of the first stage regression is included in the second stage regression, retaining the observed treatment as a covariate. Our bias assessment is for a different context from that of Terza (J. Health Econ. 2008; 27(3):531-543), who focused on the causal odds ratio conditional on the unmeasured confounder, whereas we focus on the causal odds ratio among compliers under the principal stratification framework. Our closed-form bias results show that the 2SPS logistic regression generates asymptotically biased estimates of this causal odds ratio when there is no unmeasured confounding and that this bias increases with increasing unmeasured confounding. The 2SRI logistic regression is asymptotically unbiased when there is no unmeasured confounding, but when there is unmeasured confounding, there is bias and it increases with increasing unmeasured confounding. The closed-form bias results provide guidance for using these IV logistic regression methods. Our simulation results are consistent with our closed-form analytic results under different combinations of parameter settings. Copyright © 2011 John Wiley & Sons, Ltd.

Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21495062

    1. Stat Med. 2011 Apr 15. doi: 10.1002/sim.4240. [Epub ahead of print]

Semiparametric regression models for detecting effect mo dification in matched case-crossover studies. Kim I, Cheong HK, Kim H.

Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, U.S.A.
In matched case-crossover studies, it is generally accepted that covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model because any stratum effect is removed by the conditioning on the fixed number of sets of a case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. In addition, the matching covariates may be effect modification and the methods for assessing and characterizing effect modification by matching covariates are quite limited. In this article, we propose a unified approach in its ability to detect both parametric and nonparametric relationships between the predictor and the relative risk of disease or binary outcome, as well as potential effect modifications by matching covariates. Two methods are developed using two semiparametric models: (1) the regression spline varying coefficients model and (2) the regression spline interaction model. Simulation results show that the two approaches are comparable. These methods can be used in any matched case-control study and extend to multilevel effect modification studies. We demonstrate the advantage of our approach using an epidemiological example of a 1-4 bi-directional case-crossover study of childhood aseptic meningitis associated with drinking water turbidity. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21495061

    1. Pharmacoepidemiol Drug Saf. 2011 Apr 29. doi: 10.1002/pds.2133. [Epub ahead of print]

Near real-time vaccine safety surveillance with partially accrued data. Greene SK, Kulldorff M, Yin R, Yih WK, Lieu TA, Weintraub ES, Lee GM.

Department of Population Medicine, Harvard Medical School and Harvard Pilgrim
Health Care Institute, Boston, MA, USA. Sharon_Greene@harvardpilgrim.org.
PURPOSE: The Vaccine Safety Datalink (VSD) Project conducts near real-time vaccine safety surveillance using sequential analytic methods. Timely surveillance is critical in identifying potential safety problems and preventing additional exposure before most vaccines are administered. For vaccines that are administered during a short period, such as influenza vaccines, timeliness can be improved by undertaking analyses while risk windows following vaccination are ongoing and by accommodating predictable and unpredictable data accrual delays. We describe practical solutions to these challenges, which were adopted by the
VSD Project during pandemic and seasonal influenza vaccine safety surveillance in 2009/2010.

METHODS: Adjustments were made to two sequential analytic approaches.
The Poisson-based approach compared the number of pre-defined adverse events observed following vaccination with the number expected using historical data.
The expected number was adjusted for the proportion of the risk window elapsed and the proportion of inpatient data estimated to have accrued. The binomial-based approach used a self-controlled design, comparing the observed numbers of events in risk versus comparison windows. Events were included in analysis only if they occurred during a week that had already passed for both windows. RESULTS: Analyzing data before risk windows fully elapsed improved the timeliness of safety surveillance. Adjustments for data accrual lags were tailored to each data source and avoided biasing analyses away from detecting a potential safety problem, particularly early during surveillance.

CONCLUSIONS: The timeliness of vaccine and drug safety surveillance can be improved by properly accounting for partially elapsed windows and data accrual delays. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21538670 [PubMed - as supplied by publisher]

    1. Epidemiology. 2011 Apr 11. [Epub ahead of print]

Validation Data-based Adjustments for Outcome Misclassification in Logistic Regression: An Illustration. Lyles RH, Tang L, Superak HM, King CC, Celentano DD, Lo Y, Sobel JD.
From the Department of Biostatistics and Bioinformatics, The Rollins School of Public Health of Emory University, Atlanta, GA; Centers for Disease Control and Prevention, Atlanta, GA; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD; Montefiore Medical Center and Albert Einstein College of Medicine, Bronx, NY; and Department of Medicine, Wayne State University School of Medicine, Detroit, MI.
Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.

PMID: 21487295 [PubMed - as supplied by publisher]

Theme: The Changing Face of Epidemiology

    1. Epidemiology. 2011 May;22(3):295-7.

Making observational studies count: shaping the future of comparative effectiveness research. Dreyer NA. Outcome Sciences Inc., Cambridge, MA.

PMID: 21464648

    1. Epidemiology. 2011 May;22(3):290-1.

With great data comes great responsibility: publishing comparative effectiveness research in epidemiology. Hernán MA. From Harvard School of Public Heath, Boston, MA.

PMID: 21464646 [PubMed - in process]

    1. Epidemiology. 2011 May;22(3):302-4.

Improving automated database studies. Ray WA. From the The Division of Pharmacoepidemiology, Department of Preventive Medicine, Vanderbilt University School of Medicine, Nashville, TN; and bGeriatric Research, Education and Clinical Center, Nashville Veterans Administration Medical Center, Nashville, TN.

PMID: 21464650 [PubMed - in process]

    1. Epidemiology. 2011 May;22(3):298-301.

Nonexperimental comparative effectiveness research using linked healthcare databases. Stürmer T, Jonsson Funk M, Poole C, Brookhart MA. From the Pharmacoepidemiology Program, Department of Epidemiology, UNC Gillings School of Global Public Health University of North Carolina at Chapel Hill, Chapel Hill, NC.

PMID: 21464649 [PubMed - in process]

    1. Epidemiology. 2011 May;22(3):292-4.

The new world of data linkages in clinical epidemiology: are we being brave or foolhardy? Weiss NS. From the Department of Epidemiology, University of Washington, and the Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA.

PMID: 21464647 [PubMed - in process]

 

April 2011

Jump to top of page

CER Scan – Published within the past 30 days

    1. BMC Med Res Methodol. 2011 Apr 1;11(1):36. [Epub ahead of print]

Design of cohort studies in chronic diseases using routinely collected databases when a prescription is used as surrogate outcome.

Lodi S, Carpenter J, Egger P, Evans S.

BACKGROUND: There has been little research on design of studies based on routinely collected data when the clinical endpoint of interest is not recorded, but can be inferred from a prescription. This often happens when exploring the effect of a drug on chronic diseases. Using the LifeLink claims database in studying the possible anti-inflammatory effects of statins in rheumatoid arthritis (RA), oral steroids (OS) were treated as surrogate of inflammatory flare-ups. We compared two cohort study designs, the first using time to event outcomes and the second using quantitative amount of the surrogate.
METHODS: RA patients were extracted from the LifeLink database. In the first study, patients were split into two sub-cohorts based on whether they were using OS within a specified time window of the RA index date (first record of RA). Using Cox models we evaluated the association between time-varying exposure to statins and (i) initiation of OS therapy in the non-users of OS at RA index date and (ii) cessation of OS therapy in the users of OS at RA index date. In the second study, we matched new statin users to non users on age and sex. Zero inflated negative binomial models were used to contrast the number of days’ prescriptions of OS in the year following date of statin initiation for the two exposure groups.
RESULTS: In the unmatched study, the statin exposure hazard ratio (HR) of initiating OS in the 31451 non-users of OS at RA index date was 0.96(95% CI 0.9,1.1) and the statin exposure HR of cessation of OS therapy in the 6026 users of OS therapy at RA index date was 0.95 (0.87,1.05). In the matched cohort of 6288 RA patients the statin exposure rate ratio for duration on OS therapy was 0.88(0.76,1.02). There was digit preference for outcomes in multiples of 7 and 30 days.
CONCLUSIONS: The `time to event’ study design was preferable because it better exploits information on all available patients and provides a degree of robustness toward confounding. We found no convincing evidence that statins reduce inflammation in RA patients.

PMID: 21457565 [PubMed - as supplied by publisher]

Free Full text (PDF) available: http://www.biomedcentral.com/content/pdf/1471-2288-11-36.pdf

CER Scan – Epub Ahead of Print

    1. Am J Epidemiol. 2011 Mar 23. [Epub ahead of print]

Invited Commentary: Causation or “noitasuaC”?

Schisterman E, Whitcomb B, Bowers K.

Longitudinal studies are often viewed as the “gold standard” of observational epidemiologic research. Establishing a temporal association is a necessary criterion to identify causal relations. However, when covariates in the causal system vary over time, a temporal association is not straightforward. Appropriate analytical methods may be necessary to avoid confounding and reverse causality. These issues come to light in 2 studies of breastfeeding described in the articles by Al-Sahab et al. (Am J Epidemiol. 2011;173(00):0000-0000) and Kramer et al. (Am J Epidemiol. 2011;173(00):0000-0000) in this issue of the Journal. Breastfeeding has multiple time points and is a behavior that is affected by multiple factors, many of which themselves vary over time. This creates a complex causal system that requires careful scrutiny. The methods presented here may be applicable to a wide range of studies that involve time-varying exposures and time-varying confounders.
PMID: 21430191 [PubMed - as supplied by publisher]

FreeFull text (HTML) available: http://aje.oxfordjournals.org/content/early/2011/03/23/aje.kwq499.long

    1. Pharmacoepidemiol Drug Saf. 2011 Mar 10. doi: 10.1002/pds.2098. [Epub ahead of print]

The implications of propensity score variable s election strategies in pharmacoepidemiology: an empirical illustration.

Patrick AR, Schneeweiss S, Brookhart MA, Glynn RJ, Rothman KJ, Avorn J, Stürmer T. Division of Pharmacoepidemiology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. arpatrick@partners.org.

PURPOSE: To examine the effect of variable selection strategies on the performance of propensity score (PS) methods in a study of statin initiation, mortality, and hip fracture assuming a true mortality reduction of <?15% and no effect on hip fracture.
METHODS: We compared seniors initiating statins with seniors initiating glaucoma medications. Out of 202 covariates with a prevalence?>?5%, PS variable selection strategies included none, a priori, factors predicting exposure, and factors predicting outcome. We estimated hazard ratios (HRs) for statin initiation on mortality and hip fracture from Cox models controlling for various PSs.
RESULTS: During 1 year follow-up, 2693 of 55?610 study subjects died and 496 suffered a hip fracture. The crude HR for statin initiators was 0.64 for mortality and 0.46 for hip fracture. Adjusting for the non-parsimonious PS yielded effect estimates of 0.83 (95%CI:0.75-0.93) and 0.72 (95%CI:0.56-0.93). Including in the PS only covariates associated with a greater than 20% increase or reduction in outcome rates yielded effect estimates of 0.84 (95%CI:0.75-0.94) and 0.76 (95%CI:0.61-0.95), which were closest to the effects predicted from randomized trials.
CONCLUSION: Due to the difficulty of pre-specifying all potential confounders of an exposure-outcome association, data-driven approaches to PS variable selection may be useful. Selecting covariates strongly associated with exposure but unrelated to outcome should be avoided, because this may increase bias. Selecting variables for PS based on their association with the outcome may help to reduce such bias.
Copyright © 2011 John Wiley & Sons, Ltd.PMID: 21394812 [PubMed - as supplied by publisher]

    1. Am J Epidemiol. 2011 Mar 8. [Epub ahead of print]

Doubly Robust Estimation of Causal Effects.

Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, Davidian M.

Doubly robust estimation combines a form of outcome regression with a model for the exposure (i.e., the propensity score) to estimate the causal effect of an exposure on an outcome. When used individually to estimate a causal effect, both outcome regression and propensity score methods are unbiased only if the statistical model is correctly specified. The doubly robust estimator combines these 2 approaches such that only 1 of the 2 models need be correctly specified to obtain an unbiased effect estimator. In this introduction to doubly robust estimators, the authors present a conceptual overview of doubly robust estimation, a simple worked example, results from a simulation study examining performance of estimated and bootstrapped standard errors, and a discussion of the potential advantages and limitations of this method. The supplementary material for this paper, which is posted on the Journal’s Web site (http://aje.oupjournals.org/), includes a demonstration of the doubly robust property (Web Appendix 1) and a description of a SAS macro (SAS Institute, Inc., Cary, North Carolina) for doubly robust estimation, available for download at http://www.unc.edu/~mfunk/dr/.
PMID: 21385832 [PubMed - as supplied by publisher]

FreeFull text (HTML) available: http://aje.oxfordjournals.org/content/173/7/761.long

    1. Am J Epidemiol. 2011 Feb 28. [Epub ahead of print]

Methods for Estimating Remission Rates From Cross -Sectional Survey Data: Application and Validation Using Data From a National Migraine Study.

Roy J, Stewart WF.

Knowledge about remission rates can affect treatment decisions and facilitate etiologic discoveries. However, little is known about remission of many chronic episodic disorders, including migraine. This is partly due to the fact that medical records do not fully capture the history of these conditions, since patients might stop seeking care once they no longer have symptoms. For these disorders, remission rates would typically be obtained from prospective observational studies. Prospective studies of remission for chronic episodic conditions are rarely conducted, however, and suffer from many analytical challenges, such as outcome-dependent dropout. Here the authors propose an alternative approach that is appropriate for use with cross-sectional survey data in which reported age of onset was recorded. The authors estimated migraine remission rates using data from a 2004 national survey. They took a Bayesian approach and modeled sex- and age-specific remission rates as a function of incidence and prevalence. The authors found that remission rates were an increasing function of age and were similar for men and women. Follow-up survey data from migraine cases (2005) were used to validate the methods. The remission curves estimated from the validation data were very similar to the ones from the cross-sectional data.
PMID: 21357656 [PubMed - as supplied by publisher]

    1. Int J Epidemiol. 2011 Mar 30. [Epub ahead of print]

The Simpson’s paradox unraveled.

Hernán MA, Clayton D, Keiding N. Department of Epidemiology, Harvard School of Public Health, Harvard-MIT Division of Health Sciences and Technology, Boston, MA 02115, USA, Department of Medical Genetics, Cambridge University, Addenbrooke’s Hospital, Cambridge, UK and Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark.

BACKGROUND: In a famous article, Simpson described a hypothetical data example that led to apparently paradoxical results.
METHODS: We make the causal structure of Simpson’s example explicit.
RESULTS: We show how the paradox disappears when the statistical analysis is appropriately guided by subject-matter knowledge. We also review previous explanations of Simpson’s paradox that attributed it to two distinct phenomena: confounding and non-collapsibility.
CONCLUSION: Analytical errors may occur when the problem is stripped of its causal context and analyzed merely in statistical terms.

PMID: 21454324 [PubMed - as supplied by publisher]

Free Full text (PDF) available: http://ije.oxfordjournals.org/content/early/2011/03/30/ije.dyr041.full.pdf+html

    1. Clin Trials. 2011 Jan 31. [Epub ahead of print]

Bayesian models for subgroup analysis in clinical trials.

Jones HE, Ohlssen DI, Neuenschwander B, Racine A, Branson M. School of Social and Community Medicine, University of Bristol, UK.

BACKGROUND: In a pharmaceutical drug development setting, possible interactions between the treatment and particular baseline clinical or demographic factors are often of interest. However, the subgroup analysis required to investigate such associations remains controversial. Concerns with classical hypothesis testing approaches to the problem include low power, multiple testing, and the possibility of data dredging.
PURPOSE: As an alternative to hypothesis testing, the use of shrinkage estimation techniques is investigated in the context of an exploratory post hoc subgroup analysis. A range of models that have been suggested in the literature are reviewed. Building on this, we explore a general modeling strategy, considering various options for shrinkage of effect estimates. This is applied to a case-study, in which evidence was available from seven-phase II-III clinical trials examining a novel therapy, and also to two artificial datasets with the same structure.
METHODS: Emphasis is placed on hierarchical modeling techniques, adopted within a Bayesian framework using freely available software. A range of possible subgroup model structures are applied, each incorporating shrinkage estimation techniques.
RESULTS: The investigation of the case-study showed little evidence of subgroup effects. Because inferences appeared to be consistent across a range of well-supported models, and model diagnostic checks showed no obvious problems, it seemed this conclusion was robust. It is reassuring that the structured shrinkage techniques appeared to work well in a situation where deeper inspection of the data suggested little evidence of subgroup effects.
LIMITATIONS: The post hoc examination of subgroups should be seen as an exploratory analysis, used to help make better informed decisions regarding potential future studies examining specific subgroups. To a certain extent, the degree of understanding provided by such assessments will be limited by the quality and quantity of available data.

CONCLUSIONS: In light of recent interest by health authorities into the use of subgroup analysis in the context of drug development, it appears that Bayesian approaches involving shrinkage techniques could play an important role in this area. Hopefully, the developments outlined here provide useful methodology for tackling such a problem, in-turn leading to better informed decisions regarding subgroups.

PMID: 21282293 [PubMed - as supplied by publisher]

    1. Clin Trials. 2011 Jan 26. [Epub ahead of print]

Challenges in the design and conduct of controlled clinical effectiveness trials in schizophrenia.

Rosenheck RA, Krystal JH, Lew R, Barnett PG, Thwin SS, Fiore L, Valley D, Huang GD, Neal C, Vertrees JE, Liang MH. Veterans Affairs (VA) Connecticut Healthcare System, West Haven, CT, USA, Yale School of Medicine, New Haven, CT, USA.

BACKGROUND: The introduction of antipsychotic medication has been a major advance in the treatment of schizophrenia and allows millions of people to live outside of institutions. It is generally believed that long-acting intramuscular antipsychotic medication is the most effective approach to increasing medication adherence and thereby reduce relapse in high-risk patients with schizophrenia, but the data are scant.
PURPOSE: To report the design of a study to assess the effect of long-acting injectable risperidone in unstable patients and under more realistic conditions than previously studied and to evaluate the effect of this medication on psychiatric inpatient hospitalization, schizophrenia symptoms, quality of life, medication adherence, side effects, and health care costs.
METHODS: The trial was an open randomized clinical comparative effectiveness trial in patients with schizophrenia or schizo-affective disorders in which parenteral risperidone was compared to an oral antipsychotic regimen selected by each control patient’s psychiatrist. Participants had unstable psychiatric disease defined by recent hospitalization or exhibition of unusual need for psychiatric services. The primary endpoint was hospitalization for psychiatric indications; the secondary endpoint was psychiatric symptoms.
RESULTS: Overall, 382 patients were randomized. Determination of a persons’ competency to understand the elements of informed consent was addressed. The use of a closed-circuit TV interview for psychosocial measures provided an economical, high quality, reliable means of collecting data. A unique method for insuring that usual care was optimal was incorporated in the follow-up of all subjects.
LIMITATIONS: Patients with schizophrenia or schizo-affective disorders and with the common co-morbid illnesses seen in the VA are a challenging group of subjects to study in long-term trials. Some techniques unique in the VA and found useful may not be generalizable or applicable in other research or treatment settings.
CONCLUSIONS: The trial tested a new antipsychotic medication early in its adoption in the Veterans Health Administration. The VA has a unique electronic medical record and database which can be used to identify the endpoint, that is, first hospitalization due to a psychiatric problem, with complete ascertainment. Several methodologic solutions addressed competency to understand elements of consent, the costs and reliability of collecting interview data gathering, and insuring usual care.
PMID: 21270143 [PubMed - as supplied by publisher]

Author Scan – Published within the past 30 days/ Epub Ahead of Print

    1. PLoS One. 2011 Mar 22;6(3):e18062.

Multicenter Evaluation of a Novel Surveillance Paradigm for Complications of Mechanical V entilation.

Klompas M, Khan Y, Kleinman K, Evans RS, Lloyd JF, Stevenson K, Samore M, Platt R; for the CDC Prevention Epicenters Program. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, United States of America.

BACKGROUND: Ventilator-associated pneumonia (VAP) surveillance is time consuming, subjective, inaccurate, and inconsistently predicts outcomes. Shifting surveillance from pneumonia in particular to complications in general might circumvent the VAP definition’s subjectivity and inaccuracy, facilitate electronic assessment, make interfacility comparisons more meaningful, and encourage broader prevention strategies. We therefore evaluated a novel surveillance paradigm for ventilator-associated complications (VAC) defined by sustained increases in patients’ ventilator settings after a period of stable or decreasing support.
METHODS: We assessed 600 mechanically ventilated medical and surgical patients from three hospitals. Each hospital contributed 100 randomly selected patients ventilated 2-7 days and 100 patients ventilated >7 days. All patients were independently assessed for VAP and for VAC. We compared incidence-density, duration of mechanical ventilation, intensive care and hospital lengths of stay, hospital mortality, and time required for surveillance for VAP and for VAC. A subset of patients with VAP and VAC were independently reviewed by a physician to determine possible etiology.
RESULTS: Of 597 evaluable patients, 9.3% had VAP (8.8 per 1,000 ventilator days) and 23% had VAC (21.2 per 1,000 ventilator days). Compared to matched controls, both VAP and VAC prolonged days to extubation (5.8, 95% CI 4.2-8.0 and 6.0, 95% CI 5.1-7.1 respectively), days to intensive care discharge (5.7, 95% CI 4.2-7.7 and 5.0, 95% CI 4.1-5.9), and days to hospital discharge (4.7, 95% CI 2.6-7.5 and 3.0, 95% CI 2.1-4.0). VAC was associated with increased mortality (OR 2.0, 95% CI 1.3-3.2) but VAP was not (OR 1.1, 95% CI 0.5-2.4). VAC assessment was faster (mean 1.8 versus 39 minutes per patient). Both VAP and VAC events were predominantly attributable to pneumonia, pulmonary edema, ARDS, and atelectasis.
CONCLUSIONS: Screening ventilator settings for VAC captures a similar set of complications to traditional VAP surveillance but is faster, more objective, and a superior predictor of outcomes.
PMID: 21445364 [PubMed - as supplied by publisher]

Free Full text (PDF) available: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3062570/pdf/pone.0018062.pdf

    1. Med Care. 2011 Mar 18. [Epub ahead of print]

Placebo Adherence, Clinical Outcomes, and Mortality in the Women’s Health Initiative Randomized Hormone Therapy Trials.

R Curtis J, Larson JC, Delzell E, Brookhart MA, Cadarette SM, Chlebowski R, Judd S, Safford M, Solomon DH, Lacroix AZ. * Division of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, AL † Fred Hutchinson Cancer Research Center, Seattle, WA ‡ Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL § Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC ? Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, Canada ¶ Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA ? Division of Preventive Medicine, University of Alabama at Birmingham, Birmingham, AL ** Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital, Boston, MA.

BACKGROUND: Medication adherence may be a proxy for healthy behaviors and other factors that affect outcomes. Prior studies of the association between placebo adherence and health outcomes have been limited primarily to men enrolled in clinical trials and cardiovascular disease outcomes. We examined associations between adherence to placebo and the risk of fracture, coronary heart disease, cancer, and all-cause mortality in the 2 Women’s Health Initiative hormone therapy randomized trials.
METHODS: Postmenopausal women randomized to placebo with adherence measured at least once were eligible for analysis. Time-varying adherence was assessed by dispensing history and pill counts. Outcome adjudication was based on physician review of medical records. Cox proportional hazards models evaluated the relation between high adherence (=80%) to placebo and various outcomes, referent to low adherence (<80%).
RESULTS: A total of 13,444 postmenopausal women were under observation for 106,066 person-years. High placebo adherence was inversely associated with most outcomes including hip fracture [hazard ratio (HR), 0.50; 95% confidence interval (CI), 0.33-0.78], myocardial infarction (HR, 0.69; 95% CI, 0.50-0.95), cancer death (HR, 0.60; 95% CI, 0.43-0.82), and all-cause mortality (HR, 0.64; 95% CI, 0.51-0.80) after adjustment for potential confounders. Women with low adherence to placebo were 20% more likely to have low adherence to statins and osteoporosis medications.
CONCLUSIONS: In the Women’s Health Initiative clinical trials, high adherence to placebo was associated with favorable clinical outcomes and mortality. Until the healthy behaviors and/or other factors for which high adherence is a proxy can be better elucidated, caution is warranted when interpreting the magnitude of benefit of medication adherence.
PMID: 21422960 [PubMed - as supplied by publisher]

    1. Health Serv Res. 2011 Mar 17. doi: 10.1111/j.1475-6773.2011.01253.x. [Epub ahead of print]

Crowd-out and Exposure Effects of Physical Comorbidities on Mental Healt h Care Use: Implications for Racial-Ethnic Disparities in Access. Lê Cook B, McGuire TG, Alegría M, Normand SL. Center for Multicultural Mental Health Research, 120 Beacon St., 4th Floor, Somerville, MA 02143 Department of Psychiatry, Harvard Medical School, Boston, MA Department of Health Care Policy, Harvard Medical School, Boston, MA Center for Multicultural Mental Health Research, Somerville, MA.

Objectives. In disparities models, researchers adjust for differences in “clinical need,” including indicators of comorbidities. We reconsider this practice, assessing (1) if and how having a comorbidity changes the likelihood of recognition and treatment of mental illness; and (2) differences in mental health care disparities estimates with and without adjustment for comorbidities.
Data. Longitudinal data from 2000 to 2007 Medical Expenditure Panel Survey (n=11,083) split into pre and postperiods for white, Latino, and black adults with probable need for mental health care.
Study Design. First, we tested a crowd-out effect (comorbidities decrease initiation of mental health care after a primary care provider [PCP] visit) using logistic regression models and an exposure effect (comorbidities cause more PCP visits, increasing initiation of mental health care) using instrumental variable methods. Second, we assessed the impact of adjustment for comorbidities on disparity estimates.
Principal Findings. We found no evidence of a crowd-out effect but strong evidence for an exposure effect. Number of postperiod visits positively predicted initiation of mental health care. Adjusting for racial/ethnic differences in comorbidities increased black-white disparities and decreased Latino-white disparities.
Conclusions. Positive exposure findings suggest that intensive follow-up programs shown to reduce disparities in chronic-care management may have additional indirect effects on reducing mental health care disparities.
© Health Research and Educational Trust.PMID: 21413984 [PubMed - as supplied by publisher]

 

March 2011

Jump to top of page

CER Scan [Published within the last 30 days]

    1. Med Care. 2011 Mar;49(3):257-266.

Predicting the Risk of 1-Year Mortality in Incident Dialysis Patients: Accounting for Case-Mix Severity in Studies Using Administrative Data.

Quinn RR, Laupacis A, Hux JE, Oliver MJ, Austin PC.
BACKGROUND: Administrative databases are increasingly being used to study the incident dialysis population and have important advantages. However, traditional methods of risk adjustment have limitations in this patient population.
OBJECTIVE: Our objective was to develop a prognostic index for 1-year mortality in incident dialysis patients using administrative data that was applicable to ambulatory patients, used objective definitions of candidate predictor variables, and was easily replicated in other environments.
RESEARCH DESIGN: Anonymized, administrative health data housed at the Institute for Clinical Evaluative Sciences in Toronto, Canada were used to identify a population-based sample of 16,205 patients who initiated dialysis between July 1, 1998 and March 31, 2005. The cohort was divided into derivation, validation, and testing samples and 4 different strategies were used to derive candidate logistic regression models for 1-year mortality. The final risk prediction model was selected based on discriminatory ability (as measured by the c-statistic) and a risk prediction score was derived using methods adopted from the Framingham Heart Study. Calibration of the predictive model was assessed graphically.
RESULTS: The risk of death during the first year of dialysis therapy was 16.4% in the derivation sample. The final model had a c-statistic of 0.765, 0.763, and 0.756 in the derivation, validation, and testing samples, respectively. Plots of actual versus predicted risk of death at 1-year showed good calibration.
CONCLUSION: The prognostic index and summary risk score accurately predict 1-year mortality in incident dialysis patients and can be used for the purposes of risk adjustment.
Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 21301370 [PubMed - as supplied by publisher]

CER Scan [Epub Ahead of Print]

    1. Stat Med. 2011 Feb 24. doi: 10.1002/sim.4168. [Epub ahead of print]

Generalized propensity score for estimating the average treatment effect of multiple treatments.

Feng P, Zhou XH, Zou QM, Fan MY, Li XS.

The propensity score method is widely used in clinical studies to estimate the effect of a treatment with two levels on patient’s outcomes. However, due to the complexity of many diseases, an effective treatment often involves multiple components. For example, in the practice of Traditional Chinese Medicine (TCM), an effective treatment may include multiple components, e.g. Chinese herbs, acupuncture, and massage therapy. In clinical trials involving TCM, patients could be randomly assigned to either the treatment or control group, but they or their doctors may make different choices about which treatment component to use. As a result, treatment components are not randomly assigned. Rosenbaum and Rubin proposed the propensity score method for binary treatments, and Imbens extended their work to multiple treatments. These authors defined the generalized propensity score as the conditional probability of receiving a particular level of the treatment given the pre-treatment variables. In the present work, we adopted this approach and developed a statistical methodology based on the generalized propensity score in order to estimate treatment effects in the case of multiple treatments. Two methods were discussed and compared: propensity score regression adjustment and propensity score weighting. We used these methods to assess the relative effectiveness of individual treatments in the multiple-treatment IMPACT clinical trial. The results reveal that both methods perform well when the sample size is moderate or large. Copyright © 2011 John Wiley & Sons, Ltd.
PMID: 21351291 [PubMed - as supplied by publisher]

    1. Eur J Epidemiol. 2011 Feb 23. [Epub ahead of print]

Estimating measures of interaction on an additive scale for preventive exposures.
Knol MJ, Vanderweele TJ, Groenwold RH, Klungel OH, Rovers MM, Grobbee DE.
Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, PO Box 85500, 3508 GA, Utrecht, The Netherlands, m.j.knol@umcutrecht.nl.
Measures of interaction on an additive scale (relative excess risk due to interaction [RERI], attributable proportion [AP], synergy index [S]), were developed for risk factors rather than preventive factors. It has been suggested that preventive factors should be recoded to risk factors before calculating these measures. We aimed to show that these measures are problematic with preventive factors prior to recoding, and to clarify the recoding method to be used to circumvent these problems. Recoding of preventive factors should be done such that the stratum with the lowest risk becomes the reference category when both factors are considered jointly (rather than one at a time). We used data from a case-control study on the interaction between ACE inhibitors and the ACE gene on incident diabetes. Use of ACE inhibitors was a preventive factor and DD ACE genotype was a risk factor. Before recoding, the RERI, AP and S showed inconsistent results (RERI = 0.26 [95%CI: -0.30; 0.82], AP = 0.30 [95%CI: -0.28; 0.88], S = 0.35 [95%CI: 0.02; 7.38]), with the first two measures suggesting positive interaction and the third negative interaction. After recoding the use of ACE inhibitors, they showed consistent results (RERI = -0.37 [95%CI: -1.23; 0.49], AP = -0.29 [95%CI: -0.98; 0.40], S = 0.43 [95%CI: 0.07; 2.60]), all indicating negative interaction. Preventive factors should not be used to calculate measures of interaction on an additive scale without recoding.
PMID: 21344323 [PubMed - as supplied by publisher]

    1. Stat Med. 2011 Feb 21. doi: 10.1002/sim.4200. [Epub ahead of print]

Comparing paired vs non-paired statistical methods of analyses when making inferences about absolute risk reductions in propensity-score matched samples.

Austin PC.
Propensity-score matching allows one to reduce the effects of treatment-selection bias or confounding when estimating the effects of treatments when using observational data. Some authors have suggested that methods of inference appropriate for independent samples can be used for assessing the statistical significance of treatment effects when using propensity-score matching. Indeed, many authors in the applied medical literature use methods for independent samples when making inferences about treatment effects using propensity-score matched samples. Dichotomous outcomes are common in healthcare research. In this study, we used Monte Carlo simulations to examine the effect on inferences about risk differences (or absolute risk reductions) when statistical methods for independent samples are used compared with when statistical methods for paired samples are used in propensity-score matched samples. We found that compared with using methods for independent samples, the use of methods for paired samples resulted in: (i) empirical type I error rates that were closer to the advertised rate; (ii) empirical coverage rates of 95 per cent confidence intervals that were closer to the advertised rate; (iii) narrower 95 per cent confidence intervals; and (iv) estimated standard errors that more closely reflected the sampling variability of the estimated risk difference. Differences between the empirical and advertised performance of methods for independent samples were greater when the treatment-selection process was stronger compared with when treatment-selection process was weaker. We recommend using statistical methods for paired samples when using propensity-score matched samples for making inferences on the effect of treatment on the reduction in the probability of an event occurring. Copyright © 2011 John Wiley & Sons, Ltd.

PMID: 21337595 [PubMed - as supplied by publisher]

    1. Int J Epidemiol. 2011 Feb 9. [Epub ahead of print]

Commentary: Can ‘many weak’ instruments ever be ‘strong’? Sheehan NA, Didelez V.
Department of Health Sciences, University of Leicester and Department of Mathematics, University of Bristol, Bristol, UK.
PMID: 21310719 [PubMed - as supplied by publisher]
5. Stat Methods Med Res. 2011 Feb 7. [Epub ahead of print]

Comparing measurement error correction methods for rate-of-change exposure variables in survival analysis.

Veronesi G, Ferrario MM, Chambless LE.
In this article we focus on comparing measurement error correction methods for rate-of-change exposure variables in survival analysis, when longitudinal data are observed prior to the follow-up time. Motivational examples include the analysis of the association between changes in cardiovascular risk factors and subsequent onset of coronary events. We derive a measurement error model for the rate of change, estimated through subject-specific linear regression, assuming an additive measurement error model for the time-specific measurements. The rate of change is then included as a time-invariant variable in a Cox proportional hazards model, adjusting for the first time-specific measurement (baseline) and an error-free covariate. In a simulation study, we compared bias, standard deviation and mean squared error (MSE) for the regression calibration (RC) and the simulation-extrapolation (SIMEX) estimators. Our findings indicate that when the amount of measurement error is substantial, RC should be the preferred method, since it has smaller MSE for estimating the coefficients of the rate of change and of the variable measured without error. However, when the amount of measurement error is small, the choice of the method should take into account the event rate in the population and the effect size to be estimated. An application to an observational study, as well as examples of published studies where our model could have been applied, are also provided.
PMID: 21300627 [PubMed - as supplied by publisher]

    1. Clin Trials. 2011 Jan 31. [Epub ahead of print]

Bayesian models for subgroup analysis in clinical trials.

Jones HE, Ohlssen DI, Neuenschwander B, Racine A, Branson M.
BACKGROUND: In a pharmaceutical drug development setting, possible interactions between the treatment and particular baseline clinical or demographic factors are often of interest. However, the subgroup analysis required to investigate such associations remains controversial. Concerns with classical hypothesis testing approaches to the problem include low power, multiple testing, and the possibility of data dredging.
PURPOSE: As an alternative to hypothesis testing, the use of shrinkage estimation techniques is investigated in the context of an exploratory post hoc subgroup analysis. A range of models that have been suggested in the literature are reviewed. Building on this, we explore a general modeling strategy, considering various options for shrinkage of effect estimates. This is applied to a case-study, in which evidence was available from seven-phase II-III clinical trials examining a novel therapy, and also to two artificial datasets with the same structure.
METHODS: Emphasis is placed on hierarchical modeling techniques, adopted within a Bayesian framework using freely available software. A range of possible subgroup model structures are applied, each incorporating shrinkage estimation techniques.
RESULTS: The investigation of the case-study showed little evidence of subgroup effects. Because inferences appeared to be consistent across a range of well-supported models, and model diagnostic checks showed no obvious problems, it seemed this conclusion was robust. It is reassuring that the structured shrinkage techniques appeared to work well in a situation where deeper inspection of the data suggested little evidence of subgroup effects.
LIMITATIONS: The post hoc examination of subgroups should be seen as an exploratory analysis, used to help make better informed decisions regarding potential future studies examining specific subgroups. To a certain extent, the degree of understanding provided by such assessments will be limited by the quality and quantity of available data.
CONCLUSIONS: In light of recent interest by health authorities into the use of subgroup analysis in the context of drug development, it appears that Bayesian approaches involving shrinkage techniques could play an important role in this area. Hopefully, the developments outlined here provide useful methodology for tackling such a problem, in-turn leading to better informed decisions regarding subgroups.
PMID: 21282293 [PubMed - as supplied by publisher]

    1. Int J Epidemiol. 2011 Jan 13. [Epub ahead of print]

Commentary: Adjusting for bias: a user’s guide to performing plastic surgery on meta-analyses of observational studies.

Ioannidis JP.

Department of Medicine, Stanford Prevention Research Center, Stanford University School of Medicine, MSOB X306, 251 Campus Drive, Stanford, CA 94305, USA. jioannid@stanford.edu.
PMID: 21233141 [PubMed - as supplied by publisher]

    1. Int J Epidemiol. 2010 Dec 23. [Epub ahead of print]

A proposed method of bias adjustment for meta-analyses of published observational studies.
Thompson S, Ekelund U, Jebb S, Lindroos AK, Mander A, Sharp S, Turner R, Wilks D.
MRC Biostatistics Unit, Cambridge, UK, MRC Epidemiology Unit, Cambridge, UK and MRC Human Nutrition Research, Cambridge, UK.
OBJECTIVE: Interpretation of meta-analyses of published observational studies is problematic because of numerous sources of bias. We develop bias assessment, elicitation and adjustment methods, and apply them to a systematic review of longitudinal observational studies of the relationship between objectively

measured physical activity and subsequent change in adiposity in children.
METHODS: We separated internal biases that reflect study quality from external biases that reflect generalizability to a target setting. Since published results were presented in different formats, these were all converted to correlation coefficients. Biases were considered as additive or proportional on the

correlation scale. Opinions about the extent of each bias in each study, together with its uncertainty, were elicited in a formal process from quantitatively trained assessors for the internal biases and subject-matter specialists for the external biases. Bias-adjusted results for each study were combined across assessors using median pooling, and results combined across studies by random-effects meta-analysis.
RESULTS: Before adjusting for bias, the pooled correlation is difficult to interpret because the studies varied substantially in quality and design, and there was considerable heterogeneity. After adjusting for both the internal and external biases, the pooled correlation provides a meaningful quantitative summary of all available evidence, and the confidence interval incorporates the elicited uncertainties about the extent of the biases. In the adjusted meta-analysis, there was no apparent heterogeneity.
CONCLUSION: This approach provides a viable method of bias adjustment for meta-analyses of observational studies, allowing the quantitative synthesis of evidence from otherwise incompatible studies. From the meta-analysis of longitudinal observational studies, we conclude that there is no evidence that physical activity is associated with gain in body fat.
PMID: 21186183 [PubMed - as supplied by publisher]
Free Full Text: http://ije.oxfordjournals.org/content/early/2010/12/23/ije.dyq248.full.pdf+html

Author Scan

    1. Am J Epidemiol. 2011 Mar 1;173(5):569-77. Epub 2011 Feb 2.

Limitation of Inverse Probability-of-Censoring Weights in Estimating Survival in the Presence of Strong Selection Bias.

Howe CJ, Cole SR, Chmiel JS, Muñoz A.
In time-to-event analyses, artificial censoring with correction for induced selection bias using inverse probability-of-censoring weights can be used to 1) examine the natural history of a disease after effective interventions are widely available, 2) correct bias due to noncompliance with fixed or dynamic treatment regimens, and 3) estimate survival in the presence of competing risks. Artificial censoring entails censoring participants when they meet a predefined study criterion, such as exposure to an intervention, failure to comply, or the occurrence of a competing outcome. Inverse probability-of-censoring weights use measured common predictors of the artificial censoring mechanism and the outcome of interest to determine what the survival experience of the artificially censored participants would be had they never been exposed to the intervention, complied with their treatment regimen, or not developed the competing outcome. Even if all common predictors are appropriately measured and taken into account, in the context of small sample size and strong selection bias, inverse probability-of-censoring weights could fail because of violations in assumptions necessary to correct selection bias. The authors used an example from the Multicenter AIDS Cohort Study, 1984-2008, regarding estimation of long-term acquired immunodeficiency syndrome-free survival to demonstrate the impact of violations in necessary assumptions. Approaches to improve correction methods are discussed.
PMID: 21289029 [PubMed - in process]

    1. Stat Modelling. 2010 Dec;10(4):421-439.

A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use.

Neelon BH, O’Malley AJ, Normand SL.
Department of Health Care Policy, Harvard Medical School, Boston, USA.
In applications involving count data, it is common to encounter an excess number of zeros. In the study of outpatient service utilization, for example, the number of utilization days will take on integer values, with many subjects having no utilization (zero values). Mixed-distribution models, such as the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB), are often used to fit such data. A more general class of mixture models, called hurdle models, can be used to model zero-deflation as well as zero-inflation. Several authors have proposed frequentist approaches to fitting zero-inflated models for repeated measures. We describe a practical Bayesian approach which incorporates prior information, has optimal small-sample properties, and allows for tractable inference. The approach can be easily implemented using standard Bayesian software. A study of psychiatric outpatient service use illustrates the methods.
PMCID: PMC3039917 [Available on 2011/12/1], PMID: 21339863 [PubMed]