Invitation
This page is a companion to our recent editorial in which we invite readers to suggest a case study illustrating “a problem in medical science that can be usefully solved by propensity scores, but not by other methods” (Stevens & Oke, 2022; Colorectal Disease Volume 24, in press) .   

For context, consider our role as statistics teachers in evidence based medicine.  Our students are health professionals: anaesthetists, dentists, general practitioners, haematologists, internists, midwives, nurses, psychiatrists, radiologists, surgeons, veterinarians, etc. When we teach propensity scores, we can expect to be asked, “what are the advantages of this over other methods taught on the course?”  We would like to be able to give a good example. 

We think answers about marginal vs. conditional effect measures are unlikely to persuade students on our evidence based health care programme.  There is no consensus in evidence based medicine to prefer either marginal or conditional odds ratios: on the contrary, we prefer the absolute risk difference, and this happens to be a collapsible measure (that is, there is no difference between the marginal and the conditional risk difference). 

This and other requirements were stated in our article:

Entrants should be careful that their proposed case study is not a comparison of one propensity score method to another; nor a comparison restricted to limited competing methods (e.g. propensity score matching compared to multivariate adjustment); nor an argument based on marginal vs. conditional odds ratios, unless you can first persuade us that the distinction matters.

(Stevens & Oke, 2022; Colorectal Disease Volume 24, in press).

Entries may be sent to Dr Jason Oke at jason.oke@phc.ox.ac.uk.  Before submitting your entry, please consider the paper (not the title and abstract alone) in the light of the requirements listed above.

Here is a list of some of the papers that our colleagues have previously referred to us.

Kurth et al. (2006). Results of multivariable logistic regression, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. American Journal of Epidemiology, 163(3), 262-270.

In this fascinating paper, the different propensity score methods show more differences, in the results, to each other than to the comparator, multivariate logistic regression.  The ability to target different estimands (e.g. ATE vs ATT) by choosing different weighting methods is promising.  On the other hand, Kurth et al. write that their findings “... should not be taken as evidence that, compared with other multivariable outcome models, these two methods are a better tool to adjust for covariates in observational research”, noting that if analyses are restricted to the sub-population most likely to be treated, “all adjustment methods gave fairly similar results.” 

Martens et al. (2008). Systematic differences in treatment effect estimates between propensity score methods and logistic regression. International Journal of Epidemiology, 37(5), 1142-1147. 

As an argument for propensity scores over multivariate logistic regression, this paper rests on a preference for marginal odds ratios over conditional odds ratios.  See discussion above and in Stevens and Oke.  Interestingly, the simulations in Martens et al. consider only the case that there are no confounders.

Payet et al.  (2021) High-dimensional propensity scores improved the control of indication bias in surgical comparative effectiveness studies. Journal of Clinical Epidemiology, 130, 78-86. 

This paper is a comparison of different propensity score approaches to each other: specifically, “High-dimensional propensity scores (HdPS)” to propensity scores (PS) of lower dimension.