• Open access
  • Published: 10 November 2020

Case study research for better evaluations of complex interventions: rationale and challenges

  • Sara Paparini   ORCID: orcid.org/0000-0002-1909-2481 1 ,
  • Judith Green 2 ,
  • Chrysanthi Papoutsi 1 ,
  • Jamie Murdoch 3 ,
  • Mark Petticrew 4 ,
  • Trish Greenhalgh 1 ,
  • Benjamin Hanckel 5 &
  • Sara Shaw 1  

BMC Medicine volume  18 , Article number:  301 ( 2020 ) Cite this article

19k Accesses

47 Citations

35 Altmetric

Metrics details

The need for better methods for evaluation in health research has been widely recognised. The ‘complexity turn’ has drawn attention to the limitations of relying on causal inference from randomised controlled trials alone for understanding whether, and under which conditions, interventions in complex systems improve health services or the public health, and what mechanisms might link interventions and outcomes. We argue that case study research—currently denigrated as poor evidence—is an under-utilised resource for not only providing evidence about context and transferability, but also for helping strengthen causal inferences when pathways between intervention and effects are likely to be non-linear.

Case study research, as an overall approach, is based on in-depth explorations of complex phenomena in their natural, or real-life, settings. Empirical case studies typically enable dynamic understanding of complex challenges and provide evidence about causal mechanisms and the necessary and sufficient conditions (contexts) for intervention implementation and effects. This is essential evidence not just for researchers concerned about internal and external validity, but also research users in policy and practice who need to know what the likely effects of complex programmes or interventions will be in their settings. The health sciences have much to learn from scholarship on case study methodology in the social sciences. However, there are multiple challenges in fully exploiting the potential learning from case study research. First are misconceptions that case study research can only provide exploratory or descriptive evidence. Second, there is little consensus about what a case study is, and considerable diversity in how empirical case studies are conducted and reported. Finally, as case study researchers typically (and appropriately) focus on thick description (that captures contextual detail), it can be challenging to identify the key messages related to intervention evaluation from case study reports.

Whilst the diversity of published case studies in health services and public health research is rich and productive, we recommend further clarity and specific methodological guidance for those reporting case study research for evaluation audiences.

Peer Review reports

The need for methodological development to address the most urgent challenges in health research has been well-documented. Many of the most pressing questions for public health research, where the focus is on system-level determinants [ 1 , 2 ], and for health services research, where provisions typically vary across sites and are provided through interlocking networks of services [ 3 ], require methodological approaches that can attend to complexity. The need for methodological advance has arisen, in part, as a result of the diminishing returns from randomised controlled trials (RCTs) where they have been used to answer questions about the effects of interventions in complex systems [ 4 , 5 , 6 ]. In conditions of complexity, there is limited value in maintaining the current orientation to experimental trial designs in the health sciences as providing ‘gold standard’ evidence of effect.

There are increasing calls for methodological pluralism [ 7 , 8 ], with the recognition that complex intervention and context are not easily or usefully separated (as is often the situation when using trial design), and that system interruptions may have effects that are not reducible to linear causal pathways between intervention and outcome. These calls are reflected in a shifting and contested discourse of trial design, seen with the emergence of realist [ 9 ], adaptive and hybrid (types 1, 2 and 3) [ 10 , 11 ] trials that blend studies of effectiveness with a close consideration of the contexts of implementation. Similarly, process evaluation has now become a core component of complex healthcare intervention trials, reflected in MRC guidance on how to explore implementation, causal mechanisms and context [ 12 ].

Evidence about the context of an intervention is crucial for questions of external validity. As Woolcock [ 4 ] notes, even if RCT designs are accepted as robust for maximising internal validity, questions of transferability (how well the intervention works in different contexts) and generalisability (how well the intervention can be scaled up) remain unanswered [ 5 , 13 ]. For research evidence to have impact on policy and systems organisation, and thus to improve population and patient health, there is an urgent need for better methods for strengthening external validity, including a better understanding of the relationship between intervention and context [ 14 ].

Policymakers, healthcare commissioners and other research users require credible evidence of relevance to their settings and populations [ 15 ], to perform what Rosengarten and Savransky [ 16 ] call ‘careful abstraction’ to the locales that matter for them. They also require robust evidence for understanding complex causal pathways. Case study research, currently under-utilised in public health and health services evaluation, can offer considerable potential for strengthening faith in both external and internal validity. For example, in an empirical case study of how the policy of free bus travel had specific health effects in London, UK, a quasi-experimental evaluation (led by JG) identified how important aspects of context (a good public transport system) and intervention (that it was universal) were necessary conditions for the observed effects, thus providing useful, actionable evidence for decision-makers in other contexts [ 17 ].

The overall approach of case study research is based on the in-depth exploration of complex phenomena in their natural, or ‘real-life’, settings. Empirical case studies typically enable dynamic understanding of complex challenges rather than restricting the focus on narrow problem delineations and simple fixes. Case study research is a diverse and somewhat contested field, with multiple definitions and perspectives grounded in different ways of viewing the world, and involving different combinations of methods. In this paper, we raise awareness of such plurality and highlight the contribution that case study research can make to the evaluation of complex system-level interventions. We review some of the challenges in exploiting the current evidence base from empirical case studies and conclude by recommending that further guidance and minimum reporting criteria for evaluation using case studies, appropriate for audiences in the health sciences, can enhance the take-up of evidence from case study research.

Case study research offers evidence about context, causal inference in complex systems and implementation

Well-conducted and described empirical case studies provide evidence on context, complexity and mechanisms for understanding how, where and why interventions have their observed effects. Recognition of the importance of context for understanding the relationships between interventions and outcomes is hardly new. In 1943, Canguilhem berated an over-reliance on experimental designs for determining universal physiological laws: ‘As if one could determine a phenomenon’s essence apart from its conditions! As if conditions were a mask or frame which changed neither the face nor the picture!’ ([ 18 ] p126). More recently, a concern with context has been expressed in health systems and public health research as part of what has been called the ‘complexity turn’ [ 1 ]: a recognition that many of the most enduring challenges for developing an evidence base require a consideration of system-level effects [ 1 ] and the conceptualisation of interventions as interruptions in systems [ 19 ].

The case study approach is widely recognised as offering an invaluable resource for understanding the dynamic and evolving influence of context on complex, system-level interventions [ 20 , 21 , 22 , 23 ]. Empirically, case studies can directly inform assessments of where, when, how and for whom interventions might be successfully implemented, by helping to specify the necessary and sufficient conditions under which interventions might have effects and to consolidate learning on how interdependencies, emergence and unpredictability can be managed to achieve and sustain desired effects. Case study research has the potential to address four objectives for improving research and reporting of context recently set out by guidance on taking account of context in population health research [ 24 ], that is to (1) improve the appropriateness of intervention development for specific contexts, (2) improve understanding of ‘how’ interventions work, (3) better understand how and why impacts vary across contexts and (4) ensure reports of intervention studies are most useful for decision-makers and researchers.

However, evaluations of complex healthcare interventions have arguably not exploited the full potential of case study research and can learn much from other disciplines. For evaluative research, exploratory case studies have had a traditional role of providing data on ‘process’, or initial ‘hypothesis-generating’ scoping, but might also have an increasing salience for explanatory aims. Across the social and political sciences, different kinds of case studies are undertaken to meet diverse aims (description, exploration or explanation) and across different scales (from small N qualitative studies that aim to elucidate processes, or provide thick description, to more systematic techniques designed for medium-to-large N cases).

Case studies with explanatory aims vary in terms of their positioning within mixed-methods projects, with designs including (but not restricted to) (1) single N of 1 studies of interventions in specific contexts, where the overall design is a case study that may incorporate one or more (randomised or not) comparisons over time and between variables within the case; (2) a series of cases conducted or synthesised to provide explanation from variations between cases; and (3) case studies of particular settings within RCT or quasi-experimental designs to explore variation in effects or implementation.

Detailed qualitative research (typically done as ‘case studies’ within process evaluations) provides evidence for the plausibility of mechanisms [ 25 ], offering theoretical generalisations for how interventions may function under different conditions. Although RCT designs reduce many threats to internal validity, the mechanisms of effect remain opaque, particularly when the causal pathways between ‘intervention’ and ‘effect’ are long and potentially non-linear: case study research has a more fundamental role here, in providing detailed observational evidence for causal claims [ 26 ] as well as producing a rich, nuanced picture of tensions and multiple perspectives [ 8 ].

Longitudinal or cross-case analysis may be best suited for evidence generation in system-level evaluative research. Turner [ 27 ], for instance, reflecting on the complex processes in major system change, has argued for the need for methods that integrate learning across cases, to develop theoretical knowledge that would enable inferences beyond the single case, and to develop generalisable theory about organisational and structural change in health systems. Qualitative Comparative Analysis (QCA) [ 28 ] is one such formal method for deriving causal claims, using set theory mathematics to integrate data from empirical case studies to answer questions about the configurations of causal pathways linking conditions to outcomes [ 29 , 30 ].

Nonetheless, the single N case study, too, provides opportunities for theoretical development [ 31 ], and theoretical generalisation or analytical refinement [ 32 ]. How ‘the case’ and ‘context’ are conceptualised is crucial here. Findings from the single case may seem to be confined to its intrinsic particularities in a specific and distinct context [ 33 ]. However, if such context is viewed as exemplifying wider social and political forces, the single case can be ‘telling’, rather than ‘typical’, and offer insight into a wider issue [ 34 ]. Internal comparisons within the case can offer rich possibilities for logical inferences about causation [ 17 ]. Further, case studies of any size can be used for theory testing through refutation [ 22 ]. The potential lies, then, in utilising the strengths and plurality of case study to support theory-driven research within different methodological paradigms.

Evaluation research in health has much to learn from a range of social sciences where case study methodology has been used to develop various kinds of causal inference. For instance, Gerring [ 35 ] expands on the within-case variations utilised to make causal claims. For Gerring [ 35 ], case studies come into their own with regard to invariant or strong causal claims (such as X is a necessary and/or sufficient condition for Y) rather than for probabilistic causal claims. For the latter (where experimental methods might have an advantage in estimating effect sizes), case studies offer evidence on mechanisms: from observations of X affecting Y, from process tracing or from pattern matching. Case studies also support the study of emergent causation, that is, the multiple interacting properties that account for particular and unexpected outcomes in complex systems, such as in healthcare [ 8 ].

Finally, efficacy (or beliefs about efficacy) is not the only contributor to intervention uptake, with a range of organisational and policy contingencies affecting whether an intervention is likely to be rolled out in practice. Case study research is, therefore, invaluable for learning about contextual contingencies and identifying the conditions necessary for interventions to become normalised (i.e. implemented routinely) in practice [ 36 ].

The challenges in exploiting evidence from case study research

At present, there are significant challenges in exploiting the benefits of case study research in evaluative health research, which relate to status, definition and reporting. Case study research has been marginalised at the bottom of an evidence hierarchy, seen to offer little by way of explanatory power, if nonetheless useful for adding descriptive data on process or providing useful illustrations for policymakers [ 37 ]. This is an opportune moment to revisit this low status. As health researchers are increasingly charged with evaluating ‘natural experiments’—the use of face masks in the response to the COVID-19 pandemic being a recent example [ 38 ]—rather than interventions that take place in settings that can be controlled, research approaches using methods to strengthen causal inference that does not require randomisation become more relevant.

A second challenge for improving the use of case study evidence in evaluative health research is that, as we have seen, what is meant by ‘case study’ varies widely, not only across but also within disciplines. There is indeed little consensus amongst methodologists as to how to define ‘a case study’. Definitions focus, variously, on small sample size or lack of control over the intervention (e.g. [ 39 ] p194), on in-depth study and context [ 40 , 41 ], on the logic of inference used [ 35 ] or on distinct research strategies which incorporate a number of methods to address questions of ‘how’ and ‘why’ [ 42 ]. Moreover, definitions developed for specific disciplines do not capture the range of ways in which case study research is carried out across disciplines. Multiple definitions of case study reflect the richness and diversity of the approach. However, evidence suggests that a lack of consensus across methodologists results in some of the limitations of published reports of empirical case studies [ 43 , 44 ]. Hyett and colleagues [ 43 ], for instance, reviewing reports in qualitative journals, found little match between methodological definitions of case study research and how authors used the term.

This raises the third challenge we identify that case study reports are typically not written in ways that are accessible or useful for the evaluation research community and policymakers. Case studies may not appear in journals widely read by those in the health sciences, either because space constraints preclude the reporting of rich, thick descriptions, or because of the reported lack of willingness of some biomedical journals to publish research that uses qualitative methods [ 45 ], signalling the persistence of the aforementioned evidence hierarchy. Where they do, however, the term ‘case study’ is used to indicate, interchangeably, a qualitative study, an N of 1 sample, or a multi-method, in-depth analysis of one example from a population of phenomena. Definitions of what constitutes the ‘case’ are frequently lacking and appear to be used as a synonym for the settings in which the research is conducted. Despite offering insights for evaluation, the primary aims may not have been evaluative, so the implications may not be explicitly drawn out. Indeed, some case study reports might properly be aiming for thick description without necessarily seeking to inform about context or causality.

Acknowledging plurality and developing guidance

We recognise that definitional and methodological plurality is not only inevitable, but also a necessary and creative reflection of the very different epistemological and disciplinary origins of health researchers, and the aims they have in doing and reporting case study research. Indeed, to provide some clarity, Thomas [ 46 ] has suggested a typology of subject/purpose/approach/process for classifying aims (e.g. evaluative or exploratory), sample rationale and selection and methods for data generation of case studies. We also recognise that the diversity of methods used in case study research, and the necessary focus on narrative reporting, does not lend itself to straightforward development of formal quality or reporting criteria.

Existing checklists for reporting case study research from the social sciences—for example Lincoln and Guba’s [ 47 ] and Stake’s [ 33 ]—are primarily orientated to the quality of narrative produced, and the extent to which they encapsulate thick description, rather than the more pragmatic issues of implications for intervention effects. Those designed for clinical settings, such as the CARE (CAse REports) guidelines, provide specific reporting guidelines for medical case reports about single, or small groups of patients [ 48 ], not for case study research.

The Design of Case Study Research in Health Care (DESCARTE) model [ 44 ] suggests a series of questions to be asked of a case study researcher (including clarity about the philosophy underpinning their research), study design (with a focus on case definition) and analysis (to improve process). The model resembles toolkits for enhancing the quality and robustness of qualitative and mixed-methods research reporting, and it is usefully open-ended and non-prescriptive. However, even if it does include some reflections on context, the model does not fully address aspects of context, logic and causal inference that are perhaps most relevant for evaluative research in health.

Hence, for evaluative research where the aim is to report empirical findings in ways that are intended to be pragmatically useful for health policy and practice, this may be an opportune time to consider how to best navigate plurality around what is (minimally) important to report when publishing empirical case studies, especially with regards to the complex relationships between context and interventions, information that case study research is well placed to provide.

The conventional scientific quest for certainty, predictability and linear causality (maximised in RCT designs) has to be augmented by the study of uncertainty, unpredictability and emergent causality [ 8 ] in complex systems. This will require methodological pluralism, and openness to broadening the evidence base to better understand both causality in and the transferability of system change intervention [ 14 , 20 , 23 , 25 ]. Case study research evidence is essential, yet is currently under exploited in the health sciences. If evaluative health research is to move beyond the current impasse on methods for understanding interventions as interruptions in complex systems, we need to consider in more detail how researchers can conduct and report empirical case studies which do aim to elucidate the contextual factors which interact with interventions to produce particular effects. To this end, supported by the UK’s Medical Research Council, we are embracing the challenge to develop guidance for case study researchers studying complex interventions. Following a meta-narrative review of the literature, we are planning a Delphi study to inform guidance that will, at minimum, cover the value of case study research for evaluating the interrelationship between context and complex system-level interventions; for situating and defining ‘the case’, and generalising from case studies; as well as provide specific guidance on conducting, analysing and reporting case study research. Our hope is that such guidance can support researchers evaluating interventions in complex systems to better exploit the diversity and richness of case study research.

Availability of data and materials

Not applicable (article based on existing available academic publications)

Abbreviations

Qualitative comparative analysis

Quasi-experimental design

Randomised controlled trial

Diez Roux AV. Complex systems thinking and current impasses in health disparities research. Am J Public Health. 2011;101(9):1627–34.

Article   Google Scholar  

Ogilvie D, Mitchell R, Mutrie N, M P, Platt S. Evaluating health effects of transport interventions: methodologic case study. Am J Prev Med 2006;31:118–126.

Walshe C. The evaluation of complex interventions in palliative care: an exploration of the potential of case study research strategies. Palliat Med. 2011;25(8):774–81.

Woolcock M. Using case studies to explore the external validity of ‘complex’ development interventions. Evaluation. 2013;19:229–48.

Cartwright N. Are RCTs the gold standard? BioSocieties. 2007;2(1):11–20.

Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials. Soc Sci Med. 2018;210:2–21.

Salway S, Green J. Towards a critical complex systems approach to public health. Crit Public Health. 2017;27(5):523–4.

Greenhalgh T, Papoutsi C. Studying complexity in health services research: desperately seeking an overdue paradigm shift. BMC Med. 2018;16(1):95.

Bonell C, Warren E, Fletcher A. Realist trials and the testing of context-mechanism-outcome configurations: a response to Van Belle et al. Trials. 2016;17:478.

Pallmann P, Bedding AW, Choodari-Oskooei B. Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Med. 2018;16:29.

Curran G, Bauer M, Mittman B, Pyne J, Stetler C. Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact. Med Care. 2012;50(3):217–26. https://doi.org/10.1097/MLR.0b013e3182408812 .

Moore GF, Audrey S, Barker M, Bond L, Bonell C, Hardeman W, et al. Process evaluation of complex interventions: Medical Research Council guidance. BMJ. 2015 [cited 2020 Jun 27];350. Available from: https://www.bmj.com/content/350/bmj.h1258 .

Evans RE, Craig P, Hoddinott P, Littlecott H, Moore L, Murphy S, et al. When and how do ‘effective’ interventions need to be adapted and/or re-evaluated in new contexts? The need for guidance. J Epidemiol Community Health. 2019;73(6):481–2.

Shoveller J. A critical examination of representations of context within research on population health interventions. Crit Public Health. 2016;26(5):487–500.

Treweek S, Zwarenstein M. Making trials matter: pragmatic and explanatory trials and the problem of applicability. Trials. 2009;10(1):37.

Rosengarten M, Savransky M. A careful biomedicine? Generalization and abstraction in RCTs. Crit Public Health. 2019;29(2):181–91.

Green J, Roberts H, Petticrew M, Steinbach R, Goodman A, Jones A, et al. Integrating quasi-experimental and inductive designs in evaluation: a case study of the impact of free bus travel on public health. Evaluation. 2015;21(4):391–406.

Canguilhem G. The normal and the pathological. New York: Zone Books; 1991. (1949).

Google Scholar  

Hawe P, Shiell A, Riley T. Theorising interventions as events in systems. Am J Community Psychol. 2009;43:267–76.

King G, Keohane RO, Verba S. Designing social inquiry: scientific inference in qualitative research: Princeton University Press; 1994.

Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O. Diffusion of innovations in service organizations: systematic review and recommendations. Milbank Q. 2004;82(4):581–629.

Yin R. Enhancing the quality of case studies in health services research. Health Serv Res. 1999;34(5 Pt 2):1209.

CAS   PubMed   PubMed Central   Google Scholar  

Raine R, Fitzpatrick R, Barratt H, Bevan G, Black N, Boaden R, et al. Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Health Serv Deliv Res. 2016 [cited 2020 Jun 30];4(16). Available from: https://www.journalslibrary.nihr.ac.uk/hsdr/hsdr04160#/abstract .

Craig P, Di Ruggiero E, Frohlich KL, E M, White M, Group CCGA. Taking account of context in population health intervention research: guidance for producers, users and funders of research. NIHR Evaluation, Trials and Studies Coordinating Centre; 2018.

Grant RL, Hood R. Complex systems, explanation and policy: implications of the crisis of replication for public health research. Crit Public Health. 2017;27(5):525–32.

Mahoney J. Strategies of causal inference in small-N analysis. Sociol Methods Res. 2000;4:387–424.

Turner S. Major system change: a management and organisational research perspective. In: Rosalind Raine, Ray Fitzpatrick, Helen Barratt, Gywn Bevan, Nick Black, Ruth Boaden, et al. Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Health Serv Deliv Res. 2016;4(16) 2016. https://doi.org/10.3310/hsdr04160.

Ragin CC. Using qualitative comparative analysis to study causal complexity. Health Serv Res. 1999;34(5 Pt 2):1225.

Hanckel B, Petticrew M, Thomas J, Green J. Protocol for a systematic review of the use of qualitative comparative analysis for evaluative questions in public health research. Syst Rev. 2019;8(1):252.

Schneider CQ, Wagemann C. Set-theoretic methods for the social sciences: a guide to qualitative comparative analysis: Cambridge University Press; 2012. 369 p.

Flyvbjerg B. Five misunderstandings about case-study research. Qual Inq. 2006;12:219–45.

Tsoukas H. Craving for generality and small-N studies: a Wittgensteinian approach towards the epistemology of the particular in organization and management studies. Sage Handb Organ Res Methods. 2009:285–301.

Stake RE. The art of case study research. London: Sage Publications Ltd; 1995.

Mitchell JC. Typicality and the case study. Ethnographic research: A guide to general conduct. Vol. 238241. 1984.

Gerring J. What is a case study and what is it good for? Am Polit Sci Rev. 2004;98(2):341–54.

May C, Mort M, Williams T, F M, Gask L. Health technology assessment in its local contexts: studies of telehealthcare. Soc Sci Med 2003;57:697–710.

McGill E. Trading quality for relevance: non-health decision-makers’ use of evidence on the social determinants of health. BMJ Open. 2015;5(4):007053.

Greenhalgh T. We can’t be 100% sure face masks work – but that shouldn’t stop us wearing them | Trish Greenhalgh. The Guardian. 2020 [cited 2020 Jun 27]; Available from: https://www.theguardian.com/commentisfree/2020/jun/05/face-masks-coronavirus .

Hammersley M. So, what are case studies? In: What’s wrong with ethnography? New York: Routledge; 1992.

Crowe S, Cresswell K, Robertson A, Huby G, Avery A, Sheikh A. The case study approach. BMC Med Res Methodol. 2011;11(1):100.

Luck L, Jackson D, Usher K. Case study: a bridge across the paradigms. Nurs Inq. 2006;13(2):103–9.

Yin RK. Case study research and applications: design and methods: Sage; 2017.

Hyett N, A K, Dickson-Swift V. Methodology or method? A critical review of qualitative case study reports. Int J Qual Stud Health Well-Being. 2014;9:23606.

Carolan CM, Forbat L, Smith A. Developing the DESCARTE model: the design of case study research in health care. Qual Health Res. 2016;26(5):626–39.

Greenhalgh T, Annandale E, Ashcroft R, Barlow J, Black N, Bleakley A, et al. An open letter to the BMJ editors on qualitative research. Bmj. 2016;352.

Thomas G. A typology for the case study in social science following a review of definition, discourse, and structure. Qual Inq. 2011;17(6):511–21.

Lincoln YS, Guba EG. Judging the quality of case study reports. Int J Qual Stud Educ. 1990;3(1):53–9.

Riley DS, Barber MS, Kienle GS, Aronson JK, Schoen-Angerer T, Tugwell P, et al. CARE guidelines for case reports: explanation and elaboration document. J Clin Epidemiol. 2017;89:218–35.

Download references

Acknowledgements

Not applicable

This work was funded by the Medical Research Council - MRC Award MR/S014632/1 HCS: Case study, Context and Complex interventions (TRIPLE C). SP was additionally funded by the University of Oxford's Higher Education Innovation Fund (HEIF).

Author information

Authors and affiliations.

Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK

Sara Paparini, Chrysanthi Papoutsi, Trish Greenhalgh & Sara Shaw

Wellcome Centre for Cultures & Environments of Health, University of Exeter, Exeter, UK

Judith Green

School of Health Sciences, University of East Anglia, Norwich, UK

Jamie Murdoch

Public Health, Environments and Society, London School of Hygiene & Tropical Medicin, London, UK

Mark Petticrew

Institute for Culture and Society, Western Sydney University, Penrith, Australia

Benjamin Hanckel

You can also search for this author in PubMed   Google Scholar

Contributions

JG, MP, SP, JM, TG, CP and SS drafted the initial paper; all authors contributed to the drafting of the final version, and read and approved the final manuscript.

Corresponding author

Correspondence to Sara Paparini .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Paparini, S., Green, J., Papoutsi, C. et al. Case study research for better evaluations of complex interventions: rationale and challenges. BMC Med 18 , 301 (2020). https://doi.org/10.1186/s12916-020-01777-6

Download citation

Received : 03 July 2020

Accepted : 07 September 2020

Published : 10 November 2020

DOI : https://doi.org/10.1186/s12916-020-01777-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Qualitative
  • Case studies
  • Mixed-method
  • Public health
  • Health services research
  • Interventions

BMC Medicine

ISSN: 1741-7015

case study healthcare models

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 21, Issue 1
  • What is a case study?
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • Roberta Heale 1 ,
  • Alison Twycross 2
  • 1 School of Nursing , Laurentian University , Sudbury , Ontario , Canada
  • 2 School of Health and Social Care , London South Bank University , London , UK
  • Correspondence to Dr Roberta Heale, School of Nursing, Laurentian University, Sudbury, ON P3E2C6, Canada; rheale{at}laurentian.ca

https://doi.org/10.1136/eb-2017-102845

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is it?

Case study is a research methodology, typically seen in social and life sciences. There is no one definition of case study research. 1 However, very simply… ‘a case study can be defined as an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units’. 1 A case study has also been described as an intensive, systematic investigation of a single individual, group, community or some other unit in which the researcher examines in-depth data relating to several variables. 2

Often there are several similar cases to consider such as educational or social service programmes that are delivered from a number of locations. Although similar, they are complex and have unique features. In these circumstances, the evaluation of several, similar cases will provide a better answer to a research question than if only one case is examined, hence the multiple-case study. Stake asserts that the cases are grouped and viewed as one entity, called the quintain . 6  ‘We study what is similar and different about the cases to understand the quintain better’. 6

The steps when using case study methodology are the same as for other types of research. 6 The first step is defining the single case or identifying a group of similar cases that can then be incorporated into a multiple-case study. A search to determine what is known about the case(s) is typically conducted. This may include a review of the literature, grey literature, media, reports and more, which serves to establish a basic understanding of the cases and informs the development of research questions. Data in case studies are often, but not exclusively, qualitative in nature. In multiple-case studies, analysis within cases and across cases is conducted. Themes arise from the analyses and assertions about the cases as a whole, or the quintain, emerge. 6

Benefits and limitations of case studies

If a researcher wants to study a specific phenomenon arising from a particular entity, then a single-case study is warranted and will allow for a in-depth understanding of the single phenomenon and, as discussed above, would involve collecting several different types of data. This is illustrated in example 1 below.

Using a multiple-case research study allows for a more in-depth understanding of the cases as a unit, through comparison of similarities and differences of the individual cases embedded within the quintain. Evidence arising from multiple-case studies is often stronger and more reliable than from single-case research. Multiple-case studies allow for more comprehensive exploration of research questions and theory development. 6

Despite the advantages of case studies, there are limitations. The sheer volume of data is difficult to organise and data analysis and integration strategies need to be carefully thought through. There is also sometimes a temptation to veer away from the research focus. 2 Reporting of findings from multiple-case research studies is also challenging at times, 1 particularly in relation to the word limits for some journal papers.

Examples of case studies

Example 1: nurses’ paediatric pain management practices.

One of the authors of this paper (AT) has used a case study approach to explore nurses’ paediatric pain management practices. This involved collecting several datasets:

Observational data to gain a picture about actual pain management practices.

Questionnaire data about nurses’ knowledge about paediatric pain management practices and how well they felt they managed pain in children.

Questionnaire data about how critical nurses perceived pain management tasks to be.

These datasets were analysed separately and then compared 7–9 and demonstrated that nurses’ level of theoretical did not impact on the quality of their pain management practices. 7 Nor did individual nurse’s perceptions of how critical a task was effect the likelihood of them carrying out this task in practice. 8 There was also a difference in self-reported and observed practices 9 ; actual (observed) practices did not confirm to best practice guidelines, whereas self-reported practices tended to.

Example 2: quality of care for complex patients at Nurse Practitioner-Led Clinics (NPLCs)

The other author of this paper (RH) has conducted a multiple-case study to determine the quality of care for patients with complex clinical presentations in NPLCs in Ontario, Canada. 10 Five NPLCs served as individual cases that, together, represented the quatrain. Three types of data were collected including:

Review of documentation related to the NPLC model (media, annual reports, research articles, grey literature and regulatory legislation).

Interviews with nurse practitioners (NPs) practising at the five NPLCs to determine their perceptions of the impact of the NPLC model on the quality of care provided to patients with multimorbidity.

Chart audits conducted at the five NPLCs to determine the extent to which evidence-based guidelines were followed for patients with diabetes and at least one other chronic condition.

The three sources of data collected from the five NPLCs were analysed and themes arose related to the quality of care for complex patients at NPLCs. The multiple-case study confirmed that nurse practitioners are the primary care providers at the NPLCs, and this positively impacts the quality of care for patients with multimorbidity. Healthcare policy, such as lack of an increase in salary for NPs for 10 years, has resulted in issues in recruitment and retention of NPs at NPLCs. This, along with insufficient resources in the communities where NPLCs are located and high patient vulnerability at NPLCs, have a negative impact on the quality of care. 10

These examples illustrate how collecting data about a single case or multiple cases helps us to better understand the phenomenon in question. Case study methodology serves to provide a framework for evaluation and analysis of complex issues. It shines a light on the holistic nature of nursing practice and offers a perspective that informs improved patient care.

  • Gustafsson J
  • Calanzaro M
  • Sandelowski M

Competing interests None declared.

Provenance and peer review Commissioned; internally peer reviewed.

Read the full text or download the PDF:

  • Research article
  • Open access
  • Published: 14 November 2019

Organisational change in hospitals: a qualitative case-study of staff perspectives

  • Chiara Pomare   ORCID: orcid.org/0000-0002-9118-7207 1 ,
  • Kate Churruca 1 ,
  • Janet C. Long 1 ,
  • Louise A. Ellis 1 &
  • Jeffrey Braithwaite 1  

BMC Health Services Research volume  19 , Article number:  840 ( 2019 ) Cite this article

44k Accesses

22 Citations

8 Altmetric

Metrics details

Organisational change in health systems is common. Success is often tied to the actors involved, including their awareness of the change, personal engagement and ownership of it. In many health systems, one of the most common changes we are witnessing is the redevelopment of long-standing hospitals. However, we know little about how hospital staff understand and experience such potentially far-reaching organisational change. The purpose of this study is to explore the understanding and experiences of hospital staff in the early stages of organisational change, using a hospital redevelopment in Sydney, Australia as a case study.

Semi-structured interviews were conducted with 46 clinical and non-clinical staff working at a large metropolitan hospital. Hospital staff were moving into a new building, not moving, or had moved into a different building two years prior. Questions asked staff about their level of awareness of the upcoming redevelopment and their experiences in the early stage of this change. Qualitative data were analysed using thematic analysis.

Some staff expressed apprehension and held negative expectations regarding the organisational change. Concerns included inadequate staffing and potential for collaboration breakdown due to new layout of workspaces. These fears were compounded by current experiences of feeling uninformed about the change, as well as feelings of being fatigued and under-staffed in the constantly changing hospital environment. Nevertheless, balancing this, many staff reported positive expectations regarding the benefits to patients of the change and the potential for staff to adapt in the face of this change.

Conclusions

The results of this study suggest that it is important to understand prospectively how actors involved make sense of organisational change, in order to potentially assuage concerns and alleviate negative expectations. Throughout the processes of organisational change, such as a hospital redevelopment, staff need to be engaged, adequately informed, trained, and to feel supported by management. The use of champions of varying professions and lead departments, may be useful to address concerns, adequately inform, and promote a sense of engagement among staff.

Peer Review reports

Change is a common experience in complex health care systems. Staff, patients and visitors come and go [ 1 ]; leadership, models of care, workforce and governing structures are reshaped in response to policy and legislative change [ 2 ], and new technologies and equipment are introduced or retired [ 3 ]. In addition to these common changes experienced throughout health care, the acute sector in many countries is constantly undergoing major changes to the physical hospital infrastructure [ 4 , 5 ]. In New South Wales, Australia, several reports have described the increase in hospital redevelopment projects as a ‘hospital building boom’ [ 4 , 6 ], with approximately 100 major health capital projects (i.e., projects over AUD$10 million) currently in train [ 7 ]. In addition to meeting the needs of a growing and ageing population [ 8 ], the re-design and refurbishment of older hospital infrastructure is supported by a range of arguments and anecdotal evidence highlighting the positive relationship between the hospital physical environment and patient [ 9 ] and staff outcomes [ 10 ]. While there are many reasons why hospital redevelopments are taking place, we know little about how hospital staff prospectively perceive change, and their experiences, expectations, and concerns. Hospital staff encapsulates any employee working in the hospital context. This includes clinical and non-clinical staff who provide care, support, cleaning, catering, managerial and administrative duties to patients and the broader community.

One reason as to why little research has explored the perspectives of hospital staff during a redevelopment may be because hospital redevelopment is often considered a physical, rather than organisational change. Organisational change means that not only the physical environment is altered, but also the behavioural operations, structural relationships and roles, and the hospital organisational culture may transform. For example, changing the physical health care environment can affect job satisfaction, stress, intention to leave [ 11 ], and the way staff work together [ 12 ].

Redeveloping a hospital can be both an exciting and challenging time for staff. In a recent notable example of opening a new hospital building in Australia, staff attitudes shifted from appreciation and excitement in the early stages of change to frustration and angst as the development progressed [ 13 ]. Similar experiences have been reported elsewhere, such as in a study describing the consequences for staff of hospital change in South Africa [ 14 ]. However, these examples explored staff attitudes towards change retrospectively and considered the change as a physical redevelopment, rather than organisational change. Such retrospective reports may be limited in validity [ 15 ] as prospective experiences and understanding of change reported by staff may be conflated with the final outcome of the change. The hospital redevelopment literature has also prospectively assessed health impacts of proposed redevelopment plans as a means to predetermine the impact of a large change on the population [ 16 ]; while prospective, this research again considers redevelopment as a physical modification, rather than an organisational change. Thus, while the literature has reported retrospective accounts of staff experiences in large hospital change and prospective assessment of the impact of the change, there is little research examining the understanding and experiences of staff in the early stages of redevelopmental change in hospitals through a lens of organisational change.

Seminal research in the organisational change literature highlights that the role of frontline workers (in this case hospital staff) is crucial to implementation of any process or change [ 17 , 18 ]. Specifically, that the support of actors (understanding, owning, and engaging) can determine the success of a change [ 19 ]. This is consistent with complexity science accounts which suggest that any improvement and transformation of health systems is dependent upon the actors involved, and the extent and quality of their interactions, their emergent behaviours, and localised responses [ 1 , 20 ]. In health care, change can be resisted when it is imposed on actors (in this case, hospital staff), but may be better accepted when people are involved and adopt a sense of ownership of the changes that will affect them [ 21 ]. This may include being involved in the design process. For this reason, it is important to examine the understanding and experiences of actors involved in a change (i.e., hospital staff in a redevelopment), in order to understand and potentially address their concerns, alleviating negative expectations prior to the change.

This study is part of a larger project exploring how hospital redevelopment influences the organisation, staff and patients involved [ 22 ]. The present study aimed to explore the understanding and experiences of staff prior to moving into a new building as one stage in a multidimensional organisational change project. The research questions were: How do staff make sense of this organisational change? How well informed do they feel? What are their expectations and concerns? What are the implications for hospitals undergoing organisational change, particularly redevelopment?

The study protocol has been published elsewhere [ 22 ]. The Consolidated criteria for reporting qualitative research (COREQ) guidelines were used to ensure comprehensive reporting of the qualitative study results (Additional file 1 ) [ 23 ] .

Study setting and participants

This study was conducted at a large metropolitan, publicly-funded hospital in Sydney, Australia. The facility is undergoing a multimillion-dollar development project to meet the growing needs of the community. This hospital has undergone a number of other changes over the last two decades, including incremental increases in size. Since its opening in the mid 1990s (with approximately 150 beds), several buildings have been added over the years. The hospital now has multiple buildings and over 500 beds.

During the time of this study, the hospital was in the second stage of the multi-stage redevelopment. This stage included: the opening of a new acute services building, the relocation of several wards to this new building (e.g., Intensive care unit (ICU) and Maternity), increases in resources (e.g., equipment, staffing), and the adoption of new ways of working (e.g., activity-based workspaces for support staff). Essentially, the redevelopment involves the opening of a new state-of-the-art building which will include moving services (and staff) from the old to the new building, with some wards staying in the old building. For the wards moving into the new building, this change does not initially involve more patients in existing services, but is intended to increase the number of staff because there will be more physical space to cover and new models of care introduced (e.g., ICU changing to single-bed rooms, more staff needed to individually attend to patients). The current redevelopment includes space for future expansion to account for the growing population. In addition to the redevelopment of the physical infrastructure, the way staff work together is also planned to change. Hospital leadership is aiming to foster a cultural shift towards greater cohesion and unity; highlighting that the hospital redevelopment can be conceptualised as an organisational change of considerable importance and magnitude.

Participants were hospital staff (clinical and non-clinical) working at the hospital under investigation. Staff working on four wards were targeted for interviews, with the intention to capture diverse experiences of the redevelopment and the broader organisational change; two of these wards would be moving into the new building (ICU and Maternity), one ward was not moving (Surgical), and one ward had moved into a new building two years prior (Respiratory). Interviews were also conducted with staff who held responsibilities across wards (e.g., General Services Department: cleaners, porters). The hospital staff were purposively recruited by department heads and snowballed from participants. Fifty staff members were approached (until data saturation was met) with four refusing to participate because they did not have the time.

Semi-structured interviews

Semi-structured interviews were conducted in private settings at the participants’ place of work (e.g., ward interview rooms, private offices). In the event a participant was unable to meet the researcher in person, interviews were conducted over the phone. A semi-structured interview guide was created in collaboration with key stakeholders from the hospital under investigation and following a literature review. The guide (Additional file 2 ) included questions aimed at exploring participants’: (1) understanding of the hospital’s culture and current ways of working; (2) understanding of the redevelopment and other hospital changes; and (3) concerns or expectations about the organisational change. The interviews were audio-taped and transcribed verbatim by the first author who is trained and experienced in conducting semi-structured interviews. No field notes were made during the interview nor were transcripts returned to participants for comment or correction due to the time poor characteristics of the study participants (hospital staff). Participants were informed that the research was part of the first author’s doctoral studies.

Interview data were analysed via thematic analysis [ 24 ] using NVivo [ 25 ]. This approach followed Braun and Clarke’s (2006) six phases of thematic analysis: familiarise, generate initial codes, develop themes, review potential themes, define and name themes, produce the report. Data were initially read multiple times by the first author, then descriptively and iteratively coded according to semantic features. The analysis included the use of inductive coding to identify patterns driven by the data, together with deductive coding, keeping the research questions in mind. Through examination of codes and coded data, themes were developed. The broader research team (KC, LAE, JCL) were included throughout each stage of the analysis process, with frequent discussions concerning the categorization of codes and themes. This process of having one researcher responsible for the analysis while other researchers then checked and clarified emerging themes throughout contributes to the trustworthiness of the findings [ 26 ].

In presenting the results, extracts have been edited minimally to enhance readability, without altering meaning or inference. Where extracts are presented, staff are coded according to their department (G: General – works across several wards; ICU: Intensive care unit; MAT: Maternity ward; RES: Respiratory ward; SUR: Surgical ward) and profession (AD: Administrative staff; CHGTEAM: Change management team staff; DR: Medical staff; GS: General services staff; MW: Midwifery staff; N: Nursing staff; OTH: Other profession).

Forty-six staff members participated in the semi-structured interviews. Interviews were typically conducted face-to-face ( n  = 41; 89.1%), with five interviews conducted over the phone. No differences were discerned in content between these different mediums. Hospital staff taking part in interviews included those from: nursing and midwifery, medical, general services, administrative, and change management (Table  1 ). Change management staff are external to the hospital staff, and do not report to hospital executives. Interviews ranged from seven to 33 min in length ( M  = 17 min). Participating staff had worked at the hospital for on average 10.5 years (range 5 months and 30 years).

Five themes were identified related to hospital staff’s understanding and experiences (i.e., expectations and concerns) of the change: staffing; benefits to patients; collaboration; fatigue; and adaptability. These expectations and concerns are schematically presented in Fig.  1 , with shades of red indicating negative expectations and concerns associated with the theme, and green representing positive expectations. Intensity of the colour demonstrated the frequency of positivity or negativity associated with that theme (i.e., deeper shades of red indicate frequency of negative discussion of this theme by different hospital staff). This figure also highlights the complexity and interrelatedness of these themes (e.g., the concern of inadequate staffing for the new building was linked with concerns about patient care, which could possibly impede the way the team work together, leading to staff feeling overworked and worn out; these expectations were all mitigated by the staff member’s understanding and awareness of the change). Explanations and examples are presented below.

figure 1

Thematic visualisation of staff understanding and expectations of the change

Hospital staff consistently held staffing to be a major concern in this redevelopment. To them, the opening of the new building, and with it the increase in physical size and addition of new services, meant that an increase in staff was crucial to successfully implement the change: “ My biggest uncertainty at the moment is the fact that I’m really concerned about whether I’m actually going to get enough staff ” (GS1). Many participants suggested that this issue would determine the success of the new hospital building. This was particularly important for staff moving into the new building with a bigger work space: “ We just need more staff. Yeah I think that’s the main issue - if we fix that then I believe everything should be smooth ” (ICUN4). For the most part, staff were unaware about how many new staff they would have in the new building. This uncertainty involved two related issues: (1) will we get the budget for new staff that we need? And if so, (2) where will we find all these new staff to employ?

On the first point, staff reported concerns that they would not have enough staff to cover the increased physical space and new ways of working within the new building. This lingering uncertainty was the result of external factors, specifically unresolved budget issues: “ But I suppose some of the issues stem from the fact that you never know how many beds we are able to open based on the funding from the government, and that is what is still up in the air ” (ICUDR1).

Regarding the second point, staff noted that even if budgetary issues were resolved, and there was enough money to hire new staff to fill the new building, a challenge would be finding the staff to recruit: “ I don’t know where these new staff are going to come from” (GN3). Some participants suggested that they already encountered difficulties with employing enough appropriately qualified staff and reported concerns that this issue would be compounded when they moved into the new building: “ Excitement will be way gone. It’s more to deal with that stress and the workload of other staff ” (ICUN4). Participants working on wards that were not moving into the new building also reported concerns about staffing. They noted that, despite not being directly involved in previous stages of the redevelopment, they had still been affected by these changes, because their colleagues were taken from their ward without consultation and moved into a new area. Hence, even staff not moving in the next stage of the redevelopment had concerns that their staffing levels would be affected: “ We have been told that we are not moving in there. And hopefully they don’t take our staff there ” (SURN5).

Benefits to patients

Many hospital staff expressed a positive expectation of the move related to benefits for patients. This was consistent across wards, departments and professions. Staff expected patients to experience benefits including reductions in infection rates and improved satisfaction, due to staying in a well-controlled and physically appealing environment with natural light: “ Any new place will give some joy or some happiness to people… The major change will be that because there are individual rooms, the infection rate will be lower and that I’m very pleased with” (ICUDR1).

Despite these participants reporting the improved physical environment was expected to positively affect patients, they also raised concerns that being in the new building might negatively affect patient safety because the increased physical space could introduce more room for error with the greater workload: “ Brings with it the fear, of how will we treat so many patients with nursing when you have one to one and the rooms are closed. That is a constant worry ” (ICUDR1). Participants indicated that this issue would be compounded if staffing levels were not increased.

Collaboration

Staff expressed multiple negative expectations or concerns about how their ways of working together would be affected by moving into the new building. Staff understood the change as more than just a physical expansion, but as an organisational change that would affect their ways of working. This understanding led to concern regarding how to work together in the new building. Specifically, staff moving into the new building were worried about the new layout of ICU, where nurses would be working alone in rooms with single patients. This would disrupt their ability to easily ask for support currently done by asking the nurse at an adjacent bed, or signalling to someone visible across the room: “ Single rooms are great for patients and everything but I think it becomes a bit more isolated for staffing ” (ICUOTH1). These concerns were also recognised among staff working in the change management team, who may not be directly affected by the change, but acknowledged that this is a major consequence of the move into the new building: “ All the beds, they were able to see each other all the time whereas now it’s a different work environment. They’re a bit more isolated… So that’s what we find is the challenge” (CHGTEAM2). Further, staff were concerned about working in open plan spaces that limit opportunity for private discussions, for example with other staff about workplace conflict or personal matters: “ I’m very concerned about insufficient space for private stuff ” (ICUAD1).

Staff reported negative expectations of collaboration breakdown not only within wards, but across the hospital. The organisational change will include far-flung staff and expanded infrastructure, which may decrease opportunity to collaborate directly. For several participants, the growing size of the hospital was seen as a fracturing of the positive, cohesive culture of what was once a smaller hospital—“ It used to be that the general manager would walk through and know everybody by name, the cleaner, maintenance crew, everybody knew everybody’s name ” (GN1)—into more disconnected, subunits: “ Now we’re very separate ” (ICUOTH1).

During interviews, many participants reported feeling over-worked and under-resourced. While some described being fatigued and unhappy at work, the redevelopment was, nevertheless, clearly a positive: “ We’re not happy because we’re under so much pressure and stress. But, you know, we are looking forward to the new build, it’ll be a beautiful building” (GN3). For others, there were concerns that their feelings of being over-worked would not subside with the opening of the new hospital building and that there was a lack of time to even consider the change. This was expressed by staff moving in to the new building, as well as those not moving:

Who has got the time to go and look at those decorative things ! (SURN5).
I can’t see how it will make a big difference to me… I don’t pay a lot of attention to the looks (MATDR1).
It doesn’t really matter… I could be providing it [patient care] in a tent or a building . (MATMW2).

Further, hospital staff expressed frustration in having to endure poor resourcing, which tempered their excitement for the new building: “We’ve all put up with whatever since whenever and I’m done, I’m so done” (ICUAD1). Some participants reported negative expectations related to the increase in physical space in the new building, as adding to the work load of clinical staff and requiring they travel further to get supplies and attend to patients: “They are worried about, hang on I’m going to have to do so many more laps” (ICUAD1). Similarly, an issue expressed on behalf of staff in the General Services Department was whether they will be able to adequately clean and cater for physically larger areas: “ I’m sitting here and looking at [a previous building that was opened] and seeing how filthy it is ” (CHGTEAM3). Concerns about being over-worked in the face of the redevelopment were further emphasised by some interviewees who discussed a problem with turnover: “ We’ve actually had a few people, I have had three people, which is unusual for us, who have looked for other jobs and are probably resigning. You know which is sort of the opposite of what we’d expect at this time, we’d expect they’d be excited for the new building ” (ICUN5). However, most staff in more junior positions had not seen the new building and thus were unaware of the layout and the degree to which it may impact their work: “ Because I have not seen the actual structure of the area, and I don’t know what they based it on and how they figured out a way to be friendly for both staff and patients at the same time ” (ICUN3). The unawareness and lack of understanding accentuated concerns and negative expectations among staff as they expected the worst.

Also contributing to reports of experiencing fatigue, staff described numerous other large changes taking place at the hospital over the years, in addition to the redevelopment: “ Basically for seven years we’ve been undergoing changes since I’ve been here. It is utterly exhausting having this many changes all the time ” (GS1). This highlights that while this study captures prospective insights to the change, change is constant in health care. While the move into the new building has not yet occurred, the move is part of a broader organisational change grander than the physical expansion of infrastructure. While this was a major concern for many staff, some of the senior medical staff dismissed this as being an issue, suggesting constant change is part of health care and should not lead to staff feeling worn out: “ I think once you get to my level you get good at kind of jumping through hoops… As you get more experienced, you just go with the flow a bit more” (SURDR2).

Adaptability

An additional theme involved staff’s positive expectation that they would be able to adapt to the changes brought about by the move into the new building. Reflecting on past experiences of organisational and infrastructure changes at the hospital, staff expressed that it could take time to adapt and see the benefits of the change: “ At the beginning, of course, everybody was scared of the changes and stuff like that, but eventually we got used to it. ” (SURN3). However, some staff reported that they saw adapting to the new building as a concern, potentially because of a lack of knowledge pertaining to what the new building entails: “ I just don’t know. I’m worried because I don’t know what we’re walking in to ” (ICUN2). In general, staff expressed an understanding of the change as one of physical growth (hospital redevelopment) and changes in ways of working (organisational change): “Getting bigger. So, basically taking all of our acute services and putting it in a brand spanking new building where they’re significantly expanding” (GN1); “ The biggest change is changing the way they work. Changing the way they deliver care .” (CHGTEAM2). When asked why the change was happening, hospital staff were consistent in attributing the need for redevelopment to population growth: “ To develop more resources to accommodate for the growing number of patients ” (SURDR3).

Feeling uninformed and uncertain about the change was expressed by staff of different professions and different levels throughout the hospital. In fact, even wards that were not moving to the new building were unsure if this was the case: “ There’s been no communication from anyone really. I hear from different people yes we are moving and then somebody says no we’re not. We’re staying here in the old building. So, I’m not sure exactly who’s going” (SURN1).

Our findings suggest that in the early stages of hospital redevelopment, staff experience both positive and negative expectations that are dependent upon the level of personal understanding, awareness of the change to come, and how well-resourced they already feel. Interviews with hospital staff highlighted a general understanding of the change as involving physical expansion of the hospital. However, participants also reported feeling inadequately informed about what is to come and described a range of sometimes differing expectations about the organisational effects of this change (e.g., on collaboration, for patients). This supports the conceptualisation of hospital redevelopment as not only a physical change, but an organisational one too.

The present study is the first to empirically explore the experiences and understanding of staff in the early stages of a hospital redevelopment, and conceptualised this as an organisational change. This conceptualisation is an important contribution to the organisational change literature because we show that change, even when based on the best evidence-based design, can be disappointing and bring about negative experiences for staff. The concerns and negative expectations of the change expressed by staff in the present study echo past research that retrospectively explored the experiences of staff during a hospital change, in Australia [ 13 ], and elsewhere [e.g., 14]. In the present study, staffing was a major concern reported by hospital staff. This is consistent with other reports of hospital redevelopment in the Australian context. For example, in a report into the opening of a new children’s hospital, staff were frustrated about the progression of the change and that a lack of staffing impacted on service planning. Staffing was also emphasised as an issue in another Australian hospital redevelopment project, where the building opened with insufficient staffing and resources [ 27 ]. Additionally, hospital staff in the present study indicated that they felt fatigued, so much so that excitement for the opening of the new building was diminishing. Reports of low staff morale in hospital redevelopment projects has also been documented in other Australian and international studies [ 13 , 14 ]. Further, participants in this study reported a lack of awareness of the redevelopment, something that appears to be common with a report of hospital revitalisation in the United States reporting a similar finding [ 28 ].

One source of many of the issues expressed by staff was uncertainty, a common and often inevitable experience in health care [ 29 ], for example, systems uncertainty about staffing levels and uncertainty about whether collaboration and support would break down as the hospital expands. While some types of uncertainty cannot be eradicated, it is important to manage uncertainty in times where information is available. One way to do this is to make sure front-line actors have a platform to seek information and ask questions during organisational change; having access to information is a predictor of success for organisational change in healthcare [ 30 ]. This may help alleviate stress associated with change and make the transition period less uncertain for staff, particularly in early stages where uncertainty may be greater. While it is not always possible for all the concerns and expectations of staff to be individually acknowledged and addressed by those coordinating the change (e.g., change management team or hospital executives), an alternative is through the use of ‘champions’ or ‘opinion leaders’. Opinion leaders are actors with a brokerage role; they carry information across social boundaries, such as between groups of professionals or different hospital wards [ 31 ]. Otherwise referred to as a ‘champion’, by virtue of their trustworthiness and connectedness, these actors are able to lead the opinions of others and are integral in the adoption and diffusion of new phenomena. Successful champions are enthusiastic and motivated about the change they are promoting [ 32 ]. In this case, a successful champion in a hospital undergoing organisational change is a staff member who can inform others and influence acceptance, and provide a positive frame for the change.

Implications

While findings may be localised to the hospital we researched, it is important to note that the hospital redevelopment under investigation is similar to other hospital redevelopments in metropolitan cities in Australia [ 7 ] and worldwide [ 5 ]. Specifically, the redevelopment is an expansion of infrastructure to meet the growing needs of the community which the hospital serves. The perceptions and experiences maintained by hospital staff will differ dependent on the state of the new facilities; these findings broadly generalise to any hospital redevelopment where a newer, larger building is opened. The implications of this study provide broad suggestions for other hospitals undergoing this type of hospital redevelopment.

Firstly, hospital redevelopment should be considered as more than physical change, but as an organisational change, in order to recognise the ripple effects of changing the infrastructure and how this may influence social and behavioural processes. From this study’s findings of the expectations and present experiences of organisational change, we recommend four strategies to aid in the early stages of hospital redevelopment: engage actors; plan and train; learn from the past; and increase managerial engagement (see Table  2 ). These recommendations correspond with suggestions from a past review examining transforming systems in health care [ 33 ]. Effort must be taken to ensure staff are informed of the change and rectify any confusions about who, what, when, and how the change is taking place. This is consistent with organisational change theory that maintains that large scale change requires significant effort and planning to ensure its success [ 19 ]. Therefore, an implication of this study lies in the importance of exploring the understanding and expectations of staff preceding a large organisational change in order to aid in the acceptance of, rather than resistance to, the change [ 21 ]. Further, this study also highlights the importance of studying the experiences of actors not directly involved in the organisational change but who are a part of the broader system (i.e., wards not moving implied they will be affected).

Strengths and limitations

A strength of this study lies in the number of participants and variability in the professions that contribute to the transferability of the study findings. Further, checking and clarifying themes with other researchers throughout the coding process increases the trustworthiness of the findings [ 26 ]. As to limitations, interviews were on average 17 min long, with the shortest interview lasting seven minutes. While this may be perceived as a short duration for collecting interview data it was appropriate for participants who were incredibly time poor (e.g., nurses on shift who could only get a 10 min break to talk to the researcher). It is important that the opinions of these busy staff are captured to reflect the true nature of a sample of varied hospital staff. Further, the findings may not be generalisable to other instances of organisational change and may be specific to the four wards and hospital examined in this study. Wards were purposively chosen rather than randomised. While findings may be specific to the hospital under investigation, the research has been designed to optimise research credibility in this qualitative analysis. Further, considerable context was provided to help readers infer relevance to different settings. This in-depth analysis of how staff understand and interpret organisational change in hospitals provides the opportunity to uncover theoretical insights into the processes of change in the health care system and the perspectives of staff during times of organisational change.

This study explored the prospective understanding and experiences of staff in organisational change in hospitals, using an Australian hospital redevelopment as a case exemplar. Findings indicated that staff were concerned about staffing levels, fatigue, and the potential for a breakdown of current collaborative working. These concerns are similar to past reports of redevelopment in hospitals. This paper presents recommendations for the early stages of organisational change in hospitals. For present and future hospital organisational change projects, it is important that staff concerns are addressed and that staff are informed adequately about the ongoing changes in order to improve their engagement and ownership of the change.

Availability of data and materials

The datasets analysed during the current study are not publicly available due to individual privacy, but are available from the corresponding author on reasonable request.

Abbreviations

Administrative staff

Change management team staff

Medical staff

General – works across several wards

General services staff

Intensive care unit

Maternity ward

Midwifery staff

Nursing staff

Other profession

Respiratory ward

Surgical ward

Braithwaite J, Churruca K, Ellis LA, Long JC, Clay-Williams R, Damen N, et al. Complexity science in healthcare-aspirations, approaches, applications and accomplishments: a white paper. Sydney, Australia: Macquarie University; 2017.

Google Scholar  

Braithwaite J, Wears RL, Hollnagel E. Resilient health care: turning patient safety on its head. Int J Qual Health Care. 2015;27(5):418–20.

Article   Google Scholar  

Malkin RA. Design of health care technologies for the developing world. Annu Rev Biomed Eng. 2007;9:567–87.

Article   CAS   Google Scholar  

Ritchie E. NSW budget 2017: ‘hospital building boom’ at heart of $23bn deal. The Australian. 2017.

Carpenter D, Hoppszallern S. Hospital building report. The boom goes on. Hospitals Health Networks. 2006;80(3):48–50 2-4, 2.

PubMed   Google Scholar  

Aubusson K. Berejiklian government pledges $750 million for Sydney's RPA Hospital 2019 [5 Mar]. Available from: https://www.smh.com.au/national/nsw/berejiklian-government-pledges-750-million-for-sydney-s-rpa-hospital-20190304-p511n4.html .

NSW Government. Health Infrastructure 2018 [Available from: https://www.hinfra.health.nsw.gov.au/our-projects/project-search .

Australian Institute of Health and Welfare. Australia's health 2016. Canberra: AIHW; 2016.

Schweitzer M, Gilpin L, Frampton S. Healing spaces: elements of environmental design that make an impact on health. J Altern Complement Med. 2004;10:71–83.

Rechel B, Buchan J, McKee M. The impact of health facilities on healthcare workers’ well-being and performance. Int J Nurs Stud. 2009;46(7):1025–34.

Berry LL, Parish JT. The impact of facility improvements on hospital nurses. HERD. 2008;1(2):5–13.

Gharaveis A, Hamilton DK, Pati D. The impact of environmental design on teamwork and communication in healthcare facilities: a systematic literature review. HERD. 2018;11(1):119–37.

Children's Health Queensland Hospital and Health Service. Lady Cilento Children's Hospital clinical review. 2015.

Lourens G, Ballard H. The consequences of hospital revitalisation on staff safety and wellness. Occup Health Southern Africa. 2016;22(6):13–8.

Schwarz N, Sudman S. Autobiographical memory and the validity of retrospective reports. New York: Springer-Verlag; 2012.

Dannenberg AL, Bhatia R, Cole BL, Heaton SK, Feldman JD, Rutt CD. Use of health impact assessment in the US: 27 case studies, 1999–2007. Am J Prev Med. 2008;34(3):241–56.

Austin MJ, Ciaassen J. Impact of organizational change on organizational culture: implications for introducing evidence-based practice. J Evid Based Soc Work. 2008;5(1–2):321–59.

Fitzgerald L, McDermott A. Challenging perspectives on organizational change in health care: Taylor & Francis; 2017.

Book   Google Scholar  

Todnem BR. Organisational change management: a critical review. J Chang Manag. 2005;5(4):369–80.

Plsek PE, Greenhalgh T. Complexity science: the challenge of complexity in health care. Br Med J. 2001;323(7313):625.

Braithwaite J. Changing how we think about healthcare improvement. Br Med J. 2018;361:k2014.

Pomare C, Churruca K, Long JC, Ellis LA, Gardiner B, Braithwaite J. Exploring the ripple effects of an Australian hospital redevelopment: a protocol for a longitudinal, mixed-methods study. BMJ Open. 2019;9(7):e027186.

Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): a 32-item checklist for interviews and focus groups. Int J Qual Health Care. 2007;19(6):349–57.

Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2006;3(2):77–101.

Castleberry A. NVivo 10 [software program]. Version 10. QSR International; 2012. American journal of pharmaceutical education. 2014;78(1).

Elo S, Kääriäinen M, Kanste O, Pölkki T, Utriainen K, Kyngäs H. Qualitative content analysis: a focus on trustworthiness. SAGE Open. 2014;4(1):2158244014522633.

Braithwaite J. How to fix a sick hospital: attend to its stressed health carers the Sydney morning herald; 2018.

Baker JG. The perspective of the staff regarding facility revitalization at Walter reed Army medical center. Army Medical Material Agency Fort Detrick MD; 2004.

Pomare C, Churruca K, Ellis LA, Long JC, Braithwaite J. A revised model of uncertainty in complex healthcare settings: a scoping review. J Eval Clin Pract. 2019;25(2):176–82.

Kash BA, Spaulding A, Johnson CE, Gamm L. Success factors for strategic change initiatives: a qualitative study of healthcare administrators' perspectives. J Healthc Manag. 2014;59(1):65–81.

Long JC, Cunningham FC, Braithwaite J. Bridges, brokers and boundary spanners in collaborative networks: a systematic review. BMC Health Serv Res. 2013;13(1):158.

Damschroder L, Banaszak-Holl J, Kowalski CP, Forman J, Saint S, Krein S. The role of the “champion” in infection prevention: results from a multisite qualitative study. BMJ Qual Saf. 2009;18(6):434–40.

Best A, Greenhalgh T, Lewis S, Saul JE, Carroll S, Bitz J. Large-system transformation in health care: a realist review. Milbank Q. 2012;90(3):421–56.

Download references

Acknowledgements

We thank the hospital executives, ward directors and nursing unit managers for their support in recruitment of interview participants. The authors also thank and acknowledge the interview participants.

CP was funded by the Australian Government Research Training Program (RTP) PhD Scholarship. JB is supported by multiple grants, including the National Health and Medical Research Council (NHMRC) Partnership Grant for Health Systems Sustainability (ID: 9100002). The funders had no role in the design, analysis and drafting of the manuscript.

Author information

Authors and affiliations.

Centre for Healthcare Resilience and Implementation Science, Australian Institute of Health Innovation, Macquarie University, 75 Talavera Rd, Macquarie Park, Australia

Chiara Pomare, Kate Churruca, Janet C. Long, Louise A. Ellis & Jeffrey Braithwaite

You can also search for this author in PubMed   Google Scholar

Contributions

CP and JB conceptualised the project. CP collected and analysed the data, and drafted the manuscript. KC, LAE and JCL assisted in the coding and interpretation of data. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chiara Pomare .

Ethics declarations

Ethics approval and consent to participate.

The study was approved by the relevant Ethics Committee in Sydney, New South Wales, Australia (no: 18/233). Due to ethical requirements, the committee cannot be named because it may lead to the identification of the study site. Informed consent was obtained from all study participants.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1..

Consolidated criteria for reporting qualitative studies (COREQ): 32-item checklist.

Additional file 2.

Semi-structured interview guide.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Pomare, C., Churruca, K., Long, J.C. et al. Organisational change in hospitals: a qualitative case-study of staff perspectives. BMC Health Serv Res 19 , 840 (2019). https://doi.org/10.1186/s12913-019-4704-y

Download citation

Received : 23 May 2019

Accepted : 31 October 2019

Published : 14 November 2019

DOI : https://doi.org/10.1186/s12913-019-4704-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Organisational change
  • Health systems change
  • Hospital redevelopment
  • Hospital expansion
  • Staff expectations

BMC Health Services Research

ISSN: 1472-6963

case study healthcare models

  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Editor's Choice
  • Supplements
  • French Abstracts
  • Portuguese Abstracts
  • Spanish Abstracts
  • Author Guidelines
  • Submission Site
  • Open Access
  • About International Journal for Quality in Health Care
  • About the International Society for Quality in Health Care
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Contact ISQua
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Introduction, acknowledgements, data availability.

  • < Previous

Case study: international healthcare service quality, building a model for cultivating cultural sensitivity

  • Article contents
  • Figures & tables
  • Supplementary Data

Ya-Ting Yang, Yi-Hsin Elsa Hsu, Kung-Pei Tang, Christine Wang, Stephen Timmon, Wen-Ta Chiu, Saileela Annavajjula, Jan-Show Chu, Case study: international healthcare service quality, building a model for cultivating cultural sensitivity, International Journal for Quality in Health Care , Volume 32, Issue 9, November 2020, Pages 639–642, https://doi.org/10.1093/intqhc/mzaa097

  • Permissions Icon Permissions

In the context of medical tourism, cultural differences and language barriers are unneglectable factors, which compromise the shared decision-making between doctor and patients.

This study constructs a cultural sensitivity cultivation (CSC) model that could be used to train medical professionals in the sector of medical tourism.

Since 2016, there have been explorations in new strategies to offer better services. A critical step added is to include clients’ perspectives in the re-examining process as a way to cultivate cultural sensitivity among the service providers. This practice expands to the sector of medical tourism. In our case study, we are able to conclude a new model that could yield quality international healthcare services.

The steps of our CSC model include (i) ‘Promote Awareness’ for shifting mindset, (ii) ‘Share Scenarios’ for developing empathy and compassion, (iii) ‘Review Process’ for collecting detail feedback, (iv) ‘Identify Gaps’ for targeting areas for improvement and (v) ‘Improve Systems,’ for changing standard operation procedures (SOPs) based on the strategies through Assmann’s theory with a cultural–anthropological approach.

After Kuang Tien General Hospital (KTGH) implemented the new model for 1 year, the number of international patients has increased by 64%. More research could be done in the future to cover all the important aspects of providing international medical services and could apply the CSC model to different healthcare settings.

To optimize the shared decision-making between the doctor and medical traveler patients, healthcare providers should not only overcome language and cultural barriers but also should avoid unnecessary gestures in terms of status respect. Inviting patients to be co-investigator for quality improvement is a viable solution.

In the context of medical tourism, cultural differences and language barriers are unneglectable factors, which compromise the shared decision-making between doctor and patients. This situation has raised the demand for cultural competency training for medical providers. In recent years, Taiwan is gradually being recognized as a destination of choice for high-quality and affordable healthcare [ 1 ].

Therefore, there is an immediate need to rectify the problem. This study aims to construct a cultural sensitivity cultivation (CSC) model to train Taiwanese medical professionals to enhance shared decision-making and healthcare service for medical tourists.

The study was conducted at Kuang Tien General Hospital (KTGH), a regional hospital founded more than a decade ago in Taiwan. In order to develop the CSC model using a systematic approach, a combination of literature studies, case report analyses, in-depth interviews and statistical comparisons was applied. In addition to the literature studies, we undertook an exploratory qualitative study to investigate the concept of international healthcare service quality focusing on cultural sensitivity cultivating. Adopting multiple qualitative and quantitative approaches provides a sound foundation for generating empirical conceptual frameworks of complex concepts. There are three experienced researchers reviewed applied this concept of respect when reviewing five extensive case reports provided by the KTGH. The reports include detailed information such as patients’ previsit, visit, and postvisit information (follow-up and testimonials). Also, the Chief Strategy Officer and members of the hospital participated in in-person interviews. The results of literature reviews have indicated a lack of standardization in medical tourism regulation. This has led to difficulties relating to ethical issues such as informed consent and health equity control [ 2 ]. On the other hand, the healthcare providers’ commitment to improving their cultural sensitivities is the essential component to ensure high quality of medical tourism services [ 3 ]. However, there are very few studies that address the basic theories needed in order to establish the ideal relationship between international medical seekers and healthcare providers. Hence, a cultural–anthropological approach by Assmann [ 4 ] is adopted to develop this CSC model. This is because Assmann’s theory can transcend the economic colonialist globalization framework. Assmann’s theory addresses five forms of respect: (i) Status respect ‘respect due to the position and office,’ (ii) Respect for achievement ‘respect that distinguishes individual according to their ability and achievements,’ (iii) Social respect ‘raising emotional temperature of society surplus personal appearance, abilities or possessions, which is focusing of recognition and admiration,’ (iv) Cultural respect ‘affirmative action on a global level of intercultural communication and digital self-expression within a larger political framework of decolonization’ (v) Civil respect ‘multicultural and multiethnic society stimulate new forms of social perception, interaction and empathy.’

Findings and Results

From the interviews and analysis, we discovered several fundamental strategies that KTGH employs in improving their staff’s ability to care for international patients by based on the strategies through Assmann’s theory with a cultural–anthropological approach between medical professionals and international patients such as they include practicing ‘social respect’ and ‘civil respect’ through the improvement of verbal and nonverbal communication skills; embodying the ‘cultural respect’ through sharing experience among different ethnic groups and hosting international holiday promotional events to promote mutual understandings among global citizens and also applied the Assmann’s respect theory with the international patients’ visit journey from promote awareness, share scenarios, review process, identify gaps and improve system to connected with CSC model. Since 2016, KTGH has started the efforts of exploring new strategies to offer better services, and they have especially paid attention to cultivating staffs’ intercultural competencies. During each service delivery to an international patient, staff records details on patient–doctor interactions. They use the recordings to create case studies or scenarios to develop hospital-wide training to cultivate empathy for all. To decrease the conventional ‘status respect’ between the doctor-patient relationship, the research team invited ‘the clients,’ the international healthcare seekers, to work as the ‘co-investigators.’ The goal is to include patients’ perspectives when examining behaviors occurred during the healthcare-seeking process.

After implementing these new strategies for 1 year, KTGH received testimonials from international patients on the high quality of their healthcare services. In addition, the number of international patients increased by 64% in the following year. Witnessing KTGH process and success in cultivating cultural awareness, we have used its practice to form the CSC model in our study, as shown in Figure 1 .

CSC model.

The stage of ‘Promoting Awareness’ emphasizes the importance of educating and introducing staff to cultural differences and characteristics. The ‘Share Scenarios’ stage promotes empathy and compassion through sharing through the previous experiences and lessons learned between one and another. The ‘Review Process’ stage consists of collecting service feedback from teams and discussing step-by-step experiences with patients. The ‘Identify Gaps’ stage is to target areas for improvement in each step: pre-, during- and postvisit experiences with patients. Lastly, under ‘Improve Systems,’ the model emphasizes the point in shaping patients’ experiences systemically through discussions with involved departments to enhance standard operation procedures (SOPs).

The proposed CSC model provides a conceptual framework and a systematic application process in the area where little research was conducted previously. According to the research analyzing publications in the IJQHC over the past 3 years, one of the leading keywords is ‘quality improvement’ [ 5 ]. There are various approaches to quality improvement, with different levels, from quality indicators to systematic monitoring [ 6 ]. The CSC model indicates that besides the apparent reason of language abilities that contribute to quality service, other important elements such as creating cultural awareness and using previous experiences to develop training materials are also crucial to improve service quality and experience for international patients. Although the CSC model is developed based on experiences in a single healthcare facility that offers international healthcare, we have made valuable observations that could lead to quality healthcare service in this specific facility. Attention to data can itself significantly improve data quality [ 7 ]; we have observed in this case that focusing on the internal process of delivering healthcare services could also improve the quality of the service itself. Based on the CSC model, we encourage more research that could be done in the future to cover all aspects of providing international medical services and consider applying the CSC model in different healthcare settings. Further implementation and investigation of the effects of the CSC model are suggested.

The authors thank CSF and staff members of KTGH who assisted in providing the data for this study.

We have interviewed subjects, and all data are available upon request to reviewers and editors. No new data were generated or analyzed.

Stephano RM . Health & Wellness Destination Guide . Taiwan : Global Health Insurance Publications , 2014 . http://taiwan.medicaltourism.com/online-view/document.pdf

Google Scholar

Google Preview

Foley BM , Haglin JM , Tanzer JR et al.  Patient care without borders: a systematic review of medical and surgical tourism . J Travel Med 2019 ; 26 :6.

Rokni L , Park S-H , Avci T . Improving medical tourism services through human behaviour and cultural competence . Iran J Public Health 2019 ; 48 : 1988 – 96 .

Assmann A . Civilizing societies: recognition and respect in a global world . New Lit Hist 2013 ; 44 : 69 – 91 .

Iqbal U , Yu CY , Li J . What are the leading keywords of IJQHC in last 3 years? Int J Qual Health Care 2015 ; 27 : 163 – 4 .

Geboers H , Mokkink H , Montfort P et al. . Continuous quality improvement in small general medical practices: the attitudes of general practitioners and other practice staff . Int J Qual Health Care 2001 ; 13 : 391 – 7 .

Mate KS , Bennett B , Mphatswe W et al.  Challenges for routine health system data management in a large public programme to prevent mother-to-child HIV transmission in South Africa . PLos One 2009 ; 4 :e5483.

Author notes

# Equal Contribution.

  • health personnel
  • central serous chorioretinopathy
  • cultural differences
  • quality improvement
  • language barriers
  • medical tourism
  • shared decision making
  • standard operating procedure
  • cultural sensitivity
Month: Total Views:
August 2020 6
September 2020 9
October 2020 5
November 2020 49
December 2020 32
January 2021 8
February 2021 23
March 2021 45
April 2021 32
May 2021 57
June 2021 38
July 2021 27
August 2021 33
September 2021 12
October 2021 34
November 2021 37
December 2021 24
January 2022 46
February 2022 33
March 2022 48
April 2022 66
May 2022 40
June 2022 42
July 2022 21
August 2022 31
September 2022 34
October 2022 58
November 2022 28
December 2022 31
January 2023 42
February 2023 29
March 2023 28
April 2023 52
May 2023 26
June 2023 28
July 2023 27
August 2023 31
September 2023 27
October 2023 29
November 2023 44
December 2023 19
January 2024 27
February 2024 43
March 2024 50
April 2024 36
May 2024 84
June 2024 40
July 2024 45
August 2024 54
September 2024 36

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1464-3677
  • Print ISSN 1353-4505
  • Copyright © 2024 International Society for Quality in Health Care and Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Taking a case study approach to assessing alternative leadership models in health care

Affiliations.

  • 1 Staff Nurse, Emergency Department, Leeds Teaching Hospitals NHS Trust.
  • 2 Lecturer, School of Healthcare, University of Leeds.
  • PMID: 29894257
  • DOI: 10.12968/bjon.2018.27.11.608

Good leadership is essential to patient-centred care and staff satisfaction in the healthcare environment. All members of the healthcare team can be leaders and evidence-based theory should inform their leadership practice. This article uses a case study approach to critically evaluate leadership as exercised by a charge nurse and a student nurse in a clinical scenario. Ineffective leadership styles are identified and alternatives proposed; considerable attention is given to critiquing both 'heroic' and 'post-heroic' transformational leadership theories. The concept of power will also be discussed, as power and leadership are closely related, and the importance of empowering members of the healthcare team through altering organisational structure is emphasised. This article advocates leadership that encourages innovation, enhances patient-centred care, encourages excellence and has ethical integrity. Recommendations of appropriate models of leadership are provided, while existing gaps in the healthcare leadership literature are highlighted.

Keywords: Authentic leaderships; Engaging leadership; Leadership models; Transformational leadership.

PubMed Disclaimer

Similar articles

  • Practice what you preach: developing person-centred culture in inpatient mental health settings through strengths-based, transformational leadership. Beckett P, Field J, Molloy L, Yu N, Holmes D, Pile E. Beckett P, et al. Issues Ment Health Nurs. 2013 Aug;34(8):595-601. doi: 10.3109/01612840.2013.790524. Issues Ment Health Nurs. 2013. PMID: 23909671
  • Building the capacity for evidence-based clinical nursing leadership: the role of executive co-coaching and group clinical supervision for quality patient services. Alleyne J, Jumaa MO. Alleyne J, et al. J Nurs Manag. 2007 Mar;15(2):230-43. doi: 10.1111/j.1365-2834.2007.00750.x. J Nurs Manag. 2007. PMID: 17352707
  • Implementation of Releasing Time to Care - the productive ward. Wilson G. Wilson G. J Nurs Manag. 2009 Jul;17(5):647-54. doi: 10.1111/j.1365-2834.2009.01026.x. J Nurs Manag. 2009. PMID: 19575723
  • Experiences of registered nurses as managers and leaders in residential aged care facilities: a systematic review. Dwyer D. Dwyer D. Int J Evid Based Healthc. 2011 Dec;9(4):388-402. doi: 10.1111/j.1744-1609.2011.00239.x. Int J Evid Based Healthc. 2011. PMID: 22093388 Review.
  • Embracing change. Orr P, Davenport D. Orr P, et al. Nurs Clin North Am. 2015 Mar;50(1):1-18. doi: 10.1016/j.cnur.2014.10.001. Epub 2014 Dec 31. Nurs Clin North Am. 2015. PMID: 25680483 Review.
  • Time to re-envisage integrity among nurse leaders. Markey K, Moloney M, Doody O, Robinson S. Markey K, et al. J Nurs Manag. 2022 Oct;30(7):2236-2240. doi: 10.1111/jonm.13557. Epub 2022 Feb 15. J Nurs Manag. 2022. PMID: 35118739 Free PMC article.
  • Psychometric properties of leadership scales for health professionals: a systematic review. Carlson MA, Morris S, Day F, Dadich A, Ryan A, Fradgley EA, Paul C. Carlson MA, et al. Implement Sci. 2021 Aug 28;16(1):85. doi: 10.1186/s13012-021-01141-z. Implement Sci. 2021. PMID: 34454567 Free PMC article. Review.
  • Search in MeSH

LinkOut - more resources

Full text sources, other literature sources.

  • scite Smart Citations

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

National Academies Press: OpenBook

How Modeling Can Inform Strategies to Improve Population Health: Workshop Summary (2016)

Chapter: 3 case studies of models used to inform health policy, 3 case studies of models used to inform health policy.

The workshop’s second panel featured three case studies presented by long-time modelers which were offered to illustrate some of the ways in which models can be used to inform health policy. In each case, said session moderator Pamela Russo, a senior program officer at the Robert Wood Johnson Foundation, the models are nonlinear, dynamic, and interactive, and they cross multiple disciplines. ( Box 3-1 contains highlights from these presentations.) David Mendez, an associate professor in the Department of Health Management and Policy at the University of Michigan School of Public Health, discussed tobacco models. Pasky Pascual, an environmental scientist, lawyer, and former director of the Council for Regulatory Environmental Modeling at the Environmental Protection Agency (EPA), described EPA’s use of models to set clear air standards. Bobby Milstein, a director at ReThink Health, illustrated how communities have used models to engage in regional health reform efforts. An open discussion moderated by Russo followed the three presentations.

COMPUTATIONAL MODELS IN TOBACCO POLICY 1

As the previous speakers had already noted, there are several good reasons to model, said David Mendez in the introduction to his presen-

___________________

1 This section is based on the presentation by David Mendez, an associate professor in the Department of Health Management and Policy at the University of Michigan School of Public Health, and the statements are not endorsed or verified by the Institute of Medicine.

tation on the use of models in tobacco control efforts. One reason is to understand a problem fully, and in that context models can provide a coherent framework with which to analyze a situation and integrate different datasets in a way that mental models cannot. Models, he explained, can be used to monitor, forecast, and evaluate the consequences of policies using “What if?” scenarios that are explored in in silico experiments, and they can identify gaps in knowledge which can guide data collection. In the tobacco control area, models can be used to address such questions as

  • If current conditions continue, what is the likely trajectory of smoking prevalence?
  • If all the tobacco control measures known to be effective were fully implemented, what is the likely trajectory of smoking prevalence?
  • What would be the population health impact of removing menthol cigarettes from the market?
  • What would be the consequences of increasing the minimum purchasing age for tobacco products?
  • What would be the impact of reducing nicotine in combustible tobacco products to nonaddictive levels?
  • What is the estimated impact of tobacco control policies on avoided mortality?

One of the issues that Mendez has been exploring using the Michigan Model of Smoking Prevalence and Health Effects is the effect that tobacco control policies have on the smoking status of individuals. The model identifies individuals by gender, age, when they started smoking, and trajectory, and it tracks these individuals from birth to death. Mendez and his colleagues use these simulations to examine the probability of initiation and cessation and the way those two events interact under different policy scenarios. While the model is constructed to capture the initiation and cessations probabilities at the individual level, it actually follows groups of individuals, which is why this type of model is called an aggregate model (see Figure 3-1 ). One of the assumptions built into the model is that groups comprising individuals with certain characteristics behave in a homogeneous way, Mendez said.

Image

Mendez used a bathtub analogy to explain how this model works. Water flowing into the bathtub represents individuals who initiate smoking, and the rate at which water flows out of the tap is analogous to the probability of initiation. Similarly, water flowing out of the bathtub represents individuals who quit smoking plus the number of people who die from all causes. The level of water in the bathtub represents smoking prevalence, and this is what he and his colleagues are interested in and what the model tracks over time.

When building such a model, one of the first tasks is to build confidence that the predictions that the model makes fit with the observed data (see Figures 3-2 and 3-3 ). “We have some theory about how the elements interact and how people start smoking and quit smoking,” Mendez said, “and we want to make sure that this is a good representation of what is happening in the real world.” When he and his colleagues used data from the National Health Interview Survey to test the model, they found that the fit between the model’s predictions and the data was very good ( Mendez and Warner, 2004 ). They then developed some projections of what would happen with varying initiation rates (see Figure 3-4 ).

Image

To understand the health effects of smoking, Mendez and his colleagues used data from the Cancer Prevention Study II to develop relative risks for former and current smokers, both female and male. The model predicts the relative risk of death from the time of smoking initiation and the length of time an individual smoked. He explained that the model

contains compartments that keep track of individuals by age and by when they stop smoking (see Figure 3-5 ) and that it explores the effects of policies that affect the initiation and cessation rates on mortality and morbidity over time on a population level.

Mendez listed some of the ways in which he and his colleagues have used this model. These included assessing smoking prevalence targets, evaluating the effect of offering smoking cessation programs in managed care organizations, evaluating the impact of menthol cigarettes on a population’s health, evaluating the effectiveness of radon remediation when smoking rates are declining, and evaluating the effect of tobacco control policies on global smoking trends. In the study that evaluated the effect of offering smoking cessation programs in managed care organizations (MCOs), the question of interest was whether offering such programs were cost-beneficial for the MCOs. The answer was that they were not because of the large turnover in managed care organizations. Mendez said that this study was interesting because the presumption was that there would be a clear benefit, but the analysis showed that while the gains for society are substantial, there is no benefit that accrues to a managed health care organization.

For the radon remediation project, the goal was to examine the effectiveness of EPA guidance about remediating homes with high levels of radon. “The problem here is that there was no clear distinction between smokers and nonsmokers when this policy was put in place,” Mendez

Image

said, noting that the data show that health risks associated with radon exposure are much higher for smokers than for nonsmokers. The model’s analysis showed that with declining smoking rates, the effectiveness of the EPA recommendation is questionable ( Lantz et al., 2013 ; Mendez et al., 2011 ).

To evaluate the potential impact of tobacco control policies on global smoking trends, Mendez and his colleagues used data from World Health Organization databases and explored what would happen with the world prevalence of smoking if current conditions continued compared to an environment in which a comprehensive package of well-known effective tobacco control strategies was implemented. “The World Health Organization has found this information useful as it attempts to communicate the possibilities of tobacco control worldwide,” Mendez said.

For the United States, Mendez and University of Michigan colleague Kenneth Warner used this model to analyze the possibility of meeting the Healthy People 2010 goal of achieving a smoking prevalence rate of 13 percent. This analysis showed that this goal was unreachable since even if the initiation rate for smoking went to zero, the cessation rate would have to increase by more than threefold to achieve a 13 percent prevalence in 2010. If the initiation rate dropped to 15 percent by 2010, cessation rates would need to increase more than fourfold to reach the target. “The targets were made without thinking about the mechanisms that drive prevalence,” Mendez said. “Our projections were that prevalence would be between 18 and 19 percent, and that is where we were in 2010, at 19.3 percent.” He commented that the problem with setting unrealistic goals is that the community will get discouraged. “It is not that we fail with tobacco control policies,” he said, “but because the goals were so outrageous, the public health community feels like a failure by not achieving them.”

Rather than just pointing out the error of setting unrealistic goals, Mendez and Warner used their model to set a goal for 2020 that would be feasible. To do this, they chose the policies that California had enacted and that have driven the prevalence of smoking in that state to 14.7 percent ( Mendez and Warner, 2008 ). Under the optimistic conditions that assume the nation can achieve California’s initiation and cessation rate, the model found the nation would still not achieve California’s current prevalence level until after the year 2020 (see Figure 3-6 ).

In a project commissioned by the Tobacco Product Scientific Advisory Committee of the Food and Drug Administration (FDA), Mendez modeled likely outcomes from removing menthol cigarettes from the market. The results projected that an estimated 328,000 premature deaths would be avoided over a 40-year period if menthol cigarettes did not exist and that there would be 9 million fewer new smokers over that same period. Based at least in part on this study, the advisory committee determined

Image

that removing menthol cigarettes from the market would benefit public health.

In his final comments, Mendez highlighted two modeling efforts conducted by other investigators. The SimSmoke model, developed largely by David Levy at the Georgetown University Medical Center, simulates the dynamics of smoking rates and smoking-attributed deaths in a state or nation and also the effects of policies on those outcomes. This model shows that the effects vary with the way a policy is implemented and by demographics, and it points out the dynamic, nonlinear, and interactive effects of smoking policies ( Levy and Friend, 2002 ; Levy et al., 2013 ). The Cancer Intervention and Surveillance Modeling Network, funded by the National Cancer Institute, is a network of modelers looking to improve understanding of cancer control interventions. Using data from the National Health Interview Survey, this group can reproduce the history of smoking for any cohort in the U.S. population, and it has evaluated the impact on lung cancer and overall mortality of smoking control policies enacted since 1964. By the network’s estimates, some 8 million pre-

mature deaths have been avoided and mean lifespan has been extended by 20 years ( Holford et al., 2014 ).

DEFENDING PUBLIC HEALTH MODELS IN THE COURTROOM

Pasky Pascual started his presentation by talking briefly about a modeling technique that he has used, called hierarchical Bayesian modeling, to investigate the relationship between charged particles and oxygen in two streams just north of Washington, DC. The results that the model produces, he said, depend heavily on the data used to initialize the model, the variables chosen for inclusion in the model, and on the specific equations in the model that are emphasized (see Figure 3-7 ). “If these models remained on my laptop, they continue to be an interesting intellectual exercise,” Pascual said, “but the moment I use any of these models to make a decision that affects other peoples’ lives in a significant way, that is a problem.”

Models are everywhere, Pascual said, and any major regulation that is issued by EPA is ultimately based on a model. With his colleagues Elizabeth Fisher from Oxford University and Wendy Wagner at the University of Texas School of Law, Pascual has written two papers discussing the use and abuse of scientific and social-scientific models in environmental policy ( Fisher et al., 2010 ; Wagner et al., 2010 ). Litigants will

Image

challenge models by mischaracterizing them as belonging to two different extremes, Pascual said. At one extreme, models are portrayed as being truth engines, which means that as soon as the model makes an error, it becomes invalid. At the other extreme, models are seen as being completely malleable and therefore should be considered arbitrary and capricious under the law.

The reality is that models fall somewhere between these two extremes, Pascual said. “There are indeed uncertainties that prevent us from making perfect predictions, but there are evaluation methods that help us distinguish between models that are truthful and those that are merely ‘truthy,’” he said, borrowing that last word from the lexicon of Stephen Colbert. In reality, every modeling approach has its own intellectual foundation and its own assumptions, which Pascual and his colleagues refer to as a model’s epistemic frame ( Pascual et al., 2013 ). For example, in 1962 FDA, in responding to the thalidomide crisis, stated by way of regulation that the best scientific evidence for evaluating drug risks comes from randomized clinical trials. Implicit in that decision, Pascual said, was that FDA prescribed one particular statistical method for drawing inferences. Over the years a number of panels, including those organized by the Institute of Medicine, have been recommending that FDA take a more agnostic view of modeling, and, in fact, Congress mandated in the 2007 FDA Reauthorization Act that the agency take a more universal view of modeling. Pascual and his colleagues used the FDA Reauthorization Act to make the case that models used for regulatory decision making need to be transparent in terms of both the methods used and the epistemic framework of those methods.

In an upcoming paper, Pascual and his collaborators discuss how model transparency leads to both legal accountability and defensibility in the context of the Clean Air Act. He said that while the Clean Air Act is complex, there are four important features regarding the implementation of air quality standards that are important to know for this discussion: The standards have to protect public health, be based on the latest science, be reviewed by a science panel, and be subject to judicial review. That last feature means that if a stakeholder with established standing disagrees with the model that EPA used to make its regulatory decision, the stakeholder can force the agency to defend the model in court.

In the early years of EPA’s regulation of ambient air quality standards, there were 16 court cases challenging the agency’s decisions, and the court’s approach to evaluating the science was, as Pascual characterized it, “very ad hoc.” The court’s approach was reminiscent of Justice Potter Stewart’s comment on pornography in that the court seemed to say that it could not define what good science is but that it would know it when it saw it. “If you read the cases, they are rather incoherent and not very

cohesive,” Pascual said. In 2008, though, EPA laid out its causal epistemic framework and challenged the court to judge its decisions against this framework. In the past courts said that controlled human exposure trials, like randomized clinical trials, engender the highest level of confidence about the causal relationship between ozone exposure and health effects. However, as Hammond pointed out earlier in the workshop, randomizing different groups and exposing them to different concentrations of toxics would not be ethical. The agency’s models computationally control for other covariants that might interfere with this causal relationship. In essence, this approach to science says that given a particular level of a pollutant, the models, based on data from human and animal exposure, suggest that a particular health effect will result. If the data come from controlled human studies or from observational studies that rule out chance, the agency determines that a causal relationship exists. If the data come from observational studies with possible confounders plus animal toxicity studies, the agency considers that it is likely that a causal relationship exists between exposure to a pollutant and an adverse health effect.

In concluding his talk, Pascual offered a quote from the statistician David Box, who said, “Every model is wrong, but some are useful.” No model is perfect, Pascual said, but by making a model transparent and providing a framework for evaluating a model’s performance, an expert scientific panel should be able to evaluate the validity of a model to a degree that the model will hold up in court. In the end, Pascual said, modeling is nothing more than a formalized method of questioning.

MODELING REGIONAL HEALTH REFORM USING THE RETHINK HEALTH DYNAMICS MODEL

Health data show that there are regional patterns that transcend the specifics of any disease or physical exposure, said Bobby Milstein, and this is true for the way that health care is delivered and how much it costs and with regard to the social, economic, and environmental conditions that leave people vulnerable to risk and disease. The challenge of addressing health reform at a regional level is what drove Milstein and his colleagues at ReThink Health to develop a model that can be used by those working to address health reform at the regional level to ask questions about the likely health and economic consequences of their efforts under realistic regional conditions that account for local trends related to a wide range of factors. These factors include insurance expansion, demography, aging, inequities, health status, the quality and cost of health care, the demand-supply for health care resources, provider payments, among others. Nobody can keep all of these dynamic factors in mind simultaneously when making choices about what actions to take,

Milstein said. In short, the model is meant to be used by people who are immersed in systems they do not fully understand and help them make better informed decisions.

ReThink Health started its efforts in Pueblo, Colorado, when Milstein and his colleagues met with leaders who were addressing a regional health reform agenda. Together they framed an approach to look at a number of policies and actions that these leaders thought were within their capabilities to influence in the region. From that start, ReThink Health’s efforts expanded to five additional sites in the first year and then three more over the next 3 years for a total of nine sites. The model is also being incorporated into an increasing number of academic curricula as future leaders are beginning to think about how the health care system works and about some of the conditions under which it might be able to change. Recently, for example, the National Association of Schools of Public Policy, Affairs, and Administration, a network of almost 300 academic institutions across the country, held its first student simulation competition on health policy. On one day, some 200 students at 93 schools sat down to think about how the health system might work and what their roles as policy makers would be.

Users of this model, Milstein said, confront a stewardship challenge as they attempt to craft strategies that will steer their health system in a new direction, not just in the short term, but in an enduring way over time. The model’s logic incorporates three questions that policy makers ask in the real world, Milstein said: What shall we do, how will we pay for it, and how proud would we be of the consequences? The model provides an intentionally diverse menu of intervention options, with several dozen initiatives that policy makers could choose to change with regard to upstream and downstream initiatives and also financing within a regional health system (see Figure 3-8 ). For example, planners can focus on actions to cut health care costs, improve quality, or expand capacity as well as wider efforts to enable healthier behaviors, reduce environmental hazards, improve public safety, and expand socioeconomic opportunities that strongly shape health and well-being while also affecting the demand for expensive downstream health care. Every item on this menu, he explained, has been documented to make a difference in and of itself, yet there are open questions about how they might be combined to make a greater difference and how much ought to be invested in different contexts. These items can be tested alone or in combination, and Milstein presented two quick scenarios to illustrate the types of insights that users have begun to obtain with this model.

One scenario explores a suite of strategies designed to move the health system’s overdependence on tertiary care through hospitals and specialists to one that relies more on primary care combined with a greater role

Image

for self-care and medical homes. This approach begins with an assumption, which can be easily varied, that start-up funds are available from a temporary innovation fund set at 1 percent of total health care spending for just 5 years. Also, it assumes that half of any savings that accrue from lower health care costs will be reinvested in the endeavor, which is similar to what many accountable care organizations are doing; furthermore, it is assumed that there will be a shift in provider payment away from fee-for-service to per capita payment. This scenario also emphasizes greater adherence to guidelines for preventive and chronic care and also serious efforts to coordinate care by eliminating unnecessary services that increase cost but do not improve health. Together, Milstein said, this combination of strategies represents an evolution toward higher-value preventive and chronic care. Simulated results show that these downstream investments can deliver relatively fast, focused impacts but that their effects tend to plateau (see Figure 3-9 ). Models can also have mixed results on cost and inequity. While this scenario decreases dependency on hospital inpatient stays, which in turn lowers cost, there are also fewer premature deaths, more primary care visits, more services for prevention and chronic care, and more extensive use of self-care products such as prescription drugs, all of which increase costs elsewhere in the system. “What is the balance of those? I can’t do that in my head, and that is exactly why we need com-

Image

puters to play this out against the other changing dynamics in a region,” said Milstein.

A second scenario tested a more balanced approach, with those same downstream components now coupled with upstream investments to enable healthier behaviors and safer environments. According to that model, the added emphasis on healthier behaviors and environments

could unlock much greater health and economic potential, Milstein said, noting that the upstream elements yield broad progress on health, cost, equity, and workforce productivity ( Milstein et al., 2011 ). The effects can be large, he added, but they accumulate gradually (see Figure 3-10 ).

The health consequences from the combined, balanced scenario grow stronger over time, although the distinct benefits compared to the first scenario do not become apparent until after about 8 years. The same is true for cost savings. The larger suite of initiatives is more expensive to implement, but it may still be affordable when coupled with gains from the downstream reforms. Milstein claimed that with health care costs on track to grow even larger, many regions may not be able to afford not to make these cost-saving investments. In one scenario, for example, although downstream investments save nearly $1 billion, when combined with upstream investments funded by gain-sharing agreements, the savings are 50 percent greater (see Figure 3-11 ). The challenge, as this scenario shows, is that there is an initial increase in spending that may make the necessary initial investments difficult to secure. “There is typically a worse before better pattern,” Milstein said. “In all scenarios where you are making these investments, the question is how fast you get that yield and how happy are you with that yield.” Another important insight from

Image

these modeling exercises, he added, is that balanced strategies can drive greater economic productivity which provides a return beyond that from simply reducing health care costs.

This is a well-designed model that has been widely tested and closely matches dozens of observed times series data, Milstein said. But, like all models, it is still an inexact representation of the real world. One limitation, he added, is its 25-year time horizon, which is much shorter than a full life course, which would be needed to accurately represent the benefits of other investments, such as intervention in early childhood. The model also has a relatively high level of aggregation, which is appropriate for its focus on strategy designs, Milstein said, but if one wanted to delve into, for example, the details of which behaviors would make the biggest difference, that would require a different type of tool which was focused on tactics and specific program portfolios.

One of Milstein’s biggest priorities is to guard against two extremes that are fraught with peril: an overreliance on an imperfect model versus an under-reliance on analytical tools to compensate for known flaws in people’s mental models. Decisions made in the absence of a credible model essentially depend on people’s ability to think through the complexity of a vast health system, which turns out to be notoriously difficult, if not impossible.

In his closing comments, Milstein said that he and his colleagues have examined carefully the conditions under which models can help leaders make big breakthroughs in their practice. “It is very rare for a model, in and of itself, to drive change. It really is best done when it is coupled with a sense of strong stewardship by people who come to the table thinking of themselves not as leaders of their institutions or thinking of their own narrow self-interest, but recognizing that they are part of a system upon which we all depend,” said Milstein. Working through the economics of marshalling and governing common resources is a large part of this stewardship, he added. “Having a sound strategy and the muscle to enact it means very little if you can’t gather the resources to direct to those priorities.”

Robert Kaplan from the Agency for Healthcare Research and Quality began the discussion by asking the panelists to react to the following information. Over the past 20 years, Kaplan said, most major randomized controlled trials in preventive medicine have shown that the intervention does not make much difference, even though almost all of those trials have positive effects for surrogate outcomes. On the other hand, almost every modeling study using surrogate outcomes predicts that there should be positive effects. Both the modelers and the researchers carrying out the trials then claim that the other approach is wrong because it is too simplistic. Pasqual replied that he and his colleagues discussed this very issue in detail, using the Vioxx crisis as a starting point. One thing they found is that the design for the Vioxx randomized controlled trial screened out the very people who were most likely to experience the adverse effects that later caused trouble. In his opinion, he said, the current approach to hypothesis testing that drives trial design is flawed, and it is only recently that Bayesian methods and other analytical models that can handle distributions that are not Gaussian, binomial, or in general well-behaved and that can make use of all of the available data have become amenable to analysis in a meaningful way.

John Auerbach from the Centers for Disease Control and Prevention asked the panel to talk about the challenges associated with having non-homogeneous populations and, in particular, with having subpopulations that may be affected differently by different interventions. Too often, he said, models that may be true for general populations are used to make extrapolations about the efficacy of an intervention in a subpopulation with different characteristics. Milstein responded that there is nothing like modeling and the need to make explicit the assumptions that go into a model to expose the fact that the data may not exist to support all of those

assumptions for all populations. If there are reasons to suspect that an intervention may be less effective in one group than in another, the model can be run over a range of scenarios, some more pessimistic and others more optimistic. What is important, he said, is to craft a research agenda that asks which interventions need to be tested more consistently across many different contexts and groups. Mendez added that models also provide a framework for understanding where assumptions are important and where they are not and where more data are needed to better understand the effects of heterogeneity. Pasqual noted that hierarchical Bayesian approaches are particularly useful for better understanding how subpopulations fit into the larger general population.

Marc Gourevitch from the New York University School of Medicine asked Milstein how the ReThink Health model makes predictions in the aggregate over time. Milstein responded that the model is not actually making predictions about the future because the world is complicated and models cannot anticipate surprises, nor are these models trying to estimate what the health of the population or the cost of health care will be in 25 years. What the models are doing, he explained, is playing out the consequences of actions taken now to see how long and how strong those effects can be and what the general trajectory of their effects might be relative to a clearly defined status quo scenario. “The purpose for building the model is to ask if it is within our latitude of influence to change present circumstances,” Milstein said. From the users’ perspective, these models enable them to think about the how the choices that they make today may play out over a long period of time. “My understanding is that this is a valuable perspective for leaders to have,” Milstein said.

A workshop participant from the National Association of County and City Health Officials, commenting on the widespread use of data by local health departments to improve public health and the time limits of those data, asked the panel to talk about the challenges and successes they have experienced in gathering the data needed to support models at the community level. Milstein said that the issue of getting the right kind of data at a regional level in a timely manner was something that those at ReThink Health were very conscious of when they began constructing the organization’s model. The discipline that goes into building such a model is intertwined with how the model represents phenomena, and that representation depends on having data. “Everything that ends up in the model has to be linked to evidence,” Milstein said. It is the case, he added, that there is a great deal of information and uncertainty in the available data, including experiential data that have not been formally documented but that can provide valuable information about how these systems tend to work and how they have been changing over time. Milstein also said that the ReThink Health model maps data from more than a dozen dif-

ferent sources into a common framework, and as such, it does not rely on one dataset. For example, though the goal is to get as much local data as possible, the ReThink Health model also uses datasets from state and national sources, and it makes demographic adjustments that are subjected to uncertainty analysis to fill gaps at the local level. Fortunately, he noted, the health area is a data-rich environment. “That does not make for perfect models, but it does make for well-informed ones,” he said. Having said that, Milstein added that there is a need for more consistent longitudinal data, and he said he hopes that as modeling gains more traction, it will create a demand for better longitudinal datasets.

Addressing a question from Russo about whether ReThink Health uses data from the community, Milstein said that he would never start a modeling project without access to community data, but that there is always a need to turn to other data sources for data elements that are not available locally. In California, for example, the enhanced California Health Interview Survey is a rich and valuable source of information for representing cities in that state. He added that if a region is systematically different from the state average for some reason, his team can address that using local data. As a final comment, he emphasized that the lack of data is not an excuse to not model.

Jeffrey Levi from the Trust for America’s Health commented on the lack of data that are collected on the effectiveness, value, and cost of interventions and asked the panelists what type of standardized information they would like to see gathered by federally funded and philanthropically funded studies collect so that their models would be more robust. Mendez replied that one of the important data gaps in tobacco control concerns how people transmit smoking behavior and how people communicate their intent to quit or to engage in smoking. “These social networks are becoming more and more important in order to study this area of tobacco control,” Mendez said. “We are looking at the landscape that is much more heterogeneous right now and is much more complex, and we don’t have a good dataset that would inform how these interactions are going to play out.” Milstein added that almost every part of the model and modeling process could be better informed, but that when it comes to representing initiatives and what-if scenarios, his wish list for better data would include information on the cost to implement specific innovations, the extent to which they were actually implemented, and the percentage of the population they reached.

Staying on the topic of data gaps, Robert Grist from the Institute of Social Medicine and Community Health said that one area that had not been mentioned so far was the lack of data on the political constraints that affect interventions. “I am wondering how modeling can begin to incorporate political factors that influence the kinds of interventions that

are considered politically realistic,” Grist said. “In the absence of that kind of information, I am not sure we are going to get the kind of political accountability that we would need to change the system in ways that would promote public health.” Pascual replied by asking if there would be a way of coding this kind of unstructured information so that it can be captured in one of his models. Mendez said that there is no formal way to incorporate the type of feedback that can influence decision making through the lens of a political agenda. What is possible is to conduct some type of sensitivity analysis with regard to possible delays in implementing a policy. Patrice Pascual from the Children’s Dental Health Project spoke of a recent modeling project she was involved in that examined ways of reducing cavities early in childhood. Those running the project dealt with the political ramifications of the model’s findings by taking those results to the community and letting the members of the community think about the political ramifications of the different pathways the community might take to achieving a goal. As the final comment in the discussion, she noted that this approach speaks to the comments from Victor Dzau about taking results to the community and letting the community take action.

In April 2015, the Institute of Medicine convened a workshop to explore the potential uses of simulation and other types of modeling for the purpose of selecting and refining potential strategies, ranging from interventions to investments, to improve the health of communities and the nation's health. Participants worked to identify how modeling could inform population health decision making based on lessons learned from models that have been, or have not been, used successfully, opportunities and barriers to incorporating models into decision making, and data needs and opportunities to leverage existing data and to collect new data for modeling. This report summarizes the presentations and discussions from this workshop.

READ FREE ONLINE

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

Show this book's table of contents , where you can jump to any chapter by name.

...or use these buttons to go back to the previous chapter or skip to the next one.

Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

Switch between the Original Pages , where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

To search the entire text of this book, type in your search term here and press Enter .

Share a link to this book page on your preferred social network or via email.

View our suggested citation for this chapter.

Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

Get Email Updates

Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.

  • Open access
  • Published: 27 June 2011

The case study approach

  • Sarah Crowe 1 ,
  • Kathrin Cresswell 2 ,
  • Ann Robertson 2 ,
  • Guro Huby 3 ,
  • Anthony Avery 1 &
  • Aziz Sheikh 2  

BMC Medical Research Methodology volume  11 , Article number:  100 ( 2011 ) Cite this article

797k Accesses

1115 Citations

42 Altmetric

Metrics details

The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

Peer Review reports

Introduction

The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.

The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.

This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables 1 , 2 , 3 and 4 ) and those of others to illustrate our discussion[ 3 – 7 ].

What is a case study?

A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.

Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.

These are however not necessarily mutually exclusive categories. In the first of our examples (Table 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables 2 , 3 and 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 – 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].

What are case studies used for?

According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables 2 and 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.

Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].

How are case studies conducted?

Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.

Defining the case

Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].

For example, in our evaluation of the introduction of electronic health records in English hospitals (Table 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.

Selecting the case(s)

The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.

For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.

In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.

The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.

It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.

In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.

Collecting the data

In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 – 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table 2 )[ 4 ].

Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.

In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.

Analysing, interpreting and reporting case studies

Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.

The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table 4 )[ 6 ].

Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.

When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].

What are the potential pitfalls and how can these be avoided?

The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.

Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table 8 )[ 8 , 18 – 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table 9 )[ 8 ].

Conclusions

The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.

Yin RK: Case study research, design and method. 2009, London: Sage Publications Ltd., 4

Google Scholar  

Keen J, Packwood T: Qualitative research; case study evaluation. BMJ. 1995, 311: 444-446.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sheikh A, Halani L, Bhopal R, Netuveli G, Partridge M, Car J, et al: Facilitating the Recruitment of Minority Ethnic People into Research: Qualitative Case Study of South Asians and Asthma. PLoS Med. 2009, 6 (10): 1-11.

Article   Google Scholar  

Pinnock H, Huby G, Powell A, Kielmann T, Price D, Williams S, et al: The process of planning, development and implementation of a General Practitioner with a Special Interest service in Primary Care Organisations in England and Wales: a comparative prospective case study. Report for the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D (NCCSDO). 2008, [ http://www.sdo.nihr.ac.uk/files/project/99-final-report.pdf ]

Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T, et al: Prospective evaluation of the implementation and adoption of NHS Connecting for Health's national electronic health record in secondary care in England: interim findings. BMJ. 2010, 41: c4564-

Pearson P, Steven A, Howe A, Sheikh A, Ashcroft D, Smith P, the Patient Safety Education Study Group: Learning about patient safety: organisational context and culture in the education of healthcare professionals. J Health Serv Res Policy. 2010, 15: 4-10. 10.1258/jhsrp.2009.009052.

Article   PubMed   Google Scholar  

van Harten WH, Casparie TF, Fisscher OA: The evaluation of the introduction of a quality management system: a process-oriented case study in a large rehabilitation hospital. Health Policy. 2002, 60 (1): 17-37. 10.1016/S0168-8510(01)00187-7.

Stake RE: The art of case study research. 1995, London: Sage Publications Ltd.

Sheikh A, Smeeth L, Ashcroft R: Randomised controlled trials in primary care: scope and application. Br J Gen Pract. 2002, 52 (482): 746-51.

PubMed   PubMed Central   Google Scholar  

King G, Keohane R, Verba S: Designing Social Inquiry. 1996, Princeton: Princeton University Press

Doolin B: Information technology as disciplinary technology: being critical in interpretative research on information systems. Journal of Information Technology. 1998, 13: 301-311. 10.1057/jit.1998.8.

George AL, Bennett A: Case studies and theory development in the social sciences. 2005, Cambridge, MA: MIT Press

Eccles M, the Improved Clinical Effectiveness through Behavioural Research Group (ICEBeRG): Designing theoretically-informed implementation interventions. Implementation Science. 2006, 1: 1-8. 10.1186/1748-5908-1-1.

Article   PubMed Central   Google Scholar  

Netuveli G, Hurwitz B, Levy M, Fletcher M, Barnes G, Durham SR, Sheikh A: Ethnic variations in UK asthma frequency, morbidity, and health-service use: a systematic review and meta-analysis. Lancet. 2005, 365 (9456): 312-7.

Sheikh A, Panesar SS, Lasserson T, Netuveli G: Recruitment of ethnic minorities to asthma studies. Thorax. 2004, 59 (7): 634-

CAS   PubMed   PubMed Central   Google Scholar  

Hellström I, Nolan M, Lundh U: 'We do things together': A case study of 'couplehood' in dementia. Dementia. 2005, 4: 7-22. 10.1177/1471301205049188.

Som CV: Nothing seems to have changed, nothing seems to be changing and perhaps nothing will change in the NHS: doctors' response to clinical governance. International Journal of Public Sector Management. 2005, 18: 463-477. 10.1108/09513550510608903.

Lincoln Y, Guba E: Naturalistic inquiry. 1985, Newbury Park: Sage Publications

Barbour RS: Checklists for improving rigour in qualitative research: a case of the tail wagging the dog?. BMJ. 2001, 322: 1115-1117. 10.1136/bmj.322.7294.1115.

Mays N, Pope C: Qualitative research in health care: Assessing quality in qualitative research. BMJ. 2000, 320: 50-52. 10.1136/bmj.320.7226.50.

Mason J: Qualitative researching. 2002, London: Sage

Brazier A, Cooke K, Moravan V: Using Mixed Methods for Evaluating an Integrative Approach to Cancer Care: A Case Study. Integr Cancer Ther. 2008, 7: 5-17. 10.1177/1534735407313395.

Miles MB, Huberman M: Qualitative data analysis: an expanded sourcebook. 1994, CA: Sage Publications Inc., 2

Pope C, Ziebland S, Mays N: Analysing qualitative data. Qualitative research in health care. BMJ. 2000, 320: 114-116. 10.1136/bmj.320.7227.114.

Cresswell KM, Worth A, Sheikh A: Actor-Network Theory and its role in understanding the implementation of information technology developments in healthcare. BMC Med Inform Decis Mak. 2010, 10 (1): 67-10.1186/1472-6947-10-67.

Article   PubMed   PubMed Central   Google Scholar  

Malterud K: Qualitative research: standards, challenges, and guidelines. Lancet. 2001, 358: 483-488. 10.1016/S0140-6736(01)05627-6.

Article   CAS   PubMed   Google Scholar  

Yin R: Case study research: design and methods. 1994, Thousand Oaks, CA: Sage Publishing, 2

Yin R: Enhancing the quality of case studies in health services research. Health Serv Res. 1999, 34: 1209-1224.

Green J, Thorogood N: Qualitative methods for health research. 2009, Los Angeles: Sage, 2

Howcroft D, Trauth E: Handbook of Critical Information Systems Research, Theory and Application. 2005, Cheltenham, UK: Northampton, MA, USA: Edward Elgar

Book   Google Scholar  

Blakie N: Approaches to Social Enquiry. 1993, Cambridge: Polity Press

Doolin B: Power and resistance in the implementation of a medical management information system. Info Systems J. 2004, 14: 343-362. 10.1111/j.1365-2575.2004.00176.x.

Bloomfield BP, Best A: Management consultants: systems development, power and the translation of problems. Sociological Review. 1992, 40: 533-560.

Shanks G, Parr A: Positivist, single case study research in information systems: A critical analysis. Proceedings of the European Conference on Information Systems. 2003, Naples

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/11/100/prepub

Download references

Acknowledgements

We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.

Author information

Authors and affiliations.

Division of Primary Care, The University of Nottingham, Nottingham, UK

Sarah Crowe & Anthony Avery

Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK

Kathrin Cresswell, Ann Robertson & Aziz Sheikh

School of Health in Social Science, The University of Edinburgh, Edinburgh, UK

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sarah Crowe .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors' contributions

AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Crowe, S., Cresswell, K., Robertson, A. et al. The case study approach. BMC Med Res Methodol 11 , 100 (2011). https://doi.org/10.1186/1471-2288-11-100

Download citation

Received : 29 November 2010

Accepted : 27 June 2011

Published : 27 June 2011

DOI : https://doi.org/10.1186/1471-2288-11-100

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Case Study Approach
  • Electronic Health Record System
  • Case Study Design
  • Case Study Site
  • Case Study Report

BMC Medical Research Methodology

ISSN: 1471-2288

case study healthcare models

Taking a case study approach to assessing alternative leadership models in health care

  • British Journal of Nursing 27(11):608-613
  • 27(11):608-613
  • This person is not on ResearchGate, or hasn't claimed this research yet.

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

No full-text available

Request Full-text Paper PDF

To read the full-text of this research, you can request a copy directly from the authors.

Carla Pires

  • Simon Robinson

Beverly W. Henry

  • Teoh Pei Hung
  • Norazah Mohd Nordin

George Gotsis

  • IMPLEMENT SCI

Melissa Carlson

  • Sarah Morris

Fiona Day

  • Monica Duncan

Peter Van Bogaert

  • Jaime Guzman

Caroline Bradbury-Jones

  • R.J. Alban-Metcalfe

Margaret Mcallister

  • Jessica McKinnon

Anders Skogstad

  • Online J Issues Nurs

Milisa Manojlovich

  • AlemSeghed Kebede
  • Stephen Kalberg
  • John F. Binning

Jay A Conger

  • Rabindra N. Kanungo
  • NURS EDUC TODAY

Sara Kennedy

  • Ronald Lippitt
  • Ralph K. White
  • J Nurs Manag

Carol Anne Wong

  • ORGANIZATION
  • David Buchanan
  • AM ANTHROPOL
  • Eliot D. Chapple
  • Bernard M. Bass
  • J NURS SCHOLARSHIP

Aditi Rao

  • LEADERSHIP QUART

Bruce J. Avolio

  • Boas Shamir
  • Marty Downey
  • Susan Parslow
  • Marcia Smart
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

A business journal from the Wharton School of the University of Pennsylvania

Narayana Hrudayalaya: A Model for Accessible, Affordable Health Care?

July 1, 2010 • 13 min read.

Cardiac surgeon Devi Shetty is on a mission to build 5,000-bed "health cities" across India, encouraged by the success of his nine-year-old Narayana Hrudayalaya hospital in Bangalore. He has contained costs at that facility by tweaking pricing and salary structures, driving hard bargains with vendors and negotiating creative partnership deals. Despite the challenges he faces in replicating what he calls "the Walmart approach" to medical services, Shetty believes using economies of scale could lead to a new health care model not only for India, but perhaps also for the world.

case study healthcare models

  • Health Care Management

case study healthcare models

Cardiac surgeon Dr. Devi Shetty is on a mission to build 5,000-bed “health cities” across India, encouraged by the success at his nine-year-old Narayana Hrudayalaya hospital in Bangalore. He has contained costs by tweaking processes, driving hard bargains and negotiating creative partnership deals, but faces challenges in replicating that model on a bigger scale. Shetty wants to make quality health care accessible and affordable using economies of scale, or the cost advantages businesses obtain due to expansion. His hospital in Bangalore focuses on cardiac medicine but he wants to extend the model to other specialties, in addition to other locations.

Shetty believes his success could lead to a new health care model not only for India but perhaps also for the world. “The first heart surgery was done over a hundred years ago but even today only 8% of the world’s population can afford heart operations,” Shetty notes. “In India, around 2.5 million people require heart surgeries every year but all of [the country’s doctors] put together perform only 80,000 to 90,000 surgeries a year…. We clearly need to relook and change the way things are being done.”

At his Narayana Hrudayalaya Institute of Cardiac Sciences in Bangalore, the 56-year-old Shetty is doing just that. Patients at his hospital get cardiac care at a cost lower than any other hospital in the country and at a fraction of what it would cost elsewhere in the world, a feat accomplished through what Shetty refers to as “process innovation.” Shetty, who has been in the medical profession for close to 25 years and worked at Guy’s Hospital in London, the Birla Heart Research Foundation in Kolkata (formerly Calcutta) and the Manipal Heart Foundation in Bangalore before branching out on his own, was formerly personal physician to Mother Teresa. His interactions with her, he notes, not only offered the opportunity to closely observe the famed humanitarian’s charitable work but also caused the doctor to begin thinking about how quality health care could be made widely accessible and affordable.

That was how Shetty came to the conclusion that the health care industry needs more process innovation than product innovation. The industry “does not need a magic pill or the fastest scanner or a new procedure,” he states, but instead requires improvements that lower the cost of medical attention and make it more widely available. Shetty’s premise of economies of scale is not radical; in fact, the doctor describes his way as “the Walmart approach.” What sets him apart, however, is that he has successfully adapted the method to a field as complex and costly as cardiac care. “There is no doubt that he has created a very distinct model to take cardiac care to the masses,” notes Vishal Bali, chief executive officer of Fortis Hospitals, a prominent Indian healthcare group.

Now Shetty is ready to aim higher. India currently has around 0.7 hospital beds per thousand people; the key to better aligning those numbers with the population, he states, is creating a chain of large “health cities” across the country. To set the ball rolling, Shetty spearheaded the creation of a 1,400-bed cancer and multispecialty hospital — the largest cancer hospital in the country — at the Bangalore campus. A women and children’s hospital and another for nephrology are also in the works. In addition, the Bangalore facility — which is set to expand to a total of 5,000 beds over the next three years — includes a 500-bed orthopedic hospital, an eye hospital, research facilities and room for about 50 training programs.

Over the next five years, Shetty wants to build similar 5,000-bed health cities across the country. An expansion at his Kolkata hospital is currently underway, and new hospitals in Hyderabad and Jaipur are expected to open for business later this year. Construction is starting on a 1,400-bed hospital in Ahmedabad; additional locations have also been identified. “We want to have around 30,000 beds over the next five years,” Shetty says. “As our volumes increase, we will get further economies of scale. In the next five years we want to be able to do a heart operation for US$800 from point of admission to point of discharge. We believe it is possible.”

Shetty has reason to be confident. Over the years, the Bangalore heart hospital he opened in 2001 grew to 1,000 beds; the facility has added advanced technology and doctors there perform some 30 surgeries a day — the highest number of cardiac surgeries done by any hospital in India. Other hospitals in India, including Escorts, Apollo, Wockhardt and Fortis, perform about half that number. In addition, Shetty’s staff has the capability to do a large number of different cardiac procedures. The hospital’s mortality rate of around 2% and hospital-acquired infection rate of 2.8 per 1000 ICU days are comparable to the best hospitals across the world, Shetty asserts. In an article in Forbes India , the University of Michigan’s C. K. Prahalad said the mortality rate in Narayana Hrudayalaya is “much lower than in New York State for similar kinds of heart disease.”

Serving the Poor

Cardiac surgeries in the United States can cost up to US$50,000. In India, they typically cost around US$5,000-US$7,000. Depending on the complexities of the procedure and the length of the patient’s stay at the hospital, the price tag increases. At Narayana Hrudayalaya, however, surgeries cost less than US$3,000, irrespective of the complexity of the procedure or the length of hospitalization. About 45% of Shetty’s patients pay even less. Of these, about 30% are covered under a micro-insurance plan for health care called Yeshasvini that reimburses Narayana Hrudayalaya at about US$1,200 a surgery. Conceptualized by Shetty and run by an independent trust, Yeshasvini was launched in 2002 in association with the Karnataka state government.

For those who are not part of the insurance plan and can’t afford the hospital’s regular charges, Shetty offers concessional rates. The discounts depend on patients’ financial capacity and are funded either by the hospital’s charitable trust, individual donors or by the hospital itself. Almost 15% of the hospital’s patients benefit from these concessions. In addition, Shetty and his team reach out to patients through a network of rural clinics and via telemedicine facilities. Patients come to the Bangalore facility from more than 50 countries. Shetty’s instructions to his team are clear: No one who comes to Narayana Hrudayalaya will be denied treatment due to a lack of funds.

To ensure the viability of the project, Shetty has devised a hybrid pricing model. Apart from the regular package of US$3,000 a surgery, he also offers semiprivate and private rooms for those who want and can afford better personal amenities. The medical facilities are the same for every patient, however. The upgraded rooms, which comprise around 20% of the total available at the hospital, are priced at US$4,000-US$5,000 and “offset the losses incurred from treating the poor,” Shetty notes.

The managing team at Narayana Hrudayalaya follows the unique accounting practice of studying the profit and loss account on a daily basis. “By monitoring the average realization per surgery and our profitability on a daily basis, we are able to assess how much concession we can afford to give the following day without adversely impacting our profitability,” states Sreenath Reddy, the hospital’s chief financial officer. Reddy expects revenues of US$80 million for the year ending March 2010 and to generate US$200 million annually over the next two years. The hospital has been profitable from the first year. JP Morgan and PineBridge Investments (formerly known as AIG Investments) each hold a 12.5% stake in the company. Kiran Mazumdar-Shaw, chairman and managing director of biotechnology firm Biocon owns a 2.5% stake, and Shetty and his family own the remainder of the company. Shishir Jain, executive director at JP Morgan believes Shetty has shown that “it is possible to fulfill a great social need without compromising on the profitability.” Santosh Senapathy, managing director of PineBridge Investments adds that “Narayana Hrudayalaya will change the way healthcare is delivered across the world.”

Innovations in Operations

Indeed, Shetty has already turned some standard industry practices on their heads. One of his first innovations when he set up Narayana Hrudayalaya in 2001 was in the way doctors are compensated. Typically, cardiac surgeons are paid per surgery and their costs constitute a significant proportion of a hospital’s total expenses. Shetty invited his staff physicians to work for fixed salaries; he did not pay them less than what they would have normally taken home at the end of the month, but he required doctors to perform more surgeries, bringing down the cost per procedure. This approach continues to be one of the core savings areas at Narayana Hrudayalaya.

In addition, Shetty’s father-in-law — who was in the construction business — built the first hospital for him, keeping costs to the minimum. Shetty claims he passed on those savings to patients, and maintains that, even today, construction costs at his hospitals are less than half of that for others. “The way we design the hospitals and our close monitoring of our projects help us to keep a very tight control of our construction costs,” notes Shetty’s son Viren, an engineer and director at the hospital. Shetty’s two other sons are studying medicine.

In the initial days of Narayana Hrudayalaya, patients came because of Shetty’s skill and his reputation. The cost savings he offered started attracting customers in greater numbers. Apart from the surgeries, the Bangalore campus treats about 2,500 people daily in its out-patient department. The increasing volumes in turn have helped lower costs in many ways, staff says. Instead of buying surgical gloves in India, for example, Narayana Hrudayalaya saves about 40% by importing them in container loads from Malaysia. The hospital has moved to digital X-ray technology, saving on the recurring cost of film. Most hospitals use their CT scanners, MRI (magnetic resonance imaging) and other machines for only eight hours a day, but Narayana Hrudayalaya uses them for 14 hours and offers these tests to the patients at lower rates in the late evenings. As volumes increase, per unit costs naturally come down.

For procedures like blood gas analysis, Shetty’s team convinced the equipment vendor that, instead of selling the machine to the hospital, he could simply park it there and make his money by selling the chemical reagents required for the test. The hospital saves on the cost of the machines while the vendor also profits. For the past six months, another vendor has parked his catheterization laboratory equipment at the hospital free of charge. The deal came together because the vendor wants to use Narayana Hrudayalaya as a referral, Shetty notes, with the idea that if he can show that his equipment can cope with the patient volumes at Narayana Hrudayalaya, it can work anywhere, he adds.

The high patient volumes help Shetty drive a hard bargain with vendors when negotiating prices for everything from basic supplies to sophisticated medical equipment. The new cancer hospital, for example, purchased two linear accelerators (for producing X-rays) that typically cost US$6.4 million each for the price of one machine. The cost of the machines was spread out, interest-free, over seven years. “Given [the hospital’s] volumes and Shetty’s own credibility, every negotiation is as tough as it can be. He certainly gets his pound of flesh,” notes V. Raja, president and CEO of GE Healthcare South Asia, who has been associated with Narayana Hrudayalaya from the beginning. With Shetty now on an expansion drive, Raja is in discussions with him to see how they can structure deals that enable Shetty to achieve economies of scale while bringing more business for GE Healthcare.

Testing an Untested Model

Shetty’s model of 5,000-bed health cities has its share of risks and challenges. It remains to be seen if the doctor can replicate his success in volume-based cardiac care across specialties and cities. He is also considering setting up health care facilities in the Cayman Islands and Malaysia. Observers say to succeed, Shetty needs to build organizational and management bandwidth; create teams of medical professionals that share his vision and are willing to work hard; put in place robust processes, and raise the required funding. “The scalability of any model is based on the creation of an organizational structure,” says Bali, of Fortis. “One does not see this at Narayana Hrudayalaya. It has been around for many years and by now the structure should have emerged. One will have to wait and watch if Shetty can indeed scale [his model] beyond one or two institutions.”

Amit Varma, president, healthcare, at Religare Enterprises, a financial services group and director of critical care medicine at the Fortis Escorts group of hospitals, raises another concern. Varma was part of Shetty’s team at Manipal Hospital and at Narayana Hrudayalaya. “The intention is absolutely right but there is a base cost to any procedure and you can bring that down only to a certain level,” he notes. “There is a tipping point beyond which the volume that you do will have an adverse impact on the quality. What that tipping point is remains to be seen.”

But Girdhar Gyani, CEO of the National Accreditation Board for Hospital and Healthcare Providers believes a commitment to delivering quality service is part of the culture of strong teams. “Shetty’s team in Bangalore is top-of-the-line in terms of quality and I am confident that the rest of the facilities that he builds will be the same too. Shetty is a transformational leader who can bring about a sea change in this industry.”

Mazumdar-Shaw of Biocon, who owns a stake in Shetty’s company, says the doctor brings a missionary work ethic to his efforts and has attracted a talented and committed team of doctors, nurses, paramedics and professionals. She credits Narayana Hrudayalaya with consistently focusing on training and developing specialized skills. “I have no doubt that Narayana Hrudayalaya is scalable in India and Shetty’s concept of 5,000-bed health cities is the way to go. India’s medical talent pool is vast and can certainly sustain this growth.” Raja of GE Healthcare also adds his vote of confidence: “This is a pretty much untested model across the world but Dr. Shetty is fully committed to it and, if anyone can, he can.”

More From Knowledge at Wharton

case study healthcare models

How AI Analytics Spurs Innovation at Newly Public Firms

case study healthcare models

Are Health Care Organizations Contributing to Racial Wealth Inequality?

case study healthcare models

How Can We Boost U.S. COVID Vaccination Rates?

Looking for more insights.

Sign up to stay informed about our latest article releases.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 12 September 2024

An open-source framework for end-to-end analysis of electronic health record data

  • Lukas Heumos 1 , 2 , 3 ,
  • Philipp Ehmele 1 ,
  • Tim Treis 1 , 3 ,
  • Julius Upmeier zu Belzen   ORCID: orcid.org/0000-0002-0966-4458 4 ,
  • Eljas Roellin 1 , 5 ,
  • Lilly May 1 , 5 ,
  • Altana Namsaraeva 1 , 6 ,
  • Nastassya Horlava 1 , 3 ,
  • Vladimir A. Shitov   ORCID: orcid.org/0000-0002-1960-8812 1 , 3 ,
  • Xinyue Zhang   ORCID: orcid.org/0000-0003-4806-4049 1 ,
  • Luke Zappia   ORCID: orcid.org/0000-0001-7744-8565 1 , 5 ,
  • Rainer Knoll 7 ,
  • Niklas J. Lang 2 ,
  • Leon Hetzel 1 , 5 ,
  • Isaac Virshup 1 ,
  • Lisa Sikkema   ORCID: orcid.org/0000-0001-9686-6295 1 , 3 ,
  • Fabiola Curion 1 , 5 ,
  • Roland Eils 4 , 8 ,
  • Herbert B. Schiller 2 , 9 ,
  • Anne Hilgendorff 2 , 10 &
  • Fabian J. Theis   ORCID: orcid.org/0000-0002-2419-1943 1 , 3 , 5  

Nature Medicine ( 2024 ) Cite this article

98 Altmetric

Metrics details

  • Epidemiology
  • Translational research

With progressive digitalization of healthcare systems worldwide, large-scale collection of electronic health records (EHRs) has become commonplace. However, an extensible framework for comprehensive exploratory analysis that accounts for data heterogeneity is missing. Here we introduce ehrapy, a modular open-source Python framework designed for exploratory analysis of heterogeneous epidemiology and EHR data. ehrapy incorporates a series of analytical steps, from data extraction and quality control to the generation of low-dimensional representations. Complemented by rich statistical modules, ehrapy facilitates associating patients with disease states, differential comparison between patient clusters, survival analysis, trajectory inference, causal inference and more. Leveraging ontologies, ehrapy further enables data sharing and training EHR deep learning models, paving the way for foundational models in biomedical research. We demonstrate ehrapy’s features in six distinct examples. We applied ehrapy to stratify patients affected by unspecified pneumonia into finer-grained phenotypes. Furthermore, we reveal biomarkers for significant differences in survival among these groups. Additionally, we quantify medication-class effects of pneumonia medications on length of stay. We further leveraged ehrapy to analyze cardiovascular risks across different data modalities. We reconstructed disease state trajectories in patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) based on imaging data. Finally, we conducted a case study to demonstrate how ehrapy can detect and mitigate biases in EHR data. ehrapy, thus, provides a framework that we envision will standardize analysis pipelines on EHR data and serve as a cornerstone for the community.

Similar content being viewed by others

case study healthcare models

Data-driven identification of heart failure disease states and progression pathways using electronic health records

case study healthcare models

EHR foundation models improve robustness in the presence of temporal distribution shift

case study healthcare models

Harnessing EHR data for health research

Electronic health records (EHRs) are becoming increasingly common due to standardized data collection 1 and digitalization in healthcare institutions. EHRs collected at medical care sites serve as efficient storage and sharing units of health information 2 , enabling the informed treatment of individuals using the patient’s complete history 3 . Routinely collected EHR data are approaching genomic-scale size and complexity 4 , posing challenges in extracting information without quantitative analysis methods. The application of such approaches to EHR databases 1 , 5 , 6 , 7 , 8 , 9 has enabled the prediction and classification of diseases 10 , 11 , study of population health 12 , determination of optimal treatment policies 13 , 14 , simulation of clinical trials 15 and stratification of patients 16 .

However, current EHR datasets suffer from serious limitations, such as data collection issues, inconsistencies and lack of data diversity. EHR data collection and sharing problems often arise due to non-standardized formats, with disparate systems using exchange protocols, such as Health Level Seven International (HL7) and Fast Healthcare Interoperability Resources (FHIR) 17 . In addition, EHR data are stored in various on-disk formats, including, but not limited to, relational databases and CSV, XML and JSON formats. These variations pose challenges with respect to data retrieval, scalability, interoperability and data sharing.

Beyond format variability, inherent biases of the collected data can compromise the validity of findings. Selection bias stemming from non-representative sample composition can lead to skewed inferences about disease prevalence or treatment efficacy 18 , 19 . Filtering bias arises through inconsistent criteria for data inclusion, obscuring true variable relationships 20 . Surveillance bias exaggerates associations between exposure and outcomes due to differential monitoring frequencies 21 . EHR data are further prone to missing data 22 , 23 , which can be broadly classified into three categories: missing completely at random (MCAR), where missingness is unrelated to the data; missing at random (MAR), where missingness depends on observed data; and missing not at random (MNAR), where missingness depends on unobserved data 22 , 23 . Information and coding biases, related to inaccuracies in data recording or coding inconsistencies, respectively, can lead to misclassification and unreliable research conclusions 24 , 25 . Data may even contradict itself, such as when measurements were reported for deceased patients 26 , 27 . Technical variation and differing data collection standards lead to distribution differences and inconsistencies in representation and semantics across EHR datasets 28 , 29 . Attrition and confounding biases, resulting from differential patient dropout rates or unaccounted external variable effects, can significantly skew study outcomes 30 , 31 , 32 . The diversity of EHR data that comprise demographics, laboratory results, vital signs, diagnoses, medications, x-rays, written notes and even omics measurements amplifies all the aforementioned issues.

Addressing these challenges requires rigorous study design, careful data pre-processing and continuous bias evaluation through exploratory data analysis. Several EHR data pre-processing and analysis workflows were previously developed 4 , 33 , 34 , 35 , 36 , 37 , but none of them enables the analysis of heterogeneous data, provides in-depth documentation, is available as a software package or allows for exploratory visual analysis. Current EHR analysis pipelines, therefore, differ considerably in their approaches and are often commercial, vendor-specific solutions 38 . This is in contrast to strategies using community standards for the analysis of omics data, such as Bioconductor 39 or scverse 40 . As a result, EHR data frequently remain underexplored and are commonly investigated only for a particular research question 41 . Even in such cases, EHR data are then frequently input into machine learning models with serious data quality issues that greatly impact prediction performance and generalizability 42 .

To address this lack of analysis tooling, we developed the EHR Analysis in Python framework, ehrapy, which enables exploratory analysis of diverse EHR datasets. The ehrapy package is purpose-built to organize, analyze, visualize and statistically compare complex EHR data. ehrapy can be applied to datasets of different data types, sizes, diseases and origins. To demonstrate this versatility, we applied ehrapy to datasets obtained from EHR and population-based studies. Using the Pediatric Intensive Care (PIC) EHR database 43 , we stratified patients diagnosed with ‘unspecified pneumonia’ into distinct clinically relevant groups, extracted clinical indicators of pneumonia through statistical analysis and quantified medication-class effects on length of stay (LOS) with causal inference. Using the UK Biobank 44 (UKB), a population-scale cohort comprising over 500,000 participants from the United Kingdom, we employed ehrapy to explore cardiovascular risk factors using clinical predictors, metabolomics, genomics and retinal imaging-derived features. Additionally, we performed image analysis to project disease progression through fate mapping in patients affected by coronavirus disease 2019 (COVID-19) using chest x-rays. Finally, we demonstrate how exploratory analysis with ehrapy unveils and mitigates biases in over 100,000 visits by patients with diabetes across 130 US hospitals. We provide online links to additional use cases that demonstrate ehrapy’s usage with further datasets, including MIMIC-II (ref. 45 ), and for various medical conditions, such as patients subject to indwelling arterial catheter usage. ehrapy is compatible with any EHR dataset that can be transformed into vectors and is accessible as a user-friendly open-source software package hosted at https://github.com/theislab/ehrapy and installable from PyPI. It comes with comprehensive documentation, tutorials and further examples, all available at https://ehrapy.readthedocs.io .

ehrapy: a framework for exploratory EHR data analysis

The foundation of ehrapy is a robust and scalable data storage backend that is combined with a series of pre-processing and analysis modules. In ehrapy, EHR data are organized as a data matrix where observations are individual patient visits (or patients, in the absence of follow-up visits), and variables represent all measured quantities ( Methods ). These data matrices are stored together with metadata of observations and variables. By leveraging the AnnData (annotated data) data structure that implements this design, ehrapy builds upon established standards and is compatible with analysis and visualization functions provided by the omics scverse 40 ecosystem. Readers are also available in R, Julia and Javascript 46 . We additionally provide a dataset module with more than 20 public loadable EHR datasets in AnnData format to kickstart analysis and development with ehrapy.

For standardized analysis of EHR data, it is crucial that these data are encoded and stored in consistent, reusable formats. Thus, ehrapy requires that input data are organized in structured vectors. Readers for common formats, such as CSV, OMOP 47 or SQL databases, are available in ehrapy. Data loaded into AnnData objects can be mapped against several hierarchical ontologies 48 , 49 , 50 , 51 ( Methods ). Clinical keywords of free text notes can be automatically extracted ( Methods ).

Powered by scanpy, which scales to millions of observations 52 ( Methods and Supplementary Table 1 ) and the machine learning library scikit-learn 53 , ehrapy provides more than 100 composable analysis functions organized in modules from which custom analysis pipelines can be built. Each function directly interacts with the AnnData object and adds all intermediate results for simple access and reuse of information to it. To facilitate setting up these pipelines, ehrapy guides analysts through a general analysis pipeline (Fig. 1 ). At any step of an analysis pipeline, community software packages can be integrated without any vendor lock-in. Because ehrapy is built on open standards, it can be purposefully extended to solve new challenges, such as the development of foundational models ( Methods ).

figure 1

a , Heterogeneous health data are first loaded into memory as an AnnData object with patient visits as observational rows and variables as columns. Next, the data can be mapped against ontologies, and key terms are extracted from free text notes. b , The EHR data are subject to quality control where low-quality or spurious measurements are removed or imputed. Subsequently, numerical data are normalized, and categorical data are encoded. Data from different sources with data distribution shifts are integrated, embedded, clustered and annotated in a patient landscape. c , Further downstream analyses depend on the question of interest and can include the inference of causal effects and trajectories, survival analysis or patient stratification.

In the ehrapy analysis pipeline, EHR data are initially inspected for quality issues by analyzing feature distributions that may skew results and by detecting visits and features with high missing rates that ehrapy can then impute ( Methods ). ehrapy tracks all filtering steps while keeping track of population dynamics to highlight potential selection and filtering biases ( Methods ). Subsequently, ehrapy’s normalization and encoding functions ( Methods ) are applied to achieve a uniform numerical representation that facilitates data integration and corrects for dataset shift effects ( Methods ). Calculated lower-dimensional representations can subsequently be visualized, clustered and annotated to obtain a patient landscape ( Methods ). Such annotated groups of patients can be used for statistical comparisons to find differences in features among them to ultimately learn markers of patient states.

As analysis goals can differ between users and datasets, the ehrapy analysis pipeline is customizable during the final knowledge inference step. ehrapy provides statistical methods for group comparison and extensive support for survival analysis ( Methods ), enabling the discovery of biomarkers. Furthermore, ehrapy offers functions for causal inference to go from statistically determined associations to causal relations ( Methods ). Moreover, patient visits in aggregated EHR data can be regarded as snapshots where individual measurements taken at specific timepoints might not adequately reflect the underlying progression of disease and result from unrelated variation due to, for example, day-to-day differences 54 , 55 , 56 . Therefore, disease progression models should rely on analysis of the underlying clinical data, as disease progression in an individual patient may not be monotonous in time. ehrapy allows for the use of advanced trajectory inference methods to overcome sparse measurements 57 , 58 , 59 . We show that this approach can order snapshots to calculate a pseudotime that can adequately reflect the progression of the underlying clinical process. Given a sufficient number of snapshots, ehrapy increases the potential to understand disease progression, which is likely not robustly captured within a single EHR but, rather, across several.

ehrapy enables patient stratification in pneumonia cases

To demonstrate ehrapy’s capability to analyze heterogeneous datasets from a broad patient set across multiple care units, we applied our exploratory strategy to the PIC 43 database. The PIC database is a single-center database hosting information on children admitted to critical care units at the Children’s Hospital of Zhejiang University School of Medicine in China. It contains 13,499 distinct hospital admissions of 12,881 individual pediatric patients admitted between 2010 and 2018 for whom demographics, diagnoses, doctors’ notes, vital signs, laboratory and microbiology tests, medications, fluid balances and more were collected (Extended Data Figs. 1 and 2a and Methods ). After missing data imputation and subsequent pre-processing (Extended Data Figs. 2b,c and 3 and Methods ), we generated a uniform manifold approximation and projection (UMAP) embedding to visualize variation across all patients using ehrapy (Fig. 2a ). This visualization of the low-dimensional patient manifold shows the heterogeneity of the collected data in the PIC database, with malformations, perinatal and respiratory being the most abundant International Classification of Diseases (ICD) chapters (Fig. 2b ). The most common respiratory disease categories (Fig. 2c ) were labeled pneumonia and influenza ( n  = 984). We focused on pneumonia to apply ehrapy to a challenging, broad-spectrum disease that affects all age groups. Pneumonia is a prevalent respiratory infection that poses a substantial burden on public health 60 and is characterized by inflammation of the alveoli and distal airways 60 . Individuals with pre-existing chronic conditions are particularly vulnerable, as are children under the age of 5 (ref. 61 ). Pneumonia can be caused by a range of microorganisms, encompassing bacteria, respiratory viruses and fungi.

figure 2

a , UMAP of all patient visits in the ICU with primary discharge diagnosis grouped by ICD chapter. b , The prevalence of respiratory diseases prompted us to investigate them further. c , Respiratory categories show the abundance of influenza and pneumonia diagnoses that we investigated more closely. d , We observed the ‘unspecified pneumonia’ subgroup, which led us to investigate and annotate it in more detail. e , The previously ‘unspecified pneumonia’-labeled patients were annotated using several clinical features (Extended Data Fig. 5 ), of which the most important ones are shown in the heatmap ( f ). g , Example disease progression of an individual child with pneumonia illustrating pharmacotherapy over time until positive A. baumannii swab.

We selected the age group ‘youths’ (13 months to 18 years of age) for further analysis, addressing a total of 265 patients who dominated the pneumonia cases and were diagnosed with ‘unspecified pneumonia’ (Fig. 2d and Extended Data Fig. 4 ). Neonates (0–28 d old) and infants (29 d to 12 months old) were excluded from the analysis as the disease context is significantly different in these age groups due to distinct anatomical and physical conditions. Patients were 61% male, had a total of 277 admissions, had a mean age at admission of 54 months (median, 38 months) and had an average LOS of 15 d (median, 7 d). Of these, 152 patients were admitted to the pediatric intensive care unit (PICU), 118 to the general ICU (GICU), four to the surgical ICU (SICU) and three to the cardiac ICU (CICU). Laboratory measurements typically had 12–14% missing data, except for serum procalcitonin (PCT), a marker for bacterial infections, with 24.5% missing, and C-reactive protein (CRP), a marker of inflammation, with 16.8% missing. Measurements assigned as ‘vital signs’ contained between 44% and 54% missing values. Stratifying patients with unspecified pneumonia further enables a more nuanced understanding of the disease, potentially facilitating tailored approaches to treatment.

To deepen clinical phenotyping for the disease group ‘unspecified pneumonia’, we calculated a k -nearest neighbor graph to cluster patients into groups and visualize these in UMAP space ( Methods ). Leiden clustering 62 identified four patient groupings with distinct clinical features that we annotated (Fig. 2e ). To identify the laboratory values, medications and pathogens that were most characteristic for these four groups (Fig. 2f ), we applied t -tests for numerical data and g -tests for categorical data between the identified groups using ehrapy (Extended Data Fig. 5 and Methods ). Based on this analysis, we identified patient groups with ‘sepsis-like, ‘severe pneumonia with co-infection’, ‘viral pneumonia’ and ‘mild pneumonia’ phenotypes. The ‘sepsis-like’ group of patients ( n  = 28) was characterized by rapid disease progression as exemplified by an increased number of deaths (adjusted P  ≤ 5.04 × 10 −3 , 43% ( n  = 28), 95% confidence interval (CI): 23%, 62%); indication of multiple organ failure, such as elevated creatinine (adjusted P  ≤ 0.01, 52.74 ± 23.71 μmol L −1 ) or reduced albumin levels (adjusted P  ≤ 2.89 × 10 −4 , 33.40 ± 6.78 g L −1 ); and increased expression levels and peaks of inflammation markers, including PCT (adjusted P  ≤ 3.01 × 10 −2 , 1.42 ± 2.03 ng ml −1 ), whole blood cell count, neutrophils, lymphocytes, monocytes and lower platelet counts (adjusted P  ≤ 6.3 × 10 −2 , 159.30 ± 142.00 × 10 9 per liter) and changes in electrolyte levels—that is, lower potassium levels (adjusted P  ≤ 0.09 × 10 −2 , 3.14 ± 0.54 mmol L −1 ). Patients whom we associated with the term ‘severe pneumonia with co-infection’ ( n  = 74) were characterized by prolonged ICU stays (adjusted P  ≤ 3.59 × 10 −4 , 15.01 ± 29.24 d); organ affection, such as higher levels of creatinine (adjusted P  ≤ 1.10 × 10 −4 , 52.74 ± 23.71 μmol L −1 ) and lower platelet count (adjusted P  ≤ 5.40 × 10 −23 , 159.30 ± 142.00 × 10 9 per liter); increased inflammation markers, such as peaks of PCT (adjusted P  ≤ 5.06 × 10 −5 , 1.42 ± 2.03 ng ml −1 ), CRP (adjusted P  ≤ 1.40 × 10 −6 , 50.60 ± 37.58 mg L −1 ) and neutrophils (adjusted P  ≤ 8.51 × 10 −6 , 13.01 ± 6.98 × 10 9 per liter); detection of bacteria in combination with additional pathogen fungals in sputum samples (adjusted P  ≤ 1.67 × 10 −2 , 26% ( n  = 74), 95% CI: 16%, 36%); and increased application of medication, including antifungals (adjusted P  ≤ 1.30 × 10 −4 , 15% ( n  = 74), 95% CI: 7%, 23%) and catecholamines (adjusted P  ≤ 2.0 × 10 −2 , 45% ( n  = 74), 95% CI: 33%, 56%). Patients in the ‘mild pneumonia’ group were characterized by positive sputum cultures in the presence of relatively lower inflammation markers, such as PCT (adjusted P  ≤ 1.63 × 10 −3 , 1.42 ± 2.03 ng ml −1 ) and CRP (adjusted P  ≤ 0.03 × 10 −1 , 50.60 ± 37.58 mg L −1 ), while receiving antibiotics more frequently (adjusted P  ≤ 1.00 × 10 −5 , 80% ( n  = 78), 95% CI: 70%, 89%) and additional medications (electrolytes, blood thinners and circulation-supporting medications) (adjusted P  ≤ 1.00 × 10 −5 , 82% ( n  = 78), 95% CI: 73%, 91%). Finally, patients in the ‘viral pneumonia’ group were characterized by shorter LOSs (adjusted P  ≤ 8.00 × 10 −6 , 15.01 ± 29.24 d), a lack of non-viral pathogen detection in combination with higher lymphocyte counts (adjusted P  ≤ 0.01, 4.11 ± 2.49 × 10 9 per liter), lower levels of PCT (adjusted P  ≤ 0.03 × 10 −2 , 1.42 ± 2.03 ng ml −1 ) and reduced application of catecholamines (adjusted P  ≤ 5.96 × 10 −7 , 15% (n = 97), 95% CI: 8%, 23%), antibiotics (adjusted P  ≤ 8.53 × 10 −6 , 41% ( n  = 97), 95% CI: 31%, 51%) and antifungals (adjusted P  ≤ 5.96 × 10 −7 , 0% ( n  = 97), 95% CI: 0%, 0%).

To demonstrate the ability of ehrapy to examine EHR data from different levels of resolution, we additionally reconstructed a case from the ‘severe pneumonia with co-infection’ group (Fig. 2g ). In this case, the analysis revealed that CRP levels remained elevated despite broad-spectrum antibiotic treatment until a positive Acinetobacter baumannii result led to a change in medication and a subsequent decrease in CRP and monocyte levels.

ehrapy facilitates extraction of pneumonia indicators

ehrapy’s survival analysis module allowed us to identify clinical indicators of disease stages that could be used as biomarkers through Kaplan–Meier analysis. We found strong variance in overall aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma-glutamyl transferase (GGT) and bilirubin levels (Fig. 3a ), including changes over time (Extended Data Fig. 6a,b ), in all four ‘unspecified pneumonia’ groups. Routinely used to assess liver function, studies provide evidence that AST, ALT and GGT levels are elevated during respiratory infections 63 , including severe pneumonia 64 , and can guide diagnosis and management of pneumonia in children 63 . We confirmed reduced survival in more severely affected children (‘sepsis-like pneumonia’ and ‘severe pneumonia with co-infection’) using Kaplan–Meier curves and a multivariate log-rank test (Fig. 3b ; P  ≤ 1.09 × 10 −18 ) through ehrapy. To verify the association of this trajectory with altered AST, ALT and GGT expression levels, we further grouped all patients based on liver enzyme reference ranges ( Methods and Supplementary Table 2 ). By Kaplan–Meier survival analysis, cases with peaks of GGT ( P  ≤ 1.4 × 10 −2 , 58.01 ± 2.03 U L −1 ), ALT ( P  ≤ 2.9 × 10 −2 , 43.59 ± 38.02 U L −1 ) and AST ( P  ≤ 4.8 × 10 −4 , 78.69 ± 60.03 U L −1 ) in ‘outside the norm’ were found to correlate with lower survival in all groups (Fig. 3c and Extended Data Fig. 6 ), in line with previous studies 63 , 65 . Bilirubin was not found to significantly affect survival ( P  ≤ 2.1 × 10 −1 , 12.57 ± 21.22 mg dl −1 ).

figure 3

a , Line plots of major hepatic system laboratory measurements per group show variance in the measurements per pneumonia group. b , Kaplan–Meier survival curves demonstrate lower survival for ‘sepsis-like’ and ‘severe pneumonia with co-infection’ groups. c , Kaplan–Meier survival curves for children with GGT measurements outside the norm range display lower survival.

ehrapy quantifies medication class effect on LOS

Pneumonia requires case-specific medications due to its diverse causes. To demonstrate the potential of ehrapy’s causal inference module, we quantified the effect of medication on ICU LOS to evaluate case-specific administration of medication. In contrast to causal discovery that attempts to find a causal graph reflecting the causal relationships, causal inference is a statistical process used to investigate possible effects when altering a provided system, as represented by a causal graph and observational data (Fig. 4a ) 66 . This approach allows identifying and quantifying the impact of specific interventions or treatments on outcome measures, thereby providing insight for evidence-based decision-making in healthcare. Causal inference relies on datasets incorporating interventions to accurately quantify effects.

figure 4

a , ehrapy’s causal module is based on the strategy of the tool ‘dowhy’. Here, EHR data containing treatment, outcome and measurements and a causal graph serve as input for causal effect quantification. The process includes the identification of the target estimand based on the causal graph, the estimation of causal effects using various models and, finally, refutation where sensitivity analyses and refutation tests are performed to assess the robustness of the results and assumptions. b , Curated causal graph using age, liver damage and inflammation markers as disease progression proxies together with medications as interventions to assess the causal effect on length of ICU stay. c , Determined causal effect strength on LOS in days of administered medication categories.

We manually constructed a minimal causal graph with ehrapy (Fig. 4b ) on records of treatment with corticosteroids, carbapenems, penicillins, cephalosporins and antifungal and antiviral medications as interventions (Extended Data Fig. 7 and Methods ). We assumed that the medications affect disease progression proxies, such as inflammation markers and markers of organ function. The selection of ‘interventions’ is consistent with current treatment standards for bacterial pneumonia and respiratory distress 67 , 68 . Based on the approach of the tool ‘dowhy’ 69 (Fig. 4a ), ehrapy’s causal module identified the application of corticosteroids, antivirals and carbapenems to be associated with shorter LOSs, in line with current evidence 61 , 70 , 71 , 72 . In contrast, penicillins and cephalosporins were associated with longer LOSs, whereas antifungal medication did not strongly influence LOS (Fig. 4c ).

ehrapy enables deriving population-scale risk factors

To illustrate the advantages of using a unified data management and quality control framework, such as ehrapy, we modeled myocardial infarction risk using Cox proportional hazards models on UKB 44 data. Large population cohort studies, such as the UKB, enable the investigation of common diseases across a wide range of modalities, including genomics, metabolomics, proteomics, imaging data and common clinical variables (Fig. 5a,b ). From these, we used a publicly available polygenic risk score for coronary heart disease 73 comprising 6.6 million variants, 80 nuclear magnetic resonance (NMR) spectroscopy-based metabolomics 74 features, 81 features derived from retinal optical coherence tomography 75 , 76 and the Framingham Risk Score 77 feature set, which includes known clinical predictors, such as age, sex, body mass index, blood pressure, smoking behavior and cholesterol levels. We excluded features with more than 10% missingness and imputed the remaining missing values ( Methods ). Furthermore, individuals with events up to 1 year after the sampling time were excluded from the analyses, ultimately selecting 29,216 individuals for whom all mentioned data types were available (Extended Data Figs. 8 and 9 and Methods ). Myocardial infarction, as defined by our mapping to the phecode nomenclature 51 , was defined as the endpoint (Fig. 5c ). We modeled the risk for myocardial infarction 1 year after either the metabolomic sample was obtained or imaging was performed.

figure 5

a , The UKB includes 502,359 participants from 22 assessment centers. Most participants have genetic data (97%) and physical measurement data (93%), but fewer have data for complex measures, such as metabolomics, retinal imaging or proteomics. b , We found a distinct cluster of individuals (bottom right) from the Birmingham assessment center in the retinal imaging data, which is an artifact of the image acquisition process and was, thus, excluded. c , Myocardial infarctions are recorded for 15% of the male and 7% of the female study population. Kaplan–Meier estimators with 95% CIs are shown. d , For every modality combination, a linear Cox proportional hazards model was fit to determine the prognostic potential of these for myocardial infarction. Cardiovascular risk factors show expected positive log hazard ratios (log (HRs)) for increased blood pressure or total cholesterol and negative ones for sampling age and systolic blood pressure (BP). log (HRs) with 95% CIs are shown. e , Combining all features yields a C-index of 0.81. c – e , Error bars indicate 95% CIs ( n  = 29,216).

Predictive performance for each modality was assessed by fitting Cox proportional hazards (Fig. 5c ) models on each of the feature sets using ehrapy (Fig. 5d ). The age of the first occurrence served as the time to event; alternatively, date of death or date of the last record in the EHR served as censoring times. Models were evaluated using the concordance index (C-index) ( Methods ). The combination of multiple modalities successfully improved the predictive performance for coronary heart disease by increasing the C-index from 0.63 (genetic) to 0.76 (genetics, age and sex) and to 0.77 (clinical predictors) with 0.81 (imaging and clinical predictors) for combinations of feature sets (Fig. 5e ). Our finding is in line with previous observations of complementary effects between different modalities, where a broader ‘major adverse cardiac event’ phenotype was modeled in the UKB achieving a C-index of 0.72 (ref. 78 ). Adding genetic data improves predictive potential, as it is independent of sampling age and has limited prediction of other modalities 79 . The addition of metabolomic data did not improve predictive power (Fig. 5e ).

Imaging-based disease severity projection via fate mapping

To demonstrate ehrapy’s ability to handle diverse image data and recover disease stages, we embedded pulmonary imaging data obtained from patients with COVID-19 into a lower-dimensional space and computationally inferred disease progression trajectories using pseudotemporal ordering. This describes a continuous trajectory or ordering of individual points based on feature similarity 80 . Continuous trajectories enable mapping the fate of new patients onto precise states to potentially predict their future condition.

In COVID-19, a highly contagious respiratory illness caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), symptoms range from mild flu-like symptoms to severe respiratory distress. Chest x-rays typically show opacities (bilateral patchy, ground glass) associated with disease severity 81 .

We used COVID-19 chest x-ray images from the BrixIA 82 dataset consisting of 192 images (Fig. 6a ) with expert annotations of disease severity. We used the BrixIA database scores, which are based on six regions annotated by radiologists, to classify disease severity ( Methods ). We embedded raw image features using a pre-trained DenseNet model ( Methods ) and further processed this embedding into a nearest-neighbors-based UMAP space using ehrapy (Fig. 6b and Methods ). Fate mapping based on imaging information ( Methods ) determined a severity ordering from mild to critical cases (Fig. 6b–d ). Images labeled as ‘normal’ are projected to stay within the healthy group, illustrating the robustness of our approach. Images of diseased patients were ordered by disease severity, highlighting clear trajectories from ‘normal’ to ‘critical’ states despite the heterogeneity of the x-ray images stemming from, for example, different zoom levels (Fig. 6a ).

figure 6

a , Randomly selected chest x-ray images from the BrixIA dataset demonstrate its variance. b , UMAP visualization of the BrixIA dataset embedding shows a separation of disease severity classes. c , Calculated pseudotime for all images increases with distance to the ‘normal’ images. d , Stream projection of fate mapping in UMAP space showcases disease severity trajectory of the COVID-19 chest x-ray images.

Detecting and mitigating biases in EHR data with ehrapy

To showcase how exploratory analysis using ehrapy can reveal and mitigate biases, we analyzed the Fairlearn 83 version of the Diabetes 130-US Hospitals 84 dataset. The dataset covers 10 years (1999–2008) of clinical records from 130 US hospitals, detailing 47 features of diabetes diagnoses, laboratory tests, medications and additional data from up to 14 d of inpatient care of 101,766 diagnosed patient visits ( Methods ). It was originally collected to explore the link between the measurement of hemoglobin A1c (HbA1c) and early readmission.

The cohort primarily consists of White and African American individuals, with only a minority of cases from Asian or Hispanic backgrounds (Extended Data Fig. 10a ). ehrapy’s cohort tracker unveiled selection and surveillance biases when filtering for Medicare recipients for further analysis, resulting in a shift of age distribution toward an age of over 60 years in addition to an increasing ratio of White participants. Using ehrapy’s visualization modules, our analysis showed that HbA1c was measured in only 18.4% of inpatients, with a higher frequency in emergency admissions compared to referral cases (Extended Data Fig. 10b ). Normalization biases can skew data relationships when standardization techniques ignore subgroup variability or assume incorrect distributions. The choice of normalization strategy must be carefully considered to avoid obscuring important factors. When normalizing the number of applied medications individually, differences in distributions between age groups remained. However, when normalizing both distributions jointly with age group as an additional group variable, differences between age groups were masked (Extended Data Fig. 10c ). To investigate missing data and imputation biases, we introduced missingness for the number of applied medications according to an MCAR mechanism, which we verified using ehrapy’s Little’s test ( P  ≤ 0.01 × 10 −2 ), and an MAR mechanism ( Methods ). Whereas imputing the mean in the MCAR case did not affect the overall location of the distribution, it led to an underestimation of the variance, with the standard deviation dropping from 8.1 in the original data to 6.8 in the imputed data (Extended Data Fig. 10d ). Mean imputation in the MAR case skewed both location and variance of the mean from 16.02 to 14.66, with a standard deviation of only 5.72 (Extended Data Fig. 10d ). Using ehrapy’s multiple imputation based MissForest 85 imputation on the MAR data resulted in a mean of 16.04 and a standard deviation of 6.45. To predict patient readmission in fewer than 30 d, we merged the three smallest race groups, ‘Asian’, ‘Hispanic’ and ‘Other’. Furthermore, we dropped the gender group ‘Unknown/Invalid’ owing to the small sample size making meaningful assessment impossible, and we performed balanced random undersampling, resulting in 5,677 cases from each condition. We observed an overall balanced accuracy of 0.59 using a logistic regression model. However, the false-negative rate was highest for the races ‘Other’ and ‘Unknown’, whereas their selection rate was lowest, and this model was, therefore, biased (Extended Data Fig. 10e ). Using ehrapy’s compatibility with existing machine learning packages, we used Fairlearn’s ThresholdOptimizer ( Methods ), which improved the selection rates for ‘Other’ from 0.32 to 0.38 and for ‘Unknown’ from 0.23 to 0.42 and the false-negative rates for ‘Other’ from 0.48 to 0.42 and for ‘Unknown’ from 0.61 to 0.45 (Extended Data Fig. 10e ).

Clustering offers a hypothesis-free alternative to supervised classification when clear hypotheses or labels are missing. It has enabled the identification of heart failure subtypes 86 and progression pathways 87 and COVID-19 severity states 88 . This concept, which is central to ehrapy, further allowed us to identify fine-grained groups of ‘unspecified pneumonia’ cases in the PIC dataset while discovering biomarkers and quantifying effects of medications on LOS. Such retroactive characterization showcases ehrapy’s ability to put complex evidence into context. This approach supports feedback loops to improve diagnostic and therapeutic strategies, leading to more efficiently allocated resources in healthcare.

ehrapy’s flexible data structures enabled us to integrate the heterogeneous UKB data for predictive performance in myocardial infarction. The different data types and distributions posed a challenge for predictive models that were overcome with ehrapy’s pre-processing modules. Our analysis underscores the potential of combining phenotypic and health data at population scale through ehrapy to enhance risk prediction.

By adapting pseudotime approaches that are commonly used in other omics domains, we successfully recovered disease trajectories from raw imaging data with ehrapy. The determined pseudotime, however, only orders data but does not necessarily provide a future projection per patient. Understanding the driver features for fate mapping in image-based datasets is challenging. The incorporation of image segmentation approaches could mitigate this issue and provide a deeper insight into the spatial and temporal dynamics of disease-related processes.

Limitations of our analyses include the lack of control for informative missingness where the absence of information represents information in itself 89 . Translation from Chinese to English in the PIC database can cause information loss and inaccuracies because the Chinese ICD-10 codes are seven characters long compared to the five-character English codes. Incompleteness of databases, such as the lack of radiology images in the PIC database, low sample sizes, underrepresentation of non-White ancestries and participant self-selection, cannot be accounted for and limit generalizability. This restricts deeper phenotyping of, for example, all ‘unspecified pneumonia’ cases with respect to their survival, which could be overcome by the use of multiple databases. Our causal inference use case is limited by unrecorded variables, such as Sequential Organ Failure Assessment (SOFA) scores, and pneumonia-related pathogens that are missing in the causal graph due to dataset constraints, such as high sparsity and substantial missing data, which risk overfitting and can lead to overinterpretation. We counterbalanced this by employing several refutation methods that statistically reject the causal hypothesis, such as a placebo treatment, a random common cause or an unobserved common cause. The longer hospital stays associated with penicillins and cephalosporins may be dataset specific and stem from higher antibiotic resistance, their use as first-line treatments, more severe initial cases, comorbidities and hospital-specific protocols.

Most analysis steps can introduce algorithmic biases where results are misleading or unfavorably affect specific groups. This is particularly relevant in the context of missing data 22 where determining the type of missing data is necessary to handle it correctly. ehrapy includes an implementation of Little’s test 90 , which tests whether data are distributed MCAR to discern missing data types. For MCAR data single-imputation approaches, such as mean, median or mode, imputation can suffice, but these methods are known to reduce variability 91 , 92 . Multiple imputation strategies, such as Multiple Imputation by Chained Equations (MICE) 93 and MissForest 85 , as implemented in ehrapy, are effective for both MCAR and MAR data 22 , 94 , 95 . MNAR data require pattern-mixture or shared-parameter models that explicitly incorporate the mechanism by which data are missing 96 . Because MNAR involves unobserved data, the assumptions about the missingness mechanism cannot be directly verified, making sensitivity analysis crucial 21 . ehrapy’s wide range of normalization functions and grouping functionality enables to account for intrinsic variability within subgroups, and its compatibility with Fairlearn 83 can potentially mitigate predictor biases. Generally, we recommend to assess all pre-processing in an iterative manner with respect to downstream applications, such as patient stratification. Moreover, sensitivity analysis can help verify the robustness of all inferred knowledge 97 .

These diverse use cases illustrate ehrapy’s potential to sufficiently address the need for a computationally efficient, extendable, reproducible and easy-to-use framework. ehrapy is compatible with major standards, such as Observational Medical Outcomes Partnership (OMOP), Common Data Model (CDM) 47 , HL7, FHIR or openEHR, with flexible support for common tabular data formats. Once loaded into an AnnData object, subsequent sharing of analysis results is made easy because AnnData objects can be stored and read platform independently. ehrapy’s rich documentation of the application programming interface (API) and extensive hands-on tutorials make EHR analysis accessible to both novices and experienced analysts.

As ehrapy remains under active development, users can expect ehrapy to continuously evolve. We are improving support for the joint analysis of EHR, genetics and molecular data where ehrapy serves as a bridge between the EHR and the omics communities. We further anticipate the generation of EHR-specific reference datasets, so-called atlases 98 , to enable query-to-reference mapping where new datasets get contextualized by transferring annotations from the reference to the new dataset. To promote the sharing and collective analysis of EHR data, we envision adapted versions of interactive single-cell data explorers, such as CELLxGENE 99 or the UCSC Cell Browser 100 , for EHR data. Such web interfaces would also include disparity dashboards 20 to unveil trends of preferential outcomes for distinct patient groups. Additional modules specifically for high-frequency time-series data, natural language processing and other data types are currently under development. With the widespread availability of code-generating large language models, frameworks such as ehrapy are becoming accessible to medical professionals without coding expertise who can leverage its analytical power directly. Therefore, ehrapy, together with a lively ecosystem of packages, has the potential to enhance the scientific discovery pipeline to shape the era of EHR analysis.

All datasets that were used during the development of ehrapy and the use cases were used according to their terms of use as indicated by each provider.

Design and implementation of ehrapy

A unified pipeline as provided by our ehrapy framework streamlines the analysis of EHR data by providing an efficient, standardized approach, which reduces the complexity and variability in data pre-processing and analysis. This consistency ensures reproducibility of results and facilitates collaboration and sharing within the research community. Additionally, the modular structure allows for easy extension and customization, enabling researchers to adapt the pipeline to their specific needs while building on a solid foundational framework.

ehrapy was designed from the ground up as an open-source effort with community support. The package, as well as all associated tutorials and dataset preparation scripts, are open source. Development takes place publicly on GitHub where the developers discuss feature requests and issues directly with users. This tight interaction between both groups ensures that we implement the most pressing needs to cater the most important use cases and can guide users when difficulties arise. The open-source nature, extensive documentation and modular structure of ehrapy are designed for other developers to build upon and extend ehrapy’s functionality where necessary. This allows us to focus ehrapy on the most important features to keep the number of dependencies to a minimum.

ehrapy was implemented in the Python programming language and builds upon numerous existing numerical and scientific open-source libraries, specifically matplotlib 101 , seaborn 102 , NumPy 103 , numba 104 , Scipy 105 , scikit-learn 53 and Pandas 106 . Although taking considerable advantage of all packages implemented, ehrapy also shares the limitations of these libraries, such as a lack of GPU support or small performance losses due to the translation layer cost for operations between the Python interpreter and the lower-level C language for matrix operations. However, by building on very widely used open-source software, we ensure seamless integration and compatibility with a broad range of tools and platforms to promote community contributions. Additionally, by doing so, we enhance security by allowing a larger pool of developers to identify and address vulnerabilities 107 . All functions are grouped into task-specific modules whose implementation is complemented with additional dependencies.

Data preparation

Dataloaders.

ehrapy is compatible with any type of vectorized data, where vectorized refers to the data being stored in structured tables in either on-disk or database form. The input and output module of ehrapy provides readers for common formats, such as OMOP, CSV tables or SQL databases through Pandas. When reading in such datasets, the data are stored in the appropriate slots in a new AnnData 46 object. ehrapy’s data module provides access to more than 20 public EHR datasets that feature diseases, including, but not limited to, Parkinson’s disease, breast cancer, chronic kidney disease and more. All dataloaders return AnnData objects to allow for immediate analysis.

AnnData for EHR data

Our framework required a versatile data structure capable of handling various matrix formats, including Numpy 103 for general use cases and interoperability, Scipy 105 sparse matrices for efficient storage, Dask 108 matrices for larger-than-memory analysis and Awkward array 109 for irregular time-series data. We needed a single data structure that not only stores data but also includes comprehensive annotations for thorough contextual analysis. It was essential for this structure to be widely used and supported, which ensures robustness and continual updates. Interoperability with other analytical packages was a key criterion to facilitate seamless integration within existing tools and workflows. Finally, the data structure had to support both in-memory operations and on-disk storage using formats such as HDF5 (ref. 110 ) and Zarr 111 , ensuring efficient handling and accessibility of large datasets and the ability to easily share them with collaborators.

All of these requirements are fulfilled by the AnnData format, which is a popular data structure in single-cell genomics. At its core, an AnnData object encapsulates diverse components, providing a holistic representation of data and metadata that are always aligned in dimensions and easily accessible. A data matrix (commonly referred to as ‘ X ’) stands as the foundational element, embodying the measured data. This matrix can be dense (as Numpy array), sparse (as Scipy sparse matrix) or ragged (as Awkward array) where dimensions do not align within the data matrix. The AnnData object can feature several such data matrices stored in ‘layers’. Examples of such layers can be unnormalized or unencoded data. These data matrices are complemented by an observations (commonly referred to as ‘obs’) segment where annotations on the level of patients or visits are stored. Patients’ age or sex, for instance, are often used as such annotations. The variables (commonly referred to as ‘var’) section complements the observations, offering supplementary details about the features in the dataset, such as missing data rates. The observation-specific matrices (commonly referred to as ‘obsm’) section extends the capabilities of the AnnData structure by allowing the incorporation of observation-specific matrices. These matrices can represent various types of information at the individual cell level, such as principal component analysis (PCA) results, t-distributed stochastic neighbor embedding (t-SNE) coordinates or other dimensionality reduction outputs. Analogously, AnnData features a variables-specific variables (commonly referred to as ‘varm’) component. The observation-specific pairwise relationships (commonly referred to as ‘obsp’) segment complements the ‘obsm’ section by accommodating observation-specific pairwise relationships. This can include connectivity matrices, indicating relationships between patients. The inclusion of an unstructured annotations (commonly referred to as ‘uns’) component further enhances flexibility. This segment accommodates unstructured annotations or arbitrary data that might not conform to the structured observations or variables categories. Any AnnData object can be stored on disk in h5ad or Zarr format to facilitate data exchange.

ehrapy natively interfaces with the scientific Python ecosystem via Pandas 112 and Numpy 103 . The development of deep learning models for EHR data 113 is further accelerated through compatibility with pathml 114 , a unified framework for whole-slide image analysis in pathology, and scvi-tools 115 , which provides data loaders for loading tensors from AnnData objects into PyTorch 116 or Jax arrays 117 to facilitate the development of generalizing foundational models for medical artificial intelligence 118 .

Feature annotation

After AnnData creation, any metadata can be mapped against ontologies using Bionty ( https://github.com/laminlabs/bionty-base ). Bionty provides access to the Human Phenotype, Phecodes, Phenotype and Trait, Drug, Mondo and Human Disease ontologies.

Key medical terms stored in an AnnData object in free text can be extracted using the Medical Concept Annotation Toolkit (MedCAT) 119 .

Data processing

Cohort tracking.

ehrapy provides a CohortTracker tool that traces all filtering steps applied to an associated AnnData object. To calculate cohort summary statistics, the implementation makes use of tableone 120 and can subsequently be plotted as bar charts together with flow diagrams 121 that visualize the order and reasoning of filtering operations.

Basic pre-processing and quality control

ehrapy encompasses a suite of functionalities for fundamental data processing that are adopted from scanpy 52 but adapted to EHR data:

Regress out: To address unwanted sources of variation, a regression procedure is integrated, enhancing the dataset’s robustness.

Subsample: Selects a specified fraction of observations.

Balanced sample: Balances groups in the dataset by random oversampling or undersampling.

Highly variable features: The identification and annotation of highly variable features following the ‘highly variable genes’ function of scanpy is seamlessly incorporated, providing users with insights into pivotal elements influencing the dataset.

To identify and minimize quality issues, ehrapy provides several quality control functions:

Basic quality control: Determines the relative and absolute number of missing values per feature and per patient.

Winsorization: For data refinement, ehrapy implements a winsorization process, creating a version of the input array less susceptible to extreme values.

Feature clipping: Imposes limits on features to enhance dataset reliability.

Detect biases: Computes pairwise correlations between features, standardized mean differences for numeric features between groups of sensitive features, categorical feature value count differences between groups of sensitive features and feature importances when predicting a target variable.

Little’s MCAR test: Applies Little’s MCAR test whose null hypothesis is that data are MCAR. Rejecting the null hypothesis may not always mean that data are not MCAR, nor is accepting the null hypothesis a guarantee that data are MCAR. For more details, see Schouten et al. 122 .

Summarize features: Calculates statistical indicators per feature, including minimum, maximum and average values. This can be especially useful to reduce complex data with multiple measurements per feature per patient into sets of columns with single values.

Imputation is crucial in data analysis to address missing values, ensuring the completeness of datasets that can be required for specific algorithms. The ‘ehrapy’ pre-processing module offers a range of imputation techniques:

Explicit Impute: Replaces missing values, in either all columns or a user-specified subset, with a designated replacement value.

Simple Impute: Imputes missing values in numerical data using mean, median or the most frequent value, contributing to a more complete dataset.

KNN Impute: Uses k -nearest neighbor imputation to fill in missing values in the input AnnData object, preserving local data patterns.

MissForest Impute: Implements the MissForest strategy for imputing missing data, providing a robust approach for handling complex datasets.

MICE Impute: Applies the MICE algorithm for imputing data. This implementation is based on the miceforest ( https://github.com/AnotherSamWilson/miceforest ) package.

Data encoding can be required if categoricals are a part of the dataset to obtain numerical values only. Most algorithms in ehrapy are compatible only with numerical values. ehrapy offers two encoding algorithms based on scikit-learn 53 :

One-Hot Encoding: Transforms categorical variables into binary vectors, creating a binary feature for each category and capturing the presence or absence of each category in a concise representation.

Label Encoding: Assigns a unique numerical label to each category, facilitating the representation of categorical data as ordinal values and supporting algorithms that require numerical input.

To ensure that the distributions of the heterogeneous data are aligned, ehrapy offers several normalization procedures:

Log Normalization: Applies the natural logarithm function to the data, useful for handling skewed distributions and reducing the impact of outliers.

Max-Abs Normalization: Scales each feature by its maximum absolute value, ensuring that the maximum absolute value for each feature is 1.

Min-Max Normalization: Transforms the data to a specific range (commonly (0, 1)) by scaling each feature based on its minimum and maximum values.

Power Transformation Normalization: Applies a power transformation to make the data more Gaussian like, often useful for stabilizing variance and improving the performance of models sensitive to distributional assumptions.

Quantile Normalization: Aligns the distributions of multiple variables, ensuring that their quantiles match, which can be beneficial for comparing datasets or removing batch effects.

Robust Scaling Normalization: Scales data using the interquartile range, making it robust to outliers and suitable for datasets with extreme values.

Scaling Normalization: Standardizes data by subtracting the mean and dividing by the standard deviation, creating a distribution with a mean of 0 and a standard deviation of 1.

Offset to Positive Values: Shifts all values by a constant offset to make all values non-negative, with the lowest negative value becoming 0.

Dataset shifts can be corrected using the scanpy implementation of the ComBat 123 algorithm, which employs a parametric and non-parametric empirical Bayes framework for adjusting data for batch effects that is robust to outliers.

Finally, a neighbors graph can be efficiently computed using scanpy’s implementation.

To obtain meaningful lower-dimensional embeddings that can subsequently be visualized and reused for downstream algorithms, ehrapy provides the following algorithms based on scanpy’s implementation:

t-SNE: Uses a probabilistic approach to embed high-dimensional data into a lower-dimensional space, emphasizing the preservation of local similarities and revealing clusters in the data.

UMAP: Embeds data points by modeling their local neighborhood relationships, offering an efficient and scalable technique that captures both global and local structures in high-dimensional data.

Force-Directed Graph Drawing: Uses a physical simulation to position nodes in a graph, with edges representing pairwise relationships, creating a visually meaningful representation that emphasizes connectedness and clustering in the data.

Diffusion Maps: Applies spectral methods to capture the intrinsic geometry of high-dimensional data by modeling diffusion processes, providing a way to uncover underlying structures and patterns.

Density Calculation in Embedding: Quantifies the density of observations within an embedding, considering conditions or groups, offering insights into the concentration of data points in different regions and aiding in the identification of densely populated areas.

ehrapy further provides algorithms for clustering and trajectory inference based on scanpy:

Leiden Clustering: Uses the Leiden algorithm to cluster observations into groups, revealing distinct communities within the dataset with an emphasis on intra-cluster cohesion.

Hierarchical Clustering Dendrogram: Constructs a dendrogram through hierarchical clustering based on specified group by categories, illustrating the hierarchical relationships among observations and facilitating the exploration of structured patterns.

Feature ranking

ehrapy provides two ways of ranking feature contributions to clusters and target variables:

Statistical tests: To compare any obtained clusters to obtain marker features that are significantly different between the groups, ehrapy extends scanpy’s ‘rank genes groups’. The original implementation, which features a t -test for numerical data, is complemented by a g -test for categorical data.

Feature importance: Calculates feature rankings for a target variable using linear regression, support vector machine or random forest models from scikit-learn. ehrapy evaluates the relative importance of each predictor by fitting the model and extracting model-specific metrics, such as coefficients or feature importances.

Dataset integration

Based on scanpy’s ‘ingest’ function, ehrapy facilitates the integration of labels and embeddings from a well-annotated reference dataset into a new dataset, enabling the mapping of cluster annotations and spatial relationships for consistent comparative analysis. This process ensures harmonized clinical interpretations across datasets, especially useful when dealing with multiple experimental diseases or batches.

Knowledge inference

Survival analysis.

ehrapy’s implementation of survival analysis algorithms is based on lifelines 124 :

Ordinary Least Squares (OLS) Model: Creates a linear regression model using OLS from a specified formula and an AnnData object, allowing for the analysis of relationships between variables and observations.

Generalized Linear Model (GLM): Constructs a GLM from a given formula, distribution and AnnData, providing a versatile framework for modeling relationships with nonlinear data structures.

Kaplan–Meier: Fits the Kaplan–Meier curve to generate survival curves, offering a visual representation of the probability of survival over time in a dataset.

Cox Hazard Model: Constructs a Cox proportional hazards model using a specified formula and an AnnData object, enabling the analysis of survival data by modeling the hazard rates and their relationship to predictor variables.

Log-Rank Test: Calculates the P value for the log-rank test, comparing the survival functions of two groups, providing statistical significance for differences in survival distributions.

GLM Comparison: Given two fit GLMs, where the larger encompasses the parameter space of the smaller, this function returns the P value, indicating the significance of the larger model and adding explanatory power beyond the smaller model.

Trajectory inference

Trajectory inference is a computational approach that reconstructs and models the developmental paths and transitions within heterogeneous clinical data, providing insights into the temporal progression underlying complex systems. ehrapy offers several inbuilt algorithms for trajectory inference based on scanpy:

Diffusion Pseudotime: Infers the progression of observations by measuring geodesic distance along the graph, providing a pseudotime metric that represents the developmental trajectory within the dataset.

Partition-based Graph Abstraction (PAGA): Maps out the coarse-grained connectivity structures of complex manifolds using a partition-based approach, offering a comprehensive visualization of relationships in high-dimensional data and aiding in the identification of macroscopic connectivity patterns.

Because ehrapy is compatible with scverse, further trajectory inference-based algorithms, such as CellRank, can be seamlessly applied.

Causal inference

ehrapy’s causal inference module is based on ‘dowhy’ 69 . It is based on four key steps that are all implemented in ehrapy:

Graphical Model Specification: Define a causal graphical model representing relationships between variables and potential causal effects.

Causal Effect Identification: Automatically identify whether a causal effect can be inferred from the given data, addressing confounding and selection bias.

Causal Effect Estimation: Employ automated tools to estimate causal effects, using methods such as matching, instrumental variables or regression.

Sensitivity Analysis and Testing: Perform sensitivity analysis to assess the robustness of causal inferences and conduct statistical testing to determine the significance of the estimated causal effects.

Patient stratification

ehrapy’s complete pipeline from pre-processing to the generation of lower-dimensional embeddings, clustering, statistical comparison between determined groups and more facilitates the stratification of patients.

Visualization

ehrapy features an extensive visualization pipeline that is customizable and yet offers reasonable defaults. Almost every analysis function is matched with at least one visualization function that often shares the name but is available through the plotting module. For example, after importing ehrapy as ‘ep’, ‘ep.tl.umap(adata)’ runs the UMAP algorithm on an AnnData object, and ‘ep.pl.umap(adata)’ would then plot a scatter plot of the UMAP embedding.

ehrapy further offers a suite of more generally usable and modifiable plots:

Scatter Plot: Visualizes data points along observation or variable axes, offering insights into the distribution and relationships between individual data points.

Heatmap: Represents feature values in a grid, providing a comprehensive overview of the data’s structure and patterns.

Dot Plot: Displays count values of specified variables as dots, offering a clear depiction of the distribution of counts for each variable.

Filled Line Plot: Illustrates trends in data with filled lines, emphasizing variations in values over a specified axis.

Violin Plot: Presents the distribution of data through mirrored density plots, offering a concise view of the data’s spread.

Stacked Violin Plot: Combines multiple violin plots, stacked to allow for visual comparison of distributions across categories.

Group Mean Heatmap: Creates a heatmap displaying the mean count per group for each specified variable, providing insights into group-wise trends.

Hierarchically Clustered Heatmap: Uses hierarchical clustering to arrange data in a heatmap, revealing relationships and patterns among variables and observations.

Rankings Plot: Visualizes rankings within the data, offering a clear representation of the order and magnitude of values.

Dendrogram Plot: Plots a dendrogram of categories defined in a group by operation, illustrating hierarchical relationships within the dataset.

Benchmarking ehrapy

We generated a subset of the UKB data selecting 261 features and 488,170 patient visits. We removed all features with missingness rates greater than 70%. To demonstrate speed and memory consumption for various scenarios, we subsampled the data to 20%, 30% and 50%. We ran a minimal ehrapy analysis pipeline on each of those subsets and the full data, including the calculation of quality control metrics, filtering of variables by a missingness threshold, nearest neighbor imputation, normalization, dimensionality reduction and clustering (Supplementary Table 1 ). We conducted our benchmark on a single CPU with eight threads and 60 GB of maximum memory.

ehrapy further provides out-of-core implementations using Dask 108 for many algorithms in ehrapy, such as our normalization functions or our PCA implementation. Out-of-core computation refers to techniques that process data that do not fit entirely in memory, using disk storage to manage data overflow. This approach is crucial for handling large datasets without being constrained by system memory limits. Because the principal components get reused for other computationally expensive algorithms, such as the neighbors graph calculation, it effectively enables the analysis of very large datasets. We are currently working on supporting out-of-core computation for all computationally expensive algorithms in ehrapy.

We demonstrate the memory benefits in a hosted tutorial where the in-memory pipeline for 50,000 patients with 1,000 features required about 2 GB of memory, and the corresponding out-of-core implementation required less than 200 MB of memory.

The code for benchmarking is available at https://github.com/theislab/ehrapy-reproducibility . The implementation of ehrapy is accessible at https://github.com/theislab/ehrapy together with extensive API documentation and tutorials at https://ehrapy.readthedocs.io .

PIC database analysis

Study design.

We collected clinical data from the PIC 43 version 1.1.0 database. PIC is a single-center, bilingual (English and Chinese) database hosting information of children admitted to critical care units at the Children’s Hospital of Zhejiang University School of Medicine in China. The requirement for individual patient consent was waived because the study did not impact clinical care, and all protected health information was de-identified. The database contains 13,499 distinct hospital admissions of 12,881 distinct pediatric patients. These patients were admitted to five ICU units with 119 total critical care beds—GICU, PICU, SICU, CICU and NICU—between 2010 and 2018. The mean age of the patients was 2.5 years, of whom 42.5% were female. The in-hospital mortality was 7.1%; the mean hospital stay was 17.6 d; the mean ICU stay was 9.3 d; and 468 (3.6%) patients were admitted multiple times. Demographics, diagnoses, doctors’ notes, laboratory and microbiology tests, prescriptions, fluid balances, vital signs and radiographics reports were collected from all patients. For more details, see the original publication of Zeng et al. 43 .

Study participants

Individuals older than 18 years were excluded from the study. We grouped the data into three distinct groups: ‘neonates’ (0–28 d of age; 2,968 patients), ‘infants’ (1–12 months of age; 4,876 patients) and ‘youths’ (13 months to 18 years of age; 6,097 patients). We primarily analyzed the ‘youths’ group with the discharge diagnosis ‘unspecified pneumonia’ (277 patients).

Data collection

The collected clinical data included demographics, laboratory and vital sign measurements, diagnoses, microbiology and medication information and mortality outcomes. The five-character English ICD-10 codes were used, whose values are based on the seven-character Chinese ICD-10 codes.

Dataset extraction and analysis

We downloaded the PIC database of version 1.1.0 from Physionet 1 to obtain 17 CSV tables. Using Pandas, we selected all information with more than 50% coverage rate, including demographics and laboratory and vital sign measurements (Fig. 2 ). To reduce the amount of noise, we calculated and added only the minimum, maximum and average of all measurements that had multiple values per patient. Examination reports were removed because they describe only diagnostics and not detailed findings. All further diagnoses and microbiology and medication information were included into the observations slot to ensure that the data were not used for the calculation of embeddings but were still available for the analysis. This ensured that any calculated embedding would not be divided into treated and untreated groups but, rather, solely based on phenotypic features. We imputed all missing data through k -nearest neighbors imputation ( k  = 20) using the knn_impute function of ehrapy. Next, we log normalized the data with ehrapy using the log_norm function. Afterwards, we winsorized the data using ehrapy’s winsorize function to obtain 277 ICU visits ( n  = 265 patients) with 572 features. Of those 572 features, 254 were stored in the matrix X and the remaining 318 in the ‘obs’ slot in the AnnData object. For clustering and visualization purposes, we calculated 50 principal components using ehrapy’s pca function. The obtained principal component representation was then used to calculate a nearest neighbors graph using the neighbors function of ehrapy. The nearest neighbors graph then served as the basis for a UMAP embedding calculation using ehrapy’s umap function.

We applied the community detection algorithm Leiden with resolution 0.6 on the nearest neighbor graph using ehrapy’s leiden function. The four obtained clusters served as input for two-sided t -tests for all numerical values and two-sided g -tests for all categorical values for all four clusters against the union of all three other clusters, respectively. This was conducted using ehrapy’s rank_feature_groups function, which also corrects P values for multiple testing with the Benjamini–Hochberg method 125 . We presented the four groups and the statistically significantly different features between the groups to two pediatricians who annotated the groups with labels.

Our determined groups can be confidently labeled owing to their distinct clinical profiles. Nevertheless, we could only take into account clinical features that were measured. Insightful features, such as lung function tests, are missing. Moreover, the feature representation of the time-series data is simplified, which can hide some nuances between the groups. Generally, deciding on a clustering resolution is difficult. However, more fine-grained clusters obtained via higher clustering resolutions may become too specific and not generalize well enough.

Kaplan–Meier survival analysis

We selected patients with up to 360 h of total stay for Kaplan–Meier survival analysis to ensure a sufficiently high number of participants. We proceeded with the AnnData object prepared as described in the ‘Patient stratification’ subsection to conduct Kaplan–Meier analysis among all four determined pneumonia groups using ehrapy’s kmf function. Significance was tested through ehrapy’s test_kmf_logrank function, which tests whether two Kaplan–Meier series are statistically significant, employing a chi-squared test statistic under the null hypothesis. Let h i (t) be the hazard ratio of group i at time t and c a constant that represents a proportional change in the hazard ratio between the two groups, then:

This implicitly uses the log-rank weights. An additional Kaplan–Meier analysis was conducted for all children jointly concerning the liver markers AST, ALT and GGT. To determine whether measurements were inside or outside the norm range, we used reference ranges (Supplementary Table 2 ). P values less than 0.05 were labeled significant.

Our Kaplan–Meier curve analysis depends on the groups being well defined and shares the same limitations as the patient stratification. Additionally, the analysis is sensitive to the reference table where we selected limits that generalize well for the age ranges, but, due to children of different ages being examined, they may not necessarily be perfectly accurate for all children.

Causal effect of mechanism of action on LOS

Although the dataset was not initially intended for investigating causal effects of interventions, we adapted it for this purpose by focusing on the LOS in the ICU, measured in months, as the outcome variable. This choice aligns with the clinical aim of stabilizing patients sufficiently for ICU discharge. We constructed a causal graph to explore how different drug administrations could potentially reduce the LOS. Based on consultations with clinicians, we included several biomarkers of liver damage (AST, ALT and GGT) and inflammation (CRP and PCT) in our model. Patient age was also considered a relevant variable.

Because several different medications act by the same mechanisms, we grouped specific medications by their drug classes This grouping was achieved by cross-referencing the drugs listed in the dataset with DrugBank release 5.1 (ref. 126 ), using Levenshtein distances for partial string matching. After manual verification, we extracted the corresponding DrugBank categories, counted the number of features per category and compiled a list of commonly prescribed medications, as advised by clinicians. This approach facilitated the modeling of the causal graph depicted in Fig. 4 , where an intervention is defined as the administration of at least one drug from a specified category.

Causal inference was then conducted with ehrapy’s ‘dowhy’ 69 -based causal inference module using the expert-curated causal graph. Medication groups were designated as causal interventions, and the LOS was the outcome of interest. Linear regression served as the estimation method for analyzing these causal effects. We excluded four patients from the analysis owing to their notably long hospital stays exceeding 90 d, which were deemed outliers. To validate the robustness of our causal estimates, we incorporated several refutation methods:

Placebo Treatment Refuter: This method involved replacing the treatment assignment with a placebo to test the effect of the treatment variable being null.

Random Common Cause: A randomly generated variable was added to the data to assess the sensitivity of the causal estimate to the inclusion of potential unmeasured confounders.

Data Subset Refuter: The stability of the causal estimate was tested across various random subsets of the data to ensure that the observed effects were not dependent on a specific subset.

Add Unobserved Common Cause: This approach tested the effect of an omitted variable by adding a theoretically relevant unobserved confounder to the model, evaluating how much an unmeasured variable could influence the causal relationship.

Dummy Outcome: Replaces the true outcome variable with a random variable. If the causal effect nullifies, it supports the validity of the original causal relationship, indicating that the outcome is not driven by random factors.

Bootstrap Validation: Employs bootstrapping to generate multiple samples from the dataset, testing the consistency of the causal effect across these samples.

The selection of these refuters addresses a broad spectrum of potential biases and model sensitivities, including unobserved confounders and data dependencies. This comprehensive approach ensures robust verification of the causal analysis. Each refuter provides an orthogonal perspective, targeting specific vulnerabilities in causal analysis, which strengthens the overall credibility of the findings.

UKB analysis

Study population.

We used information from the UKB cohort, which includes 502,164 study participants from the general UK population without enrichment for specific diseases. The study involved the enrollment of individuals between 2006 and 2010 across 22 different assessment centers throughout the United Kingdom. The tracking of participants is still ongoing. Within the UKB dataset, metabolomics, proteomics and retinal optical coherence tomography data are available for a subset of individuals without any enrichment for specific diseases. Additionally, EHRs, questionnaire responses and other physical measures are available for almost everyone in the study. Furthermore, a variety of genotype information is available for nearly the entire cohort, including whole-genome sequencing, whole-exome sequencing, genotyping array data as well as imputed genotypes from the genotyping array 44 . Because only the latter two are available for download, and are sufficient for polygenic risk score calculation as performed here, we used the imputed genotypes in the present study. Participants visited the assessment center up to four times for additional and repeat measurements and completed additional online follow-up questionnaires.

In the present study, we restricted the analyses to data obtained from the initial assessment, including the blood draw, for obtaining the metabolomics data and the retinal imaging as well as physical measures. This restricts the study population to 33,521 individuals for whom all of these modalities are available. We have a clear study start point for each individual with the date of their initial assessment center visit. The study population has a mean age of 57 years, is 54% female and is censored at age 69 years on average; 4.7% experienced an incident myocardial infarction; and 8.1% have prevalent type 2 diabetes. The study population comes from six of the 22 assessment centers due to the retinal imaging being performed only at those.

For the myocardial infarction endpoint definition, we relied on the first occurrence data available in the UKB, which compiles the first date that each diagnosis was recorded for a participant in a hospital in ICD-10 nomenclature. Subsequently, we mapped these data to phecodes and focused on phecode 404.1 for myocardial infarction.

The Framingham Risk Score was developed on data from 8,491 participants in the Framingham Heart Study to assess general cardiovascular risk 77 . It includes easily obtainable predictors and is, therefore, easily applicable in clinical practice, although newer and more specific risk scores exist and might be used more frequently. It includes age, sex, smoking behavior, blood pressure, total and low-density lipoprotein cholesterol as well as information on insulin, antihypertensive and cholesterol-lowering medications, all of which are routinely collected in the UKB and used in this study as the Framingham feature set.

The metabolomics data used in this study were obtained using proton NMR spectroscopy, a low-cost method with relatively low batch effects. It covers established clinical predictors, such as albumin and cholesterol, as well as a range of lipids, amino acids and carbohydrate-related metabolites.

The retinal optical coherence tomography–derived features were returned by researchers to the UKB 75 , 76 . They used the available scans and determined the macular volume, macular thickness, retinal pigment epithelium thickness, disc diameter, cup-to-disk ratio across different regions as well as the thickness between the inner nuclear layer and external limiting membrane, inner and outer photoreceptor segments and the retinal pigment epithelium across different regions. Furthermore, they determined a wide range of quality metrics for each scan, including the image quality score, minimum motion correlation and inner limiting membrane (ILM) indicator.

Data analysis

After exporting the data from the UKB, all timepoints were transformed into participant age entries. Only participants without prevalent myocardial infarction (relative to the first assessment center visit at which all data were collected) were included.

The data were pre-processed for retinal imaging and metabolomics subsets separately, to enable a clear analysis of missing data and allow for the k -nearest neighbors–based imputation ( k  = 20) of missing values when less than 10% were missing for a given participant. Otherwise, participants were dropped from the analyses. The imputed genotypes and Framingham analyses were available for almost every participant and, therefore, not imputed. Individuals without them were, instead, dropped from the analyses. Because genetic risk modeling poses entirely different methodological and computational challenges, we applied a published polygenic risk score for coronary heart disease using 6.6 million variants 73 . This was computed using the plink2 score option on the imputed genotypes available in the UKB.

UMAP embeddings were computed using default parameters on the full feature sets with ehrapy’s umap function. For all analyses, the same time-to-event and event-indicator columns were used. The event indicator is a Boolean variable indicating whether a myocardial infarction was observed for a study participant. The time to event is defined as the timespan between the start of the study, in this case the date of the first assessment center visit. Otherwise, it is the timespan from the start of the study to the start of censoring; in this case, this is set to the last date for which EHRs were available, unless a participant died, in which case the date of death is the start of censoring. Kaplan–Meier curves and Cox proportional hazards models were fit using ehrapy’s survival analysis module and the lifelines 124 package’s Cox-PHFitter function with default parameters. For Cox proportional hazards models with multiple feature sets, individually imputed and quality-controlled feature sets were concatenated, and the model was fit on the resulting matrix. Models were evaluated using the C-index 127 as a metric. It can be seen as an extension of the common area under the receiver operator characteristic score to time-to-event datasets, in which events are not observed for every sample and which ranges from 0.0 (entirely false) over 0.5 (random) to 1.0 (entirely correct). CIs for the C-index were computed based on bootstrapping by sampling 1,000 times with replacement from all computed partial hazards and computing the C-index over each of these samples. The percentiles at 2.5% and 97.5% then give the upper and lower confidence bound for the 95% CIs.

In all UKB analyses, the unit of study for a statistical test or predictive model is always an individual study participant.

The generalizability of the analysis is limited as the UK Biobank cohort may not represent the general population, with potential selection biases and underrepresentation of the different demographic groups. Additionally, by restricting analysis to initial assessment data and censoring based on the last available EHR or date of death, our analysis does not account for longitudinal changes and can introduce follow-up bias, especially if participants lost to follow-up have different risk profiles.

In-depth quality control of retina-derived features

A UMAP plot of the retina-derived features indicating the assessment centers shows a cluster of samples that lie somewhat outside the general population and mostly attended the Birmingham assessment center (Fig. 5b ). To further investigate this, we performed Leiden clustering of resolution 0.3 (Extended Data Fig. 9a ) and isolated this group in cluster 5. When comparing cluster 5 to the rest of the population in the retina-derived feature space, we noticed that many individuals in cluster 5 showed overall retinal pigment epithelium (RPE) thickness measures substantially elevated over the rest of the population in both eyes (Extended Data Fig. 9b ), which is mostly a feature of this cluster (Extended Data Fig. 9c ). To investigate potential confounding, we computed ratios between cluster 5 and the rest of the population over the ‘obs’ DataFrame containing the Framingham features, diabetes-related phecodes and genetic principal components. Out of the top and bottom five highest ratios observed, six are in genetic principal components, which are commonly used to represent genetic ancestry in a continuous space (Extended Data Fig. 9d ). Additionally, diagnoses for type 1 and type 2 diabetes and antihypertensive use are enriched in cluster 5. Further investigating the ancestry, we computed log ratios for self-reported ancestries and absolute counts, which showed no robust enrichment and depletion effects.

A closer look at three quality control measures of the imaging pipeline revealed that cluster 5 was an outlier in terms of either image quality (Extended Data Fig. 9e ) or minimum motion correlation (Extended Data Fig. 9f ) and the ILM indicator (Extended Data Fig. 9g ), all of which can be indicative of artifacts in image acquisition and downstream processing 128 . Subsequently, we excluded 301 individuals from cluster 5 from all analyses.

COVID-19 chest-x-ray fate determination

Dataset overview.

We used the public BrixIA COVID-19 dataset, which contains 192 chest x-ray images annotated with BrixIA scores 82 . Hereby, six regions were annotated by a senior radiologist with more than 20 years of experience and a junior radiologist with a disease severity score ranging from 0 to 3. A global score was determined as the sum of all of these regions and, therefore, ranges from 0 to 18 (S-Global). S-Global scores of 0 were classified as normal. Images that only had severity values up to 1 in all six regions were classified as mild. Images with severity values greater than or equal to 2, but a S-Global score of less than 7, were classified as moderate. All images that contained at least one 3 in any of the six regions with a S-Global score between 7 and 10 were classified as severe, and all remaining images with S-Global scores greater than 10 with at least one 3 were labeled critical. The dataset and instructions to download the images can be found at https://github.com/ieee8023/covid-chestxray-dataset .

We first resized all images to 224 × 224. Afterwards, the images underwent a random affine transformation that involved rotation, translation and scaling. The rotation angle was randomly selected from a range of −45° to 45°. The images were also subject to horizontal and vertical translation, with the maximum translation being 15% of the image size in either direction. Additionally, the images were scaled by a factor ranging from 0.85 to 1.15. The purpose of applying these transformations was to enhance the dataset and introduce variations, ultimately improving the robustness and generalization of the model.

To generate embeddings, we used a pre-trained DenseNet model with weights densenet121-res224-all of TorchXRayVision 129 . A DenseNet is a convolutional neural network that makes use of dense connections between layers (Dense Blocks) where all layers (with matching feature map sizes) directly connect with each other. To maintain a feed-forward nature, every layer in the DenseNet architecture receives supplementary inputs from all preceding layers and transmits its own feature maps to all subsequent layers. The model was trained on the nih-pc- chex-mimic_ch-google-openi-rsna dataset 130 .

Next, we calculated 50 principal components on the feature representation of the DenseNet model of all images using ehrapy’s pca function. The principal component representation served as input for a nearest neighbors graph calculation using ehrapy’s neighbors function. This graph served as the basis for the calculation of a UMAP embedding with three components that was finally visualized using ehrapy.

We randomly picked a root in the group of images that was labeled ‘Normal’. First, we calculated so-called pseudotime by fitting a trajectory through the calculated UMAP space using diffusion maps as implemented in ehrapy’s dpt function 57 . Each image’s pseudotime value represents its estimated position along this trajectory, serving as a proxy for its severity stage relative to others in the dataset. To determine fates, we employed CellRank 58 , 59 with the PseudotimeKernel . This kernel computes transition probabilities for patient visits based on the connectivity of the k -nearest neighbors graph and the pseudotime values of patient visits, which resembles their progression through a process. Directionality is infused in the nearest neighbors graph in this process where the kernel either removes or downweights edges in the graph that contradict the directional flow of increasing pseudotime, thereby refining the graph to better reflect the developmental trajectory. We computed the transition matrix with a soft threshold scheme (Parameter of the PseudotimeKernel ), which downweights edges that point against the direction of increasing pseudotime. Finally, we calculated a projection on top of the UMAP embedding with CellRank using the plot_projection function of the PseudotimeKernel that we subsequently plotted.

This analysis is limited by the small dataset of 192 chest x-ray images, which may affect the model’s generalizability and robustness. Annotation subjectivity from radiologists can further introduce variability in severity scores. Additionally, the random selection of a root from ‘Normal’ images can introduce bias in pseudotime calculations and subsequent analyses.

Diabetes 130-US hospitals analysis

We used data from the Diabetes 130-US hospitals dataset that were collected between 1999 and 2008. It contains clinical care information at 130 hospitals and integrated delivery networks. The extracted database information pertains to hospital admissions specifically for patients diagnosed with diabetes. These encounters required a hospital stay ranging from 1 d to 14 d, during which both laboratory tests and medications were administered. The selection criteria focused exclusively on inpatient encounters with these defined characteristics. More specifically, we used a version that was curated by the Fairlearn team where the target variable ‘readmitted’ was binarized and a few features renamed or binned ( https://fairlearn.org/main/user_guide/datasets/diabetes_hospital_data.html ). The dataset contains 101,877 patient visits and 25 features. The dataset predominantly consists of White patients (74.8%), followed by African Americans (18.9%), with other racial groups, such as Hispanic, Asian and Unknown categories, comprising smaller percentages. Females make up a slight majority in the data at 53.8%, with males accounting for 46.2% and a negligible number of entries listed as unknown or invalid. A substantial majority of the patients are over 60 years of age (67.4%), whereas those aged 30–60 years represent 30.2%, and those 30 years or younger constitute just 2.5%.

All of the following descriptions start by loading the Fairlearn version of the Diabetes 130-US hospitals dataset using ehrapy’s dataloader as an AnnData object.

Selection and filtering bias

An overview of sensitive variables was generated using tableone. Subsequently, ehrapy’s CohortTracker was used to track the age, gender and race variables. The cohort was filtered for all Medicare recipients and subsequently plotted.

Surveillance bias

We plotted the HbA1c measurement ratios using ehrapy’s catplot .

Missing data and imputation bias

MCAR-type missing data for the number of medications variable (‘num_medications‘) were introduced by randomly setting 30% of the variables to be missing using Numpy’s choice function. We tested that the data are MCAR by applying ehrapy’s implementation of Little’s MCAR test, which returned a non-significant P value of 0.71. MAR data for the number of medications variable (‘num_medications‘) were introduced by scaling the ‘time_in_hospital’ variable to have a mean of 0 and a standard deviation of 1, adjusting these values by multiplying by 1.2 and subtracting 0.6 to influence overall missingness rate, and then using these values to generate MAR data in the ‘num_medications’ variable via a logistic transformation and binomial sampling. We verified that the newly introduced missing values are not MCAR with respect to the ‘time_in_hospital’ variable by applying ehrapy’s implementation of Little’s test, which was significant (0.01 × 10 −2 ). The missing data were imputed using ehrapy’s mean imputation and MissForest implementation.

Algorithmic bias

Variables ‘race’, ‘gender’, ‘age’, ‘readmitted’, ‘readmit_binary’ and ‘discharge_disposition_id’ were moved to the ‘obs’ slot of the AnnData object to ensure that they were not used for model training. We built a binary label ‘readmit_30_days’ indicating whether a patient had been readmitted in fewer than 30 d. Next, we combined the ‘Asian’ and ‘Hispanic’ categories into a single ‘Other’ category within the ‘race’ column of our AnnData object and then filtered out and discarded any samples labeled as ‘Unknown/Invalid’ under the ‘gender‘ column and subsequently moved the ‘gender’ data to the variable matrix X of the AnnData object. All categorical variables got encoded. The data were split into train and test groups with a test size of 50%. The data were scaled, and a logistic regression model was trained using scikit-learn, which was also used to determine the balanced accuracy score. Fairlearn’s MetricFrame function was used to inspect the target model performance against the sensitive variable ‘race’. We subsequently fit Fairlearn’s ThresholdOptimizer using the logistic regression estimator with balanced_accuracy_score as the target object. The algorithmic demonstration of Fairlearn’s abilities on this dataset is shown here: https://github.com/fairlearn/talks/tree/main/2021_scipy_tutorial .

Normalization bias

We one-hot encoded all categorical variables with ehrapy using the encode function. We applied ehrapy’s implementation of scaling normalization with and without the ‘Age group’ variable as group key to scale the data jointly and separately using ehrapy’s scale_norm function.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Physionet provides access to the PIC database 43 at https://physionet.org/content/picdb/1.1.0 for credentialed users. The BrixIA images 82 are available at https://github.com/BrixIA/Brixia-score-COVID-19 . The data used in this study were obtained from the UK Biobank 44 ( https://www.ukbiobank.ac.uk/ ). Access to the UK Biobank resource was granted under application number 49966. The data are available to researchers upon application to the UK Biobank in accordance with their data access policies and procedures. The Diabetes 130-US Hospitals dataset is available at https://archive.ics.uci.edu/dataset/296/diabetes+130-us+hospitals+for+years+1999-2008 .

Code availability

The ehrapy source code is available at https://github.com/theislab/ehrapy under an Apache 2.0 license. Further documentation, tutorials and examples are available at https://ehrapy.readthedocs.io . We are actively developing the software and invite contributions from the community.

Jupyter notebooks to reproduce our analysis and figures, including Conda environments that specify all versions, are available at https://github.com/theislab/ehrapy-reproducibility .

Goldberger, A. L. et al. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101 , E215–E220 (2000).

Article   CAS   PubMed   Google Scholar  

Atasoy, H., Greenwood, B. N. & McCullough, J. S. The digitization of patient care: a review of the effects of electronic health records on health care quality and utilization. Annu. Rev. Public Health 40 , 487–500 (2019).

Article   PubMed   Google Scholar  

Jamoom, E. W., Patel, V., Furukawa, M. F. & King, J. EHR adopters vs. non-adopters: impacts of, barriers to, and federal initiatives for EHR adoption. Health (Amst.) 2 , 33–39 (2014).

Google Scholar  

Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. NPJ Digit. Med. 1 , 18 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Wolf, A. et al. Data resource profile: Clinical Practice Research Datalink (CPRD) Aurum. Int. J. Epidemiol. 48 , 1740–1740g (2019).

Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12 , e1001779 (2015).

Pollard, T. J. et al. The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Sci. Data 5 , 180178 (2018).

Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3 , 160035 (2016).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Hyland, S. L. et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat. Med. 26 , 364–373 (2020).

Rasmy, L. et al. Recurrent neural network models (CovRNN) for predicting outcomes of patients with COVID-19 on admission to hospital: model development and validation using electronic health record data. Lancet Digit. Health 4 , e415–e425 (2022).

Marcus, J. L. et al. Use of electronic health record data and machine learning to identify candidates for HIV pre-exposure prophylaxis: a modelling study. Lancet HIV 6 , e688–e695 (2019).

Kruse, C. S., Stein, A., Thomas, H. & Kaur, H. The use of electronic health records to support population health: a systematic review of the literature. J. Med. Syst. 42 , 214 (2018).

Sheikh, A., Jha, A., Cresswell, K., Greaves, F. & Bates, D. W. Adoption of electronic health records in UK hospitals: lessons from the USA. Lancet 384 , 8–9 (2014).

Sheikh, A. et al. Health information technology and digital innovation for national learning health and care systems. Lancet Digit. Health 3 , e383–e396 (2021).

Cord, K. A. M., Mc Cord, K. A. & Hemkens, L. G. Using electronic health records for clinical trials: where do we stand and where can we go? Can. Med. Assoc. J. 191 , E128–E133 (2019).

Article   Google Scholar  

Landi, I. et al. Deep representation learning of electronic health records to unlock patient stratification at scale. NPJ Digit. Med. 3 , 96 (2020).

Ayaz, M., Pasha, M. F., Alzahrani, M. Y., Budiarto, R. & Stiawan, D. The Fast Health Interoperability Resources (FHIR) standard: systematic literature review of implementations, applications, challenges and opportunities. JMIR Med. Inform. 9 , e21929 (2021).

Peskoe, S. B. et al. Adjusting for selection bias due to missing data in electronic health records-based research. Stat. Methods Med. Res. 30 , 2221–2238 (2021).

Haneuse, S. & Daniels, M. A general framework for considering selection bias in EHR-based studies: what data are observed and why? EGEMS (Wash. DC) 4 , 1203 (2016).

PubMed   Google Scholar  

Gallifant, J. et al. Disparity dashboards: an evaluation of the literature and framework for health equity improvement. Lancet Digit. Health 5 , e831–e839 (2023).

Sauer, C. M. et al. Leveraging electronic health records for data science: common pitfalls and how to avoid them. Lancet Digit. Health 4 , e893–e898 (2022).

Li, J. et al. Imputation of missing values for electronic health record laboratory data. NPJ Digit. Med. 4 , 147 (2021).

Rubin, D. B. Inference and missing data. Biometrika 63 , 581 (1976).

Scheid, L. M., Brown, L. S., Clark, C. & Rosenfeld, C. R. Data electronically extracted from the electronic health record require validation. J. Perinatol. 39 , 468–474 (2019).

Phelan, M., Bhavsar, N. A. & Goldstein, B. A. Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference. EGEMS (Wash. DC). 5 , 22 (2017).

PubMed   PubMed Central   Google Scholar  

Secondary Analysis of Electronic Health Records (ed MIT Critical Data) (Springer, 2016).

Jetley, G. & Zhang, H. Electronic health records in IS research: quality issues, essential thresholds and remedial actions. Decis. Support Syst. 126 , 113137 (2019).

McCormack, J. P. & Holmes, D. T. Your results may vary: the imprecision of medical measurements. BMJ 368 , m149 (2020).

Hobbs, F. D. et al. Is the international normalised ratio (INR) reliable? A trial of comparative measurements in hospital laboratory and primary care settings. J. Clin. Pathol. 52 , 494–497 (1999).

Huguet, N. et al. Using electronic health records in longitudinal studies: estimating patient attrition. Med. Care 58 Suppl 6 Suppl 1 , S46–S52 (2020).

Zeng, J., Gensheimer, M. F., Rubin, D. L., Athey, S. & Shachter, R. D. Uncovering interpretable potential confounders in electronic medical records. Nat. Commun. 13 , 1014 (2022).

Getzen, E., Ungar, L., Mowery, D., Jiang, X. & Long, Q. Mining for equitable health: assessing the impact of missing data in electronic health records. J. Biomed. Inform. 139 , 104269 (2023).

Tang, S. et al. Democratizing EHR analyses with FIDDLE: a flexible data-driven preprocessing pipeline for structured clinical data. J. Am. Med. Inform. Assoc. 27 , 1921–1934 (2020).

Dagliati, A. et al. A process mining pipeline to characterize COVID-19 patients’ trajectories and identify relevant temporal phenotypes from EHR data. Front. Public Health 10 , 815674 (2022).

Sun, Y. & Zhou, Y.-H. A machine learning pipeline for mortality prediction in the ICU. Int. J. Digit. Health 2 , 3 (2022).

Article   CAS   Google Scholar  

Mandyam, A., Yoo, E. C., Soules, J., Laudanski, K. & Engelhardt, B. E. COP-E-CAT: cleaning and organization pipeline for EHR computational and analytic tasks. In Proc. of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. https://doi.org/10.1145/3459930.3469536 (Association for Computing Machinery, 2021).

Gao, C. A. et al. A machine learning approach identifies unresolving secondary pneumonia as a contributor to mortality in patients with severe pneumonia, including COVID-19. J. Clin. Invest. 133 , e170682 (2023).

Makam, A. N. et al. The good, the bad and the early adopters: providers’ attitudes about a common, commercial EHR. J. Eval. Clin. Pract. 20 , 36–42 (2014).

Amezquita, R. A. et al. Orchestrating single-cell analysis with Bioconductor. Nat. Methods 17 , 137–145 (2020).

Virshup, I. et al. The scverse project provides a computational ecosystem for single-cell omics data analysis. Nat. Biotechnol. 41 , 604–606 (2023).

Zou, Q. et al. Predicting diabetes mellitus with machine learning techniques. Front. Genet. 9 , 515 (2018).

Cios, K. J. & William Moore, G. Uniqueness of medical data mining. Artif. Intell. Med. 26 , 1–24 (2002).

Zeng, X. et al. PIC, a paediatric-specific intensive care database. Sci. Data 7 , 14 (2020).

Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562 , 203–209 (2018).

Lee, J. et al. Open-access MIMIC-II database for intensive care research. Annu. Int. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2011 , 8315–8318 (2011).

Virshup, I., Rybakov, S., Theis, F. J., Angerer, P. & Alexander Wolf, F. anndata: annotated data. Preprint at bioRxiv https://doi.org/10.1101/2021.12.16.473007 (2021).

Voss, E. A. et al. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases. J. Am. Med. Inform. Assoc. 22 , 553–564 (2015).

Vasilevsky, N. A. et al. Mondo: unifying diseases for the world, by the world. Preprint at medRxiv https://doi.org/10.1101/2022.04.13.22273750 (2022).

Harrison, J. E., Weber, S., Jakob, R. & Chute, C. G. ICD-11: an international classification of diseases for the twenty-first century. BMC Med. Inform. Decis. Mak. 21 , 206 (2021).

Köhler, S. et al. Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res. 47 , D1018–D1027 (2019).

Wu, P. et al. Mapping ICD-10 and ICD-10-CM codes to phecodes: workflow development and initial evaluation. JMIR Med. Inform. 7 , e14325 (2019).

Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19 , 15 (2018).

Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res . 12 , 2825–2830 (2011).

de Haan-Rietdijk, S., de Haan-Rietdijk, S., Kuppens, P. & Hamaker, E. L. What’s in a day? A guide to decomposing the variance in intensive longitudinal data. Front. Psychol. 7 , 891 (2016).

Pedersen, E. S. L., Danquah, I. H., Petersen, C. B. & Tolstrup, J. S. Intra-individual variability in day-to-day and month-to-month measurements of physical activity and sedentary behaviour at work and in leisure-time among Danish adults. BMC Public Health 16 , 1222 (2016).

Roffey, D. M., Byrne, N. M. & Hills, A. P. Day-to-day variance in measurement of resting metabolic rate using ventilated-hood and mouthpiece & nose-clip indirect calorimetry systems. JPEN J. Parenter. Enter. Nutr. 30 , 426–432 (2006).

Haghverdi, L., Büttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13 , 845–848 (2016).

Lange, M. et al. CellRank for directed single-cell fate mapping. Nat. Methods 19 , 159–170 (2022).

Weiler, P., Lange, M., Klein, M., Pe'er, D. & Theis, F. CellRank 2: unified fate mapping in multiview single-cell data. Nat. Methods 21 , 1196–1205 (2024).

Zhang, S. et al. Cost of management of severe pneumonia in young children: systematic analysis. J. Glob. Health 6 , 010408 (2016).

Torres, A. et al. Pneumonia. Nat. Rev. Dis. Prim. 7 , 25 (2021).

Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9 , 5233 (2019).

Kamin, W. et al. Liver involvement in acute respiratory infections in children and adolescents—results of a non-interventional study. Front. Pediatr. 10 , 840008 (2022).

Shi, T. et al. Risk factors for mortality from severe community-acquired pneumonia in hospitalized children transferred to the pediatric intensive care unit. Pediatr. Neonatol. 61 , 577–583 (2020).

Dudnyk, V. & Pasik, V. Liver dysfunction in children with community-acquired pneumonia: the role of infectious and inflammatory markers. J. Educ. Health Sport 11 , 169–181 (2021).

Charpignon, M.-L. et al. Causal inference in medical records and complementary systems pharmacology for metformin drug repurposing towards dementia. Nat. Commun. 13 , 7652 (2022).

Grief, S. N. & Loza, J. K. Guidelines for the evaluation and treatment of pneumonia. Prim. Care 45 , 485–503 (2018).

Paul, M. Corticosteroids for pneumonia. Cochrane Database Syst. Rev. 12 , CD007720 (2017).

Sharma, A. & Kiciman, E. DoWhy: an end-to-end library for causal inference. Preprint at arXiv https://doi.org/10.48550/ARXIV.2011.04216 (2020).

Khilnani, G. C. et al. Guidelines for antibiotic prescription in intensive care unit. Indian J. Crit. Care Med. 23 , S1–S63 (2019).

Harris, L. K. & Crannage, A. J. Corticosteroids in community-acquired pneumonia: a review of current literature. J. Pharm. Technol. 37 , 152–160 (2021).

Dou, L. et al. Decreased hospital length of stay with early administration of oseltamivir in patients hospitalized with influenza. Mayo Clin. Proc. Innov. Qual. Outcomes 4 , 176–182 (2020).

Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50 , 1219–1224 (2018).

Julkunen, H. et al. Atlas of plasma NMR biomarkers for health and disease in 118,461 individuals from the UK Biobank. Nat. Commun. 14 , 604 (2023).

Ko, F. et al. Associations with retinal pigment epithelium thickness measures in a large cohort: results from the UK Biobank. Ophthalmology 124 , 105–117 (2017).

Patel, P. J. et al. Spectral-domain optical coherence tomography imaging in 67 321 adults: associations with macular thickness in the UK Biobank study. Ophthalmology 123 , 829–840 (2016).

D’Agostino Sr, R. B. et al. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation 117 , 743–753 (2008).

Buergel, T. et al. Metabolomic profiles predict individual multidisease outcomes. Nat. Med. 28 , 2309–2320 (2022).

Xu, Y. et al. An atlas of genetic scores to predict multi-omic traits. Nature 616 , 123–131 (2023).

Saelens, W., Cannoodt, R., Todorov, H. & Saeys, Y. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37 , 547–554 (2019).

Rousan, L. A., Elobeid, E., Karrar, M. & Khader, Y. Chest x-ray findings and temporal lung changes in patients with COVID-19 pneumonia. BMC Pulm. Med. 20 , 245 (2020).

Signoroni, A. et al. BS-Net: learning COVID-19 pneumonia severity on a large chest X-ray dataset. Med. Image Anal. 71 , 102046 (2021).

Bird, S. et al. Fairlearn: a toolkit for assessing and improving fairness in AI. https://www.microsoft.com/en-us/research/publication/fairlearn-a-toolkit-for-assessing-and-improving-fairness-in-ai/ (2020).

Strack, B. et al. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed. Res. Int. 2014 , 781670 (2014).

Stekhoven, D. J. & Bühlmann, P. MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics 28 , 112–118 (2012).

Banerjee, A. et al. Identifying subtypes of heart failure from three electronic health record sources with machine learning: an external, prognostic, and genetic validation study. Lancet Digit. Health 5 , e370–e379 (2023).

Nagamine, T. et al. Data-driven identification of heart failure disease states and progression pathways using electronic health records. Sci. Rep. 12 , 17871 (2022).

Da Silva Filho, J. et al. Disease trajectories in hospitalized COVID-19 patients are predicted by clinical and peripheral blood signatures representing distinct lung pathologies. Preprint at bioRxiv https://doi.org/10.1101/2023.09.08.23295024 (2023).

Haneuse, S., Arterburn, D. & Daniels, M. J. Assessing missing data assumptions in EHR-based studies: a complex and underappreciated task. JAMA Netw. Open 4 , e210184 (2021).

Little, R. J. A. A test of missing completely at random for multivariate data with missing values. J. Am. Stat. Assoc. 83 , 1198–1202 (1988).

Jakobsen, J. C., Gluud, C., Wetterslev, J. & Winkel, P. When and how should multiple imputation be used for handling missing data in randomised clinical trials—a practical guide with flowcharts. BMC Med. Res. Methodol. 17 , 162 (2017).

Dziura, J. D., Post, L. A., Zhao, Q., Fu, Z. & Peduzzi, P. Strategies for dealing with missing data in clinical trials: from design to analysis. Yale J. Biol. Med. 86 , 343–358 (2013).

White, I. R., Royston, P. & Wood, A. M. Multiple imputation using chained equations: issues and guidance for practice. Stat. Med. 30 , 377–399 (2011).

Jäger, S., Allhorn, A. & Bießmann, F. A benchmark for data imputation methods. Front. Big Data 4 , 693674 (2021).

Waljee, A. K. et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open 3 , e002847 (2013).

Ibrahim, J. G. & Molenberghs, G. Missing data methods in longitudinal studies: a review. Test (Madr.) 18 , 1–43 (2009).

Li, C., Alsheikh, A. M., Robinson, K. A. & Lehmann, H. P. Use of recommended real-world methods for electronic health record data analysis has not improved over 10 years. Preprint at bioRxiv https://doi.org/10.1101/2023.06.21.23291706 (2023).

Regev, A. et al. The Human Cell Atlas. eLife 6 , e27041 (2017).

Megill, C. et al. cellxgene: a performant, scalable exploration platform for high dimensional sparse matrices. Preprint at bioRxiv https://doi.org/10.1101/2021.04.05.438318 (2021).

Speir, M. L. et al. UCSC Cell Browser: visualize your single-cell data. Bioinformatics 37 , 4578–4580 (2021).

Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9 , 90–95 (2007).

Waskom, M. seaborn: statistical data visualization. J. Open Source Softw. 6 , 3021 (2021).

Harris, C. R. et al. Array programming with NumPy. Nature 585 , 357–362 (2020).

Lam, S. K., Pitrou, A. & Seibert, S. Numba: a LLVM-based Python JIT compiler. In Proc. of the Second Workshop on the LLVM Compiler Infrastructure in HPC. https://doi.org/10.1145/2833157.2833162 (Association for Computing Machinery, 2015).

Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17 , 261–272 (2020).

McKinney, W. Data structures for statistical computing in Python. In Proc. of the 9th Python in Science Conference (eds van der Walt, S. & Millman, J.). https://doi.org/10.25080/majora-92bf1922-00a (SciPy, 2010).

Boulanger, A. Open-source versus proprietary software: is one more reliable and secure than the other? IBM Syst. J. 44 , 239–248 (2005).

Rocklin, M. Dask: parallel computation with blocked algorithms and task scheduling. In Proc. of the 14th Python in Science Conference. https://doi.org/10.25080/majora-7b98e3ed-013 (SciPy, 2015).

Pivarski, J. et al. Awkward Array. https://doi.org/10.5281/ZENODO.4341376

Collette, A. Python and HDF5: Unlocking Scientific Data (‘O’Reilly Media, Inc., 2013).

Miles, A. et al. zarr-developers/zarr-python: v2.13.6. https://doi.org/10.5281/zenodo.7541518 (2023).

The pandas development team. pandas-dev/pandas: Pandas. https://doi.org/10.5281/ZENODO.3509134 (2024).

Weberpals, J. et al. Deep learning-based propensity scores for confounding control in comparative effectiveness research: a large-scale, real-world data study. Epidemiology 32 , 378–388 (2021).

Rosenthal, J. et al. Building tools for machine learning and artificial intelligence in cancer research: best practices and a case study with the PathML toolkit for computational pathology. Mol. Cancer Res. 20 , 202–206 (2022).

Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40 , 163–166 (2022).

Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 (eds Wallach, H. et al.). 8024–8035 (Curran Associates, 2019).

Frostig, R., Johnson, M. & Leary, C. Compiling machine learning programs via high-level tracing. https://cs.stanford.edu/~rfrostig/pubs/jax-mlsys2018.pdf (2018).

Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 , 259–265 (2023).

Kraljevic, Z. et al. Multi-domain clinical natural language processing with MedCAT: the Medical Concept Annotation Toolkit. Artif. Intell. Med. 117 , 102083 (2021).

Pollard, T. J., Johnson, A. E. W., Raffa, J. D. & Mark, R. G. An open source Python package for producing summary statistics for research papers. JAMIA Open 1 , 26–31 (2018).

Ellen, J. G. et al. Participant flow diagrams for health equity in AI. J. Biomed. Inform. 152 , 104631 (2024).

Schouten, R. M. & Vink, G. The dance of the mechanisms: how observed information influences the validity of missingness assumptions. Sociol. Methods Res. 50 , 1243–1258 (2021).

Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8 , 118–127 (2007).

Davidson-Pilon, C. lifelines: survival analysis in Python. J. Open Source Softw. 4 , 1317 (2019).

Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57 , 289–300 (1995).

Wishart, D. S. et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34 , D668–D672 (2006).

Harrell, F. E. Jr, Califf, R. M., Pryor, D. B., Lee, K. L. & Rosati, R. A. Evaluating the yield of medical tests. JAMA 247 , 2543–2546 (1982).

Currant, H. et al. Genetic variation affects morphological retinal phenotypes extracted from UK Biobank optical coherence tomography images. PLoS Genet. 17 , e1009497 (2021).

Cohen, J. P. et al. TorchXRayVision: a library of chest X-ray datasets and models. In Proc. of the 5th International Conference on Medical Imaging with Deep Learning (eds Konukoglu, E. et al.). 172 , 231–249 (PMLR, 2022).

Cohen, J.P., Hashir, M., Brooks, R. & Bertrand, H. On the limits of cross-domain generalization in automated X-ray prediction. In Proceedings of Machine Learning Research , Vol. 121 (eds Arbel, T. et al.) 136–155 (PMLR, 2020).

Download references

Acknowledgements

We thank M. Ansari who designed the ehrapy logo. The authors thank F. A. Wolf, M. Lücken, J. Steinfeldt, B. Wild, G. Rätsch and D. Shung for feedback on the project. We further thank L. Halle, Y. Ji, M. Lücken and R. K. Rubens for constructive comments on the paper. We thank F. Hashemi for her help in implementing the survival analysis module. This research was conducted using data from the UK Biobank, a major biomedical database ( https://www.ukbiobank.ac.uk ), under application number 49966. This work was supported by the German Center for Lung Research (DZL), the Helmholtz Association and the CRC/TRR 359 Perinatal Development of Immune Cell Topology (PILOT). N.H. and F.J.T. acknowledge support from the German Federal Ministry of Education and Research (BMBF) (LODE, 031L0210A), co-funded by the European Union (ERC, DeepCell, 101054957). A.N. is supported by the Konrad Zuse School of Excellence in Learning and Intelligent Systems (ELIZA) through the DAAD program Konrad Zuse Schools of Excellence in Artificial Intelligence, sponsored by the Federal Ministry of Education and Research. This work was also supported by the Chan Zuckerberg Initiative (CZIF2022-007488; Human Cell Atlas Data Ecosystem).

Open access funding provided by Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH).

Author information

Authors and affiliations.

Institute of Computational Biology, Helmholtz Munich, Munich, Germany

Lukas Heumos, Philipp Ehmele, Tim Treis, Eljas Roellin, Lilly May, Altana Namsaraeva, Nastassya Horlava, Vladimir A. Shitov, Xinyue Zhang, Luke Zappia, Leon Hetzel, Isaac Virshup, Lisa Sikkema, Fabiola Curion & Fabian J. Theis

Institute of Lung Health and Immunity and Comprehensive Pneumology Center with the CPC-M bioArchive; Helmholtz Zentrum Munich; member of the German Center for Lung Research (DZL), Munich, Germany

Lukas Heumos, Niklas J. Lang, Herbert B. Schiller & Anne Hilgendorff

TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany

Lukas Heumos, Tim Treis, Nastassya Horlava, Vladimir A. Shitov, Lisa Sikkema & Fabian J. Theis

Health Data Science Unit, Heidelberg University and BioQuant, Heidelberg, Germany

Julius Upmeier zu Belzen & Roland Eils

Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Munich, Germany

Eljas Roellin, Lilly May, Luke Zappia, Leon Hetzel, Fabiola Curion & Fabian J. Theis

Konrad Zuse School of Excellence in Learning and Intelligent Systems (ELIZA), Darmstadt, Germany

Altana Namsaraeva

Systems Medicine, Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE), Bonn, Germany

Rainer Knoll

Center for Digital Health, Berlin Institute of Health (BIH) at Charité – Universitätsmedizin Berlin, Berlin, Germany

Roland Eils

Research Unit, Precision Regenerative Medicine (PRM), Helmholtz Munich, Munich, Germany

Herbert B. Schiller

Center for Comprehensive Developmental Care (CDeCLMU) at the Social Pediatric Center, Dr. von Hauner Children’s Hospital, LMU Hospital, Ludwig Maximilian University, Munich, Germany

Anne Hilgendorff

You can also search for this author in PubMed   Google Scholar

Contributions

L. Heumos and F.J.T. conceived the study. L. Heumos, P.E., X.Z., E.R., L.M., A.N., L.Z., V.S., T.T., L. Hetzel, N.H., R.K. and I.V. implemented ehrapy. L. Heumos, P.E., N.L., L.S., T.T. and A.H. analyzed the PIC database. J.U.z.B. and L. Heumos analyzed the UK Biobank database. X.Z. and L. Heumos analyzed the COVID-19 chest x-ray dataset. L. Heumos, P.E. and J.U.z.B. wrote the paper. F.J.T., A.H., H.B.S. and R.E. supervised the work. All authors read, corrected and approved the final paper.

Corresponding author

Correspondence to Fabian J. Theis .

Ethics declarations

Competing interests.

L. Heumos is an employee of LaminLabs. F.J.T. consults for Immunai Inc., Singularity Bio B.V., CytoReason Ltd. and Omniscope Ltd. and has ownership interest in Dermagnostix GmbH and Cellarity. The remaining authors declare no competing interests.

Peer review

Peer review information.

Nature Medicine thanks Leo Anthony Celi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary handling editor: Lorenzo Righetto, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended data fig. 1 overview of the paediatric intensive care database (pic)..

The database consists of several tables corresponding to several data modalities and measurement types. All tables colored in green were selected for analysis and all tables in blue were discarded based on coverage rate. Despite the high coverage rate, we discarded the ‘OR_EXAM_REPORTS’ table because of the lack of detail in the exam reports.

Extended Data Fig. 2 Preprocessing of the Paediatric Intensive Care (PIC) dataset with ehrapy.

( a ) Heterogeneous data of the PIC database was stored in ‘data’ (matrix that is used for computations) and ‘observations’ (metadata per patient visit). During quality control, further annotations are added to the ‘variables’ (metadata per feature) slot. ( b ) Preprocessing steps of the PIC dataset. ( c ) Example of the function calls in the data analysis pipeline that resembles the preprocessing steps in (B) using ehrapy.

Extended Data Fig. 3 Missing data distribution for the ‘youths’ group of the PIC dataset.

The x-axis represents the percentage of missing values in each feature. The y-axis reflects the number of features in each bin with text labels representing the names of the individual features.

Extended Data Fig. 4 Patient selection during analysis of the PIC dataset.

Filtering for the pneumonia cohort of the youths filters out care units except for the general intensive care unit and the pediatric intensive care unit.

Extended Data Fig. 5 Feature rankings of stratified patient groups.

Scores reflect the z-score underlying the p-value per measurement for each group. Higher scores (above 0) reflect overrepresentation of the measurement compared to all other groups and vice versa. ( a ) By clinical chemistry. ( b ) By liver markers. ( c ) By medication type. ( d ) By infection markers.

Extended Data Fig. 6 Liver marker value progression for the ‘youths’ group and Kaplan-Meier curves.

( a ) Viral and severe pneumonia with co-infection groups display enriched gamma-glutamyl transferase levels in blood serum. ( b ) Aspartate transferase (AST) and Alanine transaminase (ALT) levels are enriched for severe pneumonia with co-infection during early ICU stay. ( c ) and ( d ) Kaplan-Meier curves for ALT and AST demonstrate lower survivability for children with measurements outside the norm.

Extended Data Fig. 7 Overview of medication categories used for causal inference.

( a ) Feature engineering process to group administered medications into medication categories using drugbank. ( b ) Number of medications per medication category. ( c ) Number of patients that received (dark blue) and did not receive specific medication categories (light blue).

Extended Data Fig. 8 UK-Biobank data overview and quality control across modalities.

( a ) UMAP plot of the metabolomics data demonstrating a clear gradient with respect to age at sampling, and ( b ) type 2 diabetes prevalence. ( c ) Analogously, the features derived from retinal imaging show a less pronounced age gradient, and ( d ) type 2 diabetes prevalence gradient. ( e ) Stratifying myocardial infarction risk by the type 2 diabetes comorbidity confirms vastly increased risk with a prior type 2 (T2D) diabetes diagnosis. Kaplan-Meier estimators with 95 % confidence intervals are shown. ( f ) Similarly, the polygenic risk score for coronary heart disease used in this work substantially enriches myocardial infarction risk in its top 5% percentile. Kaplan-Meier estimators with 95 % confidence intervals are shown. ( g ) UMAP visualization of the metabolomics features colored by the assessment center shows no discernable biases. (A-G) n = 29,216.

Extended Data Fig. 9 UK-Biobank retina derived feature quality control.

( a ) Leiden Clustering of retina derived feature space. ( b ) Comparison of ‘overall retinal pigment epithelium (RPE) thickness’ values between cluster 5 (n = 301) and the rest of the population (n = 28,915). ( c ) RPE thickness in the right eye outliers on the UMAP largely corresponds to cluster 5. ( d ) Log ratio of top and bottom 5 fields in obs dataframe between cluster 5 and the rest of the population. ( e ) Image Quality of the optical coherence tomography scan as reported in the UKB. ( f ) Minimum motion correlation quality control indicator. ( g ) Inner limiting membrane (ILM) quality control indicator. (D-G) Data are shown for the right eye only, comparable results for the left eye are omitted. (A-G) n = 29,216.

Extended Data Fig. 10 Bias detection and mitigation study on the Diabetes 130-US hospitals dataset (n = 101,766 hospital visits, one patient can have multiple visits).

( a ) Filtering to the visits of Medicare recipients results in an increase of Caucasians. ( b ) Proportion of visits where Hb1Ac measurements are recorded, stratified by admission type. Adjusted P values were calculated with Chi squared tests and Bonferroni correction (Adjusted P values: Emergency vs Referral 3.3E-131, Emergency vs Other 1.4E-101, Referral vs Other 1.6E-4.) ( c ) Normalizing feature distributions jointly vs. separately can mask distribution differences. ( d ) Imputing the number of medications for visits. Onto the complete data (blue), MCAR (30% missing data) and MAR (38% missing data) were introduced (orange), with the MAR mechanism depending on the time in hospital. Mean imputation (green) can reduce the variance of the distribution under MCAR and MAR mechanisms, and bias the center of the distribution under an MAR mechanism. Multiple imputation, such as MissForest imputation can impute meaningfully even in MAR cases, when having access to variables involved in the MAR mechanism. Each boxplot represents the IQR of the data, with the horizontal line inside the box indicating the median value. The left and right bounds of the box represent the first and third quartiles, respectively. The ‘whiskers’ extend to the minimum and maximum values within 1.5 times the IQR from the lower and upper quartiles, respectively. ( e ) Predicting the early readmission within 30 days after release on a per-stay level. Balanced accuracy can mask differences in selection and false negative rate between sensitive groups.

Supplementary information

Supplementary tables 1 and 2, reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Heumos, L., Ehmele, P., Treis, T. et al. An open-source framework for end-to-end analysis of electronic health record data. Nat Med (2024). https://doi.org/10.1038/s41591-024-03214-0

Download citation

Received : 11 December 2023

Accepted : 25 July 2024

Published : 12 September 2024

DOI : https://doi.org/10.1038/s41591-024-03214-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

case study healthcare models

  • Open access
  • Published: 13 September 2024

Understanding disciplinary perspectives: a framework to develop skills for interdisciplinary research collaborations of medical experts and engineers

  • Sophie van Baalen   ORCID: orcid.org/0000-0002-1592-3276 1 , 2 &
  • Mieke Boon   ORCID: orcid.org/0000-0003-2492-2854 1  

BMC Medical Education volume  24 , Article number:  1000 ( 2024 ) Cite this article

Metrics details

Health professionals need to be prepared for interdisciplinary research collaborations aimed at the development and implementation of medical technology. Expertise is highly domain-specific, and learned by being immersed in professional practice. Therefore, the approaches and results from one domain are not easily understood by experts from another domain. Interdisciplinary collaboration in medical research faces not only institutional, but also cognitive and epistemological barriers. This is one of the reasons why interdisciplinary and interprofessional research collaborations are so difficult. To explain the cognitive and epistemological barriers, we introduce the concept of disciplinary perspectives . Making explicit the disciplinary perspectives of experts participating in interdisciplinary collaborations helps to clarify the specific approach of each expert, thereby improving mutual understanding.

We developed a framework for making disciplinary perspectives of experts participating in an interdisciplinary research collaboration explicit. The applicability of the framework has been tested in an interdisciplinary medical research project aimed at the development and implementation of diffusion MRI for the diagnosis of kidney cancer, where the framework was applied to analyse and articulate the disciplinary perspectives of the experts involved.

We propose a general framework, in the form of a series of questions, based on new insights from the philosophy of science into the epistemology of interdisciplinary research. We explain these philosophical underpinnings in order to clarify the cognitive and epistemological barriers of interdisciplinary research collaborations. In addition, we present a detailed example of the use of the framework in a concrete interdisciplinary research project aimed at developing a diagnostic technology. This case study demonstrates the applicability of the framework in interdisciplinary research projects.

Interdisciplinary research collaborations can be facilitated by a better understanding of how an expert’s disciplinary perspectives enables and guides their specific approach to a problem. Implicit disciplinary perspectives can and should be made explicit in a systematic manner, for which we propose a framework that can be used by disciplinary experts participating in interdisciplinary research project. Furthermore, we suggest that educators can explore how the framework and philosophical underpinning can be implemented in HPE to support the development of students’ interdisciplinary expertise.

Peer Review reports

Expertise is highly domain-specific, and learned by being immersed in professional practice [ 1 ]. However, today’s rapidly evolving health care systems require clinicians who are capable of meeting complex challenges [ 2 ], which often requires interdisciplinary and interprofessional collaborations between experts from distinct disciplines. Footnote 1 With the increasingly central role of innovative medical technologies in many medical specialties [ 3 ], health professionals will presumable participate in interdisciplinary and interprofessional research collaborations. But interprofessional and interdisciplinary research collaborations are notoriously difficult (e.g., [ 4 , 5 , 6 , 7 ]). Boon et al. (2019) argue that the complexity of current medical practices requires interdisciplinary expertise , which is an extension of adaptive expertise [ 8 ]. Interdisciplinary expertise involves the ability to understand the role of disciplinary perspectives .

In this paper, we combine insights from the philosophy of science on disciplinary perspectives and practice experience from an interdisciplinary medical research project aimed at the development and implementation of diffusion MRI for the diagnosis of kidney cancer. Based on these insights and practice experience, we propose a framework for mitigating cognitive and epistemological barriers caused by different disciplinary perspectives. In addition, we present a detailed example of the use of the framework to analyse and explain the experts’ disciplinary perspectives in the aforementioned interdisciplinary research project aimed at developing a diagnostic technology. This case study demonstrates the use of the framework in interdisciplinary research projects. The framework can be used by health professionals to facilitate their interdisciplinary research projects, by analysing and explaining their disciplinary perspectives.

Interdisciplinary research

To address the barriers to interdisciplinary research, various authors have developed analytical frameworks to guide the research process and help disciplinary experts understand what it takes to execute projects together with experts from other disciplines [ 9 , 10 , 11 , 12 ]. Menken et al. (2016), for example, provide a method for interdisciplinary research that is much similar to the traditional empirical cycle, including steps such as “identify problem or topic,” “formulate preliminary research questions,” “data collection” and “draw conclusions” [ 11 ]. Other frameworks describe which steps need to be taken in the interdisciplinary research process . In the literature on team science , several authors also aim to provide a better understanding of the process of interdisciplinary research. For example, Hasan et al. (2023) focuses on the ‘micro’ layers of the team science ecosystem proposed by Stokols et al. (2019) – the layer of individual team members collaborating in interdisciplinary research projects [ 13 , 14 ]. From their analysis of an online collaborations between early academics from different fields, they provide insights into common issues in interdisciplinary research and methods for dealing with them. By applying their framework from the start of the interdisciplinary research process, they argue, interdisciplinary capture [ 15 ] can be avoided.

Although the aforementioned frameworks provide valuable guidance on the process of interdisciplinary collaboration, they do not address the deeper cognitive and epistemological challenges of interdisciplinary research collaboration [ 5 , 16 ], which is the objective of our contribution. A crucial assumption in current frameworks seems to be that interdisciplinary research collaboration is learned by doing, and that the integration of different disciplines will automatically follow. Footnote 2 In our view, however, the integration of different disciplines is both crucial and one of the most challenging aspects of interdisciplinary research collaboration. In previous work we have argued that the inherent cognitive and epistemological (knowledge-theoretical) challenges of integration have been neglected by most authors providing models for interdisciplinary research [ 8 ]. In this paper, our focus is therefore on challenges of using and producing knowledge in interdisciplinary research collaborations that aim at solving complex real-world problems. Examples are collaborations between distinct medical specialists in the diagnosis and treatment of a specific patient (e.g., an oncologist and radiologist), but also collaborations between medical experts and biomedical engineers aimed at innovative medical technology for clinical uses. In this paper, we focus on inter disciplinary research projects, in which two or more academic fields are integrated to solve real-world problems, and not on trans disciplinary projects in which one or more academic fields are integrated with expertise from outside of academia such as policy-making or practice. Footnote 3

The challenge of interdisciplinary research collaborations aimed at solving a shared problem is that each expert is guided by his/her own disciplinary perspective. However, the results produced by experts from different disciplines, although internally coherent, are not mutually coherent, so that they are not easily integrated. Furthermore, approaches and results understood within a contributing disciplinary perspective are not easily understood by experts specialised in other disciplinary perspectives, even though each expert aims to contribute to the same problem.

In short, the way in which experts use and produce knowledge is guided by the disciplinary perspective typical of their own practice. But experts are often unaware of having a disciplinary perspective. We argue that this is an obstacle to participating in interdisciplinary research collaborations focused on using and producing knowledge for complex problem-solving . Moreover, disciplinary perspectives are often considered impenetrable —as they are acquired by doing — which makes dealing with the disciplinary perspective of other experts a difficult learning objective. In this paper, we defend that disciplinary perspectives can be made explicit in a systematic manner, and that their role in ‘how experts in a specific discipline use and produce knowledge’ can thus be made understandable for experts and students in both their own and other disciplines.

To this end, we have developed a framework, based on new insights in the philosophy of science and on practice experience of interdisciplinary research collaboration aimed at the development of a medical technology, which can be used by experts in a particular discipline to analyse different elements of their discipline and, together with collaborators, to analyse the same elements from other disciplines. We believe that this systematic approach to understanding disciplinary perspectives will facilitate interdisciplinary research collaborations between experts from different fields. It will create awareness of one’s own disciplinary perspective and the ability to understand the disciplinary perspective of other experts at a sufficient level. Our framework thus aims to alleviate the challenge of integration in a collaborative research project by providing a tool for analysing disciplinary perspectives . We suggest that the concrete descriptions of disciplinary perspectives that result from the application of the framework, clarify the approaches of experts in a multi-disciplinary team. It thus enables effective communication through improved understanding of how each discipline contributes. Once researchers sufficiently understand each other’s discipline, they will be able to construct so-called conceptual models that integrate content relevant to the problems at hand. Footnote 4

Education in interdisciplinary research

In addition to professionals using our framework to facilitate collaboration in interdisciplinary research projects, we suggest that this framework can also be implemented in medical education. It can be used to teach students what it means to have a disciplinary perspective, and to explicate the role of disciplinary perspectives of disciplinary experts participating in an interdisciplinary research collaboration. We have implemented this framework in an innovative, challenge-based educational design that explicitly aims to support and promote the development of interdisciplinary research skills [ 22 ]. Research into the intended learning objectives has not yet been completed, but our initial findings indicate that the proposed framework effectively supports students in their ability to develop crucial skills for conducting interdisciplinary research projects. We suggest therefore that the framework can also be implemented in HPE as a scaffold for teaching and learning metacognitive skills needed in interdisciplinary research collaborations, for example between medical experts and engineers.

Research has shown that interprofessional education courses for healthcare students can have a positive effect on the knowledge, skills and attitudes required for interprofessional collaboration, but that organising such interventions is challenging [ 23 , 24 ]. In the HPE literature, it is generally assumed that the limitations of interprofessional and interdisciplinary teamwork are due to problems of communication, collaboration and cooperation [ 25 , 26 ], which are linked to barriers and enablers at institutional, organizational, infrastructural, professional and individual levels (e.g., [ 27 , 28 ]). Therefore, interprofessional and interdisciplinary collaborations are discussed extensively in the HPE literature – our focus is challenges of interdisciplinary research collaboration.

The ability to use and produce knowledge and methods in solving (novel) problems is covered in the HPE literature by the notion of adaptive expertise , which encompasses clinical reasoning, integrating basic and clinical sciences, and the transfer of previously learned knowledge, concepts and methods to solve new problems in another context (e.g., [ 1 , 29 , 30 , 31 , 32 , 33 , 34 ]). In previous work, we introduced the concept of interdisciplinary expertise, which expands on the notion of adaptive expertise by including the ability to understand, analyse and communicate disciplinary perspectives [ 8 ]. In this paper, we address the challenge posed by how this ability to understand, analyse and communicate disciplinary perspectives can be learned. The framework that we propose can be implemented in HPE to function as a tool to scaffold metacognitive skills of health professions students, facilitating the development of interdisciplinary expertise.

Aims and contributions of this paper

Our first objective is to show that interdisciplinary collaboration in (medical) research faces not only institutional, but also cognitive and epistemological barriers. Therefore, we first provide a theoretical explanation of the concept of ‘disciplinary perspective’ as developed in the philosophy of science, in order to make it plausible that the cognitive barriers experienced by experts in interdisciplinary collaboration are the result of different disciplinary perspectives on a problem and its solution.

Our second objective is to provide a systematic approach to improve interdisciplinary research, for which we propose a framework, in the form of a series of questions, based on new insights from the philosophy of science into the epistemology of interdisciplinary research. We provide a detailed explanation of the application of the proposed framework in an interdisciplinary medical research project to illustrate its applicability in a multidisciplinary research collaborations, by showing that the different disciplinary perspectives that inform researchers and technicians within a multidisciplinary research team can be made transparent in a systematic way.

In short, our intended contribution is (i) to explain cognitive and epistemological barriers by introducing the concept of disciplinary perspectives in medical research collaborations, (ii) to offer a framework that enables the mitigation of these barriers within interdisciplinary research projects that are caused by different disciplinary perspectives, and (iii) to illustrate the applicability of this framework by a concrete case of an interdisciplinary research collaboration in a medical-technical research setting.

We developed a framework for making disciplinary perspectives of experts participating in an interdisciplinary research collaboration explicit, by combining insights from the philosophy of science with practical experience from a medical research project. Philosophy of science provided the theoretical basis for our concept of disciplinary perspectives. Our detailed case-description stems from an interdisciplinary medical research project to develop and implement a new imaging tool for the diagnosis of kidney cancer, in which the first author participated. We then applied the framework to analyze and articulate the disciplinary perspectives of experts involved in this interdisciplinary medical research project.

The usefulness and applicability of the proposed framework was tested by the first author who, in her role as PI, was able to use it successfully in coordinating an interdisciplinary research project aimed at developing a biomedical technology for clinical practice [ 35 , 36 ]. Below, we illustrate how the framework was systematically applied to this specific case, providing initial evidence of its applicability. However, to test whether the proposed framework reduces the cognitive and epistemological barriers caused by different disciplinary perspectives, experts need to be trained in its use. We suggest that training in the use of this framework requires, among other things, some insight into the philosophical underpinnings of the concept of ‘disciplinary perspective’. Our explanation of the so-called epistemology of disciplinary perspectives in this paper aims to provide such insight.

Developing a framework for analysing and articulating a disciplinary perspective

The framework proposed here is based on insights about disciplinary perspectives in the philosophy of science. These insights concern an epistemology (a theory of knowledge) of scientific disciplines. In other words, the framework is based on an account of the knowledge-theoretical (epistemic) and pragmatic aspects that guide the production of knowledge and scientific understanding by a discipline [ 21 ].

The epistemology of scientific disciplines developed in our previous work is based on the philosophical work of Thomas Kuhn [ 37 ]. Building on his seminal ideas, we understand disciplinary perspectives as analysable in terms of a coherent set of epistemic and pragmatic aspects related to the way in which experts trained in the discipline (and who have thus, albeit implicitly, acquired the disciplinary perspective) apply and produce knowledge [ 38 ]. In our approach, the epistemic and pragmatic aspects that generally characterize a discipline, are made explicit through a set of questions that form the basis of the proposed framework (see Table 1 , and the first column of Table  2 ). The disciplinary perspective can thus be revealed through this framework. In turn, when used in educational settings, this framework can be used to foster interdisciplinary expertise by acting as a scaffold for teaching and learning metacognitive skills for interdisciplinary research collaborations. Footnote 5

The general aspects indicated by italics in each question in Table 1 are interdependent, so that analysis using this framework results in a coherent description of the disciplinary perspective in terms of these aspects. The framework can be used by experts in an interdisciplinary research project not only to make explicit their disciplinary perspective in a general sense, but to also to specify in a systematic way how these aspects relate to the interdisciplinary research problem from their disciplinary discipline (see Table  2 , which contains both the general and problem-specific descriptions for each aspect per discipline). In our view, this approach is productive in overcoming the cognitive and epistemological barriers. It thus contributes to productive interdisciplinary collaboration.

Applying the framework in an interdisciplinary medical research project

To test the applicability of this framework, we applied it to an interdisciplinary medical research project. The interdisciplinary medical research project aimed at developing a new clinical imaging tool, namely, diffusion magnetic resonance imaging (i.e., diffusion MRI) to characterize the micro-structural makeup of kidney tumours, running from early 2014 to mid-2018. The first author was involved in this project as a principle investigator (PI). As an interdisciplinary expert with a background in technical medicine , which combines medical training with technological expertise [ 41 ], she coordinated and integrated contributions from experts with medical and engineering backgrounds. In her role as PI, she applied the proposed framework to analyse and articulate the disciplinary perspectives of other experts involved in the medical research project.

The aim of the interdisciplinary medical research project was to develop a new imaging tool for the characterization of renal tumours, i.e., diffusion MRI. Diffusion MRI allows for visualization and quantification of water diffusion without administration of exogenous contrast materials and is, therefore, a promising technique for imaging kidney tumours. In earlier studies, several parameters derived from diffusion MRI studies were found to differentiate between different tumour types in the kidney [ 42 , 43 , 44 ]. Existing imaging methods in clinical practice can detect the size and location of kidney tumours, but the tumour type and malignancy can only be determined histologically after surgery. The purpose of the medical research project was to assess whether more advanced parameters that can be obtained from diffusion MRI [ 35 , 45 ] can differentiate between malignant and benign kidney tumours [ 36 ]. Being able to make this distinction could potentially prevent unnecessary surgery in patients with non-malignant tumours.

The interdisciplinary medical research project needed to bring together expertise (knowledge and skills) from different professionals, academic researchers as well as clinicians. Therefore, the research team consisted of a physicist, a biomedical engineer, a radiologist, a urologist and the principle investigator. The complex, interdisciplinary research object can be thought of as a system that encompasses several elements: the MRI-machine, the software necessary to produce images, the patient with a (suspected) kidney tumour, and the wider practice of care in which the clinical tool should function. In developing the clinical tool, these elements must be considered interrelated, whereas usually each expert focuses on one of these elements.

The PI utilized the framework to coordinate and integrate the contributions from different experts in the following manner. Throughout the project, she had meetings with each of the team members, where she probed them to explain their specific expertise in regard of the research object, as well as their expert contribution to the development of the imaging tool. Her approach in these meetings was guided by the general questions of the framework (Table 1 ). In this manner, she succeeded in getting a clear insight in aspects of each discipline relevant to the research object, and also in the specific contribution that needed to be made by each expert (as illustrated in Table  2 below). The level of understanding gained by this approach enabled her to, firstly, facilitate interdisciplinary team meetings in which disciplinary interpretations and questions from the experts about the target system could be aligned, and secondly, integrate their contributions towards the development of the new imaging tool [ 36 ].

In the presented approach, the framework was exclusively used by the PI, enabling her to acquire relevant information and understanding about the contributions of the disciplines involved. The other team members in the medical research project were not explicitly involved in applying the framework, nor in articulating their own disciplinary perspective or that of others. Hence, the resulting articulation of the disciplinary perspectives and of the contributions per discipline to the research object (in Table  2 ) is crafted by the PI. The level of understanding of the role of each discipline that the PI has acquired thereby appears to be sufficient to enable her coordinating task in this complex medical research project. Our suggestion for other research and educational practices, though, is that clinicians (as well as) other medical experts can develop this metacognitive skill by using the scaffold (in Table  1 ) in order to participate more effectively in these kinds of complex medical research projects.

In the results  section we will first present our explanation and justification of the idea that disciplinary perspectives determine the specific approaches of experts (who have been trained in a specific discipline in using and producing knowledge) when faced with a complex problem. In this explanation and justification, we will use insights from the philosophy of science. Next, we will explain and illustrate the systematic use of the proposed framework (Table 1 ) by showing the results of applying it to the interdisciplinary medical research project.

The insights from philosophy of science on which the proposed framework for the explication of disciplinary perspectives is rooted in insights of the philosophers Immanuel Kant (1794–1804) and Thomas Kuhn (1922–1996). Their important epistemological insight was that ‘objective’ knowledge of reality does not arise from some kind of imprint in the mind, such as on a photographic plate, but is partly formed by the concepts and theories that scientists hold. These concepts and theories therefore shape the way they perceive the world and produce knowledge about reality. This philosophical insight provides an important explanation for the cognitive and epistemological barriers between disciplines. After all, scientific experts learn these concepts and theories by being trained within a certain discipline. In this way, they develop a disciplinary perspective that determines their view and understanding of reality. Based on this philosophical insight, we can imagine how these barriers can be bridged, namely by developing the metacognitive ability to think about their own cognition and how their scientific view of reality is shaped by their specific disciplinary perspective. In order to facilitate this ability, we develop a framework that can be used as a metacognitive scaffold. Finally, we apply this framework to an example interdisciplinary medical-technical research project, to illustrate it’s use in practice.

Insights from the philosophy of science: disciplinary perspectives

Boon et al. (2019) refer to the notion of disciplinary perspectives and their indelible role in how experts approach problems —in particular, the ways in which experts use and produce knowledge in regard of the problem they aim to solve— and provide a philosophical account of this notion based on so-called constructivist (Kantian) epistemology (i.e., knowledge-theory, [ 38 , 46 ]). On a Kantian view, ‘the world does not speak for itself,’ i.e., knowledge of (aspects of) the external world is not acquired passively on the basis of impressions in the mind (physically) caused by the external world (e.g., similar to how pictures of the world are physically imprinted on a photographic plate). Instead, the way in which people produce and use knowledge results from an interaction between the external world, the human senses and the human cognitive system. Crucially, neither our concepts nor our perceptions stem from passive impressions. Instead, ‘pre-given’ concepts ‘in the mind’ are needed in order to be able to perceive something at all and thus to produce knowledge about reality. Conversely, according to Kant, the imaginative (i.e. creative) capacity of the mind is then able to generate new concepts and to draw new connections of which the adequacy and usability must be tested against our experiences of reality. When new concepts (invented by the creative capacity of the human mind) have been tested against experience, they allow us to see new things in the external world, which we would not see without those concepts. This theoretical insight by Kant is crucial to get past naïve conceptions of knowledge, in particular, by understanding the indelible role of concepts in generating knowledge from observations and experiences.

This philosophical insight already makes it clear, for instance, that ‘descriptions of facts’ in a research project involve discipline-specific concepts, making these descriptions not easy to understand for someone who is not trained in that discipline. After Kant, this role of concepts has been expanded to the role of perspectives . For, Kuhn [ 37 ] created awareness that the human mind plays ‘unconsciously’ and ‘unintentionally’ a much greater role in the way scientific knowledge is created than usually assumed in the view that scientific knowledge is objective . Kuhn has introduced the concept of scientific paradigm to indicate in what sense the mind contributes. His idea was revolutionary because the notion of true and objective knowledge, which is the aim of science, became deeply problematic, as knowledge is only true and objective within the scientific paradigm, whereas it may even be meaningless in another.

Our notion of disciplinary perspectives is in many respects comparable to Kuhn’s idea of scientific paradigm, and is certainly indebted to Kuhn’s invention, particularly, with regard to the idea that it is a more or less coherent, usually implicit ‘background picture’ or ‘conceptual framework,’ which constitutes an inherent part of the cognitive system of an expert, and which forms the basis from which an expert thinks, sees and investigates in a scientific or professional practice. Furthermore, the scientific paradigm is not ‘innate,’ nor individually acquired, but maintained and transferred in scientific or professional practices, usually by being immersed in it. The same can be said about disciplinary perspectives. Yet, there are also important differences.

First, Kuhn believed that the paradigm is so deeply rooted in the cognitive structure of individual scientists, and, moreover, is embedded in how the scientific community functions, that it takes a scientific revolution and a new generation of scientists to shift into another paradigm, which is called a paradigm-shift (sometimes explained as a Gestalt-switch ). Kuhn’s belief suggests that humans lack the capacity to reflect on their own paradigm. Footnote 6 Conversely, we argue that humans can develop the metacognitive ability to perform this kind of reflection by which the structure and content of the paradigm or disciplinary perspective is made explicit. We take this as an important part of interdisciplinary expertise . Our suggestion, however, should not be confused with the idea that we can think without any paradigm or disciplinary perspective – we can’t, but we can explicate its workings (and adapt it), which is what we will illustrate in the case-description below.

Second, Kuhn’s focus was science , i.e., the production of objectively true scientific knowledge, in particular, theories. Instead, our focus is on experts trained in specific disciplines, who use and produce knowledge with regard to (practical) problems that have to be solved. Nonetheless, the Kuhnean insight explains why knowledge generated in distinct disciplines often cannot be combined in a straightforward manner (e.g., as in a jigsaw puzzle), which is due to the fact that knowledge is only fully meaningful and understandable relative to the disciplinary perspective in which it has been produced.

Our notion of disciplinary perspectives is similar to Kuhn’s idea of paradigm (which he specified later on as disciplinary matrices ) in the sense that a paradigm functions as a perspective or a conceptual framework , i.e., a background picture within which a scientific or professional practice of a specific discipline is embedded and which guides and enables this practice. But instead of considering them as replacing each other in a serial historical order as Kuhn did, we assume that disciplinary perspectives co-exist, that is, exist in parallel instead of serial. This view on disciplinary perspectives can be elaborated somewhat further by harking back to Ludwik Fleck [ 47 ], a microbiologist, who already in the 1930s developed a historical philosophy and sociology of science that is very similar to Kuhn’s (also see [ 48 ]). Footnote 7 Similar to and deeply affected by Kant, Fleck draws a close connection between human knowledge (e.g., facts) and cognition. Hence, Fleck disputes that facts are descriptions of things in reality discovered through properly passive observation of aspects in reality – which is why, according to Fleck, facts are invented , not discovered . Similar to Kuhn, Fleck expands on Kant by also including the role of the community in which scientists and experts are trained. Instead of paradigms , however, Fleck uses the terms thought styles and thought collectives to describe how experts in a certain professional or academic community adopt similar ways of perceiving and thinking that differ between disciplines: “The expert [trained in the discipline] is already a specially moulded individual who can no longer escape the bonds of tradition and of the collective; otherwise he would not be an expert” ([ 47 ], p. 54). But while Kuhn strove to explain radical changes in science, Fleck’s focus is on ‘normal science,’ that is, on communities ( thought collectives each having their own thought style ) that co-exist and gradually, rather than radically, change, which is closer to our take on disciplines. Importantly, according to Fleck, the community guides which problems members of that communities find relevant and how they approach these problems. Translated to our vocabulary, in scientific and professional practices, experts trained in different disciplines each have different disciplinary perspective, by means of which they recognize different aspects and problems of the same so-called research object , which they approach in accordance with their own discipline.

We propose that disciplinary perspectives can be analysed and made explicit, which we consider a crucial metacognitive skill of interdisciplinary experts. Our proposal for the framework to analyse disciplinary perspectives (in Table 1 ) takes its cue in Kuhn’s notion of disciplinary matrices. Kuhn’s original notion presents a matrix by which historians and philosophers can analyse the paradigm in hindsight, specifying aspects such as the metaphysical background beliefs and basic concepts, core theories, epistemic values, and methods, which all play a role in how knowledge is generated (also see [ 8 , 50 ]). Our framework includes some of these aspects, but also adds others, thereby generating a scaffold that facilitates interdisciplinary collaborations aimed at applying and producing knowledge for complex problem-solving in professional research practices aimed at ‘real-world’ practices, such as medical research practice. Below, we will illustrate the application of this framework in a concrete case.

Interdisciplinary research project: diffusion MRI for the diagnosis of kidney tumour

We will illustrate the applicability of the proposed framework (Table 1 ) for the analysis of disciplinary perspectives using the example of a research project that aims to develop a new clinical imaging tool, namely, diffusion MRI to characterize the microstructure of renal tumours. In our analysis, we focus on experts from four different disciplines: (I) clinical practice, (II) medical biology, (III) MRI physics, and (IV) signal and image processing. As indicated in the methods section, the complex, interdisciplinary research object that these experts have to deal with concerns a system consisting of the MRI-machine, the software necessary to produce images, and the patient with a (suspected) renal tumour, including the broader care practice in which the clinical tool should function.

In the following paragraphs we will first present a general explanation of the four disciplines involved in the project, and next, illustrate how the proposed framework can be applied to analyse and articulate each disciplinary perspective as well as the specific contribution of each discipline to the research object (in Table  2 ). It is not our intention to provide comprehensive descriptions of the fields that are involved, but rather to provide insight into how the fields differ from each other across the elements of our framework. In addition, we do not believe that all (disciplinary) experts only adhere to one disciplinary perspective. For example, clinicians usually combine both a clinical and biomedical perspective to fit together a complete picture of a patient for clinical decision-making concerning diagnosis and treatment [ 51 , 52 , 53 ]. Moreover, MRI engineers will usually need to combine insights from MRI physics and signal processing.

I. Clinical practice concerning patients with renal tumours

Clinical practice concerns the patient with a renal tumour. This practice differs from the other disciplines in our example, because it is not primarily a scientific discipline. Nonetheless, to develop a diagnostic tool, the disciplinary perspective of clinicians specialized in patients with kidney tumours is crucial, for example, to determine the conditions that the technology needs to meet in order to be useful for their clinical practice. The knowledge-base of clinical experts is rooted in biomedical sciences, which means that clinical experts often understand their patient’s signs and symptoms from a biomedical perspective (i.e., in terms of tumour formation of healthy renal physiology). Yet, clinicians will usually focus on their patient’s clinical presentation and possible diagnostic and clinical pathways. In clinical practice, several kidney tumour types are distinguished, each with its own histological presentation (visible under the microscope), tumour growth rate and chance of metastases. Unfortunately, all kidney tumour types, including non-malignant types, appear the same on standard imaging modalities, namely, as solid lesions. When the tumour is not metastasized, treatment consists of surgery removing the whole kidney or the part of the kidney that contains the tumour (i.e., ‘radical’ or ‘partial’ nephrectomy). If surgery is not possible, other treatments include chemotherapy or radiation. After surgery, a pathologist examines the tumour tissue to determine the tumour type. Occasionally, the pathologist concludes that the removed tumour was non-malignant, which is a situation that may be prevented if diffusion MRI can be used to distinguish between malignant and non-malignant tumours prior to surgery.

II. Medical biology

In biology, the structure and working of the body is studied at several levels, from the interaction of proteins and other macromolecules within cells to the functioning of organs. In the case at hand, the organ of interest is the kidney. Functions of the kidneys are excretion of waste materials, control of blood pressure via hormone excretion, balancing the body fluid, acid-base balance and balancing salts by excretion or resorption of ions. Understanding these functions requires insights into the anatomy, tissue architecture and physiology of the kidneys. The main functional structures of the kidney are: (1) the nephron, consisting of a tuft of capillaries (the glomerulus) surrounded by membranes that are shaped like a cup (Bowman’s capsule), responsible for the first filtration of water and small ions, and (2) the renal tubule that is responsible for more specific resorption and excretion of ions and water. The arrangement of small tubes that fan from the centre towards the outside (or cortex) of the kidneys allows maintaining variation in concentrations of ions, which helps to regulate resorption and excretion. The contribution of medical biology to the development of the diagnostic tool is important because knowledge about kidneys such as just sketched provides an understanding of the properties (i.e., microstructural of physiological properties) by which different tumour types can be distinguished from each other, which is crucial to interpreting the novel diagnostic imaging technology.

III. MRI physics & diffusion MRI

Magnetic resonance imaging is based on the physics of magnetism and the interaction of tissue components with radio magnetic fields. The main component of the human body that clinical MRI machines are sensitive to is (the amount of) water molecules or, more specifically, hydrogen nuclei (protons). These protons can be thought of as rotating or spinning , producing (tiny) magnetic fields. By placing tissue in a relatively strong magnetic field (usually 1.5 or 3 Tesla emitted by a large coil that surrounds the body), the tiny magnetic fields of protons (in the water-phase of the tissue) will align themselves with the direction of the strong magnetic field. By then applying a series of radiofrequency pulses, protons will be pushed out of balance and rotate back to their original state, causing a magnetic flux that causes a change in voltage which is picked up by receiver coils in the MRI machine. The rate with which protons return to their original state, the relaxation time, is influenced by the makeup of their environment, and will, therefore, differ for different tissues, resulting in image contrasts between tissues. To be able to form images of the signal, magnetic field gradients are applied, spatially varying the field which enables to differentiate between signals from different locations. Computer software using mathematical formulas ‘translate’ the signal into a series of images.

Diffusion MRI is a subfield of MR imaging, that is based on a contrast between ‘diffusion rates’ of water molecules in different tissues. Diffusion is based on the random (‘Brownian’) motion of water molecules in tissue. This motion is restricted by tissue components such as membranes and macromolecules and therefore water molecules move (or ‘diffuse’) at different rates in different tissues, depending on the microstructure of tissues. To measure this, additional magnetic field gradients are applied, which results in a signal attenuation proportional to the diffusion rate, as water molecules move (‘or diffuse’) out of their original voxel due to diffusion.

The method for acquiring diffusion-weighted images with an MRI machine (i.e., the ‘acquisition sequence’ of applying radiofrequency pulses and switching gradients on and off) is designed to gain sensitivity to the water molecules diffusing from their original location. The measured diffusion coefficient is considered to be related to microstructural properties of the tissue, namely the density of tissue structures such as macromolecules and membranes that restrict water diffusion. Together with other diffusion parameters that can be obtained by fitting the signal to other functions or ‘models’, the diffusion coefficient can be used to characterise and distinguish between different (tumour) tissue types, which is the aim of this new imaging tool.

IV. Signal and image processing

The signal acquired by MRI machines undergoes many processing steps before they appear as images on the screen. Some of these steps are performed automatically by the MRI system while others require standardized operations in the software package supplied by the manufacturer, and yet other, more advanced, manipulations are performed in custom-made programs or software packages developed for specific research purposes. In the field of diffusion MRI, software packages that perform the most common fitting procedures are available but often custom-made algorithms are required. The reason for this is that diffusion MRI is originally developed for brain imaging, while investigating its feasibility in other organs has started more recently and only makes up a small part of the field. New applications generate new challenges. For example, unlike the brain, kidneys (and other abdominal organs) move up and down as a consequence of breathing. Therefore, specific algorithms manipulating the scan to correct for this respiratory motion are required for diffusion MRI of the kidneys. Furthermore, as tissue structure and physiology in the kidneys differ from that in the brain, existing models need to be adjusted to that of the kidney.

In this paper, we have argued that interdisciplinary collaboration is difficult because of the role of experts’ disciplinary perspective, which shapes their view and approach to a problem and creates cognitive and epistemological barriers when collaborating with other disciplines. To overcome these barriers, disciplinary experts involved in interdisciplinary research projects need to be able to explicate their own disciplinary perspective. This ability is part of what is known as interdisciplinary expertise [ 8 ]. We defend that interdisciplinary expertise begins with creating awareness of the role of disciplinary perspectives in how experts view a problem, interpret it, formulate questions and develop solutions.

Analytical frameworks to guide interdisciplinary research processes previously developed by other authors typically focus on the process of interdisciplinary collaboration [ 9 , 10 , 11 , 12 , 13 , 14 , 15 ]. The approach we propose here contributes to this literature by addressing the deeper cognitive and epistemological challenges of interdisciplinary research collaboration on the role of the disciplinary perspective as an inherent part of one’s expertise [ 5 , 16 ]. Several authors have already used the concept of ‘disciplinary perspectives’ to point out the challenges of interdisciplinary research (e.g., [ 9 , 15 ]). Our contribution to this literature is the idea, based on philosophical insights into the epistemology of interdisciplinary research, that disciplinary perspectives can be made explicit, and next, to provide an analytical framework with which disciplinary perspectives within an interdisciplinary research context can be systematically described (as in Table 1 ) with the aim of facilitating interdisciplinary communication within such research projects.

Our further contribution is that we have applied this framework to a concrete case, thereby demonstrating that disciplinary perspectives within a concrete interdisciplinary research project can actually be analyzed and explicated in terms of a coherent set of elements that make up the proposed framework. The result of this analysis (in Table  2 ) shows a coherent description of the discipline in question per column, with an explanation per aspect of what this aspect means for the interdisciplinary research project. It can also be seen that the horizontal comparison (in Table  2 ) results in very different descriptions per aspect for each discipline. We believe that this example demonstrates that it is possible to explain the nature of a specific discipline in a way that is accessible to experts from other disciplines. We do not claim, therefore, that this table is an exhaustive description of the four disciplines involved. Instead, our aim is to show that the approach outlined in this table reduces cognitive and epistemological barriers in interdisciplinary research by enabling communication about the content and nature of the disciplines involved.

We suggest that educators can explore how the framework and philosophical underpinning can be implemented in HPE to support the development of students’ interdisciplinary expertise. Much has been written, especially in the engineering education literature, about the importance of interdisciplinarity and how to teach it. A recent systematic review article shows that the focus of education aimed at interdisciplinarity is on so-called soft skills such as communication and teamwork. Project-based learning is often used to teach the necessary skills, but without specific support to promote these skills [ 7 ]. In our literature review on education for interdisciplinarity [ 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 ], we did not find any authors who specifically address the cognitive and epistemological barriers to interdisciplinary collaboration as described in our article. One possible reason for this is that current epistemological views on the application of science in real-world problem-solving contexts, such as the research project presented here, do not recognise the inherent cognitive and epistemological barriers philosophically explained in this article [ 78 ]. The novelty of our approach is therefore our emphasis on the epistemological and cognitive barriers between disciplines that result from the ineradicable role of disciplinary perspectives in the discipline-bound way in which researchers frame and interpret the common problem. This makes interdisciplinary communication and integration particularly difficult. Specific scaffolds are needed to overcome these barriers. The framework proposed here, which systematically makes the disciplinary perspective explicit, aims to be such a scaffold. We therefore argue that much more attention should be paid to this specific challenge of interdisciplinary collaboration in academic HPE education. This requires both an in-depth philosophical explanation that offers a new view of scientific knowledge that makes clear why interdisciplinary research is difficult, and learning how to make disciplinary perspectives explicit, for which the proposed framework provides a metacognitive scaffold.

We have implemented this framework in a newly designed minor programme that uses challenge-based learning and aims to develop interdisciplinary research skills. In this minor, small groups of students from different disciplines work on the (interdisciplinary) analysis and solution of a complex real-world problem. A number of other scaffolds focused on the overarching learning objective have been included in the educational design, which means that the framework proposed here cannot be tested in isolation. Although our research into whether this new educational design achieves the intended learning goal is not yet complete, our initial experience of using the framework is positive. Students, guided by the teacher, are able to use the framework in their interdisciplinary communication - first in a general sense to get to know each other’s disciplines and then within their research project. This implies that the framework is useful in education aimed at learning to conduct interdisciplinary research.

This example, where the framework has been implemented in education aimed at developing interdisciplinary research skills, also shows that although it was developed in the context of a medical-technical research project, it is in fact very general and well suited for any interdisciplinary research.

A critical comment should be made regarding our preliminary evidence of the framework’s usefulness. The first author, who was PI of the interdisciplinary medical research project, in which she applied this framework in her role as coordinator, was also involved in the development of the framework [ 35 , 36 ]. She, therefore has a detailed insight into the theoretical underpinnings of the framework in relation to its intended application. The lack of such a theoretical background may make it more difficult to apply the framework in interdisciplinary research. Footnote 8 Which is why we have provided an extensive elaboration of these underpinnings in this paper.

Further research should address the question of whether this scaffold can facilitate interdisciplinary collaboration between disciplinary experts.

Further research is also needed to systematically analyse the value of this framework in HPE education. This starts with the question of what type of educational design it can be successfully implemented in. Other important questions are: Can interdisciplinary expertise be acquired without knowledge of the other discipline (e.g., biomedical engineering)? In other words, how much education in other disciplines should HPE provide to prepare experts to participate in specific interdisciplinary collaborations?

Furthermore, we emphasize that in addition to learning to use this framework as a metacognitive scaffold to gain a deeper understanding of the epistemological and cognitive barriers, students also need to develop other skills necessary for interdisciplinary research collaboration and working in interdisciplinary teams. The frameworks discussed in our introduction that analyse and guide the interdisciplinary research process provide insights into these skills (e.g. [ 9 , 10 , 11 , 12 ] and [ 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 ]).

We suggest that the article as a whole can be used in such educational settings to achieve several goals, provided that students are guided and coached by educators. First, to foster student’s understanding of the epistemological challenges of interdisciplinary collaboration and to recognize that these challenges are usually underestimated and not addressed in most approaches. Second, by providing insights into the epistemological challenges by outlining the philosophical underpinnings, students will be made aware of having a disciplinary perspective and how it guides their work. Finally, by providing a framework that can be used to analyse these disciplinary perspectives and by providing an example from the case description. When successful, this approach encourages students to developing transferrable skills that can be used in research projects beyond the initial educational project.

Conclusions

Interdisciplinary research collaborations can be facilitated by a better understanding of how an expert’s disciplinary perspectives enables and guides their specific approach to a problem. Implicit disciplinary perspectives can and should be made explicit in a systematic manner, for which we propose a framework that can be used by disciplinary experts participating in interdisciplinary research projects. With this framework, and its philosophical underpinning, we contribute to a fundamental aspect of interdisciplinary collaborations.

Availability of data and materials

All data generated or analysed during this study are included in this published.

In this article, we use ‘disciplines,’ ‘fields’ and ‘specialisms’ interchangeably.

Bridle (2013), Klein (1990), Newell (2007) and Szostak (2002) provide activities that are important for interdisciplinary collaborations, such as communication, negotiation and evaluating assumptions. In order to be able to perform such activities, students need to develop the appropriate skills [ 9 , 17 , 18 , 19 ].

Roux et al. (2017) provide a clear characterization of transdisciplinary research: “A key aim of transdisciplinary research is for actors from science, policy and practice to co-evolve their understanding of a social–ecological issue, reconcile their diverse perspectives and co-produce appropriate knowledge to serve a common purpose.” ([ 20 ], p. 1).

Boon (2020, 2023) explains the notion of conceptual modelling in application oriented research [ 21 , 22 ].

i.e., a framework that enables us to think analytically and systematically about our cognitive processes when we use and produce knowledge [ 39 , 40 ].

Yet, we recognize that this belief was plausible in Kuhn’s era, where the idea that humans (including scientists) are inevitably and indelibly guided by paradigms and perspectives was revolutionary and devastating with regard to the rational view of man. But nowadays we have become familiar with this idea, which offers an opening for the metacognitive abilities that we suggest.

To scholars in HPE, we recommend the entry on Ludwik Fleck in the Stanford Encyclopedia of Philosophy [ 49 ].

The point made here touches on a more fundamental issue that is beyond the scope of this article. Namely, that resistance of students, but also of teachers, to the described approach may have to do with more traditional epistemological beliefs about science that do not fit well with the way scientific research works in practice [ 78 , 79 ]. The philosophical underpinnings of the proposed framework explained in this article suggest alternative epistemological beliefs that are more appropriate for interdisciplinary research aimed at (complex) ‘real-world’ problems.

Abbreviations

Health professions education

Magnetic Resonance Imaging

Principle investigator

Mylopoulos M, Regehr G. Cognitive metaphors of expertise and knowledge: prospects and limitations for medical education. Med Educ. 2007. https://doi.org/10.1111/j.1365-2923.2007.02912.x .

Mylopoulos M, Kulasegaram K, Woods NN. Developing the experts we need: fostering adaptive expertise through education. J Eval Clin Pract. 2018. https://doi.org/10.1111/jep.12905 .

World Health Organization (WHO). Medical devices: managing the Mismatch. An outcome of the Priority Medical devices project. WHO; 2010. https://www.who.int/publications/i/item/9789241564045 .

Gilbert JH, Yan J, Hoffman SJ. A WHO report: framework for action on interprofessional education and collaborative practice. J Allied Health. 2010;39(Suppl 1):196–7.

Google Scholar  

MacLeod M. What makes interdisciplinarity difficult? Some consequences of domain specificity in interdisciplinary practice. Synthese. 2016. https://doi.org/10.1007/s11229-016-1236-4 .

Hudson JN, Croker A. Educating for collaborative practice: an interpretation of current achievements and thoughts for future directions. Med Educ. 2018. https://doi.org/10.1111/medu.13455 .

Van der Beemt A, MacLeod M, van der Veen JT, van de Ven A, van Baalen S, Klaassen RG, Boon M. Interdisciplinary engineering education: a review of vision, teaching and support. J Eng Educ. 2020;109(1). https://doi.org/10.1002/jee.20347 .

Boon M, Van Baalen SJ, Groenier M. Interdisciplinary expertise in medical practice: challenges of using and producing knowledge in complex problem-solving. Med Teach. 2019. https://doi.org/10.1080/0142159X.2018.1544417 .

Klein J. Interdisciplinarity: history, theory and practice. Detroit, MI: Wayne State University; 1990.

Repko A, Navakas F, Fiscella J. Integrating interdisciplinarity: how the theories of common ground and Cognitive_Interdisciplinarity are informing the debate on interdisciplinary integration. Issues Interdisciplinary Stud. 2007;25:1–31.

Menken S, Keestra M, Rutting L, Post G, de Roo M, Blad S, de Greef L. An introduction to interdisciplinary research: theory and practice. Amsterdam: Amsterdam University; 2016.

Book   Google Scholar  

Repko AF, Szostak R. Interdisciplinary research. Process and theory. 3rd ed. Los Angeles: Sage; 2017.

Hasan MN, Koksal C, Montel L, Le Gouais A, Barnfield A, Bates G, Kwon HR. Developing shared understanding through online interdisciplinary collaboration: reflections from a research project on better integration of health outcomes in future urban. Futures. 2023. https://doi.org/10.1016/j.futures.2023.103176 .

Stokols D, Olson JS, Salazar M, Olson GM. Strengthening the ecosystem for effective team science: a case study from University of California, Irvine, USA. 2019. https://i2insights.org/2019/02/19/team-science-ecosystem/ . Accessed 2 Feb 2024 .

Brister E. Disciplinary capture and epistemological obstacles to interdisciplinary research: lessons from Central African conservation disputes. Stud History Philos Sci part C: Stud History Philos Biol Biomedical Sci. 2016. https://doi.org/10.1016/j.shpsc.2015.11.001 .

Boon M, Orozco M, Sivakumar K. Epistemological and educational issues in teaching practice-oriented scientific research: roles for philosophers of science. Eur J Philos Sci. 2022;12(1):16. https://doi.org/10.1007/s13194-022-00447-z .

Article   Google Scholar  

Bridle H, Vrieling A, Cardillo M, Araya Y, Hinojosa L. Preparing for an interdisciplinary future: a perspective from early-career researchers. Futures. 2013. https://doi.org/10.1016/j.futures.2013.09.003 .

Newell WH. Decision-making in Interdisciplinary studies. In: Morcol G, editor. Handbook of decision making. New York: CRC Press/Taylor & Francis Group; 2007. p. 245–65.

Szostak R. How to do interdisciplinarity. Integrating the debate. Issues Integr Stud. 2002;20:103–22.

Roux DJ, Nel JL, Cundill G, O’farrell P, Fabricius C. Transdisciplinary research for systemic change: who to learn with, what to learn about and how to learn. Sustain Sci. 2017. https://doi.org/10.1007/s11625-017-0446-0 .

Boon M. The role of disciplinary perspectives in an epistemology of models. Eur J Philos Sci. 2020. https://doi.org/10.1007/s13194-020-00295-9 .

Boon M, Conceptual modelling as an overarching research skill in engineering education. SEFI2023 2023;  https://doi.org/10.21427/ZDX4-VV41 accessed through https://arrow.tudublin.ie/cgi/viewcontent.cgi?article=1074&context=sefi2023_prapap .

Guraya SY, Barr H. The effectiveness of interprofessional education in healthcare: a systematic review and meta-analysis. Kaohsiung J Med Sci. 2018. https://doi.org/10.1016/j.kjms.2017.12.009 .

Darlow B, Brown M, McKinlay E, Gray L, Purdie G, Pullen S. Longitudinal impact of preregistration interprofessional education on the attitudes and skills of health professionals during their early careers: a non-randomised trial with 4-year outcomes. BMJ Open. 2022;12(7):e060066. https://doi.org/10.1136/bmjopen-2021-060066 .

Clark G. Institutionalizing interdisciplinary health professions programs in higher education: the implications of one story and two laws. J Interprof Care. 2004. https://doi.org/10.1080/13561820410001731296 .

O’Keefe M, Henderson A, Chick R. Defining a set of common interprofessional learning competencies for health profession students. Med Teach. 2017. https://doi.org/10.1080/0142159X.2017.1300246 .

Choi BC, Pak AW. Multidisciplinarity, interdisciplinarity, and transdisciplinarity in health research, services, education and policy: 2. Promotors, barriers, and strategies of enhancement. Clin Invest Med. 2007. https://doi.org/10.25011/cim.v30i6.2950 .

Lawlis TR, Anson J, Greenfield D. Barriers and enablers that influence sustainable interprofessional education: a literature review. J Interprof Care. 2014. https://doi.org/10.3109/13561820.2014.895977 .

Schwartz DL, Bransford JD, Sears D. Efficiency and innovation in transfer. In transfer of learning from a modern multidisciplinary perspective . Charlotte, NC: Information age publishing. 2005; 3:1–51. Edited by JP Mestre JP.

Mylopoulos M, Regehr G. Putting the expert together again. Med Educ. 2011. https://doi.org/10.1111/j.1365-2923.2011.04032.x .

Carbonell KB, Stalmeijer RE, Könings KD, Segers M, van Merriënboer JJ. How experts deal with novel situations: a review of adaptive expertise. Educ Res Rev. 2014. https://doi.org/10.1016/j.edurev.2014.03.001 .

Kulasegaram K, Min C, Howey E, Neville A, Woods N, Dore K, et al. The mediating effect of context variation in mixed practice for transfer of basic science. Adv Health Sci Educ. 2015. https://doi.org/10.1007/s10459-014-9574-9 .

Castillo JM, Park YS, Harris I, Cheung JJH, Sood L, Clark MD, et al. A critical narrative review of transfer of basic science knowledge in health professions education. Med Educ. 2018. https://doi.org/10.1111/medu.13519 .

Dyre L, Tolsgaard MG. The gap in transfer research. Med Educ. 2018. https://doi.org/10.1111/medu.13591 .

Van Baalen S, Leemans A, Dik P, Lilien MR, Ten Haken B, Froeling M. Intravoxel incoherent motion modeling in the kidneys: comparison of mono-, bi-, and triexponential fit. J Magn Reson Imaging. 2017. https://doi.org/10.1002/jmri.25519 .

Van Baalen S, Froeling M, Asselman M, Klazen C, Jeltes C, Van Dijk L, et al. Mono, bi-and tri-exponential diffusion MRI modelling for renal solid masses and comparison with histopathological findings. Cancer Imaging. 2018. https://doi.org/10.1186/s40644-018-0178-0 .

Kuhn TS. The Structure of Scientific Revolutions. 2nd ed. Chicago: The University of Chicago Press; 1970.

Boon M, Van Baalen S. Epistemology for interdisciplinary research–shifting philosophical paradigms of science. Eur J Philos Sci. 2019. https://doi.org/10.1007/s13194-018-0242-4 .

Flavell JH. Metacognition and cognitive monitoring: a new area of cognitive–developmental inquiry. Am Psychol. 1979. https://doi.org/10.1037/0003-066X.34.10.906 .

Pintrich P P.R. The role of metacognitive knowledge in learning, teaching, and assessing. Theory into Pract. 2002. https://doi.org/10.1207/s15430421tip4104_3 .

Groenier M, Pieters JM, Miedema HAT. Technical medicine: designing medical technological solutions for improved health care. Med Sci Educ 2017, https://doi.org/10.1007/s40670-017-0443-z

Chandarana H, Kang SK, Wong S, Rusinek H, Zhang JL, Arizono S et al. Diffusion-Weighted Intravoxel Incoherent Motion Imaging of Renal Tumors with Histopathologic Correlation. Invest Radiol 2012. https://doi.org/10.1097/RLI.0b013e31826a0a49 .

Feng Q, Ma Z, Zhang S, Wu J. Usefulness of diffusion tensor imaging for the differentiation between low-fat angiomyolipoma and clear cell carcinoma of the kidney. SpringerPlus. 2016. https://doi.org/10.1186/s40064-015-1628-x .

Rheinheimer S, Stieltjes B, Schneider F, Simon D, Pahernik S, Kauczor HU, et al. Investigation of renal lesions by diffusion-weighted magnetic resonance imaging applying intravoxel incoherent motion-derived parameters–initial experience. Eur J Radiol. 2012. https://doi.org/10.1016/j.ejrad.2011.10.016 .

Van der Bel R, Gurney-Champion OJ, Froeling M, Stroues ESG, Nederveen AJ, Krediet CTP. A tri-exponential model for intravoxel incoherent motion analysis of the human kidney: in silico and during pharmacological renal perfusion modulation. Eur J Radiol. 2017. https://doi.org/10.1016/j.ejrad.2017.03.008 .

Boon M: Philosophy of science in practice: a proposal for epistemological constructivism. 2015; Helsinki (Finland). Edited by Leitgeb H, Niiniluoto I, Seppälä P, Sober E. Helsinki (Finland): College publications. 2017a:289–310. 2017a.

Fleck L. Genesis and development of a scientific fact. Chicago: University of Chicago Press; 1935/1979.

Mößner N. Thought styles and paradigms—a comparative study of Ludwik Fleck and Thomas S. Kuhn. Stud Hist Philos Sci Part A. 2011. https://doi.org/10.1016/j.shpsa.2010.12.002 .

Sady W. Ludwik Fleck. In: the stanford encyclopedia of philosophy. Zalta EN, editor. 2017. https://plato.stanford.edu/archives/fall2017/entries/fleck/ . Accessed 30 Jul 2020.

Boon M. An engineering paradigm in the biomedical sciences: knowledge as epistemic tool. Prog Biophys Mol Biol. 2017b. doi:j.pbiomolbio.2017.04.001.

Van Baalen S, Boon M. An epistemological shift: from evidence-based medicine to epistemological responsibility. J Eval Clin Pract. 2015. https://doi.org/10.1111/jep.12282 .

Woods NN, Brooks LR, Norman GR. The role of biomedical knowledge in diagnosis of difficult clinical cases. Adv Health Sci Educ. 2007;12:417–26.

Schmidt HG, Rikers RMJP. How expertise develops in medicine: knowledge encapsulation and illness script formation. Med Educ. 2007. https://doi.org/10.1111/j.1365-2923.2007.02915.x .

Newell WH. A theory of interdisciplinary studies. Issues Integr Stud. 2001;19:1–25.

Ivanitskaya L, Clark D, Montgomery G, Primeau R. Interdisciplinary learning: process and outcomes. Innov High Educ. 2002. https://doi.org/10.1023/A:1021105309984 .

Nikitina S. Three strategies for interdisciplinary teaching: contextualizing, conceptualizing, and problem-centring. J Curric stud. 2006. https://doi.org/10.1080/00220270500422632 .

Aram JD. Concepts of interdisciplinarity: configurations of knowledge and action. Hum Relat. 2004. https://doi.org/10.1177/0018726704043893 .

Aboelela SW, Larson E, Bakken S, Carrasquillo O, Formicola A, Glied SA, et al. Defining interdisciplinary research: conclusions from a critical review of the literature. Health Serv Res. 2007. https://doi.org/10.1111/j.1475-6773.2006.00621.x .

Mansilla VB, Duraisingh ED, Wolfe CR, Haynes C. Targeted assessment rubric: an empirically grounded rubric for interdisciplinary writing. J High Educ. 2009;80(3):334–53.

Spelt EJ, Biemans HJ, Tobi H, Luning PA, Mulder M. Teaching and learning in interdisciplinary higher education: a systematic review. Educ Psychol Rev. 2009. https://doi.org/10.1007/s10648-009-9113-z .

Klein JA. A Taxonomy of interdisciplinarity. In: Frodeman R, editor. In the oxford handbook of interdisciplinarity. Oxford: Oxford University press; 2010. p. 15–30.

Terpstra JL, Best A, Abrams DB, Moor G. Health sciences and health services. In: Frodeman R, editor. The Oxford Handbook of Interdisciplinarity. Oxford: Oxford University Press; 2010.

DeZure D. Interdisciplinary pedagogies in higher education. In: Frodeman R, editor. In the oxford handbook of interdisciplinarity. Oxford: Oxford University press; 2010. p. 372–87.

Frenk J, Chen L, Bhutta ZA, Cohen J, Crisp N, Evans T. Health professionals for a new century: transforming education to strengthen health systems in an interdependent world. Lancet. 2010. https://doi.org/10.1016/S0140-6736(10)61854-5 .

Haynes C, Brown-Leonard J. From surprise parties to mapmaking: undergraduate journeys toward interdisciplinary understanding. J High Educ. 2010. https://doi.org/10.1080/00221546.2010.11779070 .

Hirsch-Hadorn G, Pohl C, Bammer G. Solving problems through transdisciplinary research. In: Frodeman R, editor. In the oxford handbook of interdisciplinarity. Oxford: Oxford University press; 2010. p. 431–52.

Szostak R. The interdisciplinary research process. In: Repko AF, Newell WH, Szostak R, editors. In Interdisciplinary research: case studies of integrative understandings of complex problems. Thousand Oaks, CA: Sage; 2011. p. 3–19.

McNair LD, Newswander C, Boden D, Borrego M. Student and faculty interdisciplinary identities in self-managed teams. J Eng Educ. 2011. https://doi.org/10.1002/j.2168-9830.2011.tb00018.x .

Liu SY, Lin CS, Tsai CC. College Students’ scientific epistemological views and thinking patterns in Socioscientific decision making. Sci Educ. 2011. https://doi.org/10.1002/sce.20422 .

Abu-Rish E, Kim S, Choe L, Varpio L, Malik E, White AA, et al. Current trends in interprofessional education of health sciences students: a literature review. J Interprof Care. 2012. https://doi.org/10.3109/13561820.2012.715604 .

Bammer G. Disciplining interdisciplinarity - integration and implementation sciences for researching Complex real-world problems. Canberra: Australian National University E-Press; 2013.

Holbrook JB. What is interdisciplinary communication? Reflections on the very idea of disciplinary integration. Synthese. 2013. https://doi.org/10.1007/s11229-012-0179-7 .

Andersen H. The second essential tension: on tradition and innovation in interdisciplinary research. Topoi. 2013. https://doi.org/10.1007/s11245-012-9133-z .

Andersen H. Collaboration, interdisciplinarity, and the epistemology of contemporary science. Stud Hist Philos Sci Part A. 2016. https://doi.org/10.1016/j.shpsa.2015.10.006 .

Lattuca LR, Knight DB, Bergom IM. Developing a measure of interdisciplinary competence for engineers. Paper presented at the American Society for Engineering Education 2012 Annual Conference & Exposition, San Antonio, Texas, USA; 2013.

Acquavita SP, Lewis MA, Aparicio E, Pecukonis E. Student perspectives on interprofessional education and experiences. J Allied Health. 2014;43(2):e31–6.

Pharo E, Davison A, McGregor H, Warr K, Brown P. Using communities of practice to enhance interdisciplinary teaching: lessons from four Australian institutions. High Educ Res Dev. 2014. https://doi.org/10.1080/07294360.2013.832168 .

Boon M. How philosophical beliefs about science affect science education in academic engineering programs: the context of construction. Eng Stud. 2022. https://doi.org/10.1080/19378629.2022.2125398 .

Bromme R, Pieschl S, Stahl E. Epistemological beliefs are standards for adaptive learning: a functional theory about epistemological beliefs and metacognition. Metacognition Learn. 2010. https://doi.org/10.1007/s11409-009-9053-5 .

Download references

Acknowledgements

We are very grateful to three anonymous reviewers who have provided valuable feedback and suggestions that have helped us improve the paper.

This work is financed by an Aspasia grant (409.40216) of the Dutch National Science Foundation (NWO) for the project Philosophy of Science for the Engineering Sciences , and by the work package Interdisciplinary Engineering Education at the 4TU-CEE (Centre Engineering Education https://www.4tu.nl/cee/en/ ) in The Netherlands.

Author information

Authors and affiliations.

Department of Philosophy, University of Twente, Enschede, The Netherlands

Sophie van Baalen & Mieke Boon

Rathenau Instituut, Den Haag, The Netherlands

Sophie van Baalen

You can also search for this author in PubMed   Google Scholar

Contributions

SvB and MB have co-authored the manuscript and have contributed equally to the article.

Authors' information

Mieke Boon (PhD) graduated in chemical engineering (cum laude) and is a full professor in philosophy of science in practice . Her research aims at a philosophy of science for the engineering sciences , addressing topics such as methodology, technological instruments, scientific modeling, paradigms of science, interdisciplinarity, and science teaching. Sophie van Baalen (PhD) graduated in technical medicine and in philosophy of science technology and society , both cum laude. She recently finished her PhD project in which she aimed to understand epistemological aspects of technical medicine from a philosophy of science perspective, such as evidence-based medicine, expertise, interdisciplinarity and technological instruments.

Corresponding author

Correspondence to Mieke Boon .

Ethics declarations

Ethics approval and consent to participate.

No human participants were involved in this research, so ethical approval and/or consent to participate is not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

van Baalen, S., Boon, M. Understanding disciplinary perspectives: a framework to develop skills for interdisciplinary research collaborations of medical experts and engineers. BMC Med Educ 24 , 1000 (2024). https://doi.org/10.1186/s12909-024-05913-1

Download citation

Received : 14 July 2023

Accepted : 14 August 2024

Published : 13 September 2024

DOI : https://doi.org/10.1186/s12909-024-05913-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Adaptive expertise
  • Interdisciplinary expertise
  • Metacognitive skills
  • Higher-order cognitive abilities
  • Epistemology
  • Problem-solving
  • Disciplinary perspectives
  • Medical technology

BMC Medical Education

ISSN: 1472-6920

case study healthcare models

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Int J Integr Care
  • v.20(1); Jan-Mar 2020

Logo of ijicare

Comparing International Models of Integrated Care: How Can We Learn Across Borders?

Carolyn steele gray.

1 Bridgepoint Collaboratory for Research and Innovation, Lunenfeld-Tanenbaum Research Institute, Sinai Health System, US

2 Institute of Health Policy, Management & Evaluation, University of Toronto, Toronto, CA

Nick Zonneveld

3 Vilans, Centre of Excellence for Long-term Care, NL

4 TIAS School for Business and Society/University of Tilburg, NL

Mylaine Breton

5 Université de Sherbrooke, CA

6 Clinical Governance in Primary Health Care, CA

Paul Wankah

7 Institute for Health System Solutions and Virtual Care, Women’s College Research Institute, Women’s College Hospital, CA

Geoff M. Anderson

Walter p. wodchis.

8 Implementation and Evaluation Science, Institute for Better Health, Trillium Health Partners, CA

Associated Data

Introduction:.

Providers, managers, health system leaders, and researchers could learn across countries implementing system-wide models of integrated care, but require accessible methods to do so. This study assesses if a common framework could describe and compare key components of international models of integrated care.

Theory and methods:

A framework developed for an international study of programs that address high needs high cost patients was used to describe and compare 11 case studies analyzed in two international research projects; the Implementing Integrated Care for Older Adults with Complex Health Needs (iCOACH) study in Canada and New Zealand, and the Vilans research group exploring models in the Netherlands. Comparative summaries were generated, with findings discussed at a 2019 International Conference on Integrated Care workshop.

The template was found to be useful to compare integrated case analyses in different contexts, and stands apart from other case comparison approaches as it is easily applied and can provide practical guidance for frontline staff and managers. Areas of improvement for the template are identified and two updated versions are presented.

Conclusions and discussion:

There is value to using a common template to provide guidance in international comparison of models of integrated care. We discuss the applicability of the approach to support scale and spread of integrated care internationally.

Introduction

Health systems globally are still struggling to roll out system-wide models of integrated health and social care [ 1 ]. In part, this is attributable to a lack of understanding of what elements are important for successfully scaling up integrated health and social care initiatives [ 2 , 3 ], and how to overcome associated implementation challenges [ 4 , 5 ]. While examples of innovation exist, they often never expand beyond the pilot phase. Sharing knowledge across these examples may offer insights into how we can scale, spread and sustain innovations as a vital step towards broader health system transformation. This type of comparative work is represented in the approach taken by the World Health Organization (WHO) [ 6 ]. In the WHO practical guide to scaling up health service innovation, they suggest it is essential to have a clear idea of the core components of the innovation, the organization, and the environment (context) to inform the process of scaling up [ 6 ]. It is also useful to consider the needs of adopters and their role in adapting and spreading innovations [ 4 ].

Comparative case study approaches may offer promise in meeting these challenges by sharing successes and identifying causes of ineffective health reform efforts [ 7 ]. To unpack and understand the complexities of integrated models of care across different countries and jurisdictions, many studies have adopted comparative case approaches [ 8 , 9 , 10 ]. Comparative case study methods have a fairly long history and a robust methodology [ 11 , 12 , 13 ]. At their core, they seek to understand phenomena in context. As compared to other methodologies that may aim to control for the “noise” of external factors, case studies consider embracing the mess of context to be fundamental to our ability to understand not just what occurred, by why and under what circumstances [ 14 , 15 ]. Studying phenomena in context allows for collection of essential elements of innovation identified by the WHO to inform scale and spread.

Comparative analyses focusing on health system reform have evolved over the last 20 years beginning with a macro level policy focus. More recent studies have focused on meso-level organizational processes and practices [ 16 ]. Comparative case studies at the organizational level have been shown to provide valuable insights with regard to effectiveness of interventions in particular contexts, can contribute to theory building, and can be used to guide implementation of new models [ 17 ]. Numerous single and comparative case studies of integrated care have been conducted [ 18 , 19 , 20 , 21 , 22 , 23 ], and can facilitate learning across borders to build strong national knowledge [ 7 ]. However, the purpose of these approaches are often to evaluate programs (comparative case study methods) or to provide evidence to inform policy (comparative policy analysis) within a context, and are generally not intended to offer practical guidance to support scale and spread and to compare among different contexts. What is required are approaches to describe core components of the intervention, organizations and environments that can be applied by adopters, i.e. practitioners implementing on the ground.

This study marks an important step towards development of an international standard for reporting integrated care initiatives, building on tools and lessons learned in developing a template to describe how programs worldwide are addressing a common problem of more efficiently and effectively delivering integrated care to patients with high needs and high cost. Researchers at the University of Toronto in Canada developed a guide to create standardized descriptions of models across nation-states. These descriptions were intended to be easily accessible to providers and managers seeking to adopt models of integrated care in their own settings. This project was initially sponsored by the Commonwealth Fund in 2018. The present study aims to assess whether the same method could be applied to extract similar descriptions of integrated care cases that have been studied as part of unrelated large empirical comparative case studies. This work was driven by two research questions:

  • What modifications and adaptations to the template may be required?
  • What are the recommendations for adopters, researchers and decision-makers who wish to use the Case Template?

Describing models of integrated care to inform scale and spread

In Goodwin’s 2016 perspective paper regarding how we define and understand integrated care, he offers “at its simplest, integrated care is an approach to overcome care fragmentations” [ 24 ]. This “simple” statement is arrived at through an account of the multiple, complex ways health systems address fragmentation via different levels of integration (eg, micro vs. meso level), taking on different forms (eg, horizontal vs. vertical integration), and occurring at varying degrees of intensity. Different heuristic models and frameworks of integrated care are available to unpack this complexity, and help determine which factors should be understood when attempting to describe the salient features and activities of models of integrated care. However, if we are to use descriptions to inform scale and spread of models of care, we must look beyond simple descriptions of key features and better understand the dynamic.

Recent writing from Horton et al and the Health Foundation about the challenge of spreading complex programs such as integrated care has emphasized the difficulty in “codifying and replicating” complex interventions [ 4 , 25 ]. The difficulty in codifying interventions refers to the challenge of determining which features of the program are most relevant to describe, and the possibility that the features of a program that drive its success might not be those we expect. Horton et al. emphasize that in addition to the basic descriptive features of the design of a program, it is also important to outline the implementation processes or “social mechanisms” by which a program has worked [ 4 ]. Program descriptions must balance a tension between “loosening and tightening” the descriptions of an intervention in order to inform the effort to spread the intervention broadly. A “loosening” approach encourages local adopters to imagine transformations to the program that would promote its success locally, whereas a “tightening” approach emphasizes details about the exact implementation processes and relational contexts that made the program successful. If the conditions of initial programs can be fully implemented then the tightening approach is most useful, otherwise some extent of loosening is needed and the core activities that constitute the program must be described in a way that enables adopters to achieve specific related program goals with the resources available.

Keeping in mind these two essential factors, describing key elements of integrated care with attention to social mechanisms, the Integrated Care Case Study Descriptive Template was developed to enable comparison of integrated models of care across diverse geographies and contexts; describing 35 programs in 11 countries for the Commonwealth Fund.

Developing of the Case Template

The initial development of the data collection template was completed by a team at the University of Toronto as part of a project funded by the Commonwealth Fund. This work built on an initial project with the University of Toronto and the Kings Fund, describing international cases of integrated care in Australia, Canada, the Netherlands, New Zealand, Sweden, The United Kingdom, and the United States [ 26 ]. In the Commonwealth Fund project, the team developed two separate templates; one for collecting data on design elements and activities of the program and another for collecting data on the policy context that supported the program. The construction of both of these data collection templates were based on literature reviews and expert opinion.

The design elements template drew heavily on the work of the Commonwealth Fund’s International Experts Working Group on Patients with Complex Needs report [ 27 ] and the survey was structured to assess 10 design dimensions that the report suggested were essential and grouped these into three broader areas: 1) population segmentation, 2) care coordination, and 3) patient and caregiver engagement.

The policy support template was focused on the external policy and incentives component of the consolidated framework for implementation research model [ 28 ] and was informed by the National Academy of Medicine report on integrated care [ 29 ]. The template identifies four policy categories: 1) finance and payment, 2) data infrastructure and data sharing, 3) workforce and 4) staffing, and governance and partnerships – and allows for identification and description of policies that were relevant to models of integrated care.

Table ​ Table1 1 summarizes the components of the Integrated Care Case Study Descriptive Template (for brevity, hereafter referred to as the Case Template).

Integrated Care Case Study Descriptive Template.

Program Structure (design elements)
       Defining and applying rules to identify and recruit patients who are likely to benefit.
       A process for intake to characterize needs, mechanisms for coordination across institutions and sectors like health and social care.
       Support for shared decision-making, self-management and support for caregivers.
       How programs defined success, their level of maturity and any evaluation work conducted.
       Governance structures in place to support the model of care. Could include committees and/or boards who meet regularly and review performance data.
       Data and information sharing policies and processes in place related to patient care.
       Staffing needed to support the model of care, including strategies on how to organize and prepare staff.
       Financing structures put in place to support the model of care. Includes attention to payment mechanisms, presence of well-defined budgets, and sustainability of funding.

While the components are separated here, it is recognized that they are also interrelated. For example, appropriate approaches to coordination and engagement are likely contingent on the types of patients and caregivers being served which is determined through the intake and recruitment process.

To answer our research questions, we applied the Integrated Care Case Study Descriptive Template to case studies conducted by the Implementing Integrated Care for Older Adults with Complex Health Needs (iCOACH) and Vilans research teams. Both these groups have conducted larger international case studies of integrated care undertaken with non-uniform and uniquely, locally defined approaches. We used the template to describe 9 integrated care cases from the iCOACH study which explore models in two jurisdictions in Canada (Ontario and Quebec, each with 3 cases) and in New Zealand (3 cases), as well as 2 cases from the Netherlands studied by the Vilans team.

Setting: iCOACH and Vilans case studies

The iCOACH research team included researchers, decision-makers, trainees and patient and family representatives from Canada (Ontario and Quebec) and New Zealand to explore the implementation of integrated community-based primary health care for older adults with complex needs. The cornerstone work of the team has been in-depth case studies of 9 different integrated care models, 3 in each jurisdiction. The team took a whole systems approach to understand the cases, including patient and caregiver, provider, organizational, and system level factors that play a role in the implementation of the models of care. To meet project objectives a multi-method case study approach was used, collecting qualitative data (interviews), quantitative data (surveys), as well as document analysis from each case(29). Overviews of the methods, theoretical frameworks, cases, policy environments, and reflections from decision-makers, patients and caregivers can be found in the iCOACH special issue in IJIC [ 30 ].

The Vilans research team consists of researchers from the Netherlands, working on several national (diabetes networks [ 31 ], stroke services networks [ 32 ]) and international comparative case studies (SUSTAIN [ 20 ], ESN [ 33 ]) on the development and implementation of integrated care initiatives. The researchers use a comprehensive multi-method case study approach. Both quantitative (surveys) and qualitative data (interviews, field notes) were collected from multiple perspectives (service users, professionals, managers as well as decision-makers).

All 9 of the iCOACH cases, and 2 cases from the Netherlands were included in the analysis presented here. While these cases were purposefully selected to answer the original research questions (see aforementioned publications), for the purposes of the case comparison study presented in this paper, case selection is more aligned with a convenience and purposeful approach [ 14 ] as they had sufficient data readily available to complete the Case Templates. Additionally, these were cases highly familiar to the study team as they had each engaged in setting up the original studies, collecting data and/or analyzing data for other studies. This afforded the team a wealth of context knowledge around the cases required to align available data to the template.

Data extraction and analysis

The Case Template was originally created as a structured interview guide conducted in two parts. Key informants with knowledge of the models of care would start by rating their models along the four components in each section, and then would be asked probing question to elicit greater detail. See Supplementary Material 1 for an overview of the original interview guide questions and probes. We adapted this method, using the guide to “interview” ourselves, using the data collected in our case studies to answer questions and probes.

For the present study, research leads with in-depth knowledge of the cases in each jurisdiction (Ontario, Quebec, New Zealand or Netherlands) were assigned to complete Case Templates for cases in their area of expertise. While we did not have one of the local New Zealand research team members available to participate in this work, the Ontario team members participating in this study had previously conducted much of the initial coding and analysis of New Zealand data and had been working closely with New Zealand research team members, providing them the necessary knowledge and expertise of the models to conduct this work. Leads looked at case study data collected as part of the iCOACH project and similar data from Vilans integrated care case studies. Various data sources were reviewed to complete templates for each jurisdiction, including: published articles based on the case studies, documents collected as part of case study work (eg, vision and mission statements, relevant policy documents and websites), coded interview data from interviews with providers and managers, and, where required to fill gaps, original interview transcripts were reviewed. Research leads used these data sources to write answers directly into the structured interview guide for each of the 11 cases reviewed, and maintained analysis memos to track data sources, the process taken, and preliminary analytic thoughts.

To begin a single Case Template was completed for each of the four jurisdictions and circulated to the team for discussion regarding process and preliminary analytic reflections. Once we were satisfied that a similar process was being used across jurisdictions, the remaining cases were completed following the same procedure. With all templates completed the lead author distilled data into a single table to facilitate cross case comparison. The table was circulated to the team for review, followed by an analysis meeting where similarities and differences across the cases were discussed and agreed upon.

Expert discussion and review

The above process and key results were presented in a 90 minute workshop at the International Conference on Integrated Care held in San Sebastian, Spain April 1 st 2019. In the workshop delegates were presented with the framework, an overview of the cases, our methods for comparing cases, and key results (presented in the results section). Delegates attending the workshop included researchers working in the field of integrated care, policy-makers and other decision-makers, as well as managers and front-line providers/practitioners engaged in delivering integrated care in their respective countries, representing all corners of the world including Europe, North America, Australia/Oceania and South-East Asia. Workshop participants had an opportunity to apply the Case Template to their own cases and engaged in roundtable discussions to help us address research questions regarding adaptation and recommendations for using the template. Workshop facilitators took notes at the session, and co-authors engaged in a post-workshop discussion to identify key learning from the exercise. While no formal ethics process was followed as the conversation was not recorded and names were not collected, delegates were made aware that the discussion would inform the refinement of the Case Template and be included as part of the publication.

The Case Template provided a useful lens to explore the 11 international cases. Table ​ Table2 2 offers a high level summary of the data across cases, with a full dataset available in Appendix A. The full data set was used to generate analytic discussion across the team. The following two sections highlight key findings from parts 1 and 2 of the Case Template and demonstrate its ability to be used to describe case studies in a comparable way. The 11 case studies represent different models of integrated care; for simplicity we refer to the case examples as “models” of care or cases.

Sample case descriptions [for full descriptions see Appendix A].

CaseSegmentationCoordinationEngagement SuccessmeasuresPolicy Context
SUSTAIN
South Holland frail, multiple health and social care needs (but broadly defined)
self-referral (clients and families) or by professionals in the community (prevention driven with active community communication)
Conducted by any provider using a standard tool (Self-reliance matrix) filled out during a home visit (using a tablet). Patient then assigned navigator (anyone on the team, all trained for this role – likely with most experience with care needs of patient).
Each team includes community nurse or GP practice nurse. Direct connection to GP practice varies.
At minimum social worker, community nurse and municipal social care worker. Can add: dementia case managers, physicians, social housing, etc…
All providers still linked to their parent organizations which can facilitate transition
All providers can access a shared data platform which includes online communication tool (all teams trained on it).
A strong belief of the teams, but not formalized. Also engagement is limited due to the low functional status of clients.
Similar to issue above, believed to be important but difficult to operationalize. Additional challenge of different characteristics of neighbourhoods.
Not yet a clear component of strategy – but family issues captured as part of the assessment process. Experimenting with digital tools to support caregiver engagement (interest in building this long term).
Program admitted first client in 2015 and has served 5,000 since. Served approx. 300 in the past 6 months. Program started as a pilot (3 teams) and scaled up in 2016 (27 teams). Composition, objectives and aims of teams varies by neighbourhood.
Better health outcomes, patient/caregiver experience and lower costs – these are not formalized in measures (not unusual).
No data on program activities are collected. Currently developing performance indicators.
No formal evaluation conducted.
Both municipal (public tender and subsidized funds) and health insurer financed.
All professionals stay employed by their mother organizations. Next to their daily work, they get extra hours for doing the multidisciplinary work/meetings. Professional training is executed by the local (applied) university and funded by the municipality.
Shared governance model. All involved parties (health and social care providers, GPs, municipality, health insurers) are represented in a steering group. However, the two financing institutes (insurers and local government) are directing. No performance data is collected yet.
To facilitate data linkage, a shared IT system has been developed. However, ‘old’ systems are still being used. Administrative burden is a risk.
Most innovative part is that a person/family has 1 contact person, and that integration takes place in all phases of the process: from intake to care delivery.
Integrated Client Care Program (ICCP) – partnership model between primary care and home care The Integrated Client Care Program (ICCP) focuses on the top 1–5% frail older adults in need of integrated services. Patients are assessed up the RAI tool.
Patients can enter the program through multiple entry points including the Family Health Team (FHT), the Community Care Access Centre (CCAC - government agency that connects patients to home care services), and through other partners and community agencies aware of the program.
Intake depends on the RAI evaluation (see target group). The Care Coordinator from the CCAC typically takes responsibility for ICCP patients.
All patients on the ICCP program have sustained access to their primary care provider who is supported through a multi-disciplinary team
There is a high level of professional integration with the multi-disciplinary team, as well as organizational integration between the FHT, the CCAC and hospital.
There is a formal program with the local hospital, Virtual Ward, which supports transitions for clients going from hospital to home. This is a clear process and protocol but only for patients at the local hospital – if patients end up in another hospital there is no process.
Partnering organizations have connecting information systems (hospital and FHT), or individuals able to access multiple platforms (embedded care coordinator can see FHT and CRIS systems)
Patients engagement occurs at this site and is an increasing focus. There are patient and family carer seats on committees and strategic planning groups.
Collaboration and patient goal-setting is a part of the culture at the FHT and embedded into the ICCP program.
Caregiver support less formalized, but providers are attentive to caregiver needs and attempt to provide supports when they can. Not a formal process.
The ICCP program began in 2012 and is a replication from the ICCP program in palliative care run out of the CCAC. Other FHT programs like Virtual Ward and IMPACT are also established and support the integrated model. IMPACT is a replication from another site.
Standard FHT measures apply to the FHT for reporting to the LHIN on performance. It is noted by decision-makers that other measures are currently missing, but they would anticipate that reduced hospitalizations and ER visits be among their key measures.
Data not available
ICCP was not formally evaluated at the time of data collection. A different Virtual Ward program at Women’s College has had a formal evaluation, as has the IMPACT program in other settings.
The ICCP program is funded by the FHT and CCAC through paying for specific staff to run the program. For the FHT the staff is now part of the global budget.
Unique staffing model which collocates the community partner (home care coordinator) in the multi-disciplinary primary care team to improve coordination and information sharing.
The FHT, like other FHTs in Ontario has a board of directors that reviews performance metrics aligned with Ministry reporting requirements.
Some innovative data sharing between the FHT and the local hospital (sharing medical records), electronic referral and information sharing with Toronto EMS (paramedics), and colocation of staff enables seeing health and social care data while in the primary care clinic.
The colocation model of ICCP, along with the virtual care and home visiting programs are innovative practices in Ontario.
Functional Autonomy Measuring System (SMAF) used to determine eligibility – need a particular score to be included. SMAF scores guides a multidisciplinary care plan. Host organization and local organizations may have some flexibility in what is provided. Those with a SMAF >5 receive a case manager through home care services unit. Personalized care plan but shared-decision making difficult to operationalize. Culture of shared decision-making supported by government and leaders. CLSC’s operational since 1970s with 100,000’s since than. It is an established government run program with secure funding and spread across the province. Public fund troughs taxation. In complementary, patient may directly pay for services from community agencies that are mostly not covered by the public insurance.
Patients with 2 or more YES answers on PRIMSA-7. SMAF is managed by a specialized team at a single point of entry for defined geography. Clients can also self-refer Some regular contact but challenging to connect to primary care as they are privately owned. Case managers have primary responsibility. No clear self-management support aspects of program Better health outcomes, patient and caregiver experience and lower costs. Related to government healthy aging policy with specific indicators: reduced wait times, reduced ED visits, # clients in the program. All professionals stay employed by their mother organization. Recent initiatives are in place to “lend” allied professionals (nurses, social workers, dieticians etc.) to privately owned Grouped Medical practices – the allied professional are still employed by the mother organization but work in private physician clinics. Family physicians in the community are paid through public insurance but are autonomous workers.
Types of services offered varies by local organization but all include primary care in the community, acute and surgical, home care, nursing home, supportive housing, community day care and social supports. Some co-location but not in all sites.
Some organizations have dedicated care transitions provider (engage in pre-discharge meetings)
Have ICT systems to facilitate integration and transitions, in particularly tools that send transfer information electronically. Government mandated (RSIPA system). Some variation in access due to location.
Some caregiver supports offered (e.g. respite days) – no information regarding formal training. Performance indicators reported on regularly.
Several formal research studies conducted to evaluate the model. Developed OSIRSIPA tool to monitor implementation and outcomes.
Since 2015, Almost a full integration of public establishment under the same governance (hospital, rehabilitation, home care, long care term facilities) Vertical governance structure. The HSSCs are public health and social care agencies that are mandated by the government to organize care delivery in their territories. The HSSCs have to lead in establishing local joint governance boards for various health problems with their local partners in the community (physician clinics, nursing homes, private community agencies etc.).
There is a government mandated IT system (the RSIPA) that is shared between various agencies within the HSSC. However, some private agencies do not have access to this public IT system. Furthermore “older” IT systems co-exist with the public IT system.
Introduction of several initiatives. E.g. formalization of care coordination by case managers, use of multidisciplinary individualized service plans, and use of multidisciplinary health and social care evaluation tools.
NZ Network Model The DHB serves a broad population but the CREST and care coordination programs focus on 65 and older population transitioning home from hospital.
Clients access services through Liaison Nurse who identifies eligible individuals in the hospital. Referrals for case management and care coordination programs for older adults can come through GPs, other providers or through self-referral.
Assessments used by Liaison nurses and care coordinators to assign services based on function and need (eg, interRAI)
GPs play an active role in NZ Network Model, referring patients as needed to programs and following up with other providers. They will engage in case conference calls with other providers as well.
Involves a wide range of health as social care services some of which are tailored to older adults with complex care needs. Providers regularly speak across boundaries to deliver care.
CREST is a structured transition program from hospital to home. Care coordinators and case managers work to help integrate other services. Teams across services also work together.
Use a few systems to share information include CCMS, SAP, Momentum, Health Conenct South and One Health Now. Providers can access patient data that sits on these systems from different settings (eg, pharma, labs, clinical care, hospitals).
Goal-setting part of care delivery (particularly for CREST programs), not part of DHB training but embedded in professional training and approach.
Area of focus particularly for the CREST program with an emphasis on enablement and support.
Not an emphasis
New model in DHB established in 2006/7 but gained traction in 2011 post earthquakes. An established program with ongoing funding.
Emphasis on process measures (early discharge), also collect patient satisfaction and engage in peer review meetings
No regular reporting mentioned in interviews – but likely occurring particularly for funded partners
No formal evaluation to our knowledge
DHB shifted to activity-based payment model for hospitals and bottom-up focused alliance contracting where maximum collective gain can only be realised if all parties support one another and agree to share any losses
Unchanged – what has changed is how they work together
Shift towards a Network model reliant on partnerships and governed by Alliance Support team.
Not necessarily new but part of the NZ approach to data where patients have unique identifiers across health and social care data platforms to facilitate finding information.
Most notable shift is in moving clients out of hospital and into the community setting faster through partnerships with social care providers and enablement program (CREST)

Comparing model structure features: Segmentation, Coordination, Engagement, Measurement

The models reflected in case studies followed various segmentation approaches. While all models cover a specific geographical area, they differ in their target group focus. In some cases, models have a broad scope, serving local communities as a whole (Community Health Centre, the Maori health organization); whereas other models focus on a more specific population. For instance, the CREST and care coordination programs in the NZ Network model, focus on people of 65 and older transitioning home from hospital. There were also examples of, “in between” models that focus on frail people or older people as wider target groups. When looking at the entry of these people into models, three categories can be distinguished: 1) professional entry, 2) self-entry, and 3) a combination of both. The Integrated Client Care Program (ICCP) model in Ontario, for example, can only be accessed through professional entry points. In the Community Health Centre model, on the other hand, both self-entry and professional entry are possible, but access is subject to availability and wait-lists. In other programs, such as South Holland, Quebec and the Maori health organization, both professional and self-entry are used.

“People or their family and friends can refer themselves to the program by visiting the municipal single access point (visit, phone and online). People can also be referred by professionals working in their neighborhood, having an active signaling/preventing role.” [South Holland program].

After entering the program, intake processes take place in all cases. Our analysis shows a broad spectrum of formal and less formal ways to conduct an intake. Some initiatives established standardized processes using validated instruments, such as the Functional Autonomy Measuring System (SMAF) guiding the development of multidisciplinary care plans in Quebec. The use of this clinical tool to assess the level of autonomy of older adults was mandated to all programs in Quebec. In other models, such as in the Community agency lead model in Ontario, the intake is an informal process and varies from program to program.

Although some variations in consistency and access are reported, in 10 out of 11 cases information sharing takes place through shared or linked digital data platforms to some degree. Only in the Utrecht Hills case is it described that professionals are not allowed to electronically share information and therefore rely on multi-disciplinary meetings occurring every six weeks.

Although many programs state that their practice is strongly driven by a belief in patient engagement, self-management and caregiver engagement, most report that few formal activities to achieve this have been implemented. Some models stress that goal-setting with patients is part of the working processes and happens regularly (e.g. the Maori health organization, ICCP). Other models report educational materials for patients (Community agency lead model, Ontario), information, advice, guidance and support for caregivers (Utrecht Hills) and respite programs for caregivers (Quebec programs). One program, ICCP, has organized patient and family caregiver roles on committees and strategic planning groups.

“Government emphasizes shared decision-making, which is martialized by the personalized care plan. The operationalization of a “shared decision-making” concept is often difficult. Influenced by provider’s time pressure, case-loads, characteristics of clients (cognitive abilities – here providers will share decision making with their caregivers) etc.” [Quebec model]

Besides their different segmentation, coordination and engagement structures, the models analyzed use a broad range of outcome measures. Mainly the Canadian programs (Community agency lead model, ICCP and Community Health Centre in Ontario, and all cases in Quebec) collect a relatively extensive amount of data on health outcomes, patient and caregiver experiences and costs. For example: the community agency lead model, collects data on service utilization, client experience/satisfaction, ER visits and fall rates, quality of life as well as a variety of primary care measures. Other practices measure their success in a less standardized way, for instance by focusing on process measures (NZ Network model) or by using more pragmatic and informal measures (South Holland). Three programs reported that no outcomes are identified or systematically measured. Only one program (representing the three Quebec cases) reports that several formal research studies have been conducted for the evaluation of the model.

Comparing the policy environment: governance, funding, staffing, innovation

The 11 cases analyzed had various governance structures. Most models had a shared governance structure consisting of partnerships between organizations involved in the continuum of care for their target populations (South Holland, Utrecht Hills, the Community agency lead model, NZ Network model). Partner organizations were often represented in steering committees of directors which included partner representation. Other programs were led by a single organization operating with a board of directors (ICCP, Community Health Centre). The Quebec program (representing 3 cases in different sized jurisdictions) follows a fully integrated model with the structural merger of all health and social care organizations under a single governance structure.

Funding approaches also varied across models. The Maori health organization model was funded by multiple sources – government, district health boards and primary health organizations. South Holland and Utrecht Hills adopted mixed funding models through local/municipal governments and private health insurers. Other models had dedicated funding through partnerships of organizations for specific staff within a primary health care clinic. For instance, the ICCP model was jointly funded (in-kind) with staff supported by the primary care practice and the local community agency. The Quebec model is based on a global budget to a single governance structure financed publicly through taxation.

Although multidisciplinary team-based care was an essential component of each case, most staff stayed employed by their mother organizations. Two approaches emerged on the staffing models that ensured multidisciplinary team-based care. First, South Holland, Utrecht Hills and NZ Network programs did not change their staffing models – these programs focused on changing professional attitudes towards improved inter-professional collaborative relationships. Second, other programs opted for co-location of staff. For instance, the ICCP model co-located community care coordinators to multidisciplinary primary care teams while the Quebec model co-located nurses and social workers to community-based family medicine group.

While nearly all cases used some sort of IT system to store and share data, our analysis reveals models have two main data sharing issues in common. First, the models faced challenges in linking data between the “newer” IT systems and the “older” IT systems. In fact, the newer IT systems were often layered upon existing IT systems. Furthermore, older technologies like faxing were still used to share data across organizational boundaries. Second, there was a lack of interconnectivity between IT systems of various health and social care providers. For instance, in some programs, the IT systems of nurses, social workers or community-based family physicians were not inter-connected. Co-location of staff in the ICCP model facilitated data sharing because community care coordinators could access the IT system of their primary organization and share relevant data with their primary health team. A challenge related to the use of IT by different professionals is the access to data entry compared to reading only. This had an impact in the interdisciplinary communication.

Innovation was an important aspect of the programs we analyzed. We identified several local care delivery innovations across the programs. Most programs endeavored to assign a single contact person responsible for the coordinating health and social services for a user. New professional roles like the care navigator were developed in the Maori health organization model. Co-located hub sites that brought together different professionals from different organizations was an innovative feature of the Community Health Centre model. The Quebec models developed innovative and comprehensive multidisciplinary health and social care evaluation tools (such as the OEMC ( outil d.évaluation multiclientèle ) tool) that facilitated inter-professional collaborations.

The results presented here represents a step in the development of an international standard for reporting integrated care initiatives, offering a cognitive test and additional validation of the Case Template developed to describe integrated care cases. We have demonstrated that the Case Template can successfully be applied to disparate international research studies, generating comparable data across 11 cases from 3 different research programs across 4 countries. In this discussion, we suggest modifications to the Case Template based on this work, and identify potential value this approach brings to different stakeholders, with an emphasis on value for adopters of integrated models.

Challenges adopting the Case Template

Based on our application of the template as well as feedback from the ICIC19 workshop, we identified the following adoption challenges:

Definitional clarity: In particular during the workshop, delegates struggled with definitional clarity needed to help them apply their experiences and models to the Case Template. One notable example provided by a delegate was around the concept of a “care or patient navigator.” This term was not consistently used across different jurisdictions amongst delegates, nor was it used consistently in the iCOACH and Vilans cases, leading to an in-depth discussion of what is meant by navigation as compared to coordination. It was determined that key terms in the template would need to be well-defined to ensure clear understanding and comparability across jurisdictions. Attending to perspective: Another important reflection in the workshop discussion was regarding attending to who exactly would be filling out the templates should these be implemented across multiple jurisdictions and programs looking to describe their models of care. It was noted that a front-line clinician and executive-level manager of the same model may respond to the same questions differently, requiring that we be clear on who in the organization should be filling out the templates to ensure comparability across sites. Divergence in perspective from different stakeholders has been found to impede implementation of integrated care [ 34 ], and as such a critical component when thinking of scaling and spreading models. Redundant concepts: Another area of struggle for the research team, as well as for delegates in the workshop, was in teasing apart concepts that felt too similar or event redundant. The most prominent example of this was in questions around eligibility in the segmentation section, and the intake process in the coordination section. It was found that often models of care would determine eligibility as part of their intake process via assessments, surveys, or interviews with patients and their families. Capturing culture: Both the research team and workshop delegates noted that the Case Template captures more process-oriented aspects of integrated care with less emphasis on cultural practices that are equally important to driving models of integrated care[ 35 ]; one notable exception is a prompt questions regarding having a patient and family engagement culture at the organization. In research team discussions, as well as those in the workshop, it was found that we could not speak about what worked functionally without attending to normative issues of relationship and culture that were considered necessary to make processes work. Even in filling out the templates, jurisdiction leads would often include reflections on these normative aspects of integrated care as they could not be removed from the processes being described. What level of context details matter: A consistent debate amongst the research team, and reflected in workshop discussions was the level of detail required in filling out the Case Template. This was particularly important with regard to sharing learning on how cases addressed common issues. For example, when discussing the differences between funding models, it was important to drill down on key details such as navigating union agreements and how to engage multiple funders so cases could learn how to navigate these difficulties. Other challenges, however, required less detail to understand across cases. In discussing inter-professional teams, it was determined to be less important to know exactly how an inter-disciplinary team was structured (eg, how many physicians, nurses, or social workers involved) or communicated, than it was to understand how the team built their relationship so they could work together to meet patient needs.

While challenges were noted, the delegates at the workshop generally felt the structure of the template captured key aspects of integrated care. It was clear in the discussion that the template was not considered to be a stand-in for more rigorous comparative case study research methods, but rather is most useful as a practical tool to describe cases and support knowledge sharing across boundaries. The participants felt the relevance of the framework was to summarize case studies and initiate a conversation to share learning on key features of integrated care models.

A critical learning was that we were successful in adopting the Case Template given the team’s research skills and in-depth knowledge of cases. While this allowed us to create comparable data sets, this may not be easily applied by managers who wish to describe their models of care. As such we offer two modifications to the template. The first is a refinement of the template that can be adopted by other researchers seeking to use the template to compare disparate empirical cases of integrated care. The second is a simplified template that we anticipate can be more readily adopted by managers to quickly describe their model in a standardized way.

Modifying the template for researchers and the value of the approach

During the post-workshop discussion, the research team identified the key areas where the template required modification based on: 1) what was discussed at the workshop and 2) notes and minutes from analysis meetings in which the challenges of applying the framework across cases were documented. We determined that many of the challenges identified in our application of the template and workshop with delegates from ICIC19 can be mitigated by modifying the template as well as providing clear definitions and guidelines for its application. Much of the content and structure worked well, and will be strengthened through the following changes:

  • Reframing segmentation questions to focus on general population of interest for the model of care, maintaining the first question as is, and moving the referral question to be a part of the intake section under coordination.
  • Streamlining the prompts to reduce redundancy in questions.
  • Adding prompts to the segmentation, coordination sections, and part 2 of the template to capture normative aspects of integration (eg, relationships and shared values that underpin these processes).

To address the important aspect of perspective and definition, we also recommend adding:

  • Clear definitions of each concept (eg, care coordination) upfront, and a section where respondents can define concepts specific to individual cases as needed.
  • A section where individuals filling out templates can identify their role in the organization.

Finally, we recommend restructuring the approach to improve feasibility of use for secondary data analysis, allowing data to be extracted from available sources rather than using an interview format. We added an introductory page which addresses how to do this work, the issue of describing contributors, and a space to provide a high level context summary of factors viewed as influential on the model described (addressing the identified issue of context). We reflect these changes to the template in Supplementary Materials #2.

For researchers, this template can be used to determine comparability of case study data as a preliminary step before engaging in more rigorous comparative case study work. One approach to comparative case studies suggestion by the WHO is to look at available data with an aim to adapting it to a common unit of comparison [ 7 ]. Our proposed modifications to the Case Template can help to achieve this aim, and serves to address three identified challenges when engaging in service level comparisons across regional boundaries [ 16 ]:

  • Securing comparability in terms of key concepts as different regions may assign different meanings even to terms that are widely used. In particular be cautious when creating typologies which can often trade-off accuracy for simplicity. Including definitions and areas where definitions of key concepts can help address this challenge.
  • Attending to both between and within system-wide differences that may influence which contextual factors are at play. Regional-based differences need to be attended to, and so descriptions should be careful not to generalize one program description to an entire nation, particularly when looking at decentralized models of care delivery. The second on policy context offers a means to tease these differences apart.
  • Finding and selecting data that is able to be compared across disparate cases. A balance must be struck between comparing aggregate level data, without losing important context and nuance unique to individual cases. This is particularly challenging when comparing in-depth case studies which are rich, detailed and contextual. The proposed template points to key constructs and leaves room for different levels of detail descriptions as determined necessary by those applying the method.

The proposed modified template can help research teams describe cases including both program and contextual policy-related factors. For non-researchers, further simplification and standardization is useful.

Simplifying the template for managers and providers

Keeping the modifications above in mind, as well as what was learned in applying this method, it is clear that our success in using the Case Template to compare and contrast a highly varied set of programs may likely be derived from having: 1) strong research backgrounds; 2) expertise in the area of integrated care; and 3) in-depth knowledge of the cases we were describing. There have been attempts by the researchers who developed the Case Template to have front-line managers and providers use it with much less success, mainly due to its depth and complexity. As part of this work to create a survey that could be used from front-line staff, research team members have been working with IFIC to review other survey tools alongside the Case Template to see if the tool could be simplified. These other tools were reviewed, and, alongside what was learned to modify the Case Template, a simplified template was developed. An initial version was written, then circulated to the team for review and discussion until consensus was reached. This second modification to the template is intended to be used by managers and providers working on the front-line to describe their cases. This simplified template can be found in Supplementary Materials #3. The intention here is to allow for a standardized approach to describing models of integrated care internationally that can be collected quickly and effectively directly from those delivering the model; reducing the need for the resource-intensive approach that relies on research teams.

A simplified, standardized template has value to many stakeholders in the system but, in particular, organizations seeking to provide innovative integrated care either by modifying their existing care delivery or adopting innovations that others have developed. In both circumstances, there is a need to accurately document or describe the innovation and to systematically understand which components or processes have been kept the same and which have been modified. These descriptions help organizations to more clearly see the main components of integrated care models, compare their existing ways of working, and see the path towards a more mature system (eg, moving from having no structured protocols for coordination processes, towards having clear protocols and strong commitment).

In the background section of this paper we presented Horton and the Health Foundations argument for the need to balance the “tightening” and “loosening” of program sections to support adoption of complex interventions [ 4 , 25 ]. Particularly in the context of adoption of complex integrated care innovations, there is a tension between having a very detailed definition or codification of the innovation that allows for fidelity and assurance of expected outcomes and allowing for modifications to take into account local context and resources [ 36 ]. The goal then is to find a “middle” way between descriptions that are too tight to be successfully replicated in new settings and too loose to allow for a reasonable expectation of predicted impact. Some recent work has shown that frameworks that are acceptable for descriptions of randomized trials may not be detailed enough to allow for meaningful spread and adoption [ 37 ]. We hope to test our new framework in the context of supporting adopters to determine if it is closer to the middle way than other existing tools [ 38 ].

A final value-add of the both modified and simplified templates is the opportunity to build a community of practice around the implementation of integrated care internationally that not only consists of those studying integrated care, but those engaging in it as well. Establishing continuous learning and social networks create opportunities for training and knowledge exchange that are found to be critical factors in supporting scale and spread of health system reform efforts [ 39 ]. We intend to use the simplified template to support sharing of knowledge, enable self-assessment, and help build social networks to advance scale and spread. First, we will pilot the simplified template at ICIC20 in Croatia with attending delegates, as well as through IFIC and its affiliate branches in Canada, Ireland and Australia with the longer term vision of generating a summary data set of integrated care models worldwide. The summary data set represents important shared knowledge that can be used by providers and managers to compare themselves to other models working in similar contexts. As IFIC already has a wide international member-base, it can also help facilitate additional social networking between models with similar profiles which can help support teams to come together across borders and then ask more detailed and granular questions to deepen learning and support scale and spread.

Limitations and Future Work

To conduct this comparative model of integrated care, the team worked with data already collected through case study research. As there was no ability to probe beyond the information already available, some details regarding descriptions of the models may have been missed. We additionally were unable to determine, at this stage, the “correct” or “optimal” level of detail required to provide more granular guidance. The discussion at the conference offers some indications that focusing on higher level context variables offers insightful information to compare cases, and may be more feasible than providing in-depth detail at all levels. However, we recognize that this approach may miss some micro level differences that could be important for adopters and researchers to consider. More work to tease apart the “right” level of context detail is likely still required.

We also recognize the issue regarding differing perspectives of management and front-line staff that was raised at the workshop may be a substantive one, potentially signaling issues with culture and leadership approach of a model. As these are complex challenges we do not recommend unpacking them using a descriptive template such as is presented here. Instead identification of disparate perspectives within a single model may signal the need for researchers to dig more deeply, and for models to attend to misalignment in the understanding of the programs vision, aims, and processes amongst staff.

The sample of cases we chose for this analysis was necessarily based on a convenience sample of the studies we had already conducted. An application of our method to other cases may yield additional insights on the template, and as such we recommend the modified and simplified templates be viewed as “living documents” to be revisited and refined as they get applied and new insights are generated.

Finally, the two modified versions of the survey require further validation and testing, in particular, the simplified version needs to be tested with front-line providers and managers to ensure that it can indeed be easily applied and provide implementation guidance. As previously noted we intend to pilot the simplified survey in 2020 through IFIC, as well as at ICIC20 as a step towards further validation.

This paper demonstrates that a standard case description template can be effectively applied as a secondary data extraction method; helping to create comparable descriptions of integrated care cases across international boundaries by drawing on data collected as part of case study research. The presented modified and simplified templates address a number of the challenges identified by the researchers in applying the tool and providers and managers who were presented the tool via a workshop at ICIC19. As demonstrated by the work presented in this paper, the modified tool will be valuable to researchers studying integrated care across different jurisdictions as a means to provide a high level comparable summary of key components of integrated care models. The presented simplified tool, we feel, has significant potential to be valuable to adopters of integrated care by offering a simple tool that can be used to summarize and compare cases, helping models to situate themselves as compared to peers, and make meaningful connections to other models as a means to further efforts to scale and spread models towards broader health system transformation.

Additional Files

The additional files for this article can be found as follows:

Appendix A.

Full case descriptions.

Supplementary Material 1.

Integrated Care Case Study Descriptive Template Structured Interview Guide – used for Commonwealth Fund study.

Supplementary Material 2.

Integrated Care Case Study Descriptive Template – Modified Version.

Supplementary Material 3.

Integrated Care Case Study Descriptive Template – Simplified Version.

Acknowledgements

We would like to acknowledge the highly engaged and enthusiastic participants in the workshop at the International Conference on Integrated Care held on Monday April 1 st 2019 in San Sebastian, Spain who shared their ideas, insights, and experiences. We would additionally like to acknowledge the trainees and colleagues who helped facilitate the session, take notes, and shared reflections on the day: Dr. Patrick Feng, Dr. G Ross Baker, Dara Gordon, Jennifer Gutberg, and Jennifer Im. Finally, we would like to acknowledge the support of Dr. Henk Nies whose leadership at Vilans helped us to build the partnership which resulted in this work.

Dr Teresa Burdett, Senior Lecturer in Integrated Health Care, Unit Lead for Foundations of Integrated Care and Person Centred Services, UK.

Dr Anna Charles, Senior Policy Adviser, The King’s Fund, UK.

Competing Interests

WPW is a facilitator of the IFIC Canada hub site, and leads the Scientific Committee for the North American Conference on Integrated Care (NACIC) planned for October 2020. CSG is also on the Scientific Committee for NACIC. WPW and CSG’s roles are on a voluntary basis. All other authors have no competing interests.

AI tool cuts unexpected deaths in hospital by 26%, Canadian study finds

Researchers say early warning system, launched in 2020 at st. michael's hospital, is 'saving lives'.

case study healthcare models

Social Sharing

Inside a bustling unit at St. Michael's Hospital in downtown Toronto, one of Shirley Bell's patients was suffering from a cat bite and a fever, but otherwise appeared fine — until an alert from an AI-based early warning system showed he was sicker than he seemed.

While the nursing team usually checked blood work around noon, the technology flagged incoming results several hours beforehand. That warning showed the patient's white blood cell count was "really, really high," recalled Bell, the clinical nurse educator for the hospital's general medicine program.

The cause turned out to be cellulitis, a bacterial skin infection. Without prompt treatment, it can lead to extensive tissue damage, amputations and even death. Bell said the patient was given antibiotics quickly to avoid those worst-case scenarios, in large part thanks to the team's in-house AI technology, dubbed Chartwatch.

"There's lots and lots of other scenarios where patients' conditions are flagged earlier, and the nurse is alerted earlier, and interventions are put in earlier," she said. "It's not replacing the nurse at the bedside; it's actually enhancing your nursing care."

A year-and-a-half-long study on Chartwatch, published Monday in the Canadian Medical Association Journal, found that use of the AI system led to a striking 26 per cent drop in the number of unexpected deaths among hospitalized patients.

"We're glad to see that we're saving lives," said co-author Dr. Muhammad Mamdani, vice-president of data science and advanced analytics at Unity Health Toronto and director of the University of Toronto Temerty Faculty of Medicine Centre for AI Research and Education in Medicine. 

'A promising sign'

The research team looked at more than 13,000 admissions to St. Michael's general internal medicine ward — an 84-bed unit caring for some of the hospital's most complex patients — to compare the impact of the tool among that patient population to thousands of admissions into other subspecialty units. 

  • This northern Ontario company is using AI to reduce paperwork at doctors' offices

"At the same time period in the other units in our hospital that were not using Chartwatch, we did not see a change in these unexpected deaths," said lead author Dr. Amol Verma, a clinician-scientist at St. Michael's, one of three Unity Health Toronto hospital network sites, and Temerty professor of AI research and education in medicine at University of Toronto. 

"That was a promising sign."

  • AI will be critical for the future of rural health care in Canada, experts say

The Unity Health AI team started developing Chartwatch back in 2017, based on suggestions from staff that predicting deaths or serious illness could be key areas where machine learning could make a positive difference.

The technology underwent several years of rigorous development and testing before it was deployed in October 2020, Verma said.

Dr. Amol Verma, a clinician-scientist at St. Michael’s Hospital who helped lead the creation and testing of CHARTwatch, stands at a computer.

"Chartwatch measures about 100 inputs from [a patient's] medical record that are currently routinely gathered in the process of delivering care," he explained. "So a patient's vital signs, their heart rate, their blood pressure … all of the lab test results that are done every day."

Working in the background alongside clinical teams, the tool monitors any changes in someone's medical record "and makes a dynamic prediction every hour about whether that patient is likely to deteriorate in the future," Verma told CBC News.

  • AI shows major promise in breast cancer detection, new studies suggest

That could mean someone getting sicker, or requiring intensive care, or even being on the brink of death, giving doctors and nurses a chance to intervene. 

In some cases, those interventions involve escalating someone's level of treatment to save their life, or providing early palliative care in situations where patients can't be rescued. 

In either case, the researchers said, Chartwatch appears to complement clinicians' own judgment and leads to better outcomes for fragile patients, helping to avoid more sudden and potentially preventable deaths.

AI on the rise in health care

Beyond its uses in medicine, artificial intelligence is getting plenty of buzz — and blowback — in recent years. 

From controversy around the use of machine learning software to crank out academic essays, to concerns over AI's capacity to create realistic audio and video content mimicking real celebrities, politicians, or average citizens, there have been plenty of reasons to be cautious about this emerging technology.

  • Canadian researchers use AI to find a possible treatment for bacteria superbug

Verma himself said he's long been wary. But in health care, he stressed, these tools have immense potential to combat the staff shortages plaguing Canada's health-care system by supplementing traditional bedside care.

case study healthcare models

How AI could change the future of our health care

It's still the early days for many of those efforts. Various research teams, including private companies, are exploring ways to use AI for earlier cancer detection. Some studies suggest it has potential for flagging hypertension just by listening to someone's voice; others show it could scan brain patterns to detect signs of a concussion .

  • From virtual care apps to AI algorithms: the trouble with data collection in healthcare

Chartwatch is notable, Verma stressed, because of its success in keeping actual patients alive.

"Very few AI technologies have actually been implemented into clinical settings yet. This is, to our knowledge, one of the first in Canada that has actually been implemented to help us care for patients every day in our hospital," he said.

'Real world' look at AI's health-care impact

The St. Michael's-based research does have limitations. The study took place during the COVID-19 pandemic, at a time when the health-care system faced an unusual set of challenges. The urban hospital's patient population is also distinct, the team acknowledged, given its high level of complex patients, including individuals facing homelessness, addiction and overlapping health issues.

"Our study was not a randomized controlled trial across multiple hospitals. It was within one organization, within one unit," Verma said. "So before we say that this tool can be used widely everywhere, I think we do need to do research on its use in multiple contexts."

  • Opinion Regulating artificial intelligence: Things are about to get a lot more interesting

Dr. John-Jose Nunez, a psychiatrist and researcher with the University of British Columbia — who wasn't involved in the study — agreed the research needs to be replicated elsewhere to get a better sense of how well Chartwatch might work in other facilities. There also needs to be considerations around patient privacy, he added, with the use of any emerging AI technologies.

Still, he praised the study team for providing a "real-world" example of how machine learning can improve patient care.

"I really think of AI tools as becoming one more team member on the clinical care team," he said.

Dr. Muhammad Mamdani, vice president of data science and advanced analytics at Unity Health Toronto and director of the University of Toronto Temerty Faculty of Medicine Centre for AI Research and Education in Medicine.

The Unity Health team is hopeful their technology will roll out more widely in the future, within their own Toronto-based hospital network and beyond.

Much of that work is happening through GEMINI , Canada's largest hospital data-sharing network for research and analytics, said Mamdani, Unity Health's vice-president of data science.

  • Researchers give a robot hand the power of touch, designing a human-like fingertip

More than 30 hospitals across Ontario are working together, he said, offering opportunities to test Chartwatch and other AI tools in various clinical settings and hospitals. 

"It just sets the groundwork now to be able to deploy these things well beyond our four walls," Mamdani said.

ABOUT THE AUTHOR

case study healthcare models

Senior Health & Medical Reporter

Lauren Pelley covers the global spread of infectious diseases, pandemic preparedness and the crucial intersection between health and climate change for CBC. She's a two-time Registered Nurses' Association of Ontario Media Award winner for in-depth health reporting in 2020 and 2022 and a silver medallist for best editorial newsletter at the 2024 Digital Publishing Awards for CBC Health's Second Opinion. Contact her at: [email protected]

  • @LaurenPelley

Related Stories

  • Ontario's police watchdog investigating after man dies in Thunder Bay hospital days after arrest
  • Learn how to transform your health with AI at an upcoming webinar
  • Should you turn to ChatGPT for medical advice? No, Western University study says
  • Nova Scotia forms partnership with Google Cloud to improve digital health care
  • Recent bot campaign backing Poilievre shows AI easily accessible for political messaging: report

Add some “good” to your morning and evening.

A vital dose of the week's news in health and medicine, from CBC Health. Delivered to your inbox every Saturday morning.

This site is protected by reCAPTCHA and the Google Privacy Policy and Google Terms of Service apply.

IMAGES

  1. FREE 10+ Medical Case Study Samples & Templates in MS Word

    case study healthcare models

  2. healthcare case study

    case study healthcare models

  3. Patient Case Study

    case study healthcare models

  4. Healthcare Sports Medicine Case Study Design

    case study healthcare models

  5. [Case Study] Healthcare

    case study healthcare models

  6. Case Study

    case study healthcare models

VIDEO

  1. CASE STUDY HEALTHCARE SMART SYSTEM WITH IOT INTEGRATION

  2. 124: Why Digital Health is Failing

  3. The role of Primary Healthcare in the response of COVID-19: Case studies from the Region

  4. Case study : Healthcare || Unit-5 || FIOT || CSE || JNTUH

  5. Case study : Activity Monitoring || Unit-5 || FIOT || CSE || JNTUH

  6. Episode 27: Supporting Patients & Providers with the Emerging Hybrid Care Model

COMMENTS

  1. Continuing to enhance the quality of case study methodology in health

    Introduction. The popularity of case study research methodology in Health Services Research (HSR) has grown over the past 40 years. 1 This may be attributed to a shift towards the use of implementation research and a newfound appreciation of contextual factors affecting the uptake of evidence-based interventions within diverse settings. 2 Incorporating context-specific information on the ...

  2. Phase 3: case studies of service delivery models

    Health-care professionals who were providers of EoL care in the case study service model were purposively sampled and invited to take part in a semistructured interview, either individually or in a group. HCPs were asked about their experiences of facilitating access to medicines, including barriers and facilitating factors.

  3. Leadership Models in Health Care—A Case for Servant Leadership

    Our current health care system is broken and unsustainable. Patients desire the highest quality care, and it needs to cost less. To regain public trust, the health care system must change and adapt to the current needs of patients. The diverse group of stakeholders in the health care system creates challenges for improving the value of care. Health care providers are in the best position to ...

  4. The case study approach

    Abstract. The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case ...

  5. Developing the DESCARTE Model: The Design of Case Study Research in

    We draw from the case study and mixed-methods literature to develop the DESCARTE model as an innovative approach to the design, conduct, and reporting of case studies in health care. We examine how case study fits within the overall enterprise of qualitatively driven mixed-methods research, and the potential strengths of the model are considered.

  6. Case study research for better evaluations of complex interventions

    The Design of Case Study Research in Health Care (DESCARTE) model suggests a series of questions to be asked of a case study researcher (including clarity about the philosophy underpinning their research), study design (with a focus on case definition) and analysis (to improve process). The model resembles toolkits for enhancing the quality and ...

  7. What is a case study?

    Case study is a research methodology, typically seen in social and life sciences. There is no one definition of case study research.1 However, very simply… 'a case study can be defined as an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units'.1 A case study has also been described as an intensive, systematic investigation of a ...

  8. Comparative case studies in integrated care implementation from across

    The seven reported case studies represent integrated care implementation efforts from five countries and continents (United States, United Kingdom, Vietnam, Israel, and Nigeria), target a range of clinical populations and care settings, and span all phases of the EPIS framework. ... integrated care models in healthcare systems such as the UK ...

  9. Prediction across healthcare settings: a case study in predicting

    We sought to compare these different approaches and to examine their strengths and weaknesses, through a case study of a hospitalization prediction model, implemented across three diverse real ...

  10. Organisational change in hospitals: a qualitative case-study of staff

    Organisational change in health systems is common. Success is often tied to the actors involved, including their awareness of the change, personal engagement and ownership of it. In many health systems, one of the most common changes we are witnessing is the redevelopment of long-standing hospitals. However, we know little about how hospital staff understand and experience such potentially far ...

  11. Case study: international healthcare service quality, building a model

    In order to develop the CSC model using a systematic approach, a combination of literature studies, case report analyses, in-depth interviews and statistical comparisons was applied. In addition to the literature studies, we undertook an exploratory qualitative study to investigate the concept of international healthcare service quality ...

  12. PDF Evaluating Case Management Models in Healthcare: A Systematic Review

    According to the results of our research, the case management models were discussed between 1989 and 2014. As Table 2 shows, the studies were initially screened based on the author's name, year of publication, study period, study position, and the model used in case management. Finally, we described each of the case management models.

  13. Taking a case study approach to assessing alternative leadership models

    All members of the healthcare team can be leaders and evidence-based theory should inform their leadership practice. This article uses a case study approach to critically evaluate leadership as exercised by a charge nurse and a student nurse in a clinical scenario. Ineffective leadership styles are identified and alternatives proposed ...

  14. (PDF) CASE STUDY Technology in healthcare: A case study of healthcare

    In their case study, Chikul et al. (2017) compare three supply chain models that use: a) manual inventory check and delivery, b) RFID inventory check and manual delivery, and c) manual inventory ...

  15. 3 Case Studies of Models Used to Inform Health Policy

    The workshop's second panel featured three case studies presented by long-time modelers which were offered to illustrate some of the ways in which models can be used to inform health policy. In each case, said session moderator Pamela Russo, a senior program officer at the Robert Wood Johnson Foundation, the models are nonlinear, dynamic, and ...

  16. The case study approach

    The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design ...

  17. Taking a case study approach to assessing alternative leadership models

    Taking a case study approach to assessing alternative leadership models in health care. June 2018. British Journal of Nursing 27 (11):608-613. DOI: 10.12968/bjon.2018.27.11.608. Authors: Jonathan ...

  18. Healthcare Private Equity: A Review of Key Case Studies and ...

    Healthcare PE Strategy Over the Years: Case Studies PE Investment Models in Healthcare. The most common financing model used by PE in healthcare has traditionally been a leveraged buyout (LBO). The approach involves a financial transaction where a PE firm buys a majority stake in a healthcare company using a significant amount of borrowed money.

  19. Case study

    Case study - Canola Fields deliberate team-based care model. A GP practice in Canowindra uses a community-based deliberate team-based care (DTBC) program. The model supports patient-centred care, shared across a team of health professionals. The DTBC program has reduced hospitalisations, improved access to care, and reduced treatment waiting ...

  20. Narayana Hrudayalaya: A Model for Accessible, Affordable Health Care

    Cardiac surgeries in the United States can cost up to US$50,000. In India, they typically cost around US$5,000-US$7,000. Depending on the complexities of the procedure and the length of the ...

  21. Identifying alternative models of healthcare service delivery to inform

    Background. The last century has seen a continuous growth in investment in the health systems of high-income countries. 1 This has contributed to significant improvements in population health and a reduction in demand for medical care of communicable diseases, but a proportional increase in demand for the management of chronic and complex conditions. 2 3 In addition, advances in medical ...

  22. Case study

    Case study - Snowy Valleys shared medical appointment model. The Snowy Valleys project is a shared medical appointment where health professionals consult patients with common health conditions, involving peer-to-peer sharing. This model included 3 separate trials, in which 2 produced sufficient billings through Medicare to satisfy ...

  23. An open-source framework for end-to-end analysis of electronic health

    With progressive digitalization of healthcare systems worldwide, large-scale collection of electronic health records (EHRs) has become commonplace. However, an extensible framework for ...

  24. Case study

    The 4Ts model provides a single employer, and shares doctors between 4 part-time primary care clinics, located in each towns' Multi-Purpose Service hospital. The model has decreased MPS emergency presentations and readmission rates. ... Case study - 4Ts networked single employer GP model ... The Department of Health and Aged Care ...

  25. Understanding disciplinary perspectives: a framework to develop skills

    Background Health professionals need to be prepared for interdisciplinary research collaborations aimed at the development and implementation of medical technology. Expertise is highly domain-specific, and learned by being immersed in professional practice. Therefore, the approaches and results from one domain are not easily understood by experts from another domain. Interdisciplinary ...

  26. Comparing International Models of Integrated Care: How Can We Learn

    Theory and methods: A framework developed for an international study of programs that address high needs high cost patients was used to describe and compare 11 case studies analyzed in two international research projects; the Implementing Integrated Care for Older Adults with Complex Health Needs (iCOACH) study in Canada and New Zealand, and the Vilans research group exploring models in the ...

  27. AI tool cuts unexpected deaths in hospital by 26%, Canadian study finds

    'Real world' look at AI's health-care impact The St. Michael's-based research does have limitations. The study took place during the COVID-19 pandemic, at a time when the health-care system faced ...

  28. Case of paradoxical cultural sensitivity: Mixed method study of web

    Background: Designing web-based informational materials regarding the human papillomavirus (HPV) vaccine has become a challenge for designers and decision makers in the health authorities because of the scientific and public controversy regarding the vaccine's safety and effectiveness and the sexual and moral concerns related to its use. Objective: The study aimed to investigate how cultural ...

  29. DOCX Deliberate team-based care model

    Deliberate team-based care model. Innovative models of care case study. The IMOC Program helps organisations trial new ways of providing primary care in rural and remote communities. Funding is for governance, community engagement and program management activities to support innovative health services delivery. ... involved the patient in their ...

  30. Are instagram gym advertisements working out? An experimental study of

    Previous health communication research has demonstrated the negative psychological and health effects of depicting thin-sized models in mass media advertisements including on social media sites such as Instagram. However, gym advertisements are one common source for the presentation of lean and thin-sized models on Instagram. Therefore, the current study guided by social comparison theory and ...