• Open access
  • Published: 10 November 2020

Case study research for better evaluations of complex interventions: rationale and challenges

  • Sara Paparini   ORCID: orcid.org/0000-0002-1909-2481 1 ,
  • Judith Green 2 ,
  • Chrysanthi Papoutsi 1 ,
  • Jamie Murdoch 3 ,
  • Mark Petticrew 4 ,
  • Trish Greenhalgh 1 ,
  • Benjamin Hanckel 5 &
  • Sara Shaw 1  

BMC Medicine volume  18 , Article number:  301 ( 2020 ) Cite this article

18k Accesses

44 Citations

35 Altmetric

Metrics details

The need for better methods for evaluation in health research has been widely recognised. The ‘complexity turn’ has drawn attention to the limitations of relying on causal inference from randomised controlled trials alone for understanding whether, and under which conditions, interventions in complex systems improve health services or the public health, and what mechanisms might link interventions and outcomes. We argue that case study research—currently denigrated as poor evidence—is an under-utilised resource for not only providing evidence about context and transferability, but also for helping strengthen causal inferences when pathways between intervention and effects are likely to be non-linear.

Case study research, as an overall approach, is based on in-depth explorations of complex phenomena in their natural, or real-life, settings. Empirical case studies typically enable dynamic understanding of complex challenges and provide evidence about causal mechanisms and the necessary and sufficient conditions (contexts) for intervention implementation and effects. This is essential evidence not just for researchers concerned about internal and external validity, but also research users in policy and practice who need to know what the likely effects of complex programmes or interventions will be in their settings. The health sciences have much to learn from scholarship on case study methodology in the social sciences. However, there are multiple challenges in fully exploiting the potential learning from case study research. First are misconceptions that case study research can only provide exploratory or descriptive evidence. Second, there is little consensus about what a case study is, and considerable diversity in how empirical case studies are conducted and reported. Finally, as case study researchers typically (and appropriately) focus on thick description (that captures contextual detail), it can be challenging to identify the key messages related to intervention evaluation from case study reports.

Whilst the diversity of published case studies in health services and public health research is rich and productive, we recommend further clarity and specific methodological guidance for those reporting case study research for evaluation audiences.

Peer Review reports

The need for methodological development to address the most urgent challenges in health research has been well-documented. Many of the most pressing questions for public health research, where the focus is on system-level determinants [ 1 , 2 ], and for health services research, where provisions typically vary across sites and are provided through interlocking networks of services [ 3 ], require methodological approaches that can attend to complexity. The need for methodological advance has arisen, in part, as a result of the diminishing returns from randomised controlled trials (RCTs) where they have been used to answer questions about the effects of interventions in complex systems [ 4 , 5 , 6 ]. In conditions of complexity, there is limited value in maintaining the current orientation to experimental trial designs in the health sciences as providing ‘gold standard’ evidence of effect.

There are increasing calls for methodological pluralism [ 7 , 8 ], with the recognition that complex intervention and context are not easily or usefully separated (as is often the situation when using trial design), and that system interruptions may have effects that are not reducible to linear causal pathways between intervention and outcome. These calls are reflected in a shifting and contested discourse of trial design, seen with the emergence of realist [ 9 ], adaptive and hybrid (types 1, 2 and 3) [ 10 , 11 ] trials that blend studies of effectiveness with a close consideration of the contexts of implementation. Similarly, process evaluation has now become a core component of complex healthcare intervention trials, reflected in MRC guidance on how to explore implementation, causal mechanisms and context [ 12 ].

Evidence about the context of an intervention is crucial for questions of external validity. As Woolcock [ 4 ] notes, even if RCT designs are accepted as robust for maximising internal validity, questions of transferability (how well the intervention works in different contexts) and generalisability (how well the intervention can be scaled up) remain unanswered [ 5 , 13 ]. For research evidence to have impact on policy and systems organisation, and thus to improve population and patient health, there is an urgent need for better methods for strengthening external validity, including a better understanding of the relationship between intervention and context [ 14 ].

Policymakers, healthcare commissioners and other research users require credible evidence of relevance to their settings and populations [ 15 ], to perform what Rosengarten and Savransky [ 16 ] call ‘careful abstraction’ to the locales that matter for them. They also require robust evidence for understanding complex causal pathways. Case study research, currently under-utilised in public health and health services evaluation, can offer considerable potential for strengthening faith in both external and internal validity. For example, in an empirical case study of how the policy of free bus travel had specific health effects in London, UK, a quasi-experimental evaluation (led by JG) identified how important aspects of context (a good public transport system) and intervention (that it was universal) were necessary conditions for the observed effects, thus providing useful, actionable evidence for decision-makers in other contexts [ 17 ].

The overall approach of case study research is based on the in-depth exploration of complex phenomena in their natural, or ‘real-life’, settings. Empirical case studies typically enable dynamic understanding of complex challenges rather than restricting the focus on narrow problem delineations and simple fixes. Case study research is a diverse and somewhat contested field, with multiple definitions and perspectives grounded in different ways of viewing the world, and involving different combinations of methods. In this paper, we raise awareness of such plurality and highlight the contribution that case study research can make to the evaluation of complex system-level interventions. We review some of the challenges in exploiting the current evidence base from empirical case studies and conclude by recommending that further guidance and minimum reporting criteria for evaluation using case studies, appropriate for audiences in the health sciences, can enhance the take-up of evidence from case study research.

Case study research offers evidence about context, causal inference in complex systems and implementation

Well-conducted and described empirical case studies provide evidence on context, complexity and mechanisms for understanding how, where and why interventions have their observed effects. Recognition of the importance of context for understanding the relationships between interventions and outcomes is hardly new. In 1943, Canguilhem berated an over-reliance on experimental designs for determining universal physiological laws: ‘As if one could determine a phenomenon’s essence apart from its conditions! As if conditions were a mask or frame which changed neither the face nor the picture!’ ([ 18 ] p126). More recently, a concern with context has been expressed in health systems and public health research as part of what has been called the ‘complexity turn’ [ 1 ]: a recognition that many of the most enduring challenges for developing an evidence base require a consideration of system-level effects [ 1 ] and the conceptualisation of interventions as interruptions in systems [ 19 ].

The case study approach is widely recognised as offering an invaluable resource for understanding the dynamic and evolving influence of context on complex, system-level interventions [ 20 , 21 , 22 , 23 ]. Empirically, case studies can directly inform assessments of where, when, how and for whom interventions might be successfully implemented, by helping to specify the necessary and sufficient conditions under which interventions might have effects and to consolidate learning on how interdependencies, emergence and unpredictability can be managed to achieve and sustain desired effects. Case study research has the potential to address four objectives for improving research and reporting of context recently set out by guidance on taking account of context in population health research [ 24 ], that is to (1) improve the appropriateness of intervention development for specific contexts, (2) improve understanding of ‘how’ interventions work, (3) better understand how and why impacts vary across contexts and (4) ensure reports of intervention studies are most useful for decision-makers and researchers.

However, evaluations of complex healthcare interventions have arguably not exploited the full potential of case study research and can learn much from other disciplines. For evaluative research, exploratory case studies have had a traditional role of providing data on ‘process’, or initial ‘hypothesis-generating’ scoping, but might also have an increasing salience for explanatory aims. Across the social and political sciences, different kinds of case studies are undertaken to meet diverse aims (description, exploration or explanation) and across different scales (from small N qualitative studies that aim to elucidate processes, or provide thick description, to more systematic techniques designed for medium-to-large N cases).

Case studies with explanatory aims vary in terms of their positioning within mixed-methods projects, with designs including (but not restricted to) (1) single N of 1 studies of interventions in specific contexts, where the overall design is a case study that may incorporate one or more (randomised or not) comparisons over time and between variables within the case; (2) a series of cases conducted or synthesised to provide explanation from variations between cases; and (3) case studies of particular settings within RCT or quasi-experimental designs to explore variation in effects or implementation.

Detailed qualitative research (typically done as ‘case studies’ within process evaluations) provides evidence for the plausibility of mechanisms [ 25 ], offering theoretical generalisations for how interventions may function under different conditions. Although RCT designs reduce many threats to internal validity, the mechanisms of effect remain opaque, particularly when the causal pathways between ‘intervention’ and ‘effect’ are long and potentially non-linear: case study research has a more fundamental role here, in providing detailed observational evidence for causal claims [ 26 ] as well as producing a rich, nuanced picture of tensions and multiple perspectives [ 8 ].

Longitudinal or cross-case analysis may be best suited for evidence generation in system-level evaluative research. Turner [ 27 ], for instance, reflecting on the complex processes in major system change, has argued for the need for methods that integrate learning across cases, to develop theoretical knowledge that would enable inferences beyond the single case, and to develop generalisable theory about organisational and structural change in health systems. Qualitative Comparative Analysis (QCA) [ 28 ] is one such formal method for deriving causal claims, using set theory mathematics to integrate data from empirical case studies to answer questions about the configurations of causal pathways linking conditions to outcomes [ 29 , 30 ].

Nonetheless, the single N case study, too, provides opportunities for theoretical development [ 31 ], and theoretical generalisation or analytical refinement [ 32 ]. How ‘the case’ and ‘context’ are conceptualised is crucial here. Findings from the single case may seem to be confined to its intrinsic particularities in a specific and distinct context [ 33 ]. However, if such context is viewed as exemplifying wider social and political forces, the single case can be ‘telling’, rather than ‘typical’, and offer insight into a wider issue [ 34 ]. Internal comparisons within the case can offer rich possibilities for logical inferences about causation [ 17 ]. Further, case studies of any size can be used for theory testing through refutation [ 22 ]. The potential lies, then, in utilising the strengths and plurality of case study to support theory-driven research within different methodological paradigms.

Evaluation research in health has much to learn from a range of social sciences where case study methodology has been used to develop various kinds of causal inference. For instance, Gerring [ 35 ] expands on the within-case variations utilised to make causal claims. For Gerring [ 35 ], case studies come into their own with regard to invariant or strong causal claims (such as X is a necessary and/or sufficient condition for Y) rather than for probabilistic causal claims. For the latter (where experimental methods might have an advantage in estimating effect sizes), case studies offer evidence on mechanisms: from observations of X affecting Y, from process tracing or from pattern matching. Case studies also support the study of emergent causation, that is, the multiple interacting properties that account for particular and unexpected outcomes in complex systems, such as in healthcare [ 8 ].

Finally, efficacy (or beliefs about efficacy) is not the only contributor to intervention uptake, with a range of organisational and policy contingencies affecting whether an intervention is likely to be rolled out in practice. Case study research is, therefore, invaluable for learning about contextual contingencies and identifying the conditions necessary for interventions to become normalised (i.e. implemented routinely) in practice [ 36 ].

The challenges in exploiting evidence from case study research

At present, there are significant challenges in exploiting the benefits of case study research in evaluative health research, which relate to status, definition and reporting. Case study research has been marginalised at the bottom of an evidence hierarchy, seen to offer little by way of explanatory power, if nonetheless useful for adding descriptive data on process or providing useful illustrations for policymakers [ 37 ]. This is an opportune moment to revisit this low status. As health researchers are increasingly charged with evaluating ‘natural experiments’—the use of face masks in the response to the COVID-19 pandemic being a recent example [ 38 ]—rather than interventions that take place in settings that can be controlled, research approaches using methods to strengthen causal inference that does not require randomisation become more relevant.

A second challenge for improving the use of case study evidence in evaluative health research is that, as we have seen, what is meant by ‘case study’ varies widely, not only across but also within disciplines. There is indeed little consensus amongst methodologists as to how to define ‘a case study’. Definitions focus, variously, on small sample size or lack of control over the intervention (e.g. [ 39 ] p194), on in-depth study and context [ 40 , 41 ], on the logic of inference used [ 35 ] or on distinct research strategies which incorporate a number of methods to address questions of ‘how’ and ‘why’ [ 42 ]. Moreover, definitions developed for specific disciplines do not capture the range of ways in which case study research is carried out across disciplines. Multiple definitions of case study reflect the richness and diversity of the approach. However, evidence suggests that a lack of consensus across methodologists results in some of the limitations of published reports of empirical case studies [ 43 , 44 ]. Hyett and colleagues [ 43 ], for instance, reviewing reports in qualitative journals, found little match between methodological definitions of case study research and how authors used the term.

This raises the third challenge we identify that case study reports are typically not written in ways that are accessible or useful for the evaluation research community and policymakers. Case studies may not appear in journals widely read by those in the health sciences, either because space constraints preclude the reporting of rich, thick descriptions, or because of the reported lack of willingness of some biomedical journals to publish research that uses qualitative methods [ 45 ], signalling the persistence of the aforementioned evidence hierarchy. Where they do, however, the term ‘case study’ is used to indicate, interchangeably, a qualitative study, an N of 1 sample, or a multi-method, in-depth analysis of one example from a population of phenomena. Definitions of what constitutes the ‘case’ are frequently lacking and appear to be used as a synonym for the settings in which the research is conducted. Despite offering insights for evaluation, the primary aims may not have been evaluative, so the implications may not be explicitly drawn out. Indeed, some case study reports might properly be aiming for thick description without necessarily seeking to inform about context or causality.

Acknowledging plurality and developing guidance

We recognise that definitional and methodological plurality is not only inevitable, but also a necessary and creative reflection of the very different epistemological and disciplinary origins of health researchers, and the aims they have in doing and reporting case study research. Indeed, to provide some clarity, Thomas [ 46 ] has suggested a typology of subject/purpose/approach/process for classifying aims (e.g. evaluative or exploratory), sample rationale and selection and methods for data generation of case studies. We also recognise that the diversity of methods used in case study research, and the necessary focus on narrative reporting, does not lend itself to straightforward development of formal quality or reporting criteria.

Existing checklists for reporting case study research from the social sciences—for example Lincoln and Guba’s [ 47 ] and Stake’s [ 33 ]—are primarily orientated to the quality of narrative produced, and the extent to which they encapsulate thick description, rather than the more pragmatic issues of implications for intervention effects. Those designed for clinical settings, such as the CARE (CAse REports) guidelines, provide specific reporting guidelines for medical case reports about single, or small groups of patients [ 48 ], not for case study research.

The Design of Case Study Research in Health Care (DESCARTE) model [ 44 ] suggests a series of questions to be asked of a case study researcher (including clarity about the philosophy underpinning their research), study design (with a focus on case definition) and analysis (to improve process). The model resembles toolkits for enhancing the quality and robustness of qualitative and mixed-methods research reporting, and it is usefully open-ended and non-prescriptive. However, even if it does include some reflections on context, the model does not fully address aspects of context, logic and causal inference that are perhaps most relevant for evaluative research in health.

Hence, for evaluative research where the aim is to report empirical findings in ways that are intended to be pragmatically useful for health policy and practice, this may be an opportune time to consider how to best navigate plurality around what is (minimally) important to report when publishing empirical case studies, especially with regards to the complex relationships between context and interventions, information that case study research is well placed to provide.

The conventional scientific quest for certainty, predictability and linear causality (maximised in RCT designs) has to be augmented by the study of uncertainty, unpredictability and emergent causality [ 8 ] in complex systems. This will require methodological pluralism, and openness to broadening the evidence base to better understand both causality in and the transferability of system change intervention [ 14 , 20 , 23 , 25 ]. Case study research evidence is essential, yet is currently under exploited in the health sciences. If evaluative health research is to move beyond the current impasse on methods for understanding interventions as interruptions in complex systems, we need to consider in more detail how researchers can conduct and report empirical case studies which do aim to elucidate the contextual factors which interact with interventions to produce particular effects. To this end, supported by the UK’s Medical Research Council, we are embracing the challenge to develop guidance for case study researchers studying complex interventions. Following a meta-narrative review of the literature, we are planning a Delphi study to inform guidance that will, at minimum, cover the value of case study research for evaluating the interrelationship between context and complex system-level interventions; for situating and defining ‘the case’, and generalising from case studies; as well as provide specific guidance on conducting, analysing and reporting case study research. Our hope is that such guidance can support researchers evaluating interventions in complex systems to better exploit the diversity and richness of case study research.

Availability of data and materials

Not applicable (article based on existing available academic publications)


Qualitative comparative analysis

Quasi-experimental design

Randomised controlled trial

Diez Roux AV. Complex systems thinking and current impasses in health disparities research. Am J Public Health. 2011;101(9):1627–34.

Article   Google Scholar  

Ogilvie D, Mitchell R, Mutrie N, M P, Platt S. Evaluating health effects of transport interventions: methodologic case study. Am J Prev Med 2006;31:118–126.

Walshe C. The evaluation of complex interventions in palliative care: an exploration of the potential of case study research strategies. Palliat Med. 2011;25(8):774–81.

Woolcock M. Using case studies to explore the external validity of ‘complex’ development interventions. Evaluation. 2013;19:229–48.

Cartwright N. Are RCTs the gold standard? BioSocieties. 2007;2(1):11–20.

Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials. Soc Sci Med. 2018;210:2–21.

Salway S, Green J. Towards a critical complex systems approach to public health. Crit Public Health. 2017;27(5):523–4.

Greenhalgh T, Papoutsi C. Studying complexity in health services research: desperately seeking an overdue paradigm shift. BMC Med. 2018;16(1):95.

Bonell C, Warren E, Fletcher A. Realist trials and the testing of context-mechanism-outcome configurations: a response to Van Belle et al. Trials. 2016;17:478.

Pallmann P, Bedding AW, Choodari-Oskooei B. Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Med. 2018;16:29.

Curran G, Bauer M, Mittman B, Pyne J, Stetler C. Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact. Med Care. 2012;50(3):217–26. https://doi.org/10.1097/MLR.0b013e3182408812 .

Moore GF, Audrey S, Barker M, Bond L, Bonell C, Hardeman W, et al. Process evaluation of complex interventions: Medical Research Council guidance. BMJ. 2015 [cited 2020 Jun 27];350. Available from: https://www.bmj.com/content/350/bmj.h1258 .

Evans RE, Craig P, Hoddinott P, Littlecott H, Moore L, Murphy S, et al. When and how do ‘effective’ interventions need to be adapted and/or re-evaluated in new contexts? The need for guidance. J Epidemiol Community Health. 2019;73(6):481–2.

Shoveller J. A critical examination of representations of context within research on population health interventions. Crit Public Health. 2016;26(5):487–500.

Treweek S, Zwarenstein M. Making trials matter: pragmatic and explanatory trials and the problem of applicability. Trials. 2009;10(1):37.

Rosengarten M, Savransky M. A careful biomedicine? Generalization and abstraction in RCTs. Crit Public Health. 2019;29(2):181–91.

Green J, Roberts H, Petticrew M, Steinbach R, Goodman A, Jones A, et al. Integrating quasi-experimental and inductive designs in evaluation: a case study of the impact of free bus travel on public health. Evaluation. 2015;21(4):391–406.

Canguilhem G. The normal and the pathological. New York: Zone Books; 1991. (1949).

Google Scholar  

Hawe P, Shiell A, Riley T. Theorising interventions as events in systems. Am J Community Psychol. 2009;43:267–76.

King G, Keohane RO, Verba S. Designing social inquiry: scientific inference in qualitative research: Princeton University Press; 1994.

Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O. Diffusion of innovations in service organizations: systematic review and recommendations. Milbank Q. 2004;82(4):581–629.

Yin R. Enhancing the quality of case studies in health services research. Health Serv Res. 1999;34(5 Pt 2):1209.

CAS   PubMed   PubMed Central   Google Scholar  

Raine R, Fitzpatrick R, Barratt H, Bevan G, Black N, Boaden R, et al. Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Health Serv Deliv Res. 2016 [cited 2020 Jun 30];4(16). Available from: https://www.journalslibrary.nihr.ac.uk/hsdr/hsdr04160#/abstract .

Craig P, Di Ruggiero E, Frohlich KL, E M, White M, Group CCGA. Taking account of context in population health intervention research: guidance for producers, users and funders of research. NIHR Evaluation, Trials and Studies Coordinating Centre; 2018.

Grant RL, Hood R. Complex systems, explanation and policy: implications of the crisis of replication for public health research. Crit Public Health. 2017;27(5):525–32.

Mahoney J. Strategies of causal inference in small-N analysis. Sociol Methods Res. 2000;4:387–424.

Turner S. Major system change: a management and organisational research perspective. In: Rosalind Raine, Ray Fitzpatrick, Helen Barratt, Gywn Bevan, Nick Black, Ruth Boaden, et al. Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Health Serv Deliv Res. 2016;4(16) 2016. https://doi.org/10.3310/hsdr04160.

Ragin CC. Using qualitative comparative analysis to study causal complexity. Health Serv Res. 1999;34(5 Pt 2):1225.

Hanckel B, Petticrew M, Thomas J, Green J. Protocol for a systematic review of the use of qualitative comparative analysis for evaluative questions in public health research. Syst Rev. 2019;8(1):252.

Schneider CQ, Wagemann C. Set-theoretic methods for the social sciences: a guide to qualitative comparative analysis: Cambridge University Press; 2012. 369 p.

Flyvbjerg B. Five misunderstandings about case-study research. Qual Inq. 2006;12:219–45.

Tsoukas H. Craving for generality and small-N studies: a Wittgensteinian approach towards the epistemology of the particular in organization and management studies. Sage Handb Organ Res Methods. 2009:285–301.

Stake RE. The art of case study research. London: Sage Publications Ltd; 1995.

Mitchell JC. Typicality and the case study. Ethnographic research: A guide to general conduct. Vol. 238241. 1984.

Gerring J. What is a case study and what is it good for? Am Polit Sci Rev. 2004;98(2):341–54.

May C, Mort M, Williams T, F M, Gask L. Health technology assessment in its local contexts: studies of telehealthcare. Soc Sci Med 2003;57:697–710.

McGill E. Trading quality for relevance: non-health decision-makers’ use of evidence on the social determinants of health. BMJ Open. 2015;5(4):007053.

Greenhalgh T. We can’t be 100% sure face masks work – but that shouldn’t stop us wearing them | Trish Greenhalgh. The Guardian. 2020 [cited 2020 Jun 27]; Available from: https://www.theguardian.com/commentisfree/2020/jun/05/face-masks-coronavirus .

Hammersley M. So, what are case studies? In: What’s wrong with ethnography? New York: Routledge; 1992.

Crowe S, Cresswell K, Robertson A, Huby G, Avery A, Sheikh A. The case study approach. BMC Med Res Methodol. 2011;11(1):100.

Luck L, Jackson D, Usher K. Case study: a bridge across the paradigms. Nurs Inq. 2006;13(2):103–9.

Yin RK. Case study research and applications: design and methods: Sage; 2017.

Hyett N, A K, Dickson-Swift V. Methodology or method? A critical review of qualitative case study reports. Int J Qual Stud Health Well-Being. 2014;9:23606.

Carolan CM, Forbat L, Smith A. Developing the DESCARTE model: the design of case study research in health care. Qual Health Res. 2016;26(5):626–39.

Greenhalgh T, Annandale E, Ashcroft R, Barlow J, Black N, Bleakley A, et al. An open letter to the BMJ editors on qualitative research. Bmj. 2016;352.

Thomas G. A typology for the case study in social science following a review of definition, discourse, and structure. Qual Inq. 2011;17(6):511–21.

Lincoln YS, Guba EG. Judging the quality of case study reports. Int J Qual Stud Educ. 1990;3(1):53–9.

Riley DS, Barber MS, Kienle GS, Aronson JK, Schoen-Angerer T, Tugwell P, et al. CARE guidelines for case reports: explanation and elaboration document. J Clin Epidemiol. 2017;89:218–35.

Download references


Not applicable

This work was funded by the Medical Research Council - MRC Award MR/S014632/1 HCS: Case study, Context and Complex interventions (TRIPLE C). SP was additionally funded by the University of Oxford's Higher Education Innovation Fund (HEIF).

Author information

Authors and affiliations.

Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK

Sara Paparini, Chrysanthi Papoutsi, Trish Greenhalgh & Sara Shaw

Wellcome Centre for Cultures & Environments of Health, University of Exeter, Exeter, UK

Judith Green

School of Health Sciences, University of East Anglia, Norwich, UK

Jamie Murdoch

Public Health, Environments and Society, London School of Hygiene & Tropical Medicin, London, UK

Mark Petticrew

Institute for Culture and Society, Western Sydney University, Penrith, Australia

Benjamin Hanckel

You can also search for this author in PubMed   Google Scholar


JG, MP, SP, JM, TG, CP and SS drafted the initial paper; all authors contributed to the drafting of the final version, and read and approved the final manuscript.

Corresponding author

Correspondence to Sara Paparini .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Paparini, S., Green, J., Papoutsi, C. et al. Case study research for better evaluations of complex interventions: rationale and challenges. BMC Med 18 , 301 (2020). https://doi.org/10.1186/s12916-020-01777-6

Download citation

Received : 03 July 2020

Accepted : 07 September 2020

Published : 10 November 2020

DOI : https://doi.org/10.1186/s12916-020-01777-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Qualitative
  • Case studies
  • Mixed-method
  • Public health
  • Health services research
  • Interventions

BMC Medicine

ISSN: 1741-7015

case study research article

  • Open access
  • Published: 27 June 2011

The case study approach

  • Sarah Crowe 1 ,
  • Kathrin Cresswell 2 ,
  • Ann Robertson 2 ,
  • Guro Huby 3 ,
  • Anthony Avery 1 &
  • Aziz Sheikh 2  

BMC Medical Research Methodology volume  11 , Article number:  100 ( 2011 ) Cite this article

788k Accesses

1063 Citations

37 Altmetric

Metrics details

The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.

Peer Review reports


The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.

The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.

This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables 1 , 2 , 3 and 4 ) and those of others to illustrate our discussion[ 3 – 7 ].

What is a case study?

A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.

Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.

These are however not necessarily mutually exclusive categories. In the first of our examples (Table 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables 2 , 3 and 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 – 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].

What are case studies used for?

According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables 2 and 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.

Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].

How are case studies conducted?

Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.

Defining the case

Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].

For example, in our evaluation of the introduction of electronic health records in English hospitals (Table 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.

Selecting the case(s)

The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.

For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.

In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.

The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.

It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.

In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.

Collecting the data

In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 – 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table 2 )[ 4 ].

Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.

In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.

Analysing, interpreting and reporting case studies

Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.

The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table 4 )[ 6 ].

Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.

When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].

What are the potential pitfalls and how can these be avoided?

The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.

Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table 8 )[ 8 , 18 – 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table 9 )[ 8 ].


The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.

Yin RK: Case study research, design and method. 2009, London: Sage Publications Ltd., 4

Google Scholar  

Keen J, Packwood T: Qualitative research; case study evaluation. BMJ. 1995, 311: 444-446.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sheikh A, Halani L, Bhopal R, Netuveli G, Partridge M, Car J, et al: Facilitating the Recruitment of Minority Ethnic People into Research: Qualitative Case Study of South Asians and Asthma. PLoS Med. 2009, 6 (10): 1-11.

Article   Google Scholar  

Pinnock H, Huby G, Powell A, Kielmann T, Price D, Williams S, et al: The process of planning, development and implementation of a General Practitioner with a Special Interest service in Primary Care Organisations in England and Wales: a comparative prospective case study. Report for the National Co-ordinating Centre for NHS Service Delivery and Organisation R&D (NCCSDO). 2008, [ http://www.sdo.nihr.ac.uk/files/project/99-final-report.pdf ]

Robertson A, Cresswell K, Takian A, Petrakaki D, Crowe S, Cornford T, et al: Prospective evaluation of the implementation and adoption of NHS Connecting for Health's national electronic health record in secondary care in England: interim findings. BMJ. 2010, 41: c4564-

Pearson P, Steven A, Howe A, Sheikh A, Ashcroft D, Smith P, the Patient Safety Education Study Group: Learning about patient safety: organisational context and culture in the education of healthcare professionals. J Health Serv Res Policy. 2010, 15: 4-10. 10.1258/jhsrp.2009.009052.

Article   PubMed   Google Scholar  

van Harten WH, Casparie TF, Fisscher OA: The evaluation of the introduction of a quality management system: a process-oriented case study in a large rehabilitation hospital. Health Policy. 2002, 60 (1): 17-37. 10.1016/S0168-8510(01)00187-7.

Stake RE: The art of case study research. 1995, London: Sage Publications Ltd.

Sheikh A, Smeeth L, Ashcroft R: Randomised controlled trials in primary care: scope and application. Br J Gen Pract. 2002, 52 (482): 746-51.

PubMed   PubMed Central   Google Scholar  

King G, Keohane R, Verba S: Designing Social Inquiry. 1996, Princeton: Princeton University Press

Doolin B: Information technology as disciplinary technology: being critical in interpretative research on information systems. Journal of Information Technology. 1998, 13: 301-311. 10.1057/jit.1998.8.

George AL, Bennett A: Case studies and theory development in the social sciences. 2005, Cambridge, MA: MIT Press

Eccles M, the Improved Clinical Effectiveness through Behavioural Research Group (ICEBeRG): Designing theoretically-informed implementation interventions. Implementation Science. 2006, 1: 1-8. 10.1186/1748-5908-1-1.

Article   PubMed Central   Google Scholar  

Netuveli G, Hurwitz B, Levy M, Fletcher M, Barnes G, Durham SR, Sheikh A: Ethnic variations in UK asthma frequency, morbidity, and health-service use: a systematic review and meta-analysis. Lancet. 2005, 365 (9456): 312-7.

Sheikh A, Panesar SS, Lasserson T, Netuveli G: Recruitment of ethnic minorities to asthma studies. Thorax. 2004, 59 (7): 634-

CAS   PubMed   PubMed Central   Google Scholar  

Hellström I, Nolan M, Lundh U: 'We do things together': A case study of 'couplehood' in dementia. Dementia. 2005, 4: 7-22. 10.1177/1471301205049188.

Som CV: Nothing seems to have changed, nothing seems to be changing and perhaps nothing will change in the NHS: doctors' response to clinical governance. International Journal of Public Sector Management. 2005, 18: 463-477. 10.1108/09513550510608903.

Lincoln Y, Guba E: Naturalistic inquiry. 1985, Newbury Park: Sage Publications

Barbour RS: Checklists for improving rigour in qualitative research: a case of the tail wagging the dog?. BMJ. 2001, 322: 1115-1117. 10.1136/bmj.322.7294.1115.

Mays N, Pope C: Qualitative research in health care: Assessing quality in qualitative research. BMJ. 2000, 320: 50-52. 10.1136/bmj.320.7226.50.

Mason J: Qualitative researching. 2002, London: Sage

Brazier A, Cooke K, Moravan V: Using Mixed Methods for Evaluating an Integrative Approach to Cancer Care: A Case Study. Integr Cancer Ther. 2008, 7: 5-17. 10.1177/1534735407313395.

Miles MB, Huberman M: Qualitative data analysis: an expanded sourcebook. 1994, CA: Sage Publications Inc., 2

Pope C, Ziebland S, Mays N: Analysing qualitative data. Qualitative research in health care. BMJ. 2000, 320: 114-116. 10.1136/bmj.320.7227.114.

Cresswell KM, Worth A, Sheikh A: Actor-Network Theory and its role in understanding the implementation of information technology developments in healthcare. BMC Med Inform Decis Mak. 2010, 10 (1): 67-10.1186/1472-6947-10-67.

Article   PubMed   PubMed Central   Google Scholar  

Malterud K: Qualitative research: standards, challenges, and guidelines. Lancet. 2001, 358: 483-488. 10.1016/S0140-6736(01)05627-6.

Article   CAS   PubMed   Google Scholar  

Yin R: Case study research: design and methods. 1994, Thousand Oaks, CA: Sage Publishing, 2

Yin R: Enhancing the quality of case studies in health services research. Health Serv Res. 1999, 34: 1209-1224.

Green J, Thorogood N: Qualitative methods for health research. 2009, Los Angeles: Sage, 2

Howcroft D, Trauth E: Handbook of Critical Information Systems Research, Theory and Application. 2005, Cheltenham, UK: Northampton, MA, USA: Edward Elgar

Book   Google Scholar  

Blakie N: Approaches to Social Enquiry. 1993, Cambridge: Polity Press

Doolin B: Power and resistance in the implementation of a medical management information system. Info Systems J. 2004, 14: 343-362. 10.1111/j.1365-2575.2004.00176.x.

Bloomfield BP, Best A: Management consultants: systems development, power and the translation of problems. Sociological Review. 1992, 40: 533-560.

Shanks G, Parr A: Positivist, single case study research in information systems: A critical analysis. Proceedings of the European Conference on Information Systems. 2003, Naples

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/11/100/prepub

Download references


We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.

Author information

Authors and affiliations.

Division of Primary Care, The University of Nottingham, Nottingham, UK

Sarah Crowe & Anthony Avery

Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK

Kathrin Cresswell, Ann Robertson & Aziz Sheikh

School of Health in Social Science, The University of Edinburgh, Edinburgh, UK

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sarah Crowe .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors' contributions

AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Crowe, S., Cresswell, K., Robertson, A. et al. The case study approach. BMC Med Res Methodol 11 , 100 (2011). https://doi.org/10.1186/1471-2288-11-100

Download citation

Received : 29 November 2010

Accepted : 27 June 2011

Published : 27 June 2011

DOI : https://doi.org/10.1186/1471-2288-11-100

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Case Study Approach
  • Electronic Health Record System
  • Case Study Design
  • Case Study Site
  • Case Study Report

BMC Medical Research Methodology

ISSN: 1471-2288

case study research article

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base


  • What Is a Case Study? | Definition, Examples & Methods

What Is a Case Study? | Definition, Examples & Methods

Published on May 8, 2019 by Shona McCombes . Revised on November 20, 2023.

A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research.

A case study research design usually involves qualitative methods , but quantitative methods are sometimes also used. Case studies are good for describing , comparing, evaluating and understanding different aspects of a research problem .

Table of contents

When to do a case study, step 1: select a case, step 2: build a theoretical framework, step 3: collect your data, step 4: describe and analyze the case, other interesting articles.

A case study is an appropriate research design when you want to gain concrete, contextual, in-depth knowledge about a specific real-world subject. It allows you to explore the key characteristics, meanings, and implications of the case.

Case studies are often a good choice in a thesis or dissertation . They keep your project focused and manageable when you don’t have the time or resources to do large-scale research.

You might use just one complex case study where you explore a single subject in depth, or conduct multiple case studies to compare and illuminate different aspects of your research problem.

Case study examples
Research question Case study
What are the ecological effects of wolf reintroduction? Case study of wolf reintroduction in Yellowstone National Park
How do populist politicians use narratives about history to gain support? Case studies of Hungarian prime minister Viktor Orbán and US president Donald Trump
How can teachers implement active learning strategies in mixed-level classrooms? Case study of a local school that promotes active learning
What are the main advantages and disadvantages of wind farms for rural communities? Case studies of three rural wind farm development projects in different parts of the country
How are viral marketing strategies changing the relationship between companies and consumers? Case study of the iPhone X marketing campaign
How do experiences of work in the gig economy differ by gender, race and age? Case studies of Deliveroo and Uber drivers in London

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

case study research article

Once you have developed your problem statement and research questions , you should be ready to choose the specific case that you want to focus on. A good case study should have the potential to:

  • Provide new or unexpected insights into the subject
  • Challenge or complicate existing assumptions and theories
  • Propose practical courses of action to resolve a problem
  • Open up new directions for future research

TipIf your research is more practical in nature and aims to simultaneously investigate an issue as you solve it, consider conducting action research instead.

Unlike quantitative or experimental research , a strong case study does not require a random or representative sample. In fact, case studies often deliberately focus on unusual, neglected, or outlying cases which may shed new light on the research problem.

Example of an outlying case studyIn the 1960s the town of Roseto, Pennsylvania was discovered to have extremely low rates of heart disease compared to the US average. It became an important case study for understanding previously neglected causes of heart disease.

However, you can also choose a more common or representative case to exemplify a particular category, experience or phenomenon.

Example of a representative case studyIn the 1920s, two sociologists used Muncie, Indiana as a case study of a typical American city that supposedly exemplified the changing culture of the US at the time.

While case studies focus more on concrete details than general theories, they should usually have some connection with theory in the field. This way the case study is not just an isolated description, but is integrated into existing knowledge about the topic. It might aim to:

  • Exemplify a theory by showing how it explains the case under investigation
  • Expand on a theory by uncovering new concepts and ideas that need to be incorporated
  • Challenge a theory by exploring an outlier case that doesn’t fit with established assumptions

To ensure that your analysis of the case has a solid academic grounding, you should conduct a literature review of sources related to the topic and develop a theoretical framework . This means identifying key concepts and theories to guide your analysis and interpretation.

There are many different research methods you can use to collect data on your subject. Case studies tend to focus on qualitative data using methods such as interviews , observations , and analysis of primary and secondary sources (e.g., newspaper articles, photographs, official records). Sometimes a case study will also collect quantitative data.

Example of a mixed methods case studyFor a case study of a wind farm development in a rural area, you could collect quantitative data on employment rates and business revenue, collect qualitative data on local people’s perceptions and experiences, and analyze local and national media coverage of the development.

The aim is to gain as thorough an understanding as possible of the case and its context.

In writing up the case study, you need to bring together all the relevant aspects to give as complete a picture as possible of the subject.

How you report your findings depends on the type of research you are doing. Some case studies are structured like a standard scientific paper or thesis , with separate sections or chapters for the methods , results and discussion .

Others are written in a more narrative style, aiming to explore the case from various angles and analyze its meanings and implications (for example, by using textual analysis or discourse analysis ).

In all cases, though, make sure to give contextual details about the case, connect it back to the literature and theory, and discuss how it fits into wider patterns or debates.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Degrees of freedom
  • Null hypothesis
  • Discourse analysis
  • Control groups
  • Mixed methods research
  • Non-probability sampling
  • Quantitative research
  • Ecological validity

Research bias

  • Rosenthal effect
  • Implicit bias
  • Cognitive bias
  • Selection bias
  • Negativity bias
  • Status quo bias

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 20). What Is a Case Study? | Definition, Examples & Methods. Scribbr. Retrieved June 28, 2024, from https://www.scribbr.com/methodology/case-study/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, primary vs. secondary sources | difference & examples, what is a theoretical framework | guide to organizing, what is action research | definition & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 21, Issue 1
  • What is a case study?
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • Roberta Heale 1 ,
  • Alison Twycross 2
  • 1 School of Nursing , Laurentian University , Sudbury , Ontario , Canada
  • 2 School of Health and Social Care , London South Bank University , London , UK
  • Correspondence to Dr Roberta Heale, School of Nursing, Laurentian University, Sudbury, ON P3E2C6, Canada; rheale{at}laurentian.ca


Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is it?

Case study is a research methodology, typically seen in social and life sciences. There is no one definition of case study research. 1 However, very simply… ‘a case study can be defined as an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units’. 1 A case study has also been described as an intensive, systematic investigation of a single individual, group, community or some other unit in which the researcher examines in-depth data relating to several variables. 2

Often there are several similar cases to consider such as educational or social service programmes that are delivered from a number of locations. Although similar, they are complex and have unique features. In these circumstances, the evaluation of several, similar cases will provide a better answer to a research question than if only one case is examined, hence the multiple-case study. Stake asserts that the cases are grouped and viewed as one entity, called the quintain . 6  ‘We study what is similar and different about the cases to understand the quintain better’. 6

The steps when using case study methodology are the same as for other types of research. 6 The first step is defining the single case or identifying a group of similar cases that can then be incorporated into a multiple-case study. A search to determine what is known about the case(s) is typically conducted. This may include a review of the literature, grey literature, media, reports and more, which serves to establish a basic understanding of the cases and informs the development of research questions. Data in case studies are often, but not exclusively, qualitative in nature. In multiple-case studies, analysis within cases and across cases is conducted. Themes arise from the analyses and assertions about the cases as a whole, or the quintain, emerge. 6

Benefits and limitations of case studies

If a researcher wants to study a specific phenomenon arising from a particular entity, then a single-case study is warranted and will allow for a in-depth understanding of the single phenomenon and, as discussed above, would involve collecting several different types of data. This is illustrated in example 1 below.

Using a multiple-case research study allows for a more in-depth understanding of the cases as a unit, through comparison of similarities and differences of the individual cases embedded within the quintain. Evidence arising from multiple-case studies is often stronger and more reliable than from single-case research. Multiple-case studies allow for more comprehensive exploration of research questions and theory development. 6

Despite the advantages of case studies, there are limitations. The sheer volume of data is difficult to organise and data analysis and integration strategies need to be carefully thought through. There is also sometimes a temptation to veer away from the research focus. 2 Reporting of findings from multiple-case research studies is also challenging at times, 1 particularly in relation to the word limits for some journal papers.

Examples of case studies

Example 1: nurses’ paediatric pain management practices.

One of the authors of this paper (AT) has used a case study approach to explore nurses’ paediatric pain management practices. This involved collecting several datasets:

Observational data to gain a picture about actual pain management practices.

Questionnaire data about nurses’ knowledge about paediatric pain management practices and how well they felt they managed pain in children.

Questionnaire data about how critical nurses perceived pain management tasks to be.

These datasets were analysed separately and then compared 7–9 and demonstrated that nurses’ level of theoretical did not impact on the quality of their pain management practices. 7 Nor did individual nurse’s perceptions of how critical a task was effect the likelihood of them carrying out this task in practice. 8 There was also a difference in self-reported and observed practices 9 ; actual (observed) practices did not confirm to best practice guidelines, whereas self-reported practices tended to.

Example 2: quality of care for complex patients at Nurse Practitioner-Led Clinics (NPLCs)

The other author of this paper (RH) has conducted a multiple-case study to determine the quality of care for patients with complex clinical presentations in NPLCs in Ontario, Canada. 10 Five NPLCs served as individual cases that, together, represented the quatrain. Three types of data were collected including:

Review of documentation related to the NPLC model (media, annual reports, research articles, grey literature and regulatory legislation).

Interviews with nurse practitioners (NPs) practising at the five NPLCs to determine their perceptions of the impact of the NPLC model on the quality of care provided to patients with multimorbidity.

Chart audits conducted at the five NPLCs to determine the extent to which evidence-based guidelines were followed for patients with diabetes and at least one other chronic condition.

The three sources of data collected from the five NPLCs were analysed and themes arose related to the quality of care for complex patients at NPLCs. The multiple-case study confirmed that nurse practitioners are the primary care providers at the NPLCs, and this positively impacts the quality of care for patients with multimorbidity. Healthcare policy, such as lack of an increase in salary for NPs for 10 years, has resulted in issues in recruitment and retention of NPs at NPLCs. This, along with insufficient resources in the communities where NPLCs are located and high patient vulnerability at NPLCs, have a negative impact on the quality of care. 10

These examples illustrate how collecting data about a single case or multiple cases helps us to better understand the phenomenon in question. Case study methodology serves to provide a framework for evaluation and analysis of complex issues. It shines a light on the holistic nature of nursing practice and offers a perspective that informs improved patient care.

  • Gustafsson J
  • Calanzaro M
  • Sandelowski M

Competing interests None declared.

Provenance and peer review Commissioned; internally peer reviewed.

Read the full text or download the PDF:

Case study research: opening up research opportunities

RAUSP Management Journal

ISSN : 2531-0488

Article publication date: 30 December 2019

Issue publication date: 3 March 2020

The case study approach has been widely used in management studies and the social sciences more generally. However, there are still doubts about when and how case studies should be used. This paper aims to discuss this approach, its various uses and applications, in light of epistemological principles, as well as the criteria for rigor and validity.


This paper discusses the various concepts of case and case studies in the methods literature and addresses the different uses of cases in relation to epistemological principles and criteria for rigor and validity.

The use of this research approach can be based on several epistemologies, provided the researcher attends to the internal coherence between method and epistemology, or what the authors call “alignment.”


This study offers a number of implications for the practice of management research, as it shows how the case study approach does not commit the researcher to particular data collection or interpretation methods. Furthermore, the use of cases can be justified according to multiple epistemological orientations.

  • Epistemology

Takahashi, A.R.W. and Araujo, L. (2020), "Case study research: opening up research opportunities", RAUSP Management Journal , Vol. 55 No. 1, pp. 100-111. https://doi.org/10.1108/RAUSP-05-2019-0109

Emerald Publishing Limited

Copyright © 2019, Adriana Roseli Wünsch Takahashi and Luis Araujo.

Published in RAUSP Management Journal . Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode

1. Introduction

The case study as a research method or strategy brings us to question the very term “case”: after all, what is a case? A case-based approach places accords the case a central role in the research process ( Ragin, 1992 ). However, doubts still remain about the status of cases according to different epistemologies and types of research designs.

Despite these doubts, the case study is ever present in the management literature and represents the main method of management research in Brazil ( Coraiola, Sander, Maccali, & Bulgacov, 2013 ). Between 2001 and 2010, 2,407 articles (83.14 per cent of qualitative research) were published in conferences and management journals as case studies (Takahashi & Semprebom, 2013 ). A search on Spell.org.br for the term “case study” under title, abstract or keywords, for the period ranging from January 2010 to July 2019, yielded 3,040 articles published in the management field. Doing research using case studies, allows the researcher to immerse him/herself in the context and gain intensive knowledge of a phenomenon, which in turn demands suitable methodological principles ( Freitas et al. , 2017 ).

Our objective in this paper is to discuss notions of what constitutes a case and its various applications, considering epistemological positions as well as criteria for rigor and validity. The alignment between these dimensions is put forward as a principle advocating coherence among all phases of the research process.

This article makes two contributions. First, we suggest that there are several epistemological justifications for using case studies. Second, we show that the quality and rigor of academic research with case studies are directly related to the alignment between epistemology and research design rather than to choices of specific forms of data collection or analysis. The article is structured as follows: the following four sections discuss concepts of what is a case, its uses, epistemological grounding as well as rigor and quality criteria. The brief conclusions summarize the debate and invite the reader to delve into the literature on the case study method as a way of furthering our understanding of contemporary management phenomena.

2. What is a case study?

The debate over what constitutes a case in social science is a long-standing one. In 1988, Howard Becker and Charles Ragin organized a workshop to discuss the status of the case as a social science method. As the discussion was inconclusive, they posed the question “What is a case?” to a select group of eight social scientists in 1989, and later to participants in a symposium on the subject. Participants were unable to come up with a consensual answer. Since then, we have witnessed that further debates and different answers have emerged. The original question led to an even broader issue: “How do we, as social scientists, produce results and seem to know what we know?” ( Ragin, 1992 , p. 16).

An important step that may help us start a reflection on what is a case is to consider the phenomena we are looking at. To do that, we must know something about what we want to understand and how we might study it. The answer may be a causal explanation, a description of what was observed or a narrative of what has been experienced. In any case, there will always be a story to be told, as the choice of the case study method demands an answer to what the case is about.

A case may be defined ex ante , prior to the start of the research process, as in Yin’s (2015) classical definition. But, there is no compelling reason as to why cases must be defined ex ante . Ragin (1992 , p. 217) proposed the notion of “casing,” to indicate that what the case is emerges from the research process:

Rather than attempt to delineate the many different meanings of the term “case” in a formal taxonomy, in this essay I offer instead a view of cases that follows from the idea implicit in many of the contributions – that concocting cases is a varied but routine social scientific activity. […] The approach of this essay is that this activity, which I call “casing”, should be viewed in practical terms as a research tactic. It is selectively invoked at many different junctures in the research process, usually to resolve difficult issues in linking ideas and evidence.

In other words, “casing” is tied to the researcher’s practice, to the way he/she delimits or declares a case as a significant outcome of a process. In 2013, Ragin revisited the 1992 concept of “casing” and explored its multiple possibilities of use, paying particular attention to “negative cases.”

According to Ragin (1992) , a case can be centered on a phenomenon or a population. In the first scenario, cases are representative of a phenomenon, and are selected based on what can be empirically observed. The process highlights different aspects of cases and obscures others according to the research design, and allows for the complexity, specificity and context of the phenomenon to be explored. In the alternative, population-focused scenario, the selection of cases precedes the research. Both positive and negative cases are considered in exploring a phenomenon, with the definition of the set of cases dependent on theory and the central objective to build generalizations. As a passing note, it is worth mentioning here that a study of multiple cases requires a definition of the unit of analysis a priori . Otherwise, it will not be possible to make cross-case comparisons.

These two approaches entail differences that go beyond the mere opposition of quantitative and qualitative data, as a case often includes both types of data. Thus, the confusion about how to conceive cases is associated with Ragin’s (1992) notion of “small vs large N,” or McKeown’s (1999) “statistical worldview” – the notion that relevant findings are only those that can be made about a population based on the analysis of representative samples. In the same vein, Byrne (2013) argues that we cannot generate nomothetic laws that apply in all circumstances, periods and locations, and that no social science method can claim to generate invariant laws. According to the same author, case studies can help us understand that there is more than one ideographic variety and help make social science useful. Generalizations still matter, but they should be understood as part of defining the research scope, and that scope points to the limitations of knowledge produced and consumed in concrete time and space.

Thus, what defines the orientation and the use of cases is not the mere choice of type of data, whether quantitative or qualitative, but the orientation of the study. A statistical worldview sees cases as data units ( Byrne, 2013 ). Put differently, there is a clear distinction between statistical and qualitative worldviews; the use of quantitative data does not by itself means that the research is (quasi) statistical, or uses a deductive logic:

Case-based methods are useful, and represent, among other things, a way of moving beyond a useless and destructive tradition in the social sciences that have set quantitative and qualitative modes of exploration, interpretation, and explanation against each other ( Byrne, 2013 , p. 9).

Other authors advocate different understandings of what a case study is. To some, it is a research method, to others it is a research strategy ( Creswell, 1998 ). Sharan Merrian and Robert Yin, among others, began to write about case study research as a methodology in the 1980s (Merrian, 2009), while authors such as Eisenhardt (1989) called it a research strategy. Stake (2003) sees the case study not as a method, but as a choice of what to be studied, the unit of study. Regardless of their differences, these authors agree that case studies should be restricted to a particular context as they aim to provide an in-depth knowledge of a given phenomenon: “A case study is an in-depth description and analysis of a bounded system” (Merrian, 2009, p. 40). According to Merrian, a qualitative case study can be defined by the process through which the research is carried out, by the unit of analysis or the final product, as the choice ultimately depends on what the researcher wants to know. As a product of research, it involves the analysis of a given entity, phenomenon or social unit.

Thus, whether it is an organization, an individual, a context or a phenomenon, single or multiple, one must delimit it, and also choose between possible types and configurations (Merrian, 2009; Yin, 2015 ). A case study may be descriptive, exploratory, explanatory, single or multiple ( Yin, 2015 ); intrinsic, instrumental or collective ( Stake, 2003 ); and confirm or build theory ( Eisenhardt, 1989 ).

both went through the same process of implementing computer labs intended for the use of information and communication technologies in 2007;

both took part in the same regional program (Paraná Digital); and

they shared similar characteristics regarding location (operation in the same neighborhood of a city), number of students, number of teachers and technicians and laboratory sizes.

However, the two institutions differed in the number of hours of program use, with one of them displaying a significant number of hours/use while the other showed a modest number, according to secondary data for the period 2007-2013. Despite the context being similar and the procedures for implementing the technology being the same, the mechanisms of social integration – an idiosyncratic factor of each institution – were different in each case. This explained differences in their use of resource, processes of organizational learning and capacity to absorb new knowledge.

On the other hand, multiple case studies seek evidence in different contexts and do not necessarily require direct comparisons ( Stake, 2003 ). Rather, there is a search for patterns of convergence and divergence that permeate all the cases, as the same issues are explored in every case. Cases can be added progressively until theoretical saturation is achieved. An example is of a study that investigated how entrepreneurial opportunity and management skills were developed through entrepreneurial learning ( Zampier & Takahashi, 2014 ). The authors conducted nine case studies, based on primary and secondary data, with each one analyzed separately, so a search for patterns could be undertaken. The convergence aspects found were: the predominant way of transforming experience into knowledge was exploitation; managerial skills were developed through by taking advantages of opportunities; and career orientation encompassed more than one style. As for divergence patterns: the experience of success and failure influenced entrepreneurs differently; the prevailing rationality logic of influence was different; and the combination of styles in career orientation was diverse.

A full discussion of choice of case study design is outside the scope of this article. For the sake of illustration, we make a brief mention to other selection criteria such as the purpose of the research, the state of the art of the research theme, the time and resources involved and the preferred epistemological position of the researcher. In the next section, we look at the possibilities of carrying out case studies in line with various epistemological traditions, as the answers to the “what is a case?” question reveal varied methodological commitments as well as diverse epistemological and ontological positions ( Ragin, 2013 ).

3. Epistemological positioning of case study research

Ontology and epistemology are like skin, not a garment to be occasionally worn ( Marsh & Furlong, 2002 ). According to these authors, ontology and epistemology guide the choice of theory and method because they cannot or should not be worn as a garment. Hence, one must practice philosophical “self-knowledge” to recognize one’s vision of what the world is and of how knowledge of that world is accessed and validated. Ontological and epistemological positions are relevant in that they involve the positioning of the researcher in social science and the phenomena he or she chooses to study. These positions do not tend to vary from one project to another although they can certainly change over time for a single researcher.

Ontology is the starting point from which the epistemological and methodological positions of the research arise ( Grix, 2002 ). Ontology expresses a view of the world, what constitutes reality, nature and the image one has of social reality; it is a theory of being ( Marsh & Furlong, 2002 ). The central question is the nature of the world out there regardless of our ability to access it. An essentialist or foundationalist ontology acknowledges that there are differences that persist over time and these differences are what underpin the construction of social life. An opposing, anti-foundationalist position presumes that the differences found are socially constructed and may vary – i.e. they are not essential but specific to a given culture at a given time ( Marsh & Furlong, 2002 ).

Epistemology is centered around a theory of knowledge, focusing on the process of acquiring and validating knowledge ( Grix, 2002 ). Positivists look at social phenomena as a world of causal relations where there is a single truth to be accessed and confirmed. In this tradition, case studies test hypotheses and rely on deductive approaches and quantitative data collection and analysis techniques. Scholars in the field of anthropology and observation-based qualitative studies proposed alternative epistemologies based on notions of the social world as a set of manifold and ever-changing processes. In management studies since the 1970s, the gradual acceptance of qualitative research has generated a diverse range of research methods and conceptions of the individual and society ( Godoy, 1995 ).

The interpretative tradition, in direct opposition to positivism, argues that there is no single objective truth to be discovered about the social world. The social world and our knowledge of it are the product of social constructions. Thus, the social world is constituted by interactions, and our knowledge is hermeneutic as the world does not exist independent of our knowledge ( Marsh & Furlong, 2002 ). The implication is that it is not possible to access social phenomena through objective, detached methods. Instead, the interaction mechanisms and relationships that make up social constructions have to be studied. Deductive approaches, hypothesis testing and quantitative methods are not relevant here. Hermeneutics, on the other hand, is highly relevant as it allows the analysis of the individual’s interpretation, of sayings, texts and actions, even though interpretation is always the “truth” of a subject. Methods such as ethnographic case studies, interviews and observations as data collection techniques should feed research designs according to interpretivism. It is worth pointing out that we are to a large extent, caricaturing polar opposites rather characterizing a range of epistemological alternatives, such as realism, conventionalism and symbolic interactionism.

If diverse ontologies and epistemologies serve as a guide to research approaches, including data collection and analysis methods, and if they should be regarded as skin rather than clothing, how does one make choices regarding case studies? What are case studies, what type of knowledge they provide and so on? The views of case study authors are not always explicit on this point, so we must delve into their texts to glean what their positions might be.

Two of the cited authors in case study research are Robert Yin and Kathleen Eisenhardt. Eisenhardt (1989) argues that a case study can serve to provide a description, test or generate a theory, the latter being the most relevant in contributing to the advancement of knowledge in a given area. She uses terms such as populations and samples, control variables, hypotheses and generalization of findings and even suggests an ideal number of case studies to allow for theory construction through replication. Although Eisenhardt includes observation and interview among her recommended data collection techniques, the approach is firmly anchored in a positivist epistemology:

Third, particularly in comparison with Strauss (1987) and Van Maanen (1988), the process described here adopts a positivist view of research. That is, the process is directed toward the development of testable hypotheses and theory which are generalizable across settings. In contrast, authors like Strauss and Van Maanen are more concerned that a rich, complex description of the specific cases under study evolve and they appear less concerned with development of generalizable theory ( Eisenhardt, 1989 , p. 546).

This position attracted a fair amount of criticism. Dyer & Wilkins (1991) in a critique of Eisenhardt’s (1989) article focused on the following aspects: there is no relevant justification for the number of cases recommended; it is the depth and not the number of cases that provides an actual contribution to theory; and the researcher’s purpose should be to get closer to the setting and interpret it. According to the same authors, discrepancies from prior expectations are also important as they lead researchers to reflect on existing theories. Eisenhardt & Graebner (2007 , p. 25) revisit the argument for the construction of a theory from multiple cases:

A major reason for the popularity and relevance of theory building from case studies is that it is one of the best (if not the best) of the bridges from rich qualitative evidence to mainstream deductive research.

Although they recognize the importance of single-case research to explore phenomena under unique or rare circumstances, they reaffirm the strength of multiple case designs as it is through them that better accuracy and generalization can be reached.

Likewise, Robert Yin emphasizes the importance of variables, triangulation in the search for “truth” and generalizable theoretical propositions. Yin (2015 , p. 18) suggests that the case study method may be appropriate for different epistemological orientations, although much of his work seems to invoke a realist epistemology. Authors such as Merrian (2009) and Stake (2003) suggest an interpretative version of case studies. Stake (2003) looks at cases as a qualitative option, where the most relevant criterion of case selection should be the opportunity to learn and understand a phenomenon. A case is not just a research method or strategy; it is a researcher’s choice about what will be studied:

Even if my definition of case study was agreed upon, and it is not, the term case and study defy full specification (Kemmis, 1980). A case study is both a process of inquiry about the case and the product of that inquiry ( Stake, 2003 , p. 136).

Later, Stake (2003 , p. 156) argues that:

[…] the purpose of a case report is not to represent the world, but to represent the case. […] The utility of case research to practitioners and policy makers is in its extension of experience.

Still according to Stake (2003 , pp. 140-141), to do justice to complex views of social phenomena, it is necessary to analyze the context and relate it to the case, to look for what is peculiar rather than common in cases to delimit their boundaries, to plan the data collection looking for what is common and unusual about facts, what could be valuable whether it is unique or common:

Reflecting upon the pertinent literature, I find case study methodology written largely by people who presume that the research should contribute to scientific generalization. The bulk of case study work, however, is done by individuals who have intrinsic interest in the case and little interest in the advance of science. Their designs aim the inquiry toward understanding of what is important about that case within its own world, which is seldom the same as the worlds of researchers and theorists. Those designs develop what is perceived to be the case’s own issues, contexts, and interpretations, its thick descriptions . In contrast, the methods of instrumental case study draw the researcher toward illustrating how the concerns of researchers and theorists are manifest in the case. Because the critical issues are more likely to be know in advance and following disciplinary expectations, such a design can take greater advantage of already developed instruments and preconceived coding schemes.

The aforementioned authors were listed to illustrate differences and sometimes opposing positions on case research. These differences are not restricted to a choice between positivism and interpretivism. It is worth noting that Ragin’s (2013 , p. 523) approach to “casing” is compatible with the realistic research perspective:

In essence, to posit cases is to engage in ontological speculation regarding what is obdurately real but only partially and indirectly accessible through social science. Bringing a realist perspective to the case question deepens and enriches the dialogue, clarifying some key issues while sweeping others aside.

cases are actual entities, reflecting their operations of real causal mechanism and process patterns;

case studies are interactive processes and are open to revisions and refinements; and

social phenomena are complex, contingent and context-specific.

Ragin (2013 , p. 532) concludes:

Lurking behind my discussion of negative case, populations, and possibility analysis is the implication that treating cases as members of given (and fixed) populations and seeking to infer the properties of populations may be a largely illusory exercise. While demographers have made good use of the concept of population, and continue to do so, it is not clear how much the utility of the concept extends beyond their domain. In case-oriented work, the notion of fixed populations of cases (observations) has much less analytic utility than simply “the set of relevant cases,” a grouping that must be specified or constructed by the researcher. The demarcation of this set, as the work of case-oriented researchers illustrates, is always tentative, fluid, and open to debate. It is only by casing social phenomena that social scientists perceive the homogeneity that allows analysis to proceed.

In summary, case studies are relevant and potentially compatible with a range of different epistemologies. Researchers’ ontological and epistemological positions will guide their choice of theory, methodologies and research techniques, as well as their research practices. The same applies to the choice of authors describing the research method and this choice should be coherent. We call this research alignment , an attribute that must be judged on the internal coherence of the author of a study, and not necessarily its evaluator. The following figure illustrates the interrelationship between the elements of a study necessary for an alignment ( Figure 1 ).

In addition to this broader aspect of the research as a whole, other factors should be part of the researcher’s concern, such as the rigor and quality of case studies. We will look into these in the next section taking into account their relevance to the different epistemologies.

4. Rigor and quality in case studies

Traditionally, at least in positivist studies, validity and reliability are the relevant quality criteria to judge research. Validity can be understood as external, internal and construct. External validity means identifying whether the findings of a study are generalizable to other studies using the logic of replication in multiple case studies. Internal validity may be established through the theoretical underpinning of existing relationships and it involves the use of protocols for the development and execution of case studies. Construct validity implies defining the operational measurement criteria to establish a chain of evidence, such as the use of multiple sources of evidence ( Eisenhardt, 1989 ; Yin, 2015 ). Reliability implies conducting other case studies, instead of just replicating results, to minimize the errors and bias of a study through case study protocols and the development of a case database ( Yin, 2015 ).

Several criticisms have been directed toward case studies, such as lack of rigor, lack of generalization potential, external validity and researcher bias. Case studies are often deemed to be unreliable because of a lack of rigor ( Seuring, 2008 ). Flyvbjerg (2006 , p. 219) addresses five misunderstandings about case-study research, and concludes that:

[…] a scientific discipline without a large number of thoroughly executed case studies is a discipline without systematic production of exemplars, and a discipline without exemplars is an ineffective one.

theoretical knowledge is more valuable than concrete, practical knowledge;

the case study cannot contribute to scientific development because it is not possible to generalize on the basis of an individual case;

the case study is more useful for generating rather than testing hypotheses;

the case study contains a tendency to confirm the researcher’s preconceived notions; and

it is difficult to summarize and develop general propositions and theories based on case studies.

These criticisms question the validity of the case study as a scientific method and should be corrected.

The critique of case studies is often framed from the standpoint of what Ragin (2000) labeled large-N research. The logic of small-N research, to which case studies belong, is different. Cases benefit from depth rather than breadth as they: provide theoretical and empirical knowledge; contribute to theory through propositions; serve not only to confirm knowledge, but also to challenge and overturn preconceived notions; and the difficulty in summarizing their conclusions is because of the complexity of the phenomena studies and not an intrinsic limitation of the method.

Thus, case studies do not seek large-scale generalizations as that is not their purpose. And yet, this is a limitation from a positivist perspective as there is an external reality to be “apprehended” and valid conclusions to be extracted for an entire population. If positivism is the epistemology of choice, the rigor of a case study can be demonstrated by detailing the criteria used for internal and external validity, construct validity and reliability ( Gibbert & Ruigrok, 2010 ; Gibbert, Ruigrok, & Wicki, 2008 ). An example can be seen in case studies in the area of information systems, where there is a predominant orientation of positivist approaches to this method ( Pozzebon & Freitas, 1998 ). In this area, rigor also involves the definition of a unit of analysis, type of research, number of cases, selection of sites, definition of data collection and analysis procedures, definition of the research protocol and writing a final report. Creswell (1998) presents a checklist for researchers to assess whether the study was well written, if it has reliability and validity and if it followed methodological protocols.

In case studies with a non-positivist orientation, rigor can be achieved through careful alignment (coherence among ontology, epistemology, theory and method). Moreover, the concepts of validity can be understood as concern and care in formulating research, research development and research results ( Ollaik & Ziller, 2012 ), and to achieve internal coherence ( Gibbert et al. , 2008 ). The consistency between data collection and interpretation, and the observed reality also help these studies meet coherence and rigor criteria. Siggelkow (2007) argues that a case study should be persuasive and that even a single case study may be a powerful example to contest a widely held view. To him, the value of a single case study or studies with few cases can be attained by their potential to provide conceptual insights and coherence to the internal logic of conceptual arguments: “[…] a paper should allow a reader to see the world, and not just the literature, in a new way” ( Siggelkow, 2007 , p. 23).

Interpretative studies should not be justified by criteria derived from positivism as they are based on a different ontology and epistemology ( Sandberg, 2005 ). The rejection of an interpretive epistemology leads to the rejection of an objective reality: “As Bengtsson points out, the life-world is the subjects’ experience of reality, at the same time as it is objective in the sense that it is an intersubjective world” ( Sandberg, 2005 , p. 47). In this event, how can one demonstrate what positivists call validity and reliability? What would be the criteria to justify knowledge as truth, produced by research in this epistemology? Sandberg (2005 , p. 62) suggests an answer based on phenomenology:

This was demonstrated first by explicating life-world and intentionality as the basic assumptions underlying the interpretative research tradition. Second, based on those assumptions, truth as intentional fulfillment, consisting of perceived fulfillment, fulfillment in practice, and indeterminate fulfillment, was proposed. Third, based on the proposed truth constellation, communicative, pragmatic, and transgressive validity and reliability as interpretative awareness were presented as the most appropriate criteria for justifying knowledge produced within interpretative approach. Finally, the phenomenological epoché was suggested as a strategy for achieving these criteria.

From this standpoint, the research site must be chosen according to its uniqueness so that one can obtain relevant insights that no other site could provide ( Siggelkow, 2007 ). Furthermore, the view of what is being studied is at the center of the researcher’s attention to understand its “truth,” inserted in a given context.

The case researcher is someone who can reduce the probability of misinterpretations by analyzing multiple perceptions, searches for data triangulation to check for the reliability of interpretations ( Stake, 2003 ). It is worth pointing out that this is not an option for studies that specifically seek the individual’s experience in relation to organizational phenomena.

In short, there are different ways of seeking rigor and quality in case studies, depending on the researcher’s worldview. These different forms pervade everything from the research design, the choice of research questions, the theory or theories to look at a phenomenon, research methods, the data collection and analysis techniques, to the type and style of research report produced. Validity can also take on different forms. While positivism is concerned with validity of the research question and results, interpretivism emphasizes research processes without neglecting the importance of the articulation of pertinent research questions and the sound interpretation of results ( Ollaik & Ziller, 2012 ). The means to achieve this can be diverse, such as triangulation (of multiple theories, multiple methods, multiple data sources or multiple investigators), pre-tests of data collection instrument, pilot case, study protocol, detailed description of procedures such as field diary in observations, researcher positioning (reflexivity), theoretical-empirical consistency, thick description and transferability.

5. Conclusions

The central objective of this article was to discuss concepts of case study research, their potential and various uses, taking into account different epistemologies as well as criteria of rigor and validity. Although the literature on methodology in general and on case studies in particular, is voluminous, it is not easy to relate this approach to epistemology. In addition, method manuals often focus on the details of various case study approaches which confuse things further.

Faced with this scenario, we have tried to address some central points in this debate and present various ways of using case studies according to the preferred epistemology of the researcher. We emphasize that this understanding depends on how a case is defined and the particular epistemological orientation that underpins that conceptualization. We have argued that whatever the epistemological orientation is, it is possible to meet appropriate criteria of research rigor and quality provided there is an alignment among the different elements of the research process. Furthermore, multiple data collection techniques can be used in in single or multiple case study designs. Data collection techniques or the type of data collected do not define the method or whether cases should be used for theory-building or theory-testing.

Finally, we encourage researchers to consider case study research as one way to foster immersion in phenomena and their contexts, stressing that the approach does not imply a commitment to a particular epistemology or type of research, such as qualitative or quantitative. Case study research allows for numerous possibilities, and should be celebrated for that diversity rather than pigeon-holed as a monolithic research method.

case study research article

The interrelationship between the building blocks of research

Byrne , D. ( 2013 ). Case-based methods: Why We need them; what they are; how to do them . Byrne D. In D Byrne. and C.C Ragin (Eds.), The SAGE handbooks of Case-Based methods , pp. 1 – 10 . London : SAGE Publications Inc .

Creswell , J. W. ( 1998 ). Qualitative inquiry and research design: choosing among five traditions , London : Sage Publications .

Coraiola , D. M. , Sander , J. A. , Maccali , N. & Bulgacov , S. ( 2013 ). Estudo de caso . In A. R. W. Takahashi , (Ed.), Pesquisa qualitativa em administração: Fundamentos, métodos e usos no Brasil , pp. 307 – 341 . São Paulo : Atlas .

Dyer , W. G. , & Wilkins , A. L. ( 1991 ). Better stories, not better constructs, to generate better theory: a rejoinder to Eisenhardt . The Academy of Management Review , 16 , 613 – 627 .

Eisenhardt , K. ( 1989 ). Building theory from case study research . Academy of Management Review , 14 , 532 – 550 .

Eisenhardt , K. M. , & Graebner , M. E. ( 2007 ). Theory building from cases: Opportunities and challenges . Academy of Management Journal , 50 , 25 – 32 .

Flyvbjerg , B. ( 2006 ). Five misunderstandings about case-study research . Qualitative Inquiry , 12 , 219 – 245 .

Freitas , J. S. , Ferreira , J. C. A. , Campos , A. A. R. , Melo , J. C. F. , Cheng , L. C. , & Gonçalves , C. A. ( 2017 ). Methodological roadmapping: a study of centering resonance analysis . RAUSP Management Journal , 53 , 459 – 475 .

Gibbert , M. , Ruigrok , W. , & Wicki , B. ( 2008 ). What passes as a rigorous case study? . Strategic Management Journal , 29 , 1465 – 1474 .

Gibbert , M. , & Ruigrok , W. ( 2010 ). The “what” and “how” of case study rigor: Three strategies based on published work . Organizational Research Methods , 13 , 710 – 737 .

Godoy , A. S. ( 1995 ). Introdução à pesquisa qualitativa e suas possibilidades . Revista de Administração de Empresas , 35 , 57 – 63 .

Grix , J. ( 2002 ). Introducing students to the generic terminology of social research . Politics , 22 , 175 – 186 .

Marsh , D. , & Furlong , P. ( 2002 ). A skin, not a sweater: ontology and epistemology in political science . In D Marsh. , & G Stoker , (Eds.), Theory and Methods in Political Science , New York, NY : Palgrave McMillan , pp. 17 – 41 .

McKeown , T. J. ( 1999 ). Case studies and the statistical worldview: Review of King, Keohane, and Verba’s designing social inquiry: Scientific inference in qualitative research . International Organization , 53 , 161 – 190 .

Merriam , S. B. ( 2009 ). Qualitative research: a guide to design and implementation .

Ollaik , L. G. , & Ziller , H. ( 2012 ). Distintas concepções de validade em pesquisas qualitativas . Educação e Pesquisa , 38 , 229 – 241 .

Picoli , F. R. , & Takahashi , A. R. W. ( 2016 ). Capacidade de absorção, aprendizagem organizacional e mecanismos de integração social . Revista de Administração Contemporânea , 20 , 1 – 20 .

Pozzebon , M. , & Freitas , H. M. R. ( 1998 ). Pela aplicabilidade: com um maior rigor científico – dos estudos de caso em sistemas de informação . Revista de Administração Contemporânea , 2 , 143 – 170 .

Sandberg , J. ( 2005 ). How do we justify knowledge produced within interpretive approaches? . Organizational Research Methods , 8 , 41 – 68 .

Seuring , S. A. ( 2008 ). Assessing the rigor of case study research in supply chain management. Supply chain management . Supply Chain Management: An International Journal , 13 , 128 – 137 .

Siggelkow , N. ( 2007 ). Persuasion with case studies . Academy of Management Journal , 50 , 20 – 24 .

Stake , R. E. ( 2003 ). Case studies . In N. K. , Denzin , & Y. S. , Lincoln (Eds.). Strategies of Qualitative Inquiry , London : Sage Publications . pp. 134 – 164 .

Takahashi , A. R. W. , & Semprebom , E. ( 2013 ). Resultados gerais e desafios . In A. R. W. , Takahashi (Ed.), Pesquisa qualitativa em administração: Fundamentos, métodos e usos no brasil , pp. 343 – 354 . São Paulo : Atlas .

Ragin , C. C. ( 1992 ). Introduction: Cases of “what is a case? . In H. S. , Becker , & C. C. Ragin and (Eds). What is a case? Exploring the foundations of social inquiry , pp. 1 – 18 .

Ragin , C. C. ( 2013 ). Reflections on casing and Case-Oriented research . In D , Byrne. , & C. C. Ragin (Eds.), The SAGE handbooks of Case-Based methods , London : SAGE Publications , pp. 522 – 534 .

Yin , R. K. ( 2015 ). Estudo de caso: planejamento e métodos , Porto Alegre : Bookman .

Zampier , M. A. , & Takahashi , A. R. W. ( 2014 ). Aprendizagem e competências empreendedoras: Estudo de casos de micro e pequenas empresas do setor educacional . RGO Revista Gestão Organizacional , 6 , 1 – 18 .


Author contributions: Both authors contributed equally.

Corresponding author

Related articles, all feedback is valuable.

Please share your general feedback

Report an issue or find answers to frequently asked questions

Contact Customer Support

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Case study research


  • 1 Faculty of Health, Social Care and Education, Anglia Ruskin University, Cambridge, England.
  • PMID: 26058651
  • DOI: 10.7748/ns.29.41.36.e8856

This article describes case study research for nursing and healthcare practice. Case study research offers the researcher an approach by which a phenomenon can be investigated from multiple perspectives within a bounded context, allowing the researcher to provide a 'thick' description of the phenomenon. Although case study research is a flexible approach for the investigation of complex nursing and healthcare issues, it has methodological challenges, often associated with the multiple methods used in individual studies. These are explored through examples of case study research carried out in practice and education settings. An overview of what constitutes 'good' case study research is proposed.

Keywords: Case study research; case study research approaches; nursing protocols; research design; research methodologies; rigour.

PubMed Disclaimer

Similar articles

  • Research education: findings of a study of teaching-learning research using multiple analytical perspectives. Vandermause R, Barbosa-Leiker C, Fritz R. Vandermause R, et al. J Nurs Educ. 2014 Dec;53(12):673-7. doi: 10.3928/01484834-20141120-02. Epub 2014 Nov 20. J Nurs Educ. 2014. PMID: 25406843
  • Clarifying case study research: examples from practice. Casey D, Houghton C. Casey D, et al. Nurse Res. 2010;17(3):41-51. doi: 10.7748/nr2010. Nurse Res. 2010. PMID: 20450088
  • Schematic representation of case study research designs. Rosenberg JP, Yates PM. Rosenberg JP, et al. J Adv Nurs. 2007 Nov;60(4):447-52. doi: 10.1111/j.1365-2648.2007.04385.x. Epub 2007 Sep 6. J Adv Nurs. 2007. PMID: 17822427
  • Challenges and strategies in developing nursing research capacity: a review of the literature. Segrott J, McIvor M, Green B. Segrott J, et al. Int J Nurs Stud. 2006 Jul;43(5):637-51. doi: 10.1016/j.ijnurstu.2005.07.011. Epub 2005 Sep 12. Int J Nurs Stud. 2006. PMID: 16157338 Review.
  • Safety in nursing social research. Hughes R. Hughes R. Int J Nurs Stud. 2004 Nov;41(8):933-40. doi: 10.1016/j.ijnurstu.2004.05.002. Int J Nurs Stud. 2004. PMID: 15476766 Review.
  • Tracking health sector priority setting processes and outcomes for human resources for health, five-years after political devolution: a county-level case study in Kenya. Munywoki J, Kagwanja N, Chuma J, Nzinga J, Barasa E, Tsofa B. Munywoki J, et al. Int J Equity Health. 2020 Sep 21;19(1):165. doi: 10.1186/s12939-020-01284-3. Int J Equity Health. 2020. PMID: 32958000 Free PMC article.
  • Search in MeSH
  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Calculation of the minimum clinically important difference (MCID) using different methodologies: case study and practical guide

  • Original Article
  • Open access
  • Published: 28 June 2024

Cite this article

You have full access to this open access article

case study research article

  • Anita M. Klukowska 1 , 2 ,
  • W. Peter Vandertop 1 ,
  • Marc L. Schröder 3 &
  • Victor E. Staartjes   ORCID: orcid.org/0000-0003-1039-2098 4  

39 Accesses

Explore all metrics


Establishing thresholds of change that are actually meaningful for the patient in an outcome measurement instrument is paramount. This concept is called the minimum clinically important difference (MCID). We summarize available MCID calculation methods relevant to spine surgery, and outline key considerations, followed by a step-by-step working example of how MCID can be calculated, using publicly available data, to enable the readers to follow the calculations themselves.

Thirteen MCID calculations methods were summarized, including anchor-based methods, distribution-based methods, Reliable Change Index, 30% Reduction from Baseline, Social Comparison Approach and the Delphi method. All methods, except the latter two, were used to calculate MCID for improvement of Zurich Claudication Questionnaire (ZCQ) Symptom Severity of patients with lumbar spinal stenosis. Numeric Rating Scale for Leg Pain and Japanese Orthopaedic Association Back Pain Evaluation Questionnaire Walking Ability domain were used as anchors.

The MCID for improvement of ZCQ Symptom Severity ranged from 0.8 to 5.1. On average, distribution-based methods yielded lower MCID values, than anchor-based methods. The percentage of patients who achieved the calculated MCID threshold ranged from 9.5% to 61.9%.


MCID calculations are encouraged in spinal research to evaluate treatment success. Anchor-based methods, relying on scales assessing patient preferences, continue to be the “gold-standard” with receiver operating characteristic curve approach being optimal. In their absence, the minimum detectable change approach is acceptable. The provided explanation and step-by-step example of MCID calculations with statistical code and publicly available data can act as guidance in planning future MCID calculation studies.

Similar content being viewed by others

Determining the clinical importance of treatment benefits for interventions for painful orthopedic conditions.

case study research article

The anchor design of anchor-based method to determine the minimal clinically important difference: a systematic review

case study research article

Values derived from patient reported outcomes in spine surgery: a systematic review of the minimal clinically important difference, substantial clinical benefit, and patient acceptable symptom state

Avoid common mistakes on your manuscript.

The notion of minimum clinically important difference (MCID) was introduced to establish thresholds of change in an outcome measurement instrument that are actually meaningful for the patient. Jaeschke et al . originally defined it “as the smallest difference in score in the domain of interest which the patient perceives as beneficial and which would mandate, in the absence of troublesome side-effects and excessive cost, a change in the patient’s management” [ 1 ].

In many clinical trials statistical analyses only focuses on intergroup comparisons of raw outcome scores using parametric/non-parametric tests and deriving conclusions based on the p -value. Using the classical threshold of p- value < 0.05 only suggests that the observed effect is unlikely to have occurred by chance, but it does not equate to a change that is clinically meaningful for the patient [ 2 ]. Calculating MCID scores, and using them as thresholds for “treatment success”, ensures that patients’ needs and preferences are considered and allows for comparison of proportion of patients experiencing a clinically relevant improvement among different groups [ 3 ]. Through MCID, clinicians can better understand the impact of an intervention on their patients’ lives, sample size calculations can become more robust and health policy makers may decide which treatments deserve reimbursement [ 4 , 5 , 6 ].

The MCID can be determined from the patient’s perspective, where it is the patient who decides whether a change in their health was meaningful [ 4 , 7 , 8 , 9 ]. This is the most common “gold-standard” approach and one that we will focus on. Occasionally, the clinician’s perspective can also be used to determine MCID. However, MCID for a clinician may not necessarily mean an increase in a patient’s functionality, but rather a change in disease survival or treatment planning [ 10 ]. MCID can also be defined at a societal level, as e.g. improvement in a patient’s functionality significant enough to aid their return to work [ 11 ].

MCID thresholds are intended to assess an individual’s clinical improvement and ought not to be applied to mean scores of entire groups post-intervention, as doing so may falsely over-estimate treatment effectiveness. It is also noteworthy to mention that obtained MCID values are not treatment-specific but broadly disease category-specific. They rely on a patient’s perception of clinical benefit, which is influenced by their diagnosis and subsequent symptoms, not just treatment modality.

In this study, we summarize available MCID calculation methods and outline key considerations when designing a MCID study, followed by a step-by-step working example of how MCID can be calculated.

Navigating the case study

To illustrate the MCID methods and to enable the reader to follow the practical calculation guide of different MCID values, based on the described methods along the way, a previously published data set of 84 patients, as described in Minetama et al ., was used based on CC0.10 license [ 12 ]. Data can be downloaded at https://data.mendeley.com/datasets/vm8rg6rvsw/1 . The statistical R code can be found in Supplementry content  1 including instructions on formatting the data set for MCID calculations The title of different MCID methods in the paper (listed below) and their number correspond to the same title and respective number in the R code. All analyses in this case study were carried out using R version 2023.12 + 402 (The R Foundation for Statistical Computing, Vienna Austria) [ 13 ].

The aim of Minetama et al . was to assess the effectiveness of supervised physical therapy (PT) with unsupervised at-home-exercises (HE) in patients with lumbar spinal stenosis (LSS). The main inclusion criteria were presence of neurogenic intermittent claudication and pain/or numbness in the lower extremities with or without back pain and > 50 years of age; diagnosis of LSS confirmed on MRI and a history of ineffective response to therapy for ≥ 3 months. Patients were then randomized into a 6-week PT or HE programme [ 12 ]. All data was pooled, as a clinically significant benefit for patients is independent of group allocation and because MCID is disease-specific. Therefore, the derived MCID will be applicable to most patients with lumbar spinal stenosis, irrespective of treatment modality. Change scores were calculated by subtracting baseline scores from follow-up scores.

MCID calculation methods

There are multiple approaches to calculate MCID, mainly divided into anchor-based and distribution-based methods (Fig.  1 ) [ 4 , 10 , 14 , 15 , 16 , 17 ]. Before deciding on the method, it needs to be defined whether the calculated MCID will be for improvement or deterioration [ 18 ]. Most commonly, MCID is used to measure improvement (as per Jaeschke et al . definition) [ 1 , 4 , 7 , 14 , 15 , 16 , 19 , 20 ]. The value of MCID for improvement should not be directly applied in reverse to determine whether a decrease in patients' scores signifies a clinically meaningful deterioration – those are two separate concepts [ 18 ]. In addition, the actual MCID value ought to be applied to post-intervention score of an individual patient (not the overall score for the whole group), to determine whether, at follow-up, he or she experienced a change equating to MCID or more, compared to their baseline score. Such patient is then classified as “responders”.

figure 1

Flow diagram presenting range of Minimum clinically important difference calculation methods stratified into anchor, distribution-based and “other” described in the study. MCID, Minimum Clinically Important Difference; MIC, Minimal Important Change

According to the Consensus-based Standards for the selection of health measurement instruments (COSMIN) guidelines, the “anchor-based” approach is regarded as the “gold-standard” [ 21 , 22 , 23 ]. In this approach, we determine the MCID of a chosen outcome measurement, based on whether a pre-defined MCID (usually derived from another published study) was achieved by an external criterion, known as the anchor, usually another patient-reported outcome measure (PROM) or an objective test of functionality [ 4 , 7 , 8 , 15 , 16 , 17 , 18 , 20 ]. It is best to use scales which allow the patient to rate the specific aspect of their health related to the disease of interest post-intervention compared to baseline on a Likert-type scale. This scale may range, for example, from “much worse”, “somewhat worse”, “about the same”, “somewhat better”, to “much better”, such as the established Global Assessment Rating tool [ 7 , 8 , 24 , 25 ]. Depending on the scale, some studies determine MCID by calculating change scores for patients who only ranked themselves as “somewhat better”, and some only consider patients who ranked themselves as “much better” [ 7 , 25 , 26 , 27 , 28 , 29 ]. This discrepancy is likely an explanation for a range of MCID for a single outcome measure dependent on the methodology. There appears to be no singular “correct” approach. One of the alternatives to the Global assessment rating is the use of the health transition item (HTI) from the SF-36 questionnaire, where patients are asked about their overall health compared to one year ago [ 7 , 30 , 31 ]. Although quick and easy to conduct, the patient’s response may be influenced by comorbid health issues other than those targeted by intervention. Nevertheless, any anchor where the patient is the one to decide what change is clinically meaningful, captures the true essence of the MCID. One should however, be mindful of the not easily addressed recall bias with such anchors – patients at times do not reliably remember their baseline health status [ 32 ]. Moreover, what the above anchors do not consider is, whether the patient would still choose the intervention for the same condition despite experiencing side-effects or cost. That can be addressed through implementing anchors such as the Satisfaction with Results scale described in Copay et al ., who found that MCID values based on the Satisfaction with Results scale were slightly higher than those derived from HTI-SF-36 [ 7 , 33 ].

Other commonly used outcome scales, such as Oswestry Disability Index (ODI), Roland–Morris Disability Questionnaire (RMDQ), Visual Analogue Scale (VAS), or EQ5D-3L Health-Related Quality of Life, can also act as anchors [ 7 , 14 , 16 , 34 , 35 ]. In such instances, patients complete the “anchor” questionnaire at baseline and post-intervention and the MCID of that anchor is derived from a previous publication [ 12 , 16 , 35 ]. Before deciding on the MCID, full understanding of how it was derived in that previous publication is crucial. Ideally, this should be done for a population similar to our study cohort, with comparable follow-up periods [ 18 , 20 ]. Correlations between the anchor instrument and the investigated outcome measurement instrument must be recorded, and ought to be at least moderate (> 0.05), as that is the best indicator of construct validity (whether both the anchor instrument and outcome instrument represent a similar construct of patient health) [ 18 , 36 ]. If such correlation is not available, the anchor-based MCID credibility instrument is available to aid in assessing construct proximity between the two [ 36 , 37 ].

Once the process for selecting an anchor and classifying “responders” and “non-responders” is established, the MCID can be calculated. The outcome instrument of interest will be defined as an outcome for which we want to calculate the MCID. The first anchor-based method (within-patient change) focuses on the average improvement seen among clear responders in the anchor. The between-patient change anchor-based method additionally subtracts the average improvement seen among non-responders (unchanged and/or worsened) and consequently ends up with a smaller MCID value. Finally, an anchor-based method based on Receiver Operating Characteristic (ROC) curve analysis–that can be considered the current “gold standard”- also exists, which effectively looks at the MCID calculation as a sort of diagnostic instrument and aims to improve the discriminatory performance of our MCID threshold. In the following paragraphs, the three anchor-based methods are described in more detail. The R code (Supplementry Content  1 ) enables the reader to follow the text and to calculate MCID for the Zurich Claudication Questionnaire (ZCQ) Symptom Severity domain, based on a publicly available dataset [ 12 ].

Choice of outcome measurement instruments for MCID calculation case study

The chosen outcome measurement instrument in this case study for which MCID for improvement will be calculated is ZCQ Symptom Severity domain [ 12 ]. The ZCQ is composed of three subscales: symptom severity (7 questions, score per question ranging from 1 to 5 points); physical function (5 questions, score per question ranging from 1 to 4 points) and patient satisfaction with treatment scale (6 questions, score per question ranging from to 4 points). Higher scores indicate greater disability/worse satisfaction [ 38 ]. To visualize different MCID values, Numeric Rating Scale (NRS) for Leg Pain (score from 0 “no pain” to 10 “worse possible pain) and Japanese Orthopaedic Association Back Pain Evaluation Questionnaire (JOABPEQ) Walking Ability domain are chosen, as they showed high responsiveness in patients with LSS post-operatively [ 39 ].Through 25 questions, the JOABPEQ assesses five distinctive domains: pain-related symptoms, lumbar spine dysfunction, walking ability, impairment in social functioning and psychological disturbances. The score for each domain ranges from 0 to 100 points (higher score indicating better health status) [ 40 ]. The correlation of ZCQ symptom severity with NRS Leg Pain and JOABPEQ Walking Ability domain, is 0.56 and − 0.51, respectively [ 39 ]. For a patient to be classified as a “responder”, using the NRS for Leg pain or JOABPEQ walking ability, the score at 6-week follow-up must have improved by 1.6 points or 20 points, respectively [ 7 , 40 , 41 ].

This publicly available dataset does not report patient satisfaction or any kind of global assessment rating.

To enable calculation of global assessment rating-based MCID methods for educational purposes, despite very limited availability of studies providing MCID for deterioration of JOABPEQ, we decided to stratify patients in this dataset into the three following groups, based on the JOABPEQ Walking Ability as an anchor: likely improved (change score above 20 points according to Kasai et al . ), no significant change (− 20– + 20 points change score), and likely deteriorated (lower than − 20 points change score) [ 41 ]. As obtained MCID values were expected to be negative, all values, for clarity of presentation, were multiplied by − 1, except in Method (IX), where graphical data distribution was shown.

The different methods in detail

Method (i) calculating mcid using “within-patient” score change.

The first method focuses on calculating the change between baseline and post-intervention score of our outcome instrument, for each patient classified as a “responder”. A “responder” is a patient who, at follow-up, has achieved the pre-defined MCID of the anchor (or ranks themselves high enough on Global assessment rating type scale based on our methodology). The MCID is then defined as the mean change in the outcome instrument of interest of those classified as “responders” [ 4 , 7 , 16 , 31 ].

The corresponding R-Code formula is described in Step 5a of Supplementry Content  1 . Calculated within-patient MCID of ZCQ Symptom Severity based on NRS Leg Pain and JOABPEQ Walking Ability domain was 4.4 and 4.2, respectively.

Method (II) calculating MCID using “between-patient” score change

In this approach, the mean change in our outcome instrument is calculated for not only “responders” but also for “non-responders”. “Non-responders” are patients who did not achieve the pre-defined MCID of our anchor or who did not rank themselves high enough (unchanged, or sometimes: unchanged + worsened) on Global Assessment Rating type scale according to our methodology. The minimum clinically important difference of our outcome instrument is then defined as the difference between the mean change scores of “responders” and “non-responders” [ 4 , 7 , 16 , 19 ].

The corresponding R-Code formula is described in Step 5b of Supplementry content  1 . Calculated between-patient MCID of ZCQ Symptom Severity based on NRS Leg Pain and JOABPEQ Walking Ability domain was 3.5 and 2.8, respectively.

Method (III) calculating MCID using the ROC analysis

Here the MCID is derived through ROC analysis to identify the “threshold” score of our outcome instrument that best discriminates between “responders” and “non-responders” of the anchor [ 4 , 7 , 16 , 19 , 27 ]. To understand ROC, one must familiarize oneself with the concept of sensitivity and specificity. In ROC analysis, sensitivity is defined as the ability of the test to correctly detect “true positives”, which in this context refers to patients who have achieved a clinically meaningful change.

“False negative” would be a patient, who was classified as “non-responder” but is really a “responder”. Specificity is defined as the ability of a test to correctly detect a “true negative” result- a patient who did not achieve a clinically meaningful change – a “non-responder” [ 25 ].

A “false positive” would be a patient, who was classified as a “responder” but who was a “non-responder”. Values for sensitivity and specificity range from 0 to 1. Sensitivity of 1 means that the test can detect 100% of “true positives”’ (“responders”), while specificity of 1 reflects the ability to detect 100% of “true negatives” (“non-responders”). It is unclear what the minimum sensitivity and specificity should be for a “gold-standard” MCID, which is why the most established approach is to opt for a MCID threshold that maximizes both sensitivity and specificity at the same time, which can be done using ROC analysis [ 4 , 7 , 25 , 31 , 42 ]. During ROC analysis, the “closest-to-(0,1)-criterion” (the top left most point of the curve) or the Youden index are the two methods to automatically determine the optimal threshold point [ 43 ].

When conducting the ROC analysis, the Area under the curve (AUC) is also determined–a measure of how well the MCID threshold discriminates responders and non-responders in general. Values in AUC can range 0–1. An AUC of 0.5 signifies that the score discriminates no better than random chance, whereas a value of 1 means that the score perfectly discriminates between responders and non-responders. In the literature, an AUC of 0.7 and 0.8 is deemed fair (acceptable), while ≥ 0.8 to < 0.9 is considered good and values ≥ 0.9 are considered excellent [ 44 ]. Calculating the AUC provides a rough estimate of how well the chosen MCID threshold performs. The corresponding R-Code formula is described in Step 5c of Supplementry content  1 . Statistical package pROC was used. The calculated MCID of ZCQ symptom severity based on NRS Leg Pain and JOABPEQ Walking Ability domain was for both 1.5.

Calculation of MCID through distribution-based methods

Calculation of MCID using the distribution-based approach focuses on statistical properties of the dataset [ 7 , 14 , 16 , 27 , 45 ]. Those methods are objective, easy to calculate, and in some cases, yield values close to anchor-based MCID. The advantage of this approach is that it does not rely on any external criterion or require additional studies on previously established MCIDs or other validated “gold standard” questionnaires for the specific disease in each clinical setting. However, it fails to include the patient’s perspective of a clinically meaningful change, which will be discussed later in this study. In this sense, distribution-based methods focus on finding MCID thresholds that enable mathematical distinction of what is considered a changed vs. unchanged score, whereas anchor-based methods focus on finding MCID thresholds which represent a patient-centered, meaningful improvement.

Method (IV) calculating MCID through Standard Error of Measurement (SEM)

The standard error of measurement conceptualizes the reliability of the outcome measure, by determining how repeated measurements of an outcome may differ from the “true score”. Greater SEM equates to lower reliability, which is suggestive of meaningful inconsistencies in the values produced by the outcome instrument despite similar measuring conditions. Hence, it has been theorized that 1 SEM is equal to MCID, because a change score ≥ 1 SEM, is unlikely to be due to measurement error and therefore is also more likely to be clinically meaningful [ 46 , 47 ]. The following formula is used: [ 1 , 7 , 35 , 46 , 48 ].

The ICC, also called reliability coefficient, signifies level of agreement or consistency between measurements taken on different occasions or by different raters [ 49 ]. There are various ways of calculating the ICC depending on the used model with values < 0.5, 0.5– 0.75, 0.75–0.9 and > 0.90 indicating poor, moderate, good and excellent reliability, respectively [ 49 ]. While a value of 1 × SEM is probably the most established way to calculate MCID, in the literature, a range of multiplication factors for SEM-based MCID have been used, including 1.96 SEM or even 2.77 SEM to identify a more specific threshold for improvement [ 48 , 50 ]. The corresponding R-Code formula is described in Step 6a of Supplementry Content  1 . The chosen ZCQ Symptom Severity ICC was 0.81 [ 51 ]. The SEM-based MCID was 1.9.

Method (V) calculating MCID through Effect Size (ES)

Effect size (ES) is a standardized measure of the strength of the relationship or difference between two variables [ 52 ]. It is described by Cohen et al . as “degree to which the null hypothesis (there is no difference between the two groups) is false”. It allows for direct comparison of different instruments with different units between studies. There are multiple forms to calculate ES, but for the purpose of MCID calculations, the ES represents the number of SDs by which the post-intervention score has changed from baseline score. It is calculated based on the following formula incorporating the average change score divided by the SD of the baseline score: [ 52 ].

According to Cohen et al . 0.2 is considered small ES, 0.5 is moderate ES and 0.8 or more is large ES [ 53 ]. Most commonly, a change score with an ES of 0.2 is considered equivalent to MCID [ 7 , 16 , 31 , 54 , 55 , 56 ]. Using this method, we are basically identifying the mean change score (in this case reflecting the MCID) that equates to an ES of 0.2: [ 7 , 55 ].

Practically, if a patient experienced small improvement in an outcome measure post intervention, the ES will be smaller than for a patient who experienced a large improvement in outcomes measure. The corresponding R-Code formula is described in Step 6b of Supplementry Content  1 . The ES-based MCID was 0.9.

Method (VI) calculating MCID through Standardized Response Mean (SRM)

The Standardized Response Mean (SRM) aims to gauge the responsiveness of an outcome similarly to ES. Initially described by Cohen et al . as a derivative of ES assessing differences of paired observations in a single sample, later renamed as SRM, it is also considered an “index of responsiveness” [ 38 , 53 ]. However, the denominator is SD of the change scores–not the SD of the baseline scores–while the numerator remains the average change score from baseline to follow-up: [ 10 , 45 , 57 , 58 , 59 ].

Similarly, to Cohen’s rule of interpreting ES, it has been theorized that responsiveness can be considered low if SRM is 0.2–0.5, moderate if > 0.5–0.8 and large if > 0.8 [ 58 , 59 , 60 ]. Again, a change score equating to SRM of 0.2 (although SRM of 1/3 or 0.5 were also proposed) can be considered MCID, although studies have used the overall SRM as MCID as well [ 45 , 54 , 56 , 61 ]. However, since SRM is a standardized index, similarly to ES, the aim of the SRM-based method ought to be to identify a change score that indicates responsiveness of 0.2: [ 61 ].

Similar to the ES-based method, the SRM-based approach for calculating the MCID is not commonly used in in spine surgery studies [ 14 ]. It is a measure of responsiveness, which is the ability to detect change over time in a construct to be measured by the instrument, and ought to be therefore calculated for the study-specific change score rather than extrapolated as a “universal” MCID threshold to other studies. The corresponding R-Code formula is described in Step 6c of Supplementry Content  1 . The SRM-based MCID was 0.8.

The limitation of using Method (V) and (VI) in MCID calculations will be later described in Discussion.

Method (VII) calculating MCID through SD

Standard Deviation represents the average spread of individual data points around the mean value of the outcome measure. Norman et al . found in their review of studies using MCID in health-related quality of life instruments that most studies had an average ES of 0.5, which equated to clinically meaningful change score of 0.5 × SD of baseline score [ 7 , 16 ,  30 ].

The corresponding R-Code formula is described in Step 6d of Supplementry content  1 . The SD-based MCID was 2.1.

Method (VIII) calculating MCID through 95% Minimum Detectable Change (MDC)

The MDC is defined as the minimal change below which there is a 95% chance that it is due to measurement error of the outcome measurement instrument: [ 7 , 61 ].

Usually, value corresponding to z is the desired level of confidence, which for 95% confidence level is 1.96. Although MDC–like all distribution-based methods–does not consider whether a change is clinically meaningful, the calculated MCID should be at least the same or greater than MDC to enable distinguishing true mathematical change from measurement noise. The 95% MDC calculation, is the most common distribution-based approach in spinal surgery, and it appears to most closely resemble anchor-derived MCID values, as demonstrated by Copay et al . [ 7 , 14 , 62 ]. The corresponding R-Code formula is described in Step 6e of Supplementry Content  1 . The 95% MDC was 5.1.

Method (IX) calculating MCID through Reliable Change Index

Another less frequently applied method through which “responders and “non-responders” can be classified but which does not rely on an external criterion is the Reliable Change Index (RCI), also called the Jacobson–Truax index [ 63 , 64 ]. It indicates whether an individual change score is statistically significantly greater than a change in score that could have occurred due to random measurement error alone [ 63 ].

In theory, a patient can be considered to experience a statistically reliably identifiable improvement ( p  < 0.05), if the individual RCI is > 1.96. Again, it does not reflect whether the change is clinically meaningful for the patient but rather that the change should not be attributed to measurement error alone and likely has a component of true score change. Therefore, this method is discouraged in MCID calculations as it relies on statistical properties of the sample and not patient preferences–as all distribution-based methods do [ 65 ]. In the example of Bolton et al . who focused on the Bournemouth Questionnaire in patients with neck pain, RCI was subsequently used to discriminate between “responders” and “non-responders”. The ROC analysis approach was then used to determine the MCID [ 64 ]. The corresponding R-Code formula is described in Step 6f of Supplementry Content  1 . Again, pROC package was used. The ROC-derived MCID was 2.5.

Other methods

Method (x) calculating mcid through anchor-based minimal important change (mic) distribution model.

In theory, combining anchor- and distribution-based methods could yield superior results. Some suggestions include averaging the values of various methods, simply combining two different methods (i.e. both an anchor-based criterion such as ROC-based MCID from patient satisfaction and 95% MDC-based MCID have to both be met to consider a patient as having achieved MCID) [ 25 ]. In 2007, de Vet et al . introduced a new visual method of MCID calculations that does not only combine but also integrates both anchor- and distribution-based calculations [ 25 ]. In addition, their method allows the calculation of both MCID for improvement and for deterioration, as these can differ.

In short form, using an anchor, patients were divided into three “importantly improved”, “not importantly changed” and “importantly deteriorated” groups (Fig.  2 ) . Then distribution expressed in percentiles of patients who “importantly improved”, “importantly deteriorated” and “not importantly changed” were plotted on a graph. This is the anchor-based part of the approach, ensuring that MCID thresholds chosen have clinical value.

figure 2

Distribution of the Zurich Claudication Questionnaire Symptom Severity change scores for patients categorized as experiencing “important improvement”, “no important change” or “important deterioration” in JOABPEQ walking ability as an anchor (Method (X)). For ZCQ Symptom Severity score to improve, the actual value must decrease explaining the negative values in the model. ROC , Receiver Operating Characteristic; ZCQ , Zurich Claudication Questionnaire; JOABPEQ , Japanese Orthopaedic Association Back Pain Evaluation Questionnaire

The second part of the approach is then entirely focused on the group of patients determined by the anchor to be “unchanged”, and can be either distribution- or anchor-based:

In the first and more anchor-based method, the ROC-based method described in Method (III) is applied to find the threshold for improvement (by finding the ROC-based threshold point that optimizes sensitivity and specificity of identifying improved vs unchanged patients) or for deterioration (by finding the ROC-based threshold point that optimizes sensitivity and specificity of identifying deteriorated vs unchanged patients). For example, the threshold for improvement is found by combining the improved and unchanged groups, and then testing out different thresholds for discriminating those two groups from each other. The optimal point on the resulting ROC curve based on the closest-to-(0,1)-criterion is then found.

In the second method, which is distribution-based, the upper 95% (for improvement) and lower 95% (for deterioration) limits are found based solely on the group of patients determined to be unchanged. The following formula is used (instead, subtracting instead of adding the 1.645 × SD for deterioration or improvement, respectively): [ 25 ]

The corresponding R-Code formula can be found under Step 7a in Supplementry Content  1 . The model is presented in Fig.  2 . The 95% upper limit and 95% lower limit was 4.1 and − 7.2 respectively. The ROC-derived MCID using RCI was − 2.5 (important improvement vs unchanged) and − 0.5 (important deterioration vs unchanged). For the purpose of the model, MCID values were not multiplied by − 1 but remained in original form.

Method (XI) calculating MCID as 30% Reduction from Baseline

In recent years, a simple 30% reduction from baseline values has been introduced as an alternative to MCID calculations [ 66 ]. It has been speculated that absolute-point changes are difficult to interpret and have limited value in context of “ceiling” and “floor” effects (i.e. values that are on the extreme spectra of the measurement scale) [ 4 ]. To overcome this, Khan et al . found that 30% reduction in PROMs has similar effectiveness as traditional anchored or distribution-based methods in detecting patients with clinically meaningful differences post lumbar spine surgery [ 15 ]. The corresponding R-Code formula can be found under Step 7b in Supplementry Content  1 .

Method (XII) Calculating MCID through Delphi method

The Delphi Method is a systemic approach using the collective opinion of experts to establish a consensus regarding a medical issue [ 67 ]. It has mostly been used to develop best practice guidelines [ 68 ]. However, it can also be used to aid MCID determination [ 69 ]. The method focuses on distributing questionnaires or surveys to panel of members. The anonymized answers are grouped together and shared again with the expert panel in subsequent rounds. This allows the experts to reflect on their opinions and consider strengths and weaknesses of the others response. The process is repeated until consensus is reached. Ensuring anonymity, this prevents any potential bias linked to a specific participant’s concern about their own opinion being viewed or influenced by other personal factors [ 67 ].

Method (XIII) calculating MCID through Social Comparison Approach

The final approach is asking patients to compare themselves to other patients, which requires time and resources [ 70 ]. In a study by Redelmeier et al . patients with chronic obstructive pulmonary disease in a rehabilitation program were organized into small groups and observed each other at multiple occasions [ 70 ]. Additionally, each patient was paired with another participant and had a one-to-one interview with them discussing different aspects of their health. Finally, each patient anonymously rated themselves against their partner on a scale “much better”, “somewhat better”, “a little bit better”, “about the same”, “a little bit worse” “somewhat worse” and “much worse”. MCID was then calculated based on the mean change score of patients who graded themselves as “a little bit better” (MCID for improvement) or a “little bit worse” (MCID for deterioration), like in the within-patient change and between-patient change method described in Method (I) and (II) [ 70 ].

Substantial Clinical Benefit

Over the years, it has been noted that MCID calculations based either purely on distribution-based method or only group of patients rating themselves as “somewhat better” or “slightly better” does not necessarily constitute a change that patients would consider beneficial enough “to mandate, in the absence of troublesome side effects and excessive cost, to undergo the treatment again” [ 3 , 24 ]. Therefore, the concept of substantial clinical benefit (SCB) has been introduced as a way of identifying a threshold of clinical success of intervention rather than a “floor” value for improvement- that is MCID [ 24 ]. For example, in Carreon et al ., ROC derived SCB “thresholds” were defined as a change score with equal sensitivity and specificity to distinguish “much better” from “somewhat better” patients post cervical spinal fusion [ 71 ]. Glassman et al . on the other hand used ROC derived SCB thresholds to discriminate between “much better” and “about the same” patients following lumbar spinal fusion. The authors stress that SCB and MCID are indeed separate entities, and one should not be used to derive the other [ 24 ]. Thus, while the methods to derive SCB and MCID thresholds can be carried out similarly based on anchors, the ultimate goal of applying SCB versus MCID is different.

Using the various methods explained above, overall, MCID for improvement for ZCQ Symptoms Severity domain ranged from 0.8 to 5.1 (Table  1 ). Here, the readers obtained results can be checked for correctness. On average distribution-based MCID values were lower than anchor-based MCID values. Within distribution-based approach, method (VIII) “Minimum detectable change” resulted in MCID of 5.1, which exceeded the MCID’s derived using the “gold-standard” anchor-based approaches. The average MCID based on anchor of NRS Leg pain and JOABPEQ walking ability was 3.1 and 2.8, respectively. Dependent on methods used, percentage of responders to HE and PT intervention fell within range of 9.5% for “30% Reduction from Baseline” method to 61.9% using ES- and SRM-based method (Table  2 ). Method (X) is graphically presented in Fig.  2 .

As demonstrated above, the MCID is dependent upon the methodology and the chosen anchor, highlighting the necessity for careful preparation in MCID calculations. The lowest MCID of 0.8 was calculated for Method (VI) being SRM. Logically, if a patient on average had a baseline ZCQ Symptom Severity score of 23.2, an improvement of 0.8 is unlikely to be clinically meaningful, even if rounded up. It rather informs on the measurement error property of our instrument as explained by COSMIN. Additionally, the distribution-based methods rely on statistical properties of the sample, which varies from cohort to cohort making it only generalizable to patient groups with similar SD but not applicable to others with a different spread of data [ 52 ]. Not surprisingly, anchor-based methods considering patient preferences yielded on average higher MCID values than distribution-based methods, which again varied from anchor to anchor. The mean MCID for improvement calculated for NPRS Leg Pain was 3.1, while for JOABPEQ Walking Ability it was 2.8—such similar values prove the importance of selecting responsive anchors with at least moderate correlations. Despite assessing different aspects of LSS disease, the MCID remained comparable in this specific case.

Interestingly, Method (VIII) MDC yielded the highest value of 5.1, exceeding the “gold-standard” ROC-derived MCID. This suggests that, in this example, using this ROC-derived MCID in clinical practice would be illogical, as the value falls within the measurement error determined by MDC. Here it would be appropriate to choose MDC approach as the MCID. Interestingly, ROC-derived MCID values based on Global Assessment Rating like stratification of patients based on their JOABPEQ Walking Ability (Method X) yielded higher MCID, than in Method (III). This may be attributed to a more a balanced distribution of “responders” and “non-responders” (only unchanged patients) in Method (X), unlike in the latter (Method III) where patients were strictly categorized into “responders” and “non-responders” (including both deteriorated and unchanged). This further highlights the importance of using global assessment rating type scales in determining the extent of clinical benefit.

Although ES-based (Method (V)) and SRM-based (Method (VI)) MCID calculations have been described in the literature, ES and SRM were originally created to quantify the strength of relationship between scores of two samples (in case of ES) and change score of paired observations in one sample (in case of SRM) [ 53 , 58 , 59 ]. They do offer an alternative to MCID calculations. However, verification with other MCID calculation methods, ideally anchor-based, is strongly recommended. As seen in this case study and other MCID’s derived similarly, they often result small estimates [ 7 , 55 ]. There is also no consensus regarding the choice of SD of Change Score vs. SD of Baseline Score as denominator. Additionally, whether the calculated MCID (mean change score) should represent value, such as the ES is 0.2 indicating small effect, or value should be 0.5 suggesting moderate effect is currently arbitrary and often relies on the researcher’s preference [ 53 , 55 , 59 ]. Both ES and SRM can be used to assess whether the overall change score observed in single study is suggestive of a clinically meaningful benefit in that specific cohort or in case of SRM, whether the outcome measure is responsive. However, it is our perspective that extending such value as “MCID” from one study to another is not recommended.

One can argue whether there is even a place for distribution-based methods in MCID calculations. They ultimately fail to provide an MCID value that meets the original definition of Jaeschke et al . “of smallest change in the outcome that the patient would identify as important”. At no point are patients asked about what constitutes a meaningful change for them, and the value is derived from statistical properties of the sample solely [ 1 ]. Nevertheless, conduction of studies on MCID implementing scales such as Global Assessment Rating is time-consuming and performing studies for each patient outcome and each disease is likely not feasible. Distribution-based methods still have some merit in that they–like the 95% MDC method—can help distinguish measurement noise and inaccuracy from true change. Even if anchor-based methods should probably be used to define MCID thresholds, they ought to be supported by a calculation of MDC so that it can be decided whether the chosen threshold makes sense mathematically (i.e., can reliably be distinguished from measurement inaccuracies) as seen in our case study.

Calculating MCID for different diagnoses

Previously, MCID thresholds for outcome measurement instruments were calculated for generic populations, such as patients suffering from low back pain. More recently, MCID values for commonly used PROMs in spine surgery, such as ODI, RMDQ or NRS have been calculated for more narrowly defined diagnoses, such as lumbar disc herniation (LDH) or LSS. The question arises as to whether a separate MCID is needed for all the different spinal conditions. In general, establishing an MCID specific to these patient groups is only recommended if these patient’s perception of meaningful change is different from that of low back pain in general. Importantly, again, the MCID should not be treatment-specific, but rather broadly disease specific. Therefore, it is advisable to use MCID based on patients who had the most similar disease characteristics to our cohort. For example, an MCID for NRS Back Pain based on study group composed of different types of lumbar degenerative disease, may in some cases, be applied to study cohort composed solely of patients with LDH. However, no such extrapolation should be performed for populations with back pain secondary to malignancy, due to a totally different pathogenesis and associated symptoms that may influence the ability to detect a clinically meaningful change in the above NRS Back Pain such as fatigue or anorexia.

Study cohort characteristics that influence MCID

Regardless of robust methodology, it can be expected that it is impossible to obtain the same MCID on different occasions even in the same population due to the inherent subjectivity of what is perceived as “clinically beneficial” and day-to-day symptom fluctuation. However, it was found that patients who have worse baseline scores, reflecting e.g., more advanced disease, require greater overall change at follow-up to report it as clinically meaningful [ 72 ]. One should also be mindful of “regression to the mean” where extremely high or low-scoring patients then subsequently score closer to baseline at second measurement [ 73 ]. Therefore, adequate cohort characteristics need to be presented, for the readers to judge how generalizable the MCID may be to their study cohort. If a patient pre-operatively experiences NRS Leg Pain of 1, and the MCID is 1.6, they cannot achieve MCID at all, as the maximum possible change score is smaller than the MCID threshold (“floor effect”). A similar situation can occur with patients closer to the higher end of the scale (“ceiling effect”). The general rule is, that if at least 15% of the study cohort has the highest or lowest possible score for a given outcome instrument, one can expect significant “ceiling/floor effects” [ 50 ]. One way to overcome this, is through transferring absolute MCID scores to percentage change scores [ 4 , 45 ]. However, percentage change scores only account for high baseline scores, if high baseline scores indicate larger disability (as seen with ODI) and have a possibility of larger change. If a high score in an instruments reflects better health status (as seen in in SF-36), than percentage change scores will increase the association with baseline score [ 4 ]. In general, it is important to consider which patient to exclude from certain analyses when applying MCID: For example, patients without relevant disease preoperatively (for example, those exhibiting so-called “patient-accepted symptom states”, PASS) should probably be excluded altogether when reporting the percentage of patients achieving MCID [ 74 ].

Establishing reliable thresholds for MCID is key in clinical research and forms the basis of patient-centered treatment evaluations when using patient-reported outcome measures or objective functional tests. Calculation of MCID thresholds can be achieved using a variety of different methods, each yielding completely different results, as is demonstrated in this practical guide. Generally, anchor-based methods relying on scales assessing patient preferences/satisfaction or global assessment ratings continue to be the “gold-standard” approach- the most common being ROC analysis. In the absence of appropriate anchors, the distribution-based MCID based on the 95% MDC approach is acceptable, as it appears to yield the most similar results compared to anchor-based approaches. Moreover, we recommend using it as a supplement to any anchor-based MCID thresholds to check if they can reliably distinguish true change from measurement inaccuracies. The explanation provided in this practical guide with step-by-step examples along with public data and statistical code can add as guidance for future studies calculating MCID thresholds.

Jaeschke R, Singer J, Guyatt GH (1989) Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials 10:407–415. https://doi.org/10.1016/0197-2456(89)90005-6

Article   CAS   PubMed   Google Scholar  

Concato J, Hartigan JA (2016) P values: from suggestion to superstition. J Investig Med 64:1166. https://doi.org/10.1136/jim-2016-000206

Article   PubMed   PubMed Central   Google Scholar  

Zannikos S, Lee L, Smith HE (2014) Minimum clinically important difference and substantial clinical benefit: Does one size fit all diagnoses and patients? Semin Spine Surg 26:8–11. https://doi.org/10.1053/j.semss.2013.07.004

Article   Google Scholar  

Copay AG, Subach BR, Glassman SD et al (2007) Understanding the minimum clinically important difference: a review of concepts and methods. Spine J 7:541–546. https://doi.org/10.1016/j.spinee.2007.01.008

Article   PubMed   Google Scholar  

Lanario J, Hyland M, Menzies-Gow A et al (2020) Is the minimally clinically important difference (MCID) fit for purpose? a planned study using the SAQ. Euro Respirat J. https://doi.org/10.1183/13993003.congress-2020.2241

Neely JG, Karni RJ, Engel SH, Fraley PL, Nussenbaum B, Paniello RC (2007) Practical guides to understanding sample size and minimal clinically important difference (MCID). Otolaryngol Head Neck Surg 136(1):14–18. https://doi.org/10.1016/j.otohns.2006.11.001

Copay AG, Glassman SD, Subach BR et al (2008) Minimum clinically important difference in lumbar spine surgery patients: a choice of methods using the Oswestry disability index, medical outcomes study questionnaire short form 36, and pain scales. Spine J 8:968–974. https://doi.org/10.1016/j.spinee.2007.11.006

Andersson EI, Lin CC, Smeets RJ (2010) Performance tests in people with chronic low back pain: responsiveness and minimal clinically important change. Spine 35(26):E1559-1563. https://doi.org/10.1097/BRS.0b013e3181cea12e

Mannion AF, Porchet F, Kleinstück FS, Lattig F, Jeszenszky D, Bartanusz V, Dvorak J, Grob D (2009) The quality of spine surgery from the patient’s perspective. Part 1: the core outcome measures index in clinical practice. Euro Spine J 18:367–373. https://doi.org/10.1007/s00586-009-0942-8

Crosby RD, Kolotkin RL, Williams GR (2003) Defining clinically meaningful change in health-related quality of life. J Clin Epidemiol 56:395–407. https://doi.org/10.1016/S0895-4356(03)00044-1

Gatchel RJ, Mayer TG (2010) Testing minimal clinically important difference: consensus or conundrum? Spine J 10:321–327. https://doi.org/10.1016/j.spinee.2009.10.015

Minetama M, Kawakami M, Teraguchi M et al (2019) Supervised physical therapy vs. home exercise for patients with lumbar spinal stenosis: a randomized controlled trial. Spine J 19:1310–1318. https://doi.org/10.1016/j.spinee.2019.04.009

R Core Team (2021) R A Language and Environment for Statistical Computing

Chung AS, Copay AG, Olmscheid N, Campbell D, Walker JB, Chutkan N (2017) Minimum clinically important difference: current trends in the spine literature. Spine 42(14):1096–1105. https://doi.org/10.1097/BRS.0000000000001990

Khan I, Pennings JS, Devin CJ, Asher AM, Oleisky ER, Bydon M, Asher AL, Archer KR (2021) Clinically meaningful improvement following cervical spine surgery: 30% reduction versus absolute point-change MCID values. Spine 46(11):717–725. https://doi.org/10.1097/BRS.0000000000003887

Gautschi OP, Stienen MN, Corniola MV et al (2016) Assessment of the minimum clinically important difference in the timed up and go test after surgery for lumbar degenerative disc disease. Neurosurgery. https://doi.org/10.1227/NEU.0000000000001320

Kulkarni AV (2006) Distribution-based and anchor-based approaches provided different interpretability estimates for the hydrocephalus outcome questionnaire. J Clin Epidemiol 59:176–184. https://doi.org/10.1016/j.jclinepi.2005.07.011

Wang Y, Devji T, Qasim A et al (2022) A systematic survey identified methodological issues in studies estimating anchor-based minimal important differences in patient-reported outcomes. J Clin Epidemiol 142:144–151. https://doi.org/10.1016/j.jclinepi.2021.10.028

Parker SL, Godil SS, Shau DN et al (2013) Assessment of the minimum clinically important difference in pain, disability, and quality of life after anterior cervical discectomy and fusion: clinical article. J Neurosurg Spine 18:154–160. https://doi.org/10.3171/2012.10.SPINE12312

Carrasco-Labra A, Devji T, Qasim A et al (2021) Minimal important difference estimates for patient-reported outcomes: a systematic survey. J Clin Epidemiol 133:61–71. https://doi.org/10.1016/j.jclinepi.2020.11.024

Prinsen CAC, Mokkink LB, Bouter LM et al (2018) COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res 27:1147–1157. https://doi.org/10.1007/s11136-018-1798-3

Article   CAS   PubMed   PubMed Central   Google Scholar  

Mokkink LB, de Vet HCW, Prinsen CAC et al (2018) COSMIN risk of bias checklist for systematic reviews of patient-reported outcome measures. Qual Life Res 27:1171–1179. https://doi.org/10.1007/s11136-017-1765-4

Terwee CB, Prinsen CAC, Chiarotto A et al (2018) COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual Life Res 27:1159–1170. https://doi.org/10.1007/s11136-018-1829-0

Glassman SD, Copay AG, Berven SH et al (2008) Defining substantial clinical benefit following lumbar spine arthrodesis. J Bone Joint Surg Am 90:1839–1847. https://doi.org/10.2106/JBJS.G.01095

de Vet HCW, Ostelo RWJG, Terwee CB et al (2007) Minimally important change determined by a visual method integrating an anchor-based and a distribution-based approach. Qual Life Res 16:131–142. https://doi.org/10.1007/s11136-006-9109-9

Solberg T, Johnsen LG, Nygaard ØP, Grotle M (2013) Can we define success criteria for lumbar disc surgery? Acta Orthop 84:196–201. https://doi.org/10.3109/17453674.2013.786634

Power JD, Perruccio AV, Canizares M et al (2023) Determining minimal clinically important difference estimates following surgery for degenerative conditions of the lumbar spine: analysis of the Canadian spine outcomes and research network (CSORN) registry. The Spine Journal 23:1323–1333. https://doi.org/10.1016/j.spinee.2023.05.001

Asher AL, Kerezoudis P, Mummaneni PV et al (2018) Defining the minimum clinically important difference for grade I degenerative lumbar spondylolisthesis: insights from the quality outcomes database. Neurosurg Focus 44:E2. https://doi.org/10.3171/2017.10.FOCUS17554

Cleland JA, Whitman JM, Houser JL et al (2012) Psychometric properties of selected tests in patients with lumbar spinal stenosis. Spine J 12:921–931. https://doi.org/10.1016/j.spinee.2012.05.004

Norman GR, Sloan JA, Wyrwich KW (2003) Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Med Care 41:582–592. https://doi.org/10.1097/01.MLR.0000062554.74615.4C

Parker SL, Mendenhall SK, Shau DN et al (2012) Minimum clinically important difference in pain, disability, and quality of life after neural decompression and fusion for same-level recurrent lumbar stenosis: understanding clinical versus statistical significance. J Neurosurg Spine 16:471–478. https://doi.org/10.3171/2012.1.SPINE11842

Gatchel RJ, Mayer TG, Chou R (2012) What does/should the minimum clinically important difference measure?: a reconsideration of its clinical value in evaluating efficacy of lumbar fusion surgery. Clin J Pain 28:387. https://doi.org/10.1097/AJP.0b013e3182327f20

Lloyd H, Jenkinson C, Hadi M et al (2014) Patient reports of the outcomes of treatment: a structured review of approaches. Health Qual Life Outcomes 12:5. https://doi.org/10.1186/1477-7525-12-5

Beighley A, Zhang A, Huang B et al (2022) Patient-reported outcome measures in spine surgery: a systematic review. J Craniovertebr Junction Spine 13:378–389. https://doi.org/10.4103/jcvjs.jcvjs_101_22

Ogura Y, Ogura K, Kobayashi Y et al (2020) Minimum clinically important difference of major patient-reported outcome measures in patients undergoing decompression surgery for lumbar spinal stenosis. Clin Neurol Neurosurg 196:105966. https://doi.org/10.1016/j.clineuro.2020.105966

Wang Y, Devji T, Carrasco-Labra A et al (2023) An extension minimal important difference credibility item addressing construct proximity is a reliable alternative to the correlation item. J Clin Epidemiol 157:46–52. https://doi.org/10.1016/j.jclinepi.2023.03.001

Devji T, Carrasco-Labra A, Qasim A et al (2020) Evaluating the credibility of anchor based estimates of minimal important differences for patient reported outcomes: instrument development and reliability study. BMJ 369:m1714. https://doi.org/10.1136/bmj.m1714

Stucki G, Daltroy L, Liang MH et al (1996) Measurement properties of a self-administered outcome measure in lumbar spinal stenosis. Spine 21:796

Fujimori T, Ikegami D, Sugiura T, Sakaura H (2022) Responsiveness of the Zurich claudication questionnaire, the Oswestry disability index, the Japanese orthopaedic association back pain evaluation questionnaire, the 8-item short form health survey, and the Euroqol 5 dimensions 5 level in the assessment of patients with lumbar spinal stenosis. Eur Spine J 31:1399–1412. https://doi.org/10.1007/s00586-022-07236-5

Fukui M, Chiba K, Kawakami M et al (2009) JOA back pain evaluation questionnaire (JOABPEQ)/ JOA cervical myelopathy evaluation questionnaire (JOACMEQ) the report on the development of revised versions April 16, 2007: the subcommittee of the clinical outcome committee of the Japanese orthopaedic association on low back pain and cervical myelopathy evaluation. J Orthop Sci 14:348–365. https://doi.org/10.1007/s00776-009-1337-8

Kasai Y, Fukui M, Takahashi K et al (2017) Verification of the sensitivity of functional scores for treatment results–substantial clinical benefit thresholds for the Japanese orthopaedic association back pain evaluation questionnaire (JOABPEQ). J Orthop Sci 22:665–669. https://doi.org/10.1016/j.jos.2017.02.012

Glassman SD, Carreon LY, Anderson PA, Resnick DK (2011) A diagnostic classification for lumbar spine registry development. Spine J 11:1108–1116. https://doi.org/10.1016/j.spinee.2011.11.016

Perkins NJ, Schisterman EF (2006) The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. Am J Epidemiol 163:670–675. https://doi.org/10.1093/aje/kwj063

Nahm FS (2022) Receiver operating characteristic curve: overview and practical use for clinicians. Korean J Anesthesiol 75:25–36. https://doi.org/10.4097/kja.21209

Angst F, Aeschlimann A, Angst J (2017) The minimal clinically important difference raised the significance of outcome effects above the statistical level, with methodological implications for future studies. J Clin Epidemiol 82:128–136. https://doi.org/10.1016/j.jclinepi.2016.11.016

Wyrwich KW, Tierney WM, Wolinsky FD (1999) Further evidence supporting an SEM-based criterion for identifying meaningful intra-individual changes in health-related quality of life. J Clin Epidemiol 52:861–873. https://doi.org/10.1016/s0895-4356(99)00071-2

Wolinsky FD, Wan GJ, Tierney WM (1998) Changes in the SF-36 in 12 months in a clinical sample of disadvantaged older adults. Med Care 36:1589–1598

Wyrwich KW, Nienaber NA, Tierney WM, Wolinsky FD (1999) Linking clinical relevance and statistical significance in evaluating intra-individual changes in health-related quality of life. Med Care 37:469–478. https://doi.org/10.1097/00005650-199905000-00006

Koo TK, Li MY (2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 15:155–163. https://doi.org/10.1016/j.jcm.2016.02.012

McHorney CA, Tarlov AR (1995) Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res 4:293–307

Hara N, Matsudaira K, Masuda K et al (2016) Psychometric assessment of the Japanese version of the Zurich claudication questionnaire (ZCQ): reliability and validity. PLoS ONE 11:e0160183. https://doi.org/10.1371/journal.pone.0160183

Kazis LE, Anderson JJ, Meenan RF (1989) Effect sizes for interpreting changes in health status. Med Care 27:S178–S189. https://doi.org/10.1097/00005650-198903001-00015

Cohen J (1988) Statistical power analysis for the behavioral sciences. L Erlbaum Associates, Hillsdale, NJ

Franceschini M, Boffa A, Pignotti E et al (2023) The minimal clinically important difference changes greatly based on the different calculation methods. Am J Sports Med 51:1067–1073. https://doi.org/10.1177/03635465231152484

Samsa G, Edelman D, Rothman ML et al (1999) Determining clinically important differences in health status measures: a general approach with illustration to the health utilities index mark II. Pharmacoeconomics 15:141–155. https://doi.org/10.2165/00019053-199915020-00003

Wright A, Hannon J, Hegedus EJ, Kavchak AE (2012) Clinimetrics corner: a closer look at the minimal clinically important difference (MCID). J Man Manip Ther 20:160–166. https://doi.org/10.1179/2042618612Y.0000000001

Stucki G, Liang MH, Fossel AH, Katz JN (1995) Relative responsiveness of condition-specific and generic health status measures in degenerative lumbar spinal stenosis. J Clin Epidemiol 48:1369–1378. https://doi.org/10.1016/0895-4356(95)00054-2

Liang MH, Fossel AH, Larson MGS (1990) Comparisons of five health status instruments for orthopedic evaluation. Med Care 28:632–642

Middel B, Van Sonderen E (2002) Statistical significant change versus relevant or important change in (quasi) experimental design: some conceptual and methodological problems in estimating magnitude of intervention-related change in health services research. Int J Integr care. https://doi.org/10.5334/ijic.65

Revicki D, Hays RD, Cella D, Sloan J (2008) Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol 61:102–109. https://doi.org/10.1016/j.jclinepi.2007.03.012

Woaye-Hune P, Hardouin J-B, Lehur P-A et al (2020) Practical issues encountered while determining minimal clinically important difference in patient-reported outcomes. Health Qual Life Outcomes 18:156. https://doi.org/10.1186/s12955-020-01398-w

Parker SL, Mendenhall SK, Shau D et al (2012) Determination of minimum clinically important difference in pain, disability, and quality of life after extension of fusion for adjacent-segment disease. J Neurosurg Spine 16:61–67. https://doi.org/10.3171/2011.8.SPINE1194

Jacobson NS, Truax P (1991) Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 59:12–19

Bolton JE (2004) Sensitivity and specificity of outcome measures in patients with neck pain: detecting clinically significant improvement. Spine 29(21):2410–2417. https://doi.org/10.1097/01.brs.0000143080.74061.25

Blampied NM (2022) Reliable change and the reliable change index: Still useful after all these years? Cogn Behav Ther 15:e50. https://doi.org/10.1017/S1754470X22000484

Asher AM, Oleisky ER, Pennings JS et al (2020) Measuring clinically relevant improvement after lumbar spine surgery: Is it time for something new? Spine J 20:847–856. https://doi.org/10.1016/j.spinee.2020.01.010

Barrett D, Heale R (2020) What are Delphi studies? Evid Based Nurs 23:68–69. https://doi.org/10.1136/ebnurs-2020-103303

Droeghaag R, Schuermans VNE, Hermans SMM et al (2021) Evidence-based recommendations for economic evaluations in spine surgery: study protocol for a Delphi consensus. BMJ Open 11:e052988. https://doi.org/10.1136/bmjopen-2021-052988

Henderson EJ, Morgan GS, Amin J et al (2019) The minimum clinically important difference (MCID) for a falls intervention in Parkinson’s: a delphi study. Parkinsonism Relat Disord 61:106–110. https://doi.org/10.1016/j.parkreldis.2018.11.008

Redelmeier DA, Guyatt GH, Goldstein RS (1996) Assessing the minimal important difference in symptoms: a comparison of two techniques. J Clin Epidemiol 49:1215–1219. https://doi.org/10.1016/s0895-4356(96)00206-5

Carreon LY, Glassman SD, Campbell MJ, Anderson PA (2010) Neck disability index, short form-36 physical component summary, and pain scales for neck and arm pain: the minimum clinically important difference and substantial clinical benefit after cervical spine fusion. Spine J 10:469–474. https://doi.org/10.1016/j.spinee.2010.02.007

Wang Y-C, Hart DL, Stratford PW, Mioduski JE (2011) Baseline dependency of minimal clinically important improvement. Phys Ther 91:675–688. https://doi.org/10.2522/ptj.20100229

Tenan MS, Simon JE, Robins RJ et al (2021) Anchored minimal clinically important difference metrics: considerations for bias and regression to the mean. J Athl Train 56:1042–1049. https://doi.org/10.4085/1062-6050-0368.20

Staartjes VE, Stumpo V, Ricciardi L et al (2022) FUSE-ML: development and external validation of a clinical prediction model for mid-term outcomes after lumbar spinal fusion for degenerative disease. Eur Spine J 31:2629–2638. https://doi.org/10.1007/s00586-022-07135-9

Download references

Open access funding provided by University of Zurich. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and affiliations.

Department of Neurosurgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam Movement Sciences, Amsterdam, The Netherlands

Anita M. Klukowska & W. Peter Vandertop

Department of Neurosurgery, University Clinical Hospital of Bialystok, Bialystok, Poland

Anita M. Klukowska

Department of Neurosurgery, Park Medical Center, Rotterdam, The Netherlands

Marc L. Schröder

Machine Intelligence in Clinical Neuroscience and Microsurgical Neuroanatomy (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland

Victor E. Staartjes

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Victor E. Staartjes .

Ethics declarations

Conflict of interest.

The authors declare that the article and its content were composed in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (TXT 6 KB)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Klukowska, A.M., Vandertop, W.P., Schröder, M.L. et al. Calculation of the minimum clinically important difference (MCID) using different methodologies: case study and practical guide. Eur Spine J (2024). https://doi.org/10.1007/s00586-024-08369-5

Download citation

Received : 03 May 2024

Revised : 17 May 2024

Accepted : 10 June 2024

Published : 28 June 2024

DOI : https://doi.org/10.1007/s00586-024-08369-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Minimum clinically important difference
  • Anchor-based methods
  • Distribution-based methods
  • Change scores
  • Clinical outcomes
  • Spine surgery
  • Find a journal
  • Publish with us
  • Track your research
  • Introduction
  • Conclusions
  • Article Information

ADHD indicates attention-deficit/hyperactivity disorder; CVD, cardiovascular disease.

a Controls were derived from the same base cohort as the cases; thus, a case with a later date of CVD diagnosis could potentially serve as a control for another case in the study.

Crude odds ratios (ORs) were based on cases and controls matched on age, sex, and calendar time. Adjusted ORs (AORs) were based on cases and controls matched on age, sex, and calendar time and adjusted for country of birth, educational level, somatic comorbidities (type 2 diabetes, obesity, dyslipidemia, and sleep disorders), and psychiatric comorbidities (anxiety disorders, autism spectrum disorder, bipolar disorder, conduct disorder, depressive disorder, eating disorders, intellectual disability, personality disorders, schizophrenia, and substance use disorders).

The solid lines represent the adjusted odds ratios, and the shaded areas represent the 95% CIs. In restricted cubic splines analysis, knots were placed at the 10th, 50th, and 90th percentiles of ADHD medication use.

eTable 1. International Classification of Diseases (ICD) Codes from the Swedish National Inpatient Register

eTable 2. Type of Cardiovascular Disease in Cases

eTable 3. Risk of CVD Associated With ADHD Medication Use Across Different Average Defined Daily Doses

eTable 4. Risk of CVD Associated With Cumulative Duration of Use of Different Types of ADHD Medications

eTable 5. Sensitivity Analyses of CVD Risk Associated With Cumulative Use of ADHD Medications, Based On Different Cohort, Exposure, and Outcome Definitions

eFigure. Risk of CVD Associated With Cumulative Use of ADHD Medications, Stratified by Sex

Data Sharing Statement

  • Long-Term ADHD Medications and Cardiovascular Disease Risk JAMA Medical News in Brief December 26, 2023 Emily Harris
  • Long-Term Cardiovascular Effects of Medications for ADHD—Balancing Benefits and Risks of Treatment JAMA Psychiatry Editorial February 1, 2024 Samuele Cortese, MD, PhD; Cristiano Fava, MD, PhD

See More About

Select your interests.

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing

Others Also Liked

  • Download PDF
  • X Facebook More LinkedIn

Zhang L , Li L , Andell P, et al. Attention-Deficit/Hyperactivity Disorder Medications and Long-Term Risk of Cardiovascular Diseases. JAMA Psychiatry. 2024;81(2):178–187. doi:10.1001/jamapsychiatry.2023.4294

Manage citations:

© 2024

  • Permissions

Attention-Deficit/Hyperactivity Disorder Medications and Long-Term Risk of Cardiovascular Diseases

  • 1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
  • 2 Unit of Cardiology, Heart and Vascular Division, Department of Medicine, Karolinska University Hospital, Karolinska Institutet, Stockholm, Sweden
  • 3 School of Medical Sciences, Faculty of Medicine and Health, Örebro University, Örebro, Sweden
  • 4 Department of Applied Health Science, School of Public Health, Indiana University, Bloomington
  • 5 Department of Psychological and Brain Sciences, Indiana University, Bloomington
  • Editorial Long-Term Cardiovascular Effects of Medications for ADHD—Balancing Benefits and Risks of Treatment Samuele Cortese, MD, PhD; Cristiano Fava, MD, PhD JAMA Psychiatry
  • Medical News in Brief Long-Term ADHD Medications and Cardiovascular Disease Risk Emily Harris JAMA

Question   Is long-term use of attention-deficit/hyperactivity disorder (ADHD) medication associated with an increased risk of cardiovascular disease (CVD)?

Findings   In this case-control study of 278 027 individuals in Sweden aged 6 to 64 years who had an incident ADHD diagnosis or ADHD medication dispensation, longer cumulative duration of ADHD medication use was associated with an increased risk of CVD, particularly hypertension and arterial disease, compared with nonuse.

Meaning   Findings of this study suggest that long-term exposure to ADHD medications was associated with an increased risk of CVD; therefore, the potential risks and benefits of long-term ADHD medication use should be carefully weighed.

Importance   Use of attention-deficit/hyperactivity disorder (ADHD) medications has increased substantially over the past decades. However, the potential risk of cardiovascular disease (CVD) associated with long-term ADHD medication use remains unclear.

Objective   To assess the association between long-term use of ADHD medication and the risk of CVD.

Design, Setting, and Participants   This case-control study included individuals in Sweden aged 6 to 64 years who received an incident diagnosis of ADHD or ADHD medication dispensation between January 1, 2007, and December 31, 2020. Data on ADHD and CVD diagnoses and ADHD medication dispensation were obtained from the Swedish National Inpatient Register and the Swedish Prescribed Drug Register, respectively. Cases included individuals with ADHD and an incident CVD diagnosis (ischemic heart diseases, cerebrovascular diseases, hypertension, heart failure, arrhythmias, thromboembolic disease, arterial disease, and other forms of heart disease). Incidence density sampling was used to match cases with up to 5 controls without CVD based on age, sex, and calendar time. Cases and controls had the same duration of follow-up.

Exposure   Cumulative duration of ADHD medication use up to 14 years.

Main Outcomes and Measures   The primary outcome was incident CVD. The association between CVD and cumulative duration of ADHD medication use was measured using adjusted odds ratios (AORs) with 95% CIs.

Results   Of 278 027 individuals with ADHD aged 6 to 64 years, 10 388 with CVD were identified (median [IQR] age, 34.6 [20.0-45.7] years; 6154 males [59.2%]) and matched with 51 672 control participants without CVD (median [IQR] age, 34.6 [19.8-45.6] years; 30 601 males [59.2%]). Median (IQR) follow-up time in both groups was 4.1 (1.9-6.8) years. Longer cumulative duration of ADHD medication use was associated with an increased risk of CVD compared with nonuse (0 to ≤1 year: AOR, 0.99 [95% CI, 0.93-1.06]; 1 to ≤2 years: AOR, 1.09 [95% CI, 1.01-1.18]; 2 to ≤3 years: AOR, 1.15 [95% CI, 1.05-1.25]; 3 to ≤5 years: AOR, 1.27 [95% CI, 1.17-1.39]; and >5 years: AOR, 1.23 [95% CI, 1.12-1.36]). Longer cumulative ADHD medication use was associated with an increased risk of hypertension (eg, 3 to ≤5 years: AOR, 1.72 [95% CI, 1.51-1.97] and >5 years: AOR, 1.80 [95% CI, 1.55-2.08]) and arterial disease (eg, 3 to ≤5 years: AOR, 1.65 [95% CI, 1.11-2.45] and >5 years: AOR, 1.49 [95% CI, 0.96-2.32]). Across the 14-year follow-up, each 1-year increase of ADHD medication use was associated with a 4% increased risk of CVD (AOR, 1.04 [95% CI, 1.03-1.05]), with a larger increase in risk in the first 3 years of cumulative use (AOR, 1.08 [95% CI, 1.04-1.11]) and stable risk over the remaining follow-up. Similar patterns were observed in children and youth (aged <25 years) and adults (aged ≥25 years).

Conclusions and Relevance   This case-control study found that long-term exposure to ADHD medications was associated with an increased risk of CVDs, especially hypertension and arterial disease. These findings highlight the importance of carefully weighing potential benefits and risks when making treatment decisions about long-term ADHD medication use. Clinicians should regularly and consistently monitor cardiovascular signs and symptoms throughout the course of treatment.

Attention-deficit/hyperactivity disorder (ADHD) is a common psychiatric disorder characterized by developmentally inappropriate inattentiveness, impulsivity, and hyperactivity. 1 , 2 Pharmacological therapy, including both stimulants and nonstimulants, is recommended as the first-line treatment for ADHD in many countries. 1 , 3 The use of ADHD medication has increased greatly in both children and adults during the past decades. 4 Although the effectiveness of ADHD medications has been demonstrated in randomized clinical trials (RCTs) and other studies, 5 , 6 concerns remain regarding their potential cardiovascular safety. 7 Meta-analyses of RCTs have reported increases in heart rate and blood pressure associated with both stimulant and nonstimulant ADHD medications. 5 , 7 - 9

As RCTs typically evaluate short-term effects (average treatment duration of 75 days), 7 it remains uncertain whether and to what extent the increases in blood pressure and heart rate associated with ADHD medication lead to clinically significant cardiovascular disease (CVD) over time. Longitudinal observational studies 10 - 12 examining the association between ADHD medication use and serious cardiovascular outcomes have emerged in recent years, but the findings have been mixed. A meta-analysis 13 of observational studies found no statistically significant association between ADHD medication and risk of CVD. However, the possibility of a modest risk increase cannot be ruled out due to several methodological limitations in these studies, including confounding by indication, immortal time bias, and prevalent user bias. Additionally, most of these studies had an average follow-up time of no more than 2 years. 13 , 14 Thus, evidence regarding the long-term cardiovascular risk of ADHD medication use is still lacking.

Examining the long-term cardiovascular risk associated with ADHD medicine use is clinically important given that individuals with a diagnosis of ADHD, regardless of whether they receive treatment, face an elevated risk of CVD. 15 Additionally, a substantial proportion of young individuals with ADHD continues to have impairing symptoms in adulthood, 16 necessitating prolonged use of ADHD medication. Notably, studies have indicated a rising trend in the long-term use of ADHD medications, with approximately half of individuals using ADHD medication for over 5 years. 17 Furthermore, evidence is lacking regarding how cardiovascular risk may vary based on factors such as type of CVD, type of ADHD medication, age, and sex. 13 Therefore, there is a need for long-term follow-up studies to address these knowledge gaps and provide a more comprehensive understanding of the cardiovascular risks associated with ADHD medication use. This information is also crucial from a public health perspective, particularly due to the increasing number of individuals receiving ADHD medications worldwide. 4

This study aimed to assess the association between cumulative use of ADHD medication up to 14 years and the risk of CVD by using nationwide health registers in Sweden. We hypothesized that longer cumulative use of ADHD medication would be associated with increased CVD risk. In addition, we aimed to examine whether the associations differ across types of ADHD medication, types of CVD, sex, and age groups.

We used data from several Swedish nationwide registers linked through unique personal identification numbers. 18 Diagnoses were obtained from the National Inpatient Register, 19 which contains data on inpatient diagnoses since 1973 and outpatient diagnoses since 2001. Information on prescribed medications was retrieved from the Swedish Prescribed Drug Register, which contains all dispensed medications in Sweden since July 2005 and includes information on drug identity based on the Anatomical Therapeutic Chemical (ATC) classification, 20 dispensing dates, and free-text medication prescriptions. Socioeconomic factors were obtained from the Longitudinal Integrated Database for Health Insurance and Labour Market studies. 21 Information on death was retrieved from the Swedish Cause of Death Register, 22 which contains information on all deaths since 1952. The study was approved by the Swedish Ethical Review Authority. Informed patient consent is not required for register-based studies in Sweden. The study followed the Reporting of Studies Conducted Using Observational Routinely Collected Health Data–Pharmacoepidemiological Research ( RECORD-PE ) guideline. 23

We conducted a nested case-control study of all individuals residing in Sweden aged 6 to 64 years who received an incident diagnosis of ADHD or ADHD medication dispensation 15 between January 1, 2007, and December 31, 2020. The diagnosis of ADHD ( International Statistical Classification of Diseases and Related Health Problems, Tenth Revision [ ICD-10 ] code F90) was identified from the National Inpatient Register. Incident ADHD medication dispensation was identified from the Swedish Prescribed Drug Register and was defined as a dispensation after at least 18 months without any ADHD medication dispensation. 24 Baseline (ie, cohort entry) was defined as the date of incident ADHD diagnosis or ADHD medication dispensation, whichever came first. Individuals with ADHD medication prescriptions for indications other than ADHD 25 and individuals who emigrated, died, or had a history of CVD before baseline were excluded from the study. The cohort was followed until the case index date (ie, the date of CVD diagnosis), death, migration, or the study end date (December 31, 2020), whichever came first.

Within the study cohort, we identified cases as individuals with an incident diagnosis of any CVD (including ischemic heart diseases, cerebrovascular diseases, hypertension, heart failure, arrhythmias, thromboembolic disease, arterial disease, and other forms of heart disease; eTable 1 in Supplement 1 ) during follow-up. For each case, the date of their CVD diagnosis was assigned as the index date. Using incidence density sampling, 26 up to 5 controls without CVD were randomly selected for each case from the base cohort of individuals with ADHD. The matching criteria included age, sex, and calendar time, ensuring that cases and controls had the same duration of follow-up from baseline to index date. Controls were eligible for inclusion if they were alive, living in Sweden, and free of CVD at the time when their matched case received a diagnosis of CVD, with the index date set as the date of CVD diagnosis of the matched case ( Figure 1 ). Controls were derived from the same base cohort as the cases. Thus, a case with a later date of CVD diagnosis could potentially serve as a control for another case in the study. 26

The main exposure was cumulative duration of ADHD medication use, which included all ADHD medications approved in Sweden during the study period, including stimulants (methylphenidate [ATC code N06BA04], amphetamine [ATC code N06BA01], dexamphetamine [ATC code N06BA02], and lisdexamfetamine [ATC code N06BA12]) as well as nonstimulants (atomoxetine [ATC code N06BA09] and guanfacine [ATC code C02AC02]). Duration of ADHD medication use was derived from a validated algorithm that estimates treatment duration from free text in prescription records. 25 The cumulative duration of ADHD medication use was calculated by summing all days covered by ADHD medication between baseline and 3 months prior to the index date. The last 3 months before the index date were excluded to reduce reverse causation, as clinicians’ perception of potential cardiovascular risks may influence ADHD medication prescription. This time window was chosen because routine psychiatric practice in Sweden limits a prescription to a maximum 3 months at a time. 27 Individuals with follow-up of less than 3 months were excluded.

We conducted conditional logistic regression analyses to estimate odds ratios (ORs) for the associations between cumulative durations of ADHD medication use and incident CVD. Crude ORs were adjusted for all matching variables (age, sex, and calendar time) by design. Adjusted ORs (AORs) were additionally controlled for country of birth (Sweden vs other), highest educational level (primary or lower secondary, upper secondary, postsecondary or postgraduate, or unknown; individuals aged <16 years were included as a separate category), and diagnoses of somatic (type 2 diabetes, obesity, dyslipidemia, and sleep disorders) and psychiatric comorbidities (anxiety disorders, autism spectrum disorder, bipolar disorder, conduct disorder, depressive disorder, eating disorders, intellectual disability, personality disorders, schizophrenia, and substance use disorders; eTable 1 in Supplement 1 ) before baseline. The association between cumulative ADHD medication use and incident CVD was assessed using both continuous and categorical measures (no ADHD medication use, 0 to ≤1, 1 to ≤2, 2 to ≤3, 3 to ≤5, and >5 years). To capture potential nonlinear associations, we used restricted cubic splines to examine ADHD medication use as a continuous measure throughout follow-up. 28 The associations were examined in the full sample and stratified by age at baseline, that is, children or youth (<25 years old) and adults (≥25 years old). Furthermore, to evaluate the association with dosage of ADHD medication, we estimated the risk of CVD associated with each 1-year increase in use of ADHD medication across different dosage groups categorized by the average defined daily dose (DDD; for instance, 1 DDD of methylphenidate equals 30 mg) during follow-up. 29

In subgroup analyses, we examined the associations between ADHD medication use and specific CVDs, including arrhythmias, arterial disease, cerebrovascular disease, heart failure, hypertension, ischemic heart disease, and thromboembolic disease (eTable 1 in Supplement 1 ). Additionally, we investigated the associations with CVD risk for the most commonly prescribed ADHD medications in Sweden, ie, methylphenidate, lisdexamfetamine, and atomoxetine, while adjusting for other ADHD medication use. We also examined sex-specific associations.

To further examine the robustness of our findings, we conducted 4 sensitivity analyses. First, we restricted the sample to ever users of ADHD medication to reduce unmeasured confounding between ADHD medication users and nonusers. Second, we assessed ADHD medication exposure over the entire follow-up period without excluding the 3 months prior to the index date. Third, to capture fatal cardiovascular events, we additionally included death by CVD in the outcome definition. Finally, we constructed a conditional logistic regression model that adjusted for propensity scores of ADHD medication use. Data management was performed using SAS, version 9.4 (SAS Institute Inc) and all analyses were performed using R, version 4.2.3 (R Foundation for Statistical Computing).

The study cohort consisted of 278 027 individuals with ADHD aged 6 to 64 years. The incidence rate of CVD was 7.34 per 1000 person-years. After applying exclusion criteria and matching, the analysis included 10 388 cases (median [IQR] age at baseline, 34.6 (20.0-45.7) years; 6154 males [59.2%] and 4234 females [40.8%]) and 51 672 matched controls (median [IQR] age at baseline, 34.6 [19.8-45.6] years; 30 601 males [59.2%] and 21 071 females [40.8%]) ( Figure 1 and Table 1 ). Median (IQR) follow-up in both groups was 4.1 (1.9-6.8) years. Among the controls, 3363 had received a CVD diagnosis after their index dates. The most common types of CVD in cases were hypertension (4210 cases [40.5%]) and arrhythmias (1310 cases [12.6%]; eTable 2 in Supplement 1 ). Table 1 presents the sociodemographic information and somatic and psychiatric comorbidities in cases and controls. In general, cases had higher rates of somatic and psychiatric comorbidities and a lower level of educational attainment compared with controls.

A similar proportion of cases (83.9%) and controls (83.5%) used ADHD medication during follow-up, with methylphenidate being the most commonly dispensed type, followed by atomoxetine and lisdexamfetamine. Longer cumulative duration of ADHD medication use was associated with an increased risk of CVD compared with nonuse (0 to ≤1 year: AOR, 0.99 [95% CI, 0.93-1.06]; 1 to ≤2 years: AOR, 1.09 [95% CI, 1.01-1.18]; 2 to ≤3 years: AOR, 1.15 [95% CI, 1.05-1.25]; 3 to ≤5 years: AOR, 1.27 [95% CI, 1.17-1.39]; and >5 years: AOR, 1.23 [95% CI, 1.12-1.36]) ( Figure 2 ). The restricted cubic spline model suggested a nonlinear association, with the AORs increasing rapidly for the first 3 cumulative years of ADHD medication use and then becoming stable thereafter ( Figure 3 ). Throughout the entire follow-up, each 1-year increase in the use of ADHD medication was associated with a 4% increased risk of CVD (AOR, 1.04 [95% CI, 1.03-1.05]), and the corresponding increase for the first 3 years was 8% (AOR, 1.08 [95% CI, 1.04-1.11]). We observed similar results when examining children or youth and adults separately ( Figure 2 ). The restricted cubic spline model suggested a similar nonlinear association, with higher AORs in children or youth than in adults, but the 95% CIs largely overlapped ( Figure 3 ). Furthermore, similar associations were observed for females and males (eFigure in Supplement 1 ). The dosage analysis showed that the risk of CVD associated with each 1 year of ADHD medication use increased with higher average DDDs. The risk was found to be statistically significant only among individuals with a mean dose of at least 1.5 times the DDD (eTable 3 in Supplement 1 ). For example, among individuals with a mean DDD of 1.5 to 2 or less (eg, for methylphenidate, 45 to ≤60 mg), each 1-year increase in ADHD medication use was associated with a 4% increased risk of CVD (AOR, 1.04 [95% CI, 1.02-1.05]). Among individuals with a mean DDD >2 (eg, for methylphenidate >60 mg), each 1-year increase in ADHD medication use was associated with 5% increased risk of CVD (AOR, 1.05 [95% CI, 1.03-1.06]).

When examining the risk for specific CVDs, we found that long-term use of ADHD medication (compared with no use) was associated with an increased risk of hypertension (AOR, 1.72 [95% CI, 1.51-1.97] for 3 to ≤5 years; AOR, 1.80 [95% CI 1.55-2.08] for >5 years) ( Table 2 ), as well as arterial disease (AOR, 1.65 [95% CI, 1.11-2.45] for 3 to ≤5 years; AOR, 1.49 [95% CI 0.96-2.32] for >5 years). However, we did not observe any statistically significant increased risk for arrhythmias, heart failure, ischemic heart disease, thromboembolic disease, or cerebrovascular disease ( Table 2 ). Furthermore, long-term use of methylphenidate (compared with no use) was associated with an increased risk of CVD (AOR, 1.20 [95% CI, 1.10-1.31] for 3 to ≤5 years; AOR, 1.19 [95% CI, 1.08-1.31]) for >5 years; eTable 4 in Supplement 1 ). Compared with no use, lisdexamfetamine was also associated with an elevated risk of CVD (AOR, 1.23 [95% CI, 1.05-1.44] for 2 to ≤3 years; AOR, 1.17 [95% CI, 0.98-1.40] for >3 years), while the AOR for atomoxetine use was significant only for the first year of use (1.07 [95% CI 1.01-1.13]; eTable 4 in Supplement 1 ).

In sensitivity analyses, we observed a similar pattern of estimates when the analysis was restricted to ever users of ADHD medications. Significantly increased risk of CVD was found when comparing ADHD medication use for 1 year or less with use for 3 to 5 or less years (AOR, 1.28 (95% CI, 1.18-1.38) or for use for more than 5 years (AOR, 1.24 [95% CI, 1.13-1.36]) (eTable 5 in Supplement 1 ). When assessing ADHD medication use across the entire follow-up period, and compared with no use, the pattern of estimates was similar to the main analysis (3 to ≤5 years: AOR, 1.28 [95% CI, 1.18-1.39]; >5 years: AOR, 1.25 [95% CI, 1.14-1.37]) (eTable 5 in Supplement 1 ). The analysis that included cardiovascular death as a combined outcome also had results similar to the main analysis. Moreover, when adjusting for propensity scores of ADHD medication use, the findings remained consistent (eTable 5 in Supplement 1 ).

This large, nested case-control study found an increased risk of incident CVD associated with long-term ADHD medication use, and the risk increased with increasing duration of ADHD medication use. This association was statistically significant both for children and youth and for adults, as well as for females and males. The primary contributors to the association between long-term ADHD medication use and CVD risk was an increased risk of hypertension and arterial disease. Increased risk was also associated with stimulant medication use.

We found individuals with long-term ADHD medication use had an increased risk of incident CVD in a dose-response manner in the first 3 years of cumulative ADHD medication use. To our knowledge, few previous studies have investigated the association between long-term ADHD medication use and the risk of CVD with follow-up of more than 2 years. 13 The only 2 prior studies with long-term follow-up (median, 9.5 and 7.9 years 30 , 31 ) found an average 2-fold and 3-fold increased risk of CVD with ADHD medication use compared with nonuse during the study period, yet 1 of the studies 30 included only children, and participants in the other study 31 were not the general population of individuals with ADHD (including those with ADHD and long QT syndrome). Furthermore, both studies were subject to prevalent user bias. Results from the current study suggest that the CVD risk associated with ADHD medication use (23% increased risk for >5 years of ADHD medication use compared with nonuse) is lower than previously reported. 30 , 31 Furthermore, we observed that the increased risk stabilized after the first several years of medication use and persisted throughout the 14-year follow-up period.

The association between ADHD medication use and CVD was significant for hypertension and arterial disease, while no significant association was observed with other types of cardiovascular events. To our knowledge, only 1 previous study 12 has examined the association between ADHD medication use and clinically diagnosed hypertension, and it found an increased risk, although the increase was not statistically significant. Furthermore, increased blood pressure associated with ADHD medication use has been well documented. 7 , 9 One study 32 found that blood pressure was mainly elevated during the daytime, suggesting that the cardiovascular system may recover at night. However, the cross-sectional nature of that study cannot preclude a long-term risk of clinically diagnosed hypertension associated with ADHD medication use. We also identified an increased risk for arterial disease. To date, no previous study has explored the association between ADHD medication use and arterial disease. A few studies have reported that ADHD medication may be associated with changes in serum lipid profiles, but the results were not consistent. 33 , 34 Further research is needed on the potential implications of ADHD medications for individuals’ lipid profiles. We did not observe any association between ADHD medication use and the risk of arrhythmias. A recent systematic review of observational studies of ADHD medication use reported an elevated risk of arrhythmias, but the finding was not statistically significant. 13 A review of RCTs also found that the use of stimulants was associated with an average increase in heart rate of 5.7 beats/min, 9 but no evidence of prolonged QT interval or tachycardia was found based on electrocardiograms. 7 Additionally, it is worth noting that some individuals receiving ADHD medications might be prescribed antiarrhythmic β-blockers to alleviate palpitation symptoms, thus potentially attenuating an association between ADHD medications and arrhythmias. Nevertheless, the absence of an association between ADHD medication use and clinically diagnosed arrhythmias in the present study does not rule out an increased risk for mild arrhythmias or subclinical symptoms, as palpitations and sinus tachycardia are not routinely coded as arrhythmia diagnoses. Further research is necessary to replicate our findings.

Regarding types of ADHD medication, findings of the present study suggest that increasing cumulative durations of methylphenidate and lisdexamfetamine use were associated with incident CVD, while the associations for atomoxetine were statistically significant only for the first year of use. Previous RCTs have reported increased blood pressure and heart rate with methylphenidate, lisdexamfetamine, and atomoxetine, 5 , 35 , 36 but the mechanisms behind these adverse effects are still a topic of debate; there might be differences in cardiovascular adverse effects in stimulants vs nonstimulants. 37

We found that the association between cumulative duration of ADHD medication use and CVD was similar in females and males. Previous investigations exploring sex-specific association found higher point estimates in females, although the differences were not statistically significant. 13 Research has indicated that females diagnosed with ADHD may demonstrate different comorbidity patterns and potentially have different responses to stimulant medications compared with males. 38 - 40 Therefore, additional studies are needed to explore and better understand the potential sex-specific differences in cardiovascular responses to ADHD medications.

A strength of this study is that data on ADHD medication prescriptions and CVD diagnoses were recorded prospectively, so the results were not affected by recall bias. The findings should, however, be interpreted in the context of several limitations. First, our approach for identification of patients with CVD was based on recorded diagnoses and there could be under ascertainment of cardiovascular diagnoses in the registers used. This means that some controls may have had undiagnosed CVD that did not yet require medical care, which would tend to underestimate associations between ADHD medication use and CVD. Second, exposure misclassification may have occurred if patients did not take their medication as prescribed. This misclassification, if nondifferential, would tend to reduce ORs such that the estimates we observed were conservative. Third, while we accounted for a wide range of potential confounding variables, considering the observational nature of the study and the possibility of residual confounding, we could not prove causality. It is possible that the association observed might have been affected by time-varying confounders. For example, other psychotropic medications and lifestyle factors could have affected both ADHD medication use and the occurrence of cardiovascular events. 41 , 42 Confounding by ADHD severity is also a potential factor to consider, as individuals with more severe ADHD symptoms may have more comorbidities and a less healthy lifestyle, which could affect the risk of CVD. Fourth, the study did not examine the risk of CVD among individuals with preexisting CVD. Individuals with preexisting CVD represent a distinct clinical group that requires careful monitoring; thus, evaluating the risk among them necessitates a different study design that carefully considers the potential impact of prior knowledge and periodic monitoring. Finally, the results by type of ADHD medication and type of CVD need to be replicated by studies with larger sample sizes.

The results of this population-based case-control study with a longitudinal follow-up of 14 years suggested that long-term use of ADHD medication was associated with an increased risk of CVD, especially hypertension and arterial disease, and the risk was higher for stimulant medications. These findings highlight the importance of carefully weighing potential benefits and risks when making treatment decisions on long-term ADHD medication use. Clinicians should be vigilant in monitoring patients, particularly among those receiving higher doses, and consistently assess signs and symptoms of CVD throughout the course of treatment. Monitoring becomes even more crucial considering the increasing number of individuals engaging in long-term use of ADHD medication.

Accepted for Publication: August 29, 2023.

Published Online: November 22, 2023. doi:10.1001/jamapsychiatry.2023.4294

Open Access: This is an open access article distributed under the terms of the CC-BY License . © 2023 Zhang L et al. JAMA Psychiatry .

Corresponding Authors: Zheng Chang, PhD ( [email protected] ) and Le Zhang, PhD ( [email protected] ), Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Nobels väg 12A, 171 65 Stockholm, Sweden.

Author Contributions: Dr Zhang and Prof Chang had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Zhang, Johnell, Larsson, Chang.

Acquisition, analysis, or interpretation of data: Zhang, Li, Andell, Garcia-Argibay, Quinn, D'Onofrio, Brikell, Kuja-Halkola, Lichtenstein, Johnell, Chang.

Drafting of the manuscript: Zhang.

Critical review of the manuscript for important intellectual content: All authors.

Statistical analysis: Zhang, Li.

Obtained funding: Larsson, Chang.

Administrative, technical, or material support: Garcia-Argibay, D'Onofrio, Kuja-Halkola, Lichtenstein, Chang.

Supervision: Andell, Lichtenstein, Johnell, Larsson, Chang.

Conflict of Interest Disclosures: Dr Larsson reported receiving grants from Takeda Pharmaceuticals and personal fees from Takeda Pharmaceuticals, Evolan, and Medici Medical Ltd outside the submitted work. No other disclosures were reported.

Funding/Support: This study was supported by grants from the Swedish Research Council for Health, Working Life, and Welfare (2019-01172 and 2022-01111) (Dr Chang) and the European Union’s Horizon 2020 research and innovation program under grant agreement 965381 (Dr Larsson).

Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Data Sharing Statement: See Supplement 2 .

  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts


  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Research: Using AI at Work Makes Us Lonelier and Less Healthy

  • David De Cremer
  • Joel Koopman

case study research article

Employees who use AI as a core part of their jobs report feeling more isolated, drinking more, and sleeping less than employees who don’t.

The promise of AI is alluring — optimized productivity, lightning-fast data analysis, and freedom from mundane tasks — and both companies and workers alike are fascinated (and more than a little dumbfounded) by how these tools allow them to do more and better work faster than ever before. Yet in fervor to keep pace with competitors and reap the efficiency gains associated with deploying AI, many organizations have lost sight of their most important asset: the humans whose jobs are being fragmented into tasks that are increasingly becoming automated. Across four studies, employees who use it as a core part of their jobs reported feeling lonelier, drinking more, and suffering from insomnia more than employees who don’t.

Imagine this: Jia, a marketing analyst, arrives at work, logs into her computer, and is greeted by an AI assistant that has already sorted through her emails, prioritized her tasks for the day, and generated first drafts of reports that used to take hours to write. Jia (like everyone who has spent time working with these tools) marvels at how much time she can save by using AI. Inspired by the efficiency-enhancing effects of AI, Jia feels that she can be so much more productive than before. As a result, she gets focused on completing as many tasks as possible in conjunction with her AI assistant.

  • David De Cremer is a professor of management and technology at Northeastern University and the Dunton Family Dean of its D’Amore-McKim School of Business. His website is daviddecremer.com .
  • JK Joel Koopman is the TJ Barlow Professor of Business Administration at the Mays Business School of Texas A&M University. His research interests include prosocial behavior, organizational justice, motivational processes, and research methodology. He has won multiple awards from Academy of Management’s HR Division (Early Career Achievement Award and David P. Lepak Service Award) along with the 2022 SIOP Distinguished Early Career Contributions award, and currently serves on the Leadership Committee for the HR Division of the Academy of Management .

Partner Center

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Med Libr Assoc
  • v.107(1); 2019 Jan

Distinguishing case study as a research method from case reports as a publication type

The purpose of this editorial is to distinguish between case reports and case studies. In health, case reports are familiar ways of sharing events or efforts of intervening with single patients with previously unreported features. As a qualitative methodology, case study research encompasses a great deal more complexity than a typical case report and often incorporates multiple streams of data combined in creative ways. The depth and richness of case study description helps readers understand the case and whether findings might be applicable beyond that setting.

Single-institution descriptive reports of library activities are often labeled by their authors as “case studies.” By contrast, in health care, single patient retrospective descriptions are published as “case reports.” Both case reports and case studies are valuable to readers and provide a publication opportunity for authors. A previous editorial by Akers and Amos about improving case studies addresses issues that are more common to case reports; for example, not having a review of the literature or being anecdotal, not generalizable, and prone to various types of bias such as positive outcome bias [ 1 ]. However, case study research as a qualitative methodology is pursued for different purposes than generalizability. The authors’ purpose in this editorial is to clearly distinguish between case reports and case studies. We believe that this will assist authors in describing and designating the methodological approach of their publications and help readers appreciate the rigor of well-executed case study research.

Case reports often provide a first exploration of a phenomenon or an opportunity for a first publication by a trainee in the health professions. In health care, case reports are familiar ways of sharing events or efforts of intervening with single patients with previously unreported features. Another type of study categorized as a case report is an “N of 1” study or single-subject clinical trial, which considers an individual patient as the sole unit of observation in a study investigating the efficacy or side effect profiles of different interventions. Entire journals have evolved to publish case reports, which often rely on template structures with limited contextualization or discussion of previous cases. Examples that are indexed in MEDLINE include the American Journal of Case Reports , BMJ Case Reports, Journal of Medical Case Reports, and Journal of Radiology Case Reports . Similar publications appear in veterinary medicine and are indexed in CAB Abstracts, such as Case Reports in Veterinary Medicine and Veterinary Record Case Reports .

As a qualitative methodology, however, case study research encompasses a great deal more complexity than a typical case report and often incorporates multiple streams of data combined in creative ways. Distinctions include the investigator’s definitions and delimitations of the case being studied, the clarity of the role of the investigator, the rigor of gathering and combining evidence about the case, and the contextualization of the findings. Delimitation is a term from qualitative research about setting boundaries to scope the research in a useful way rather than describing the narrow scope as a limitation, as often appears in a discussion section. The depth and richness of description helps readers understand the situation and whether findings from the case are applicable to their settings.


Case study as a qualitative methodology is an exploration of a time- and space-bound phenomenon. As qualitative research, case studies require much more from their authors who are acting as instruments within the inquiry process. In the case study methodology, a variety of methodological approaches may be employed to explain the complexity of the problem being studied [ 2 , 3 ].

Leading authors diverge in their definitions of case study, but a qualitative research text introduces case study as follows:

Case study research is defined as a qualitative approach in which the investigator explores a real-life, contemporary bounded system (a case) or multiple bound systems (cases) over time, through detailed, in-depth data collection involving multiple sources of information, and reports a case description and case themes. The unit of analysis in the case study might be multiple cases (a multisite study) or a single case (a within-site case study). [ 4 ]

Methodologists writing core texts on case study research include Yin [ 5 ], Stake [ 6 ], and Merriam [ 7 ]. The approaches of these three methodologists have been compared by Yazan, who focused on six areas of methodology: epistemology (beliefs about ways of knowing), definition of cases, design of case studies, and gathering, analysis, and validation of data [ 8 ]. For Yin, case study is a method of empirical inquiry appropriate to determining the “how and why” of phenomena and contributes to understanding phenomena in a holistic and real-life context [ 5 ]. Stake defines a case study as a “well-bounded, specific, complex, and functioning thing” [ 6 ], while Merriam views “the case as a thing, a single entity, a unit around which there are boundaries” [ 7 ].

Case studies are ways to explain, describe, or explore phenomena. Comments from a quantitative perspective about case studies lacking rigor and generalizability fail to consider the purpose of the case study and how what is learned from a case study is put into practice. Rigor in case studies comes from the research design and its components, which Yin outlines as (a) the study’s questions, (b) the study’s propositions, (c) the unit of analysis, (d) the logic linking the data to propositions, and (e) the criteria for interpreting the findings [ 5 ]. Case studies should also provide multiple sources of data, a case study database, and a clear chain of evidence among the questions asked, the data collected, and the conclusions drawn [ 5 ].

Sources of evidence for case studies include interviews, documentation, archival records, direct observations, participant-observation, and physical artifacts. One of the most important sources for data in qualitative case study research is the interview [ 2 , 3 ]. In addition to interviews, documents and archival records can be gathered to corroborate and enhance the findings of the study. To understand the phenomenon or the conditions that created it, direct observations can serve as another source of evidence and can be conducted throughout the study. These can include the use of formal and informal protocols as a participant inside the case or an external or passive observer outside of the case [ 5 ]. Lastly, physical artifacts can be observed and collected as a form of evidence. With these multiple potential sources of evidence, the study methodology includes gathering data, sense-making, and triangulating multiple streams of data. Figure 1 shows an example in which data used for the case started with a pilot study to provide additional context to guide more in-depth data collection and analysis with participants.

An external file that holds a picture, illustration, etc.
Object name is jmla-107-1-f001.jpg

Key sources of data for a sample case study


Case study methodology is evolving and regularly reinterpreted. Comparative or multiple case studies are used as a tool for synthesizing information across time and space to research the impact of policy and practice in various fields of social research [ 9 ]. Because case study research is in-depth and intensive, there have been efforts to simplify the method or select useful components of cases for focused analysis. Micro-case study is a term that is occasionally used to describe research on micro-level cases [ 10 ]. These are cases that occur in a brief time frame, occur in a confined setting, and are simple and straightforward in nature. A micro-level case describes a clear problem of interest. Reporting is very brief and about specific points. The lack of complexity in the case description makes obvious the “lesson” that is inherent in the case; although no definitive “solution” is necessarily forthcoming, making the case useful for discussion. A micro-case write-up can be distinguished from a case report by its focus on briefly reporting specific features of a case or cases to analyze or learn from those features.


Disciplines such as education, psychology, sociology, political science, and social work regularly publish rich case studies that are relevant to particular areas of health librarianship. Case reports and case studies have been defined as publication types or subject terms by several databases that are relevant to librarian authors: MEDLINE, PsycINFO, CINAHL, and ERIC. Library, Information Science & Technology Abstracts (LISTA) does not have a subject term or publication type related to cases, despite many being included in the database. Whereas “Case Reports” are the main term used by MEDLINE’s Medical Subject Headings (MeSH) and PsycINFO’s thesaurus, CINAHL and ERIC use “Case Studies.”

Case reports in MEDLINE and PsycINFO focus on clinical case documentation. In MeSH, “Case Reports” as a publication type is specific to “clinical presentations that may be followed by evaluative studies that eventually lead to a diagnosis” [ 11 ]. “Case Histories,” “Case Studies,” and “Case Study” are all entry terms mapping to “Case Reports”; however, guidance to indexers suggests that “Case Reports” should not be applied to institutional case reports and refers to the heading “Organizational Case Studies,” which is defined as “descriptions and evaluations of specific health care organizations” [ 12 ].

PsycINFO’s subject term “Case Report” is “used in records discussing issues involved in the process of conducting exploratory studies of single or multiple clinical cases.” The Methodology index offers clinical and non-clinical entries. “Clinical Case Study” is defined as “case reports that include disorder, diagnosis, and clinical treatment for individuals with mental or medical illnesses,” whereas “Non-clinical Case Study” is a “document consisting of non-clinical or organizational case examples of the concepts being researched or studied. The setting is always non-clinical and does not include treatment-related environments” [ 13 ].

Both CINAHL and ERIC acknowledge the depth of analysis in case study methodology. The CINAHL scope note for the thesaurus term “Case Studies” distinguishes between the document and the methodology, though both use the same term: “a review of a particular condition, disease, or administrative problem. Also, a research method that involves an in-depth analysis of an individual, group, institution, or other social unit. For material that contains a case study, search for document type: case study.” The ERIC scope note for the thesaurus term “Case Studies” is simple: “detailed analyses, usually focusing on a particular problem of an individual, group, or organization” [ 14 ].


We call your attention to a few examples published as case studies in health sciences librarianship to consider how their characteristics fit with the preceding definitions of case reports or case study research. All present some characteristics of case study research, but their treatment of the research questions, richness of description, and analytic strategies vary in depth and, therefore, diverge at some level from the qualitative case study research approach. This divergence, particularly in richness of description and analysis, may have been constrained by the publication requirements.

As one example, a case study by Janke and Rush documented a time- and context-bound collaboration involving a librarian and a nursing faculty member [ 15 ]. Three objectives were stated: (1) describing their experience of working together on an interprofessional research team, (2) evaluating the value of the librarian role from librarian and faculty member perspectives, and (3) relating findings to existing literature. Elements that signal the qualitative nature of this case study are that the authors were the research participants and their use of the term “evaluation” is reflection on their experience. This reads like a case study that could have been enriched by including other types of data gathered from others engaging with this team to broaden the understanding of the collaboration.

As another example, the description of the academic context is one of the most salient components of the case study written by Clairoux et al., which had the objectives of (1) describing the library instruction offered and learning assessments used at a single health sciences library and (2) discussing the positive outcomes of instruction in that setting [ 16 ]. The authors focus on sharing what the institution has done more than explaining why this institution is an exemplar to explore a focused question or understand the phenomenon of library instruction. However, like a case study, the analysis brings together several streams of data including course attendance, online material page views, and some discussion of results from surveys. This paper reads somewhat in between an institutional case report and a case study.

The final example is a single author reporting on a personal experience of creating and executing the role of research informationist for a National Institutes of Health (NIH)–funded research team [ 17 ]. There is a thoughtful review of the informationist literature and detailed descriptions of the institutional context and the process of gaining access to and participating in the new role. However, the motivating question in the abstract does not seem to be fully addressed through analysis from either the reflective perspective of the author as the research participant or consideration of other streams of data from those involved in the informationist experience. The publication reads more like a case report about this informationist’s experience than a case study that explores the research informationist experience through the selection of this case.

All of these publications are well written and useful for their intended audiences, but in general, they are much shorter and much less rich in depth than case studies published in social sciences research. It may be that the authors have been constrained by word counts or page limits. For example, the submission category for Case Studies in the Journal of the Medical Library Association (JMLA) limited them to 3,000 words and defined them as “articles describing the process of developing, implementing, and evaluating a new service, program, or initiative, typically in a single institution or through a single collaborative effort” [ 18 ]. This definition’s focus on novelty and description sounds much more like the definition of case report than the in-depth, detailed investigation of a time- and space-bound problem that is often examined through case study research.

Problem-focused or question-driven case study research would benefit from the space provided for Original Investigations that employ any type of quantitative or qualitative method of analysis. One of the best examples in the JMLA of an in-depth multiple case study that was authored by a librarian who published the findings from her doctoral dissertation represented all the elements of a case study. In eight pages, she provided a theoretical basis for the research question, a pilot study, and a multiple case design, including integrated data from interviews and focus groups [ 19 ].

We have distinguished between case reports and case studies primarily to assist librarians who are new to research and critical appraisal of case study methodology to recognize the features that authors use to describe and designate the methodological approaches of their publications. For researchers who are new to case research methodology and are interested in learning more, Hancock and Algozzine provide a guide [ 20 ].

We hope that JMLA readers appreciate the rigor of well-executed case study research. We believe that distinguishing between descriptive case reports and analytic case studies in the journal’s submission categories will allow the depth of case study methodology to increase. We also hope that authors feel encouraged to pursue submitting relevant case studies or case reports for future publication.

Editor’s note: In response to this invited editorial, the Journal of the Medical Library Association will consider manuscripts employing rigorous qualitative case study methodology to be Original Investigations (fewer than 5,000 words), whereas manuscripts describing the process of developing, implementing, and assessing a new service, program, or initiative—typically in a single institution or through a single collaborative effort—will be considered to be Case Reports (formerly known as Case Studies; fewer than 3,000 words).


  • Author Services


You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia


Article Menu

case study research article

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

What extra information can be provided by multi-component seismic data: a case study of 2d3c prospecting of a copper–molybdenum mine in inner mongolia, china.

case study research article

Share and Cite

Li, Y.; Gu, Y.; Zhang, Y.; Wang, Y.; Yu, G.; Xu, M. What Extra Information Can Be Provided by Multi-Component Seismic Data: A Case Study of 2D3C Prospecting of a Copper–Molybdenum Mine in Inner Mongolia, China. Minerals 2024 , 14 , 689. https://doi.org/10.3390/min14070689

Li Y, Gu Y, Zhang Y, Wang Y, Yu G, Xu M. What Extra Information Can Be Provided by Multi-Component Seismic Data: A Case Study of 2D3C Prospecting of a Copper–Molybdenum Mine in Inner Mongolia, China. Minerals . 2024; 14(7):689. https://doi.org/10.3390/min14070689

Li, Yingda, Yutian Gu, Yi Zhang, Yun Wang, Guangming Yu, and Mingcai Xu. 2024. "What Extra Information Can Be Provided by Multi-Component Seismic Data: A Case Study of 2D3C Prospecting of a Copper–Molybdenum Mine in Inner Mongolia, China" Minerals 14, no. 7: 689. https://doi.org/10.3390/min14070689

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.


Subscribe to receive issue release notifications and newsletters from MDPI journals

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 05 January 2024

Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing

  • Yusong Wang 1 , 2   na1 ,
  • Tong Wang   ORCID: orcid.org/0000-0002-9483-0050 1   na1 ,
  • Shaoning Li 1   na1 ,
  • Xinheng He 1 , 3 , 4 ,
  • Mingyu Li 1 , 5 ,
  • Zun Wang   ORCID: orcid.org/0000-0002-8763-8327 1 ,
  • Nanning Zheng 2 ,
  • Bin Shao   ORCID: orcid.org/0000-0002-9790-5687 1 &
  • Tie-Yan Liu   ORCID: orcid.org/0000-0002-0476-8020 1  

Nature Communications volume  15 , Article number:  313 ( 2024 ) Cite this article

5856 Accesses

2 Citations

4 Altmetric

Metrics details

  • Chemical biology
  • Computational biology and bioinformatics
  • Computational models
  • Molecular modelling
  • Protein structure predictions

Geometric deep learning has been revolutionizing the molecular modeling field. Despite the state-of-the-art neural network models are approaching ab initio accuracy for molecular property prediction, their applications, such as drug discovery and molecular dynamics (MD) simulation, have been hindered by insufficient utilization of geometric information and high computational costs. Here we propose an equivariant geometry-enhanced graph neural network called ViSNet, which elegantly extracts geometric features and efficiently models molecular structures with low computational costs. Our proposed ViSNet outperforms state-of-the-art approaches on multiple MD benchmarks, including MD17, revised MD17 and MD22, and achieves excellent chemical property prediction on QM9 and Molecule3D datasets. Furthermore, through a series of simulations and case studies, ViSNet can efficiently explore the conformational space and provide reasonable interpretability to map geometric representations to molecular structures.

Similar content being viewed by others

case study research article

Computational design of soluble and functional membrane protein analogues

case study research article

Accurate structure prediction of biomolecular interactions with AlphaFold 3

case study research article

Highly accurate protein structure prediction with AlphaFold


Molecular modeling plays a crucial role in modern scientific and engineering fields, aiding in the understanding of chemical reactions, facilitating new drug development, and driving scientific and technological advancements 1 , 2 , 3 , 4 . One commonly used method in molecular modeling is density functional theory (DFT). DFT enables accurate calculations of energy, forces, and other chemical properties of molecules 5 , 6 . However, due to the large computational requirements, DFT calculations often demand significant computational resources and time, particularly for large molecular systems or high-precision calculations. Machine learning (ML) offers an alternative solution by learning from reference data with ab initio accuracy and high computational efficiency 7 , 8 . Gradient-domain machine learning (GDML) 9 constructs accurate molecular force fields using conservation of energy and limited samples from ab initio molecular dynamics trajectories, enabling cost-effective simulations while maintaining accuracy. Symmetric GDML (sGDML) 10 further improves force field construction by incorporating physical symmetries, achieving CCSD(T)-level accuracy for flexible molecules. An exact iterative approach (Global sGDML) 11 extends sGDML to global force fields for molecules with several hundred atoms, maintaining correlations of atomic degree and accurately describing complex molecules and materials. In recent years, deep learning (DL) has demonstrated its powerful ability to learn from raw data without any hand-crafted features in many fields and thus attracted more and more attention. However, the inherent drawback of deep learning, which requires large amounts of data, has become a bottleneck for its application to more scenarios 12 . To alleviate the dependency on data for DL potentials, recent works have incorporated the inductive bias of symmetry into neural network design, known as geometric deep learning (GDL). Symmetry describes the conservation of physical laws, i.e., the unchanged physical properties with any transformations such as translations or rotations. It allows GDL to be extended to limited data scenarios without any data augmentation.

Equivariant graph neural network (EGNN) is one of the representative approaches in GDL, which has extensive capability to model molecular geometry 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 . A popular kind of EGNN conducts equivariance from directional information and involves geometric features to predict molecular properties. GemNet 20 extends the invariant DimeNet/DimeNet++ 16 , 17 with dihedral information. They explicitly extract geometric information in the Euclidean space with first-order geometric tensor, i.e., setting l max  = 1. PaiNN 18 and equivariant transformer 19 further adopt vector embedding and scalarize the angular representation implicitly via the inner product of the vector embedding itself. They reduce the complexity of explicit geometry extraction by taking the angular information into consideration. Another mainstream approach to achieving equivariance is through group representation theory, which can achieve higher accuracy but comes with large computational costs. NequIP, Allegro, and MACE 12 , 22 , 23 achieve state-of-the-art performance on several molecular dynamics simulation datasets leveraging high-order geometric tensors. On the one hand, algorithms based on group representation theory have strong mathematical foundations and are able to fully utilize geometric information using high-order geometric tensors. On the other hand, these algorithms often require computationally expensive operations such as the Clebsch–Gordan product (CG-product) 24 , making them possibly suitable for periodic systems with elaborate model design but impractical for large molecular systems such as chemical and biological molecules without periodic boundary conditions.

In this study, we propose ViSNet (short for “Vector-Scalar interactive graph neural Network"), which alleviates the dilemma between computational costs and sufficient utilization of geometric information. By incorporating an elaborate runtime geometry calculation (RGC) strategy, ViSNet implicitly extracts various geometric features, i.e., angles, dihedral torsion angles, and improper angles in accordance with the force field of classical MD with linear time complexity, thus significantly accelerating model training and inference while reducing the memory consumption. To extend the vector representation, we introduce spherical harmonics and simplify the computationally expensive Clebsch–Gordan product with the inner product. Furthermore, we present a well-designed vector–scalar interactive equivariant message passing (ViS-MP) mechanism, which fully utilizes the geometric features by interacting vector hidden representations with scalar ones. When comprehensively evaluated on some benchmark datasets, ViSNet outperforms all state-of-the-art algorithms on all molecules in MD17, revised MD17 and MD22 datasets and shows superior performance on QM9, Molecule3D dataset indicating the powerful capability of molecular geometric representation. ViSNet also has won the PCQM4Mv2 track in the OGB-LCS@NeurIPS2022 competition ( https://ogb.stanford.edu/neurips2022/results/ ). We then performed molecular dynamics simulations for each molecule on MD17 driven by ViSNet trained only with limited data (950 samples). The highly consistent interatomic distance distributions and the explored potential energy surfaces between ViSNet and quantum simulation illustrate that ViSNet is genuinely data-efficient and can perform simulations with high fidelity. To further explore the usefulness of ViSNet to real-world applications, we used an in-house dataset that consists of about 10,000 different conformations of the 166-atom mini-protein Chignolin derived from replica exchange molecular dynamics and calculated at the DFT level. When evaluated on the dataset, ViSNet also achieved significantly better performance than empirical force fields, and the simulations performed by ViSNet exhibited very close force calculation to DFT. In addition, ViSNet exhibits reasonable interpretability to map geometric representation to molecular structures. The contributions of ViSNet can be summarized as follows:

Proposing an RGC module that utilizes high-order geometric tensors to implicitly extract various geometric features, including angles, dihedral torsion angles, and improper angles, with linear time complexity.

Introducing ViS-MP mechanism to enable efficient interaction between vector hidden representations and scalar ones and fully exploit the geometric information.

Achieving state-of-the-art performance in six benchmarks for predicting energy, forces, HOMO-LUMO gap, and other quantum properties of molecules.

Performing molecular dynamics simulations driven by ViSNet on both small molecules and 166-atom Chignolin with high fidelity.

Demonstrating reasonable model interpretability between geometric features and molecular structures.

Overview of ViSNet

ViSNet is a versatile EGNN that predicts potential energy, atomic forces as well as various quantum chemical properties by taking atomic coordinates and numbers as inputs. As shown in Fig.  1 a, the model is composed of an embedding block and multiple stacked ViSNet blocks, followed by an output block. The atomic number and coordinates are fed into the embedding block followed by ViSNet blocks to extract and encode geometric representations. The geometric representations are then used to predict molecular properties through the output block. It is worth noting that ViSNet is an energy-conserving potential, i.e., the predicted atomic forces are derived from the negative gradients of the potential energy with respect to the coordinates 9 , 10 .

figure 1

a Model sketch of ViSNet. ViSNet embeds the 3D structures of molecules and extracts the geometric information through a series of ViSNet blocks and outputs the molecule properties such as energy, forces, and HOMO–LUMO gap through an output block. b Flowchart of one ViSNet Block. One ViSNet block consists of two modules: (i) Scalar2Vec , responsible for attaching scalar embeddings to vectors.; (ii) Vec2Scalar , renovates scalar embeddings built on RGC strategy. The inputs of Scalar2Vec are the node embedding h i , edge embedding f i j , direction unit \({\overrightarrow{v}}_{i}\) and the relative positions between two atoms. The edge-fusion graph attention module (serves as \({\phi }_{{\rm {m}}}^{{\rm {s}}}\) ) takes as input h i and the output of the dense layer following f i j , and outputs scalar messages. Before aggregation, each scalar message is transformed through a dense layer, and then fused with the unit of the relative position \({\overrightarrow{u}}_{ij}\) and its own direction unit \({\overrightarrow{v}}_{j}\) . We further compute the vector messages and aggregate them all among the neighborhood. Through a gated residual connection, the final residual \({{\Delta }}{\overrightarrow{v}}_{i}\) is produced. In Vec2Scalar module, by Hadamard production of aggregated scalar messages and the output of RGC-Angle calculation and adding a gated residual connection, the final Δ h i is figured out. Likewise, combining the projected f i j and the output of RGC-Dihedral calculation, the final Δ f i j is determined.

The success of classical force fields shows that geometric features such as interatomic distances, angles, dihedral torsion angles, and improper angles in Fig.  2 are essential to determine the total potential energy of molecules. The explicit extraction of invariant geometric representations in previous studies often suffers from a large amount of time or memory consumption during model training and inference. Given an atom, the calculation of angular information scales \({{{{{{{\mathcal{O}}}}}}}}({{{{{{{{\mathcal{N}}}}}}}}}^{2})\) with the number of neighboring atoms, while the computational complexity is even \({{{{{{{\mathcal{O}}}}}}}}({{{{{{{{\mathcal{N}}}}}}}}}^{3})\) for dihedrals 20 . To alleviate this problem, inspired by Sch¨utt et al. 18 , we propose runtime geometry calculation (RGC), which uses an equivariant vector representation (termed as direction unit) for each node to preserve its geometric information. RGC directly calculates the geometric information from the direction unit which only sums the vectors from the target node to its neighbors once. Therefore, the computational complexity can be reduced to \({{{{{{{\mathcal{O}}}}}}}}({{{{{{{\mathcal{N}}}}}}}})\) . Notably, beyond employing angular information that has been used in PaiNN 18 and ET 19 , ViSNet further considers the dihedral torsion and improper angle calculation with higher geometric tensors.

figure 2

The bonded terms consist of bond length, bond angle, dihedral torsion, and improper angle. The RGC module depicts all bonded terms of classical MD as model operations in linear time complexity. Yellow arrow \({\overrightarrow{v}}_{i}\) denotes the direction unit in Eq. ( 1 ).

Considering the sub-structure of a toy molecule with four atoms shown in Fig.  2 , the angular information of the target node i could be conducted from the vector \({\overrightarrow{r}}_{ij}\) as follows:

where \({\overrightarrow{r}}_{ij}\) is the vector from node i to its neighboring node j , \({\overrightarrow{u}}_{ij}\) is the unit vector of \({\overrightarrow{r}}_{ij}\) . Here, we define the direction unit \({\overrightarrow{v}}_{i}\) as the sum of all unit vectors from node i to its all neighboring nodes j , where node i is the intersection of all unit vectors. As shown in Eq. ( 2 ), we calculate the inner product of the direction unit \({\overrightarrow{v}}_{i}\) which represents the sum of the inner products of unit vectors from node i to all its neighboring nodes. Combining with Eq. ( 1 ), the inner product of direction \({\overrightarrow{v}}_{i}\) finally stands for the sum of cosine values of all angles formed by node i and any two of its neighboring nodes.

Similar to runtime angle calculation, we also calculate the vector rejection 25 of the direction unit \({\overrightarrow{v}}_{i}\) of node i and \({\overrightarrow{v}}_{j}\) of node j on the vector \({\overrightarrow{u}}_{ij}\) and \({\overrightarrow{u}}_{ji}\) , respectively.

where \({{{{{{{{\rm{Rej}}}}}}}}}_{\overrightarrow{b}}(\overrightarrow{a})\) represents the vector component of \(\overrightarrow{a}\) perpendicular to \(\overrightarrow{b}\) , termed as the vector rejection. \({\overrightarrow{u}}_{ij}\) and \({\overrightarrow{v}}_{i}\) are defined in Eq. ( 1 ). \({\overrightarrow{w}}_{ij}\) represents the sum of the vector rejection \({{{{{{{{\rm{Rej}}}}}}}}}_{{\overrightarrow{u}}_{ij}}({\overrightarrow{u}}_{im})\) and \({\overrightarrow{w}}_{ji}\) represents the sum of the vector rejection \({{{{{{{{\rm{Rej}}}}}}}}}_{{\overrightarrow{u}}_{ji}}({\overrightarrow{u}}_{jn})\) . The inner product between \({\overrightarrow{w}}_{ij}\) and \({\overrightarrow{w}}_{ji}\) is then calculated to conduct dihedral torsion angle information of the intersecting edge e i j as follows:

The improper angle is derived from a pyramid structure forming by 4 nodes. As the last toy molecule shown in Fig.  2 , node i is the vertex of the pyramid, and the improper torsion angle is formed by two adjacent planes with an intersecting edge e i j . We can also calculate the improper angle by vector rejection:

In the same way, the inner product between \({\overrightarrow{t}}_{ij}\) and \({\overrightarrow{t}}_{ji}\) indicates the summation of improper angle information formed by e i j :

Multiple works have shown the effectiveness of high-order geometric tensors for molecular modeling 12 , 22 , 26 , 27 . However, the computational overheads of these approaches are generally expansive due to the CG-product, impeding their further application for large systems. In this work, we convert the vectors to high-order representation with spherical harmonics but discard CG-product with the inner product following the idea of RGC. We find that the extended high-order geometric tensors can still represent the above angular information in the form of Legendre polynomials according to the addition theorem:

where the P l is the Legendre polynomial of degree l , Y l , m denotes the spherical harmonics function and \({Y}_{l,m}^{*}\) denotes its complex conjugation. We sum the product of different order l to obtain the scalar angular representation, which is the same operation as the inner product. It is worth noting that such an extension does not increase the model size and keeps the model architecture unchanged. We also provide proof about the rotational invariance of the RGC strategy in the section “Proofs of the rotational invariance of RGC ”.

In order to make full use of geometric information and enhance the interaction between scalars and vectors, we designed an effective vector–scalar interactive message-passing mechanism with respect to the intersecting nodes and edges for angles and dihedrals, respectively. It is important to note that previous studies 18 , 19 primarily focused on updating node features, whereas our approach updates both node and edge features during message passing, leading to a more comprehensive geometric representation. The key operations in ViS-MP are given as follows:

where h i denotes the scalar embedding of node i , f i j stands for the edge feature between node i and node j . \({\overrightarrow{v}}_{i}\) represents the embedding of the direction unit mentioned in RGC. The superscript of variables indicates the index of the block that the variables belong to. We omit the improper angle here for brevity. A comprehensive version is depicted in Supplementary. ViS-MP extends the conventional message passing, aggregation, and update processes with vector–scalar interactions. Eqs. ( 8 ) and ( 9 ) depict our message-passing and aggregation processes. To be concrete, scalar messages m i j incorporating scalar embedding h j , h i , and f i j are passed and then aggregated to node i through a message function \({\phi }_{m}^{s}\) (Eq. ( 8 )). Similar operations are applied for vector messages \({\overrightarrow{m}}_{i}^{l}\) of node i that incorporates scalar message m i j , vector \({\overrightarrow{r}}_{ij}\) and vector embedding \({\overrightarrow{v}}_{j}\) (Eq. ( 9 )). Equations ( 10 ) and ( 11 ) demonstrate the update processes. h i is updated by the aggregated scalar message output m i while the inner product of \({\overrightarrow{v}}_{i}\) is updated through an update function \({\phi }_{un}^{s}\) . Then \({\overrightarrow{f}}_{ij}\) is updated by the inner product of the rejection of the vector embedding \({\overrightarrow{v}}_{i}\) and \({\overrightarrow{v}}_{j}\) through an update function \({\phi }_{ue}^{s}\) . Finally, the vector embedding \({\overrightarrow{v}}_{i}\) is updated by both scalar and vector messages through an update function \({\phi }_{un}^{v}\) . Notably, the vectors update function, i.e., ϕ v require to be equivariant. The detailed message and update functions can be found in the Methods section. A proof about the equivariance of ViS-MP can be found in Supplementary Methods.

In summary, the geometric features are extracted by inner products in the RGC strategy and the scalar and vector embeddings are cyclically updating each other in ViS-MP so as to learn a comprehensive geometric representation from molecular structures.

Accurate quantum chemical property predictions

We evaluated ViSNet on several prevailing benchmark datasets including MD17 9 , 10 , 28 , revised MD17 29 , MD22 30 , QM9 31 , Molecule3D 32 , and OGB-LSC PCQM4Mv2 33 for energy, force, and other molecular property prediction. MD17 consists of the MD trajectories of seven small organic molecules; the number of conformations in each molecule dataset ranges from 133,700 to 993,237. The dataset rMD17 is a reproduced version of MD17 with higher accuracy. MD22 is a recently proposed MD trajectories dataset that presents challenges with respect to larger system sizes (42–370 atoms). Large molecules such as proteins, lipids, carbohydrates, nucleic acids, and supramolecules are included in MD22. QM9 consists of 12 kinds of quantum chemical properties of 133,385 small organic molecules with up to 9 heavy atoms. Molecule3D is a recently proposed dataset including 3,899,647 molecules collected from PubChemQC with their ground-state structures and corresponding properties calculated by DFT. We focus on the prediction of the HOMO–LUMO gap following ComENet 34 . OGB-LSC PCQM4Mv2 is a quantum chemistry dataset originally curated under the PubChemQC including a DFT-calculated HOMO–LUMO gap of 3,746,619 molecules. The 3D conformations are provided for 3,378,606 training molecules but not for the validation and test sets. The training details of ViSNet on each benchmark are described in the “Methods” section.

We compared ViSNet with the state-of-the-art algorithms, including DimeNet 16 , PaiNN 18 , SpookyNet 21 , ET 19 , GemNet 20 , UNiTE 35 , NequIP 12 , SO3KRATES 36 , Allegro 22 , MACE 23 and so on. As shown in Table  1 (MD17), Table  2 (rMD17), and Table  3 (MD22), it is remarkable that ViSNet outperformed the compared algorithms for both small (MD17 and rMD17) and large molecules (MD22) with the lowest mean absolute errors (MAE) of predicted energy and forces. On the one hand, compared with PaiNN, ET, and GemNet, ViSNet incorporated more geometric information and made full use of geometric information in ViS-MP, which contributes to the performance gains. On the other hand, compared with NequIP, Allegro, SO3KRATES, MACE, etc., ViSNet testified the effect of introducing spherical harmonics in the RGC module.

As shown in Table  4 , ViSNet also achieved superior performance for chemical property predictions on QM9. It outperformed the compared algorithms for 9 of 12 chemical properties and achieved comparable results on the remaining properties. Elaborated evaluations on Molecule3D confirmed the high prediction accuracy of ViSNet as shown in Table  5 . ViSNet achieved 33.6% and 6.51% improvements than the second-best for random split and scaffold split, respectively. Furthermore, ViSNet exhibited good portability to other multimodality methods, e.g., Transformer-M 37 and outperformed other approaches on OGB-LSC PCQM4Mv2 (see Supplementary Fig.  S1) . ViSNet also achieved the winners of PCQM4Mv2 track in the OGB-LCS@NeurIPS2022 competition when testing on unseen molecules 38 ( https://ogb.stanford.edu/neurips2022/results/ ).

To evaluate the computational efficiency of our ViSNet, following 23 , we compare the time latency of ViSNet with prevailing models in Supplementary Fig.  S2 . The latency is defined as the time it takes to compute forces on a structure (i.e., the gradient calculation for a set of input coordinates through the whole deep neural network). As shown in Supplementary Fig.  S2 , ViSNet ( L  = 2) saved 42.8% time latency compared with MACE ( L  = 2). Notably, despite the use of CG-product, Allegro had a significant speed improvement compared to NequIP and BOTNet. However, ViSNet still saved 6.1%, 4.1%, and 61% time latency compared to Allegro with L  = 1, 2, and 3, respectively.

Efficient molecular dynamics simulations

To evaluate ViSNet as the potential for MD simulations, we incorporated ViSNet that trained only with 950 samples on MD17 into the ASE simulation framework 39 to perform MD simulations for all seven kinds of organic molecules. All simulations are run with a time step τ  = 0.5 fs under the Berendsen thermostat with the other settings the same as those of the MD17 dataset. As shown in Fig.  3 , we analyzed the interatomic distance distributions derived from both AIMD simulations with ViSNet as the potential and ab initio molecular dynamics simulations at the DFT level for all seven molecules, respectively. As shown in Fig.  3 a, the interatomic distance distribution h ( r ) is defined as the ensemble average of atomic density at a radius r 9 . Figure  3 b–h illustrates the distributions derived from ViSNet are very close to those generated by DFT. We also compared the potential energy surfaces sampled by ViSNet and DFT for these molecules, respectively (Supplementary Fig.  S3 ). The consistent potential energy surfaces suggest that ViSNet can recover the conformational space from the simulation trajectories. Moreover, compared to DFT, numerous groundbreaking machine learning force fields (MLFFs), including sGDML 10 , ANI 40 , DPMD 41 , and PhysNet 42 have proven their exceptional speeds in MD simulations. Similar to such algorithms, ViSNet also exhibited significant computational cost reduction compared to DFT as shown in Supplementary Fig.  S4 and Table  S2 .

figure 3

a An illustration about the atomic density at a radius r with the arbitrary atom as the center. The interatomic distance distribution is defined as the ensemble average of atomic density. b – h The interatomic distance distributions comparison between simulations by ViSNet and DFT for all seven organic molecules in MD17. The curve of ViSNet is shown using a solid blue line, while the dashed orange line is used for the DFT curve. The structures of the corresponding molecules are shown in the upper right corner. Source data are provided as a Source Data file.

To further examine the molecular properties derived from simulations driven by ViSNet, we performed 500 ps MD simulations at a constant energy ensemble (NVE) for ethanol in the MD17 dataset with a time step of τ  = 0.5 fs and 200 ps Ac-Ala3-NHMe in the MD22 dataset with a time step of τ  = 1 fs. The simulations were driven by ViSNet, sGDML, and DFT, respectively. For ethanol, we analyzed its vibrational spectra and the probability distribution of dihedral angles. For Ac-Ala3-NHMe, we investigated its vibrational spectra and potential energy surface (PES) via the Ramachandran plot. To analyze the Ramachandran plot of different simulations, the free energy value was estimated using the potential of mean force (PMF). ϕ and ψ were set as two reaction coordinates ( x , y ). All three ϕ and ψ dihedrals in Ac-Ala3-NHMe were calculated and plotted. The relative free energy value was calculated and referred to with the minimum value. To generate the landscape, 40 bins were used in both the x and y directions. Supplementary Fig.  S5 a and b demonstrate that both ViSNet and sGDML generate similar vibrational spectra, with slight differences in peak intensities compared to DFT. The probability distribution of hydroxyl angles in ethanol (Supplementary Fig.  S5 c) reveals three minima: gauche ± ( M g ± ) and trans ( M t ). Furthermore, even though ViSNet showed better performance than sGDML for various conformations in the MD22 dataset, starting from the same structure of the alanine tetrapeptide, the performance difference may not have a notable impact on the sampling efficiency for such small molecules, and thus may also lead to similar dynamics on the Ramachandran plots as shown in the Supplementary Fig.  S5 d–f. These results demonstrate that with only a few training samples, ViSNet can act with the potential to perform high-fidelity molecular dynamics simulations with much less computational cost and higher accuracy.

Applications for real-world full-atom proteins

To examine the usefulness of ViSNet in real-world applications, we made evaluations on the 166-atom mini-protein Chignolin (Fig.  4 a). Based on a Chignolin dataset consisting of about 10,000 conformations that sampled by replica exchange MD 43 and calculated at DFT level by Gaussian 16 44 in our another study 45 , 46 , we split it as training, validation, and test sets by the ratio of 8:1:1. We trained ViSNet as well as other prevailing MLFFs including ET 19 , PaiNN 18 , GemNet-OC 47 , MACE 23 , NequIP 12 and Allegro 22 and compared them with molecular mechanics (MM) 48 . The DFT results were used as the ground truth. Figure  4 b shows the free energy landscape of Chignolin and is depicted by d D3−G7 (the distance between carbonyl oxygen on the D3 backbone and nitrogen on the G7 backbone) and d E5−T8 (the distance between carbonyl oxygen on the E5 backbone and nitrogen on T8 backbone). The concentrated energy basin on the left shows the folded state and the scattered energy basin on the right shows the unfolded state. We randomly selected six structures from different regions of the potential energy surface for visualization. Among them, four structures were predicted by the model with smaller errors than the MAE while the other two with larger errors. Interestingly, all models consistently performed poorly on the structures with high potential energies (low probability of sampling) and performed well on the other structures. This implies that the sampling of conformations with high potential energies could be enhanced to ensure the generalization ability of the models.

figure 4

a The visualization of Chignolin structure. The backbone is colored grey while the side chains of each residue in Chignolin are highlighted with a ball and stick. b The energy landscape of Chignolin sampled by REMD. The x -axis of the landscape is the distance between carbonyl oxygen on the D3 backbone and nitrogen on the G7 backbone, while the y -axis is the distance between carbonyl oxygen on the E5 backbone and nitrogen on the T8 backbone. Six structures were then selected for visualization. Each structure is shown as a cartoon and residues are depicted in sticks. The histograms show the absolute error between the energy difference predicted by MLFFs including ViSNet, ET, PaiNN, GemNet-OC, NequIP, Allegro, and MACE or calculated by MM, and the ground truth calculated by DFT on the corresponding structure. c The average root mean square deviation (RMSD) of the Chignolin trajectories simulated by ViSNet was calculated from 10 different trajectories. The shaded areas indicate the standard deviation range. d The MAE of each component of atomic forces during the simulations driven by ViSNet. The ground truth energies and forces were calculated using Gaussian 16. The shaded areas indicate the standard deviation range. Source data are provided as a Source Data file.

Supplementary Fig.  S6 shows the correlations between the energies predicted by MLFFs or MM and the ground truth values calculated by DFT for all conformations in the test set. ViSNet achieved a lower MAE and a higher R 2 score. From the violin plot of the absolute errors shown in Supplementary Fig.  S7 , ViSNet, PaiNN and ET exhibited smaller errors than other MLFFs while MM got a much wider range of prediction errors. Similar results can be seen in the force correlations in each component shown in Supplementary Fig.  S8 . Detailed settings about DFT and MM calculations are shown in Supplementary Materials. Furthermore, we also made a comprehensive comparison by taking model performance, training time consumption, and model size into consideration. ViSNet and other state-of-the-art algorithms such as PaiNN, ET, GemNet-OC, MACE, NequIP, and Allegro were analyzed on the Chignolin dataset and shown in Fig.  5 . Although ViSNet is marginally slower than ET and PaiNN, it introduces more geometric information, significantly enhancing its performance. When compared to GemNet, which also incorporates dihedral angles, ViSNet’s computational cost is significantly more affordable. Similarly, ViSNet proves to be computationally efficient when compared to models employing the CG-product method, such as MACE, Allegro, and NequIP.

figure 5

PaiNN and ET are faster and smaller as ViSNet further incorporates dihedral calculation. ViSNet outperforms GemNet-OC due to its Runtime Geometry Calculation, reducing the explicit extraction of dihedral complexity from \({{{{{{{\mathcal{O}}}}}}}}({{{{{{{{\mathcal{N}}}}}}}}}^{3})\) to \({{{{{{{\mathcal{O}}}}}}}}({{{{{{{\mathcal{N}}}}}}}})\) . Additionally, ViSNet is also faster and smaller than MACE, Allegro, and NequIP for streamlining the CG-product. ViSNet achieves the best performance for its elaborate design, i.e., runtime geometric calculation and vector–scalar interactive message passing. Source data are provided as a Source Data file.

In addition, we performed MD simulations for Chignolin driven by ViSNet. 10 conformations were randomly selected as initial structures, and 100 ps simulations were run for each. As shown in Fig.  4 c, the RMSD for 10 simulation trajectories is shown against the simulation time. In Fig.  4 d, we displayed the MAE values of each component of the atomic forces between ViSNet and those calculated by Gaussian 16 44 at the DFT level. The simulation trajectory driven by ViSNet exhibited a small force difference for each component to quantum mechanics, which implies that ViSNet has no bias towards any force component, and thus consolidates the accuracy and potential usefulness for real-world applications.

Interpretability of ViSNet on molecular structures

Prior works have shown the effectiveness of incorporating geometric features, such as angles 16 , 20 . The primary method of geometry extraction utilized by ViSNet is the distinct inner product in its runtime geometry calculation. To this end, we illustrate a reasonable model interpretability of ViSNet by mapping the angle representations derived from the inner product of direction units in the model to the atoms in the molecular structure. We aim to bridge the gap between geometric representation in ViSNet and molecular structures. We visualized the embeddings after the inner product of direction units \(\langle {\overrightarrow{v}}_{i},{\overrightarrow{v}}_{i}\rangle\) extracted from 50 aspirin samples on the validation set. The high-dimensional embeddings were reduced to 2-dimensional space using T-SNE 49 and then clustered using DBSCAN 50 without the prior of number of clusters.

Supplementary Fig.  S9 exhibits the clustering results of nodes’ embeddings after the inner product of their corresponding direction units. We further map the clustered nodes to the atoms of aspirin chemical structure. Interestingly, the embeddings for these nodes could be distinctly gathered into several clusters shown in different colors. For example, although carbon atom C 11 and carbon atom C 12 possess different positions and connect with different atoms, their inner product \(\langle {\overrightarrow{v}}_{i},{\overrightarrow{v}}_{i}\rangle\) are clustered into the same class for holding similar substructures ({ C 11 − O 2 O 3 C 6 } and { C 12 − O 1 O 4 C 13 }). To summarize, ViSNet can discriminate different molecular substructures in the embedding space.

Ablation study

To further explore where the performance gains of ViSNet come from, we conducted a comprehensive ablation study. Specifically, we excluded the runtime angle calculation (w/o A), runtime dihedral calculation (w/o D), and both of them (w/o A&D) in ViSNet, in order to evaluate the usefulness of each part. ViSNet-improper denotes the additional improper angles and ViSNet l =1 uses the first-order spherical harmonics.

We designed some model variants with different message-passing mechanisms based on ViS-MP for scalar and vector interaction. ViSNet-N directly aggregates the dihedral information to intersecting nodes, and ViSNet-T leverages another form of dihedral calculation. The details of these model variants are elaborated in Supplementary. The results of the ablation study are shown in Supplementary Table  S3 and Supplementary Fig.  S10 . Based on the results, we can see that both kinds of directional geometric information are useful and the dihedral information contributes a little bit more to the final performance. The significant performance drop from ViSNet-N and ViSNet-T further validates the effectiveness of the ViS-MP mechanism. ViSNet-improper achieves similar performance to ViSNet for small molecules, but the contribution of improper angles is more obvious for large molecules (see Table  3 ). Furthermore, ViSNet using higher-order spherical harmonics achieves better performance.

We propose ViSNet, a geometric deep learning potential for molecular dynamics simulation. The group representation theory-based methods and the directional information-based methods are two mainstream classes of geometric deep learning potentials to enforce SE(3) equivariance 20 . ViSNet takes advantage of both sides in designing the RGC strategy and ViS-MP mechanism. On the one hand, the RGC strategy explicitly extracts and exploits the directional geometric information with computationally lightweight operations, making the model training and inference fast. On the other hand, ViS-MP employs a series of effective and efficient vector-scalar interactive operations, leading to the full use of geometric information. Furthermore, according to the many-body expansion theory 51 , 52 , 53 , the potential energy of the whole system equals the potential of each single atom plus the energy corrections from two-bodies to many-bodies. Most of the previous studies model the truncated energy correction terms hierarchically with k -hop information via stacking k message passing blocks. Different from these approaches, ViSNet encodes the angle, dihedral torsion, and improper information in a single block, which empowers the model to have a much more powerful representation ability. In addition, ViSNet’s universality or completeness is not validated by the geometric Weisfeiler–Leman (GWL) test 54 due to the inner product operation, which is computationally efficient but fails to distinguish certain atom reflection structures with the same angular information. To pass counterexamples or the GWL test, incorporating the CG-product with higher-order spherical harmonics is necessary in future studies.

Besides predicting energy, force, and chemical properties with high accuracy, performing molecular dynamics simulations with ab initio accuracy at the cost of the empirical force field is a grand challenge. ViSNet proves its usefulness in real-world ab initio molecular dynamics simulations with less computational costs and the ability of scaling to large molecules such as proteins. Extending ViSNet to support larger and more complex molecular systems will be our future research direction.


In the context of machine learning for atomic systems, equivariance is a pervasive concept. Specifically, the atomic vectors such as dipoles or forces must rotate in a manner consistent with the conformation coordinates. In molecular dynamics, such equivariance can be ensured by computing gradients based on a predicted conservative scalar energy. Formally, a function \({{{{{{{\mathcal{F}}}}}}}}:{{{{{{{\mathcal{X}}}}}}}}\to {{{{{{{\mathcal{Y}}}}}}}}\) is equivariant should guarantee:

where \({\rho }_{{{{{{{{\mathcal{X}}}}}}}}}(g)\) and \({\rho }_{{{{{{{{\mathcal{X}}}}}}}}}(g)\) are group representations in input and output spaces. The integration of equivariance into model parameterization has been shown to be effective, as seen in the implementation of shift-equivariance in CNNs, which is critical for enhancing the generalization capacity.

Proofs of the rotational invariance of RGC

Assume that the molecule rotates in 3D space, i.e.,

where, R   ∈   S O (3) is an arbitrary rotation matrix that satisfies:

The angular information after rotation is calculated as follows:

As shown in Eq. ( 18 ), the angle information does not change after rotation. The dihedral angular and improper information is also rotationally invariant since:

As Eq. ( 18 ) proved, the inner product has rotational invariance. Then, Eq. ( 19 ) can be further simplified as

The dihedral or improper angular information after rotation is calculated as:

As a result, Eqs. ( 18 ) and ( 21 ) have proved the rotational invariance of our proposed runtime geometry calculation (RGC).

We also provide proof of the equivariance of our ViS-MP in Supplementary Methods.

Detailed operations and modules in ViSNet

ViSNet predicts the molecular properties (e.g., energy \(\hat{E}\) , forces \(\overrightarrow{F}\in {{\mathbb{R}}}^{N\times 3}\) , dipole moment μ ) from the current states of atoms, including the atomic positions \(X\in {{\mathbb{R}}}^{N\times 3}\) and atomic numbers \(Z\in {{\mathbb{N}}}^{N}\) . The architecture of the proposed ViSNet is shown in Fig.  1 . The overall design of ViSNet follows the vector–scalar interactive message passing as illustrated from Eqs. ( 8 )–( 11 ). First, an embedding block encodes the atom numbers and edge distances into the embedding space. Then, a series of ViSNet blocks update the node-wise scalar and vector representations based on their interactions. A residual connection is placed between two ViSNet blocks. Finally, stacked corresponding gated equivariant blocks proposed by 18 are attached to the output block for specific molecular property prediction.

The embedding block

ViSNet expands the direct node and edge embedding with their neighbors. It first embeds atomic chemical symbol z i , and calculates the edge representation whose distances within the cutoff through radial basis functions (RBF). Then the initial embedding of the atom i , its 1-hop neighbors j and the directly connected edge e i j within cutoff are fused together as the initial node embedding \({h}_{i}^{0}\) and edge embedding \({f}_{ij}^{0}\) . In summary, the embedding block is given by:

\({{{{{{{\mathcal{N}}}}}}}}(i)\) denotes the set of 1-hop neighboring nodes of node i , and j is one of its neighbors. The embedding process is elaborated in Supplementary. The initial vector embedding \({\overrightarrow{v}}_{i}\) is set to \(\overrightarrow{0}\) . The vector embeddings \(\overrightarrow{v}\) are projected into the embedding space by following 18 ; \(\overrightarrow{v}\in {{\mathbb{R}}}^{N\times 3\times F}\) and F is the size of hidden dimension. The advantage of such projection is to assign a unique high-dimensional representation for each embedding to discriminate from each other. Further discussions on its effectiveness and interpretability are given in the Results section.

The Scalar2Vec module

In the Scalar2Vec module, the vector embedding \(\overrightarrow{v}\) is updated by both the scalar messages derived from node and edge scalar embeddings (Eq. ( 8 )) and the vector messages with inherent geometric information (Eq. ( 9 )). The message of each atom is calculated through an Edge-Fusion Graph Attention module, which fuses the node and edge embeddings and computes the attention scores. The fusion of the node and edge embeddings could be the concatenation operation, Hadamard product, or adding a learnable bias 55 . We leverage the Hadamard product and the vanilla multi-head attention mechanism borrowed from Transformer 56 for edge-node fusion.

Following 19 , we pass the fused representations through a nonlinear activation function as shown in Eq. ( 23 ). The value ( V ) in the attention mechanism is also fused by edge features before being multiplied by attention scores weighted by a cosine cutoff as shown in Eq. ( 24 ),

where l   ∈  {0, 1, 2,  ⋯   ,  L } is the index of the block, σ denotes the activation function (SiLU in this paper), W is the learnable weight matrix,  ⊙  represents the Hadamard product, ϕ (  ⋅  ) denotes the cosine cutoff and Dense(  ⋅  ) refers to one learnable weight matrix with an activation function. For brevity, we omit the learnable bias for linear transformation on scalar embedding in equations, and there is no bias for vector embedding to ensure the equivariance.

Then, the computed \({m}_{ij}^{l}\) is used to produce the geometric messages \({\overrightarrow{m}}_{ij}^{l}\) for vectors:

And the vector embedding \({\overrightarrow{v}}^{l}\) is updated by:

The Vec2Scalar module

In the Vec2Scalar module, the node embedding \({h}_{i}^{l}\) and edge embedding \({f}_{ij}^{l}\) are updated by the geometric information extracted by the RGC strategy, i.e., angles (Eq. ( 10 )) and dihedrals (Eq. ( 11 )), respectively. The residual node embedding \({{\Delta }}{h}_{i}^{l+1}\) , is calculated by a Hadamard product between the runtime angle information and the aggregated scalar messages with a gated residual connection:

To compute the residual edge embedding \({{\Delta }}{f}_{ij}^{l+1}\) , we perform the Hadamard product of the runtime dihedral information with the transformed edge embedding:

After the residual hidden representations are calculated, we add them to the original input of block l and feed them to the next block.

A comprehensive version that includes improper angles is depicted in Supplementary Methods.

The output block

Following PaiNN 18 , we update the scalar embedding and vector embedding of nodes with multiple gated equivariant blocks:

where [  ⋅  ,  ⋅  ] is the tensor concatenation operation. The final scalar embedding \({h}_{i}^{L}\in {{\mathbb{R}}}^{N\times 1}\) and vector embedding \({\overrightarrow{v}}_{i}^{L}\in {{\mathbb{R}}}^{N\times 3\times 1}\) are used to predict various molecular properties.

On QM9, the molecular dipole is calculated as follows:

where \({\overrightarrow{r}}_{c}\) denotes the center of mass. Similarly, for the prediction of electronic spatial extent 〈 R 2 〉, we use the following equation:

For the remaining 10 properties y , we simply aggregate the final scalar embedding of nodes as follows:

For models trained on the molecular dynamics datasets including MD17, revised MD17, and Chignolin, the total potential energy is obtained as the sum of the final scalar embedding of the nodes. As an energy-conserving potential, the forces are then calculated using the negative gradients of the predicted total potential energy with respect to the atomic coordinates:

Statistics and reproducibility

For the QM9 dataset, we randomly split it into 110,000 samples as the train set, 10,000 samples as the validation set, and the rest as the test set by following the previous studies 18 , 19 . For the Molecule3D and OGB-LSC PCQM4Mv2 datasets, the splitting has been provided in their paper 32 , 33 .

To evaluate the effectiveness of ViSNet in simulation data, ViSNet was trained on MD17 and rMD17 with a limited data setting, which consists of only 950 uniformly sampled conformations for model training and 50 conformations for validation for each molecule. For the MD22 dataset, we use the same number of molecules as in ref. 30 for training and validation, and the rest as the test set.

Furthermore, the whole Chignolin dataset was randomly split into 80%, 10%, and 10% as the training, validation, and test datasets. Six representative conformations were picked from the test set for illustration.

Experimental settings

For the QM9 dataset, we adopted a batch size of 32 and a learning rate of 1e−4 for all the properties. For the Molecule3D dataset, we adopted a larger batch size of 512 and a learning rate of 2e−4. For the OGB-LSC PCQM4Mv2 dataset, we trained our model in a mixed 2D/3D mode with a batch size of 256 and a learning rate of 2e−4. The mean squared error (MSE) loss was used for model training. For the molecular dynamic dataset including MD17, rMD17, MD22, and Chignolin, we leveraged a combined MSE loss for energy and force prediction. The weight of energy loss was set to 0.05. The weight of force loss was set to 0.95. The batch size was chosen from 2, 4, 8 due to the GPU memory and the learning rate was chosen from 1e−4 to 4e−4 for different molecules. The cutoff was set to 5 for small molecules in QM9, MD17, rMD17, and Molecule3D, and changed to 4 for Chignolin in order to reduce the number of edges in the molecular graphs. For the MD22 dataset, the cutoff of relatively small molecules was set to 5, that of bigger molecules was set to 4. Cutoff was not used in the OGB-LSC PCQM4Mv2 dataset. We used the learning rate decay if the validation loss stopped decreasing. The patience was set to 5 epochs for Molecule3D, 15 epochs for QM9, and 30 epochs for MD17, rMD17, MD22, and Chignolin. The learning rate decay factor was set to 0.8 for these models. Training is stopped if a maximum number of epochs is reached, or the validation loss does not improve for a maximum number of early stopping patience. The ViSNet model trained on the molecular dynamic datasets and Molecule3D had 9 hidden layers and the embedding dimension was set to 256. We used a larger model for the QM9 dataset, i.e., the embedding dimension changed to 512. For the OGB-LSC PCQM4Mv2 dataset, we use the 12-layer and 768-dimension Transformer-M 37 as the backbone. More details about the hyperparameters of ViSNet can be found in Supplementary Table  S4 . Experiments were conducted on NVIDIA 32G-V100 GPUs.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

All relevant data supporting the key findings of this study are available within the article and its Supplementary Information files. MD17 dataset [ http://www.quantum-machine.org/gdml/data/npz ], MD22 dataset [ http://www.quantum-machine.org/gdml/data/npz ], rMD17 dataset [ https://archive.materialscloud.org/record/file?filename=rmd17.tar.bz2&record_id=466 ], QM9 dataset [ https://deepchemdata.s3-us-west-1.amazonaws.com/datasets/molnet_publish/qm9.zip ], Molecule3D dataset [ https://github.com/divelab/MoleculeX/tree/molx/Molecule3D ], OGB-LSC PCQM4Mv2 dataset [ https://ogb.stanford.edu/docs/lsc/pcqm4mv2 ] and Chignolin dataset [ https://github.com/microsoft/AI2BMD/tree/ViSNet/chignolin_data ].  Source data are provided with this paper.

Code availability

Most experiments were run with Python with version 3.9.15, Pytorch with version 1.11.0, Pytorch Geometric with version 2.1.0, and Pytorch Lightning with version 1.8.0. The code used to reproduce our results is available at https://github.com/microsoft/AI2BMD/tree/ViSNet 57 . Matplotlib and Seaborn were used for plotting figures.

Chow, E., Klepeis, J., Rendleman, C., Dror, R. & Shaw, D. 9.6 new technologies for molecular dynamics simulations. In Comprehensive Biophysics (ed Egelman, E.H.) 86–104 (Elsevier, Amsterdam, 2012).

Singh, S. & Singh, V. K. Molecular dynamics simulation: methods and application. In Frontiers in Protein Structure, Function, and Dynamics (eds Singh, D. B. & Tripathi, T.) 213–238 (Springer, 2020).

Lu, S. et al. Activation pathway of a g protein-coupled receptor uncovers conformational intermediates as targets for allosteric drug design. Nat. Commun. 12 , 1–15 (2021).

Article   ADS   Google Scholar  

Li, Y. et al. Exploring the regulatory function of the n-terminal domain of sars-cov-2 spike protein through molecular dynamics simulation. Adv. Theory Simul. 4 , 2100152 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140 , A1133 (1965).

Article   ADS   MathSciNet   Google Scholar  

Marx, D. & Hutter, J. Ab Initio Molecular Dynamics: Basic Theory and Advanced Methods (Cambridge University Press, 2009).

Christensen, A. S., Bratholm, L. A., Faber, F. A. & Anatole von Lilienfeld, O. Fchl revisited: faster and more accurate quantum machine learning. J. Chem. Phys. 152 , 044107 (2020).

Article   ADS   CAS   PubMed   Google Scholar  

Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104 , 136403 (2010).

Article   ADS   PubMed   Google Scholar  

Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3 , e1603015 (2017).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9 , 1–10 (2018).

Article   CAS   Google Scholar  

Chmiela, S. et al. Accurate global machine learning force fields for molecules with hundreds of atoms. Sci. Adv. 9 , eadf0873 (2023).

Article   PubMed   PubMed Central   Google Scholar  

Batzner, S. et al. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13 , 1–11 (2022).

Article   Google Scholar  

Brandstetter, J., Hesselink, R., van der Pol, E., Bekkers, E. & Welling, M. Geometric and physical quantities improve e (3) equivariant message passing. In International Conference on Learning Representations (OpenReview.net, 2022).

Hutchinson, M. J. et al. Lietransformer: equivariant self-attention for lie groups. In International Conference on Machine Learning , (eds Meila, M. & Zhang, T.) 4533–4543 (PMLR, 2021).

Fuchs, F., Worrall, D., Fischer, V. & Welling, M. Se (3)-transformers: 3d roto-translation equivariant attention networks. Adv. Neural Inf. Process. Syst. 33 , 1970–1981 (2020).

Google Scholar  

Gasteiger, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. In International Conference on Learning Representations (OpenReview.net, 2019).

Gasteiger, J. et al. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. arXiv preprint arXiv:2011.14115 (2020).

Schütt, K., Unke, O. & Gastegger, M. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 9377–9388 (PMLR, 2021).

Thölke, P. & De Fabritiis, G. Torchmd-net: equivariant transformers for neural network based molecular potentials. In The International Conference on Learning Representations (OpenReview.net, 2022).

Gasteiger, J., Becker, F. & Günnemann, S. Gemnet: universal directional graph neural networks for molecules. Adv. Neural Inf. Process. Syst. 34 , 6790–6802 (2021).

Unke, O. T. et al. Spookynet: learning force fields with electronic degrees of freedom and nonlocal effects. Nat. Commun. 12 , 1–14 (2021).

Musaelian, A. et al. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 14 , 579 (2023).

Batatia, I. et al. MACE: Higher order equivariant message passing neural networks for fast and accurate force fields. Adv. Neural. Inf. Process. Syst. 35 , 11423–11436 (2022).

Han, J., Rong, Y., Xu, T. & Huang, W. Geometrically equivariant graph neural networks: a survey. arXiv preprint arXiv:2202.07230 (2022).

Perwass, C., Edelsbrunner, H., Kobbelt, L. & Polthier, K. Geometric Algebra With Applications in Engineering Vol. 4 (Springer, 2009).

Zitnick, L. et al. Spherical channels for modeling atomic interactions. Adv. Neural. Inf. Process. Syst. 35 , 8054–8067 (2022).

Liao, Y.-L. & Smidt, T. Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs. The Eleventh International Conference on Learning Representations. (2022).

Schütt, K. T., Arbabzadah, F., Chmiela, S., Müller, K. R. & Tkatchenko, A. Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8 , 1–8 (2017).

Christensen, A. S. & Von Lilienfeld, O. A. On the role of gradients for machine learning of molecular energies and forces. Mach. Learn.: Sci. Technol. 1 , 045018 (2020).

Chmiela, S. et al. Accurate global machine learning force fields for molecules with hundreds of atoms. Sci Adv. 9 , eadf0873 (2023).

Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1 , 1–7 (2014).

Xu, Z. et al. Molecule3d: a benchmark for predicting 3d geometries from molecular graphs. arXiv preprint arXiv:2110.01717 (2021).

Hu, W. et al. OGB-LSC: a large-scale challenge for machine learning on graphs. In Proc. of the 35 th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (eds Vanschoren, J. & Yeung, S.) (Neural Information Processing Systems Foundation, Inc., 2021).

Wang, L., Liu, Y., Lin, Y., Liu, H. & Ji, S. Comenet: towards complete and efficient message passing for 3d molecular graphs. Adv. Neural Inf. Process. Syst. 35 , 650–664 (2022).

Qiao, Z. et al. Informing geometric deep learning with electronic interactions to accelerate quantum chemistry. Proc. Natl Acad. Sci. USA 119 , e2205221119 (2022).

Frank, T., Unke, O. T. & Muller, K. R. So3krates: equivariant attention for interactions on arbitrary length-scales in molecular systems. In Advances in Neural Information Processing Systems (eds Koyejo, S. et al.) (Curran Associates, Inc., 2022).

Luo, S. et al. One transformer can understand both 2d & 3d molecular data. arXiv preprint arXiv:2210.01765 (2022).

Wang, Y. et al. An ensemble of visnet, transformer-m, and pretraining models for molecular property prediction in ogb large-scale challenge@ neurips 2022. arXiv preprint arXiv:2211.12791 (2022).

Larsen, A. H. et al. The atomic simulation environment—a python library for working with atoms. J. Phys.: Condens. Matter 29 , 273002 (2017).

Smith, J. S., Isayev, O. & Roitberg, A. E. Ani-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8 , 3192–3203 (2017).

Zhang, L., Han, J., Wang, H., Car, R. & Weinan, E. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120 , 143001 (2018).

Unke, O. T. & Meuwly, M. Physnet: a neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15 , 3678–3693 (2019).

Article   CAS   PubMed   Google Scholar  

Qi, R., Wei, G., Ma, B. & Nussinov, R. Replica exchange molecular dynamics: a practical application protocol with solutions to common problems and a peptide aggregation and self-assembly example. In Peptide Self-assembly (eds Nilsson, B. L. & Doran, T. M.) 101–119 (Springer, 2018).

Frisch, M. J. et al. Gaussian 16 Revision C.01 (Gaussian Inc. Wallingford, CT, 2016).

Wang, T., He, X., Li, M., Shao, B. & Liu, T.-Y. AIMD-Chig: exploring the conformational space of a 166-atom protein chignolin with ab initio molecular dynamics. Sci. Data 10 , 549 (2023).

Wang, Z. et al. Improving machine learning force fields for molecular dynamics simulations with fine-grained force metrics. J. Chem. Phys. 159 , 035101 (2023).

Gasteiger, J. et al. GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets. Transactions on Machine Learning Research (2022).

Case, D. A. et al. Amber 2021 (University of California, San Francisco, 2021).

Van der Maaten, L. & Hinton, G. Visualizing data using t-sne. J. Mach. Learn. Res. 9 , 2579–2605 (2008).

Ester, M., Kriegel, H.-P., Sander, J., Xu, X. et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd (eds Simoudis, E., Han, J. & Fayyad, U.) Vol. 96, 226–231 (AAAI Press, 1996).

Nesbet, R. Atomic Bethe–Goldstone equations. III. correlation energies of ground states of Be, B, C, N, O, F, and Ne. Phys. Rev. 175 , 2 (1968).

Article   ADS   CAS   Google Scholar  

Hankins, D., Moskowitz, J. & Stillinger, F. Water molecule interactions. J. Chem. Phys. 53 , 4544–4554 (1970).

Gordon, M. S., Fedorov, D. G., Pruitt, S. R. & Slipchenko, L. V. Fragmentation methods: a route to accurate calculations on large systems. Chem. Rev. 112 , 632–672 (2012).

Joshi, C. K., Bodnar, C., Mathis, S. V., Cohen, T. & Lio, P. On the expressive power of geometric graph neural networks. arXiv preprint arXiv:2301.09308 (2023).

Ying, C. et al. Do transformers really perform badly for graph representation? Adv. Neural Inf. Process. Syst. 34 , (2021).

Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 , (2017).

Wang, T. Enhancing geometric representations for molecules with equivariant vector–scalar interactive message passing. AI2BMD https://doi.org/10.5281/zenodo.10069040 (2023).

Download references


We would like to express our sincere gratitude to S. Chmiela, H.E. Sauceda, K.R. Müller, and A. Tkatchenko, for their invaluable assistance in performing the simulations and analyzing the vibrational spectra. Their extensive expertise and knowledge greatly contributed to the completion of the supplementary experiments, making our manuscript more solid.

Author information

These authors contributed equally: Yusong Wang, Tong Wang, Shaoning Li.

Authors and Affiliations

Microsoft Research AI4Science, 100080, Beijing, China

Yusong Wang, Tong Wang, Shaoning Li, Xinheng He, Mingyu Li, Zun Wang, Bin Shao & Tie-Yan Liu

National Key Laboratory of Human–Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, 710049, Xi’an, China

Yusong Wang & Nanning Zheng

The CAS Key Laboratory of Receptor Research and State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 201203, Shanghai, China

University of Chinese Academy of Sciences, 100049, Beijing, China

Medicinal Chemistry and Bioinformatics Center, School of Medicine, Shanghai Jiaotong University, Shanghai, 200025, China

You can also search for this author in PubMed   Google Scholar


T.W. led, conceived, and designed the study. T.W. is the lead contact. Y.W., S.L., X.H., and M.L. conducted the work when they were visiting Microsoft Research. S.L., Y.W., and T.W. carried out algorithm design. Y.W., S.L., X.H., and T.W. carried out experiments, evaluations, analysis, and visualization. Y.W. and S.L. wrote the original manuscript. T.W., X.H., M.L., Z.W., and B.S. revised the manuscript. N.Z. and T.-Y.L. contributed to the writing. All authors reviewed the final manuscript.

Corresponding authors

Correspondence to Tong Wang or Bin Shao .

Ethics declarations

Competing interests.

T.W., B.S., and T.-Y.L. have been filing a patent on ViSNet model. The remaining authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks Zhirong Liu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary material, peer review file, reporting summary, source data, source data, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wang, Y., Wang, T., Li, S. et al. Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing. Nat Commun 15 , 313 (2024). https://doi.org/10.1038/s41467-023-43720-2

Download citation

Received : 04 May 2023

Accepted : 16 November 2023

Published : 05 January 2024

DOI : https://doi.org/10.1038/s41467-023-43720-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Equivariant neural network force fields for magnetic materials.

  • Zilong Yuan

Quantum Frontiers (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

case study research article


  1. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...

  2. Case Study Method: A Step-by-Step Guide for Business Researchers

    Some famous books about case study methodology (Merriam, 2002; Stake, 1995; Yin, 2011) provide useful details on case study research but they emphasize more on theory as compared to practice, and most of them do not provide the basic knowledge of case study conduct for beginners (Hancock & Algozzine, 2016). This article is an attempt to bridge ...

  3. (PDF) Case Study Research

    This study employed a qualitative case study methodology. The case study method is a research strategy that aims to gain an in-depth understanding of a specific phenomenon by collecting and ...

  4. Continuing to enhance the quality of case study methodology in health

    Introduction. The popularity of case study research methodology in Health Services Research (HSR) has grown over the past 40 years. 1 This may be attributed to a shift towards the use of implementation research and a newfound appreciation of contextual factors affecting the uptake of evidence-based interventions within diverse settings. 2 Incorporating context-specific information on the ...

  5. Case study research for better evaluations of complex interventions

    Case study research, as an overall approach, is based on in-depth explorations of complex phenomena in their natural, or real-life, settings. Empirical case studies typically enable dynamic understanding of complex challenges and provide evidence about causal mechanisms and the necessary and sufficient conditions (contexts) for intervention ...

  6. The case study approach

    A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the ...

  7. What Is a Case Study?

    Revised on November 20, 2023. A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are ...

  8. Case Study Research: In-Depth Understanding in Context

    Abstract. This chapter explores case study as a major approach to research and evaluation. After first noting various contexts in which case studies are commonly used, the chapter focuses on case study research directly Strengths and potential problematic issues are outlined and then key phases of the process.

  9. Case Study Methods and Examples

    The purpose of case study research is twofold: (1) to provide descriptive information and (2) to suggest theoretical relevance. Rich description enables an in-depth or sharpened understanding of the case. It is unique given one characteristic: case studies draw from more than one data source. Case studies are inherently multimodal or mixed ...

  10. Toward Developing a Framework for Conducting Case Study Research

    Articles Purpose of Case Study Research Reasons to Use Case Study Research Types of Case Study Research Methods of Gathering Data Data Analysis; Sustainability and scalability of university spinouts: a business model perspective (Bigdeli et al., 2015) Theory oriented: Theory extension/refinement:

  11. Methodology or method? A critical review of qualitative case study reports

    Definitions of qualitative case study research. Case study research is an investigation and analysis of a single or collective case, intended to capture the complexity of the object of study (Stake, Citation 1995).Qualitative case study research, as described by Stake (Citation 1995), draws together "naturalistic, holistic, ethnographic, phenomenological, and biographic research methods ...

  12. What is a case study?

    Case study is a research methodology, typically seen in social and life sciences. There is no one definition of case study research.1 However, very simply… 'a case study can be defined as an intensive study about a person, a group of people or a unit, which is aimed to generalize over several units'.1 A case study has also been described as an intensive, systematic investigation of a ...

  13. Perspectives from Researchers on Case Study Design

    This article reviews the use of case study research for both practical and theoretical issues especially in management field with the emphasis on management of technology and innovation. Many researchers commented on the methodological issues of the case study research from their point of view thus, presenting a comprehensive framework was missing.

  14. Case study research: opening up research opportunities

    1. Introduction. The case study as a research method or strategy brings us to question the very term "case": after all, what is a case? A case-based approach places accords the case a central role in the research process (Ragin, 1992).However, doubts still remain about the status of cases according to different epistemologies and types of research designs.

  15. Case study research for better evaluations of complex interventions

    The overall approach of case study research is based on the in-depth exploration of complex phenomena in their natural, or 'real-life', settings. Empirical case studies typically enable dynamic understanding of complex challenges rather than restricting the focus on narrow problem delineations and simple fixes.

  16. Planning Qualitative Research: Design and Decision Making for New

    A case study can be a complete research project in itself, such as in the study of a particular organization, community, or program. Case studies are also often used for evaluation purposes, for example, in an external review. In educational contexts, case studies can be used to illustrate, test, or extend a theory, or assist other educators to ...

  17. (PDF) The case study as a type of qualitative research

    Abstract. This article presents the case study as a type of qualitative research. Its aim is to give a detailed description of a case study - its definition, some classifications, and several ...

  18. The case study approach

    A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table.

  19. Case study research

    This article describes case study research for nursing and healthcare practice. Case study research offers the researcher an approach by which a phenomenon can be investigated from multiple perspectives within a bounded context, allowing the researcher to provide a 'thick' description of the phenomenon. Although case study research is a ...

  20. Methodology or method? A critical review of qualitative case study

    Case studies are designed to suit the case and research question and published case studies demonstrate wide diversity in study design. There are two popular case study approaches in qualitative research. The first, proposed by Stake ( 1995) and Merriam ( 2009 ), is situated in a social constructivist paradigm, whereas the second, by Yin ( 2012 ...

  21. Innovation in Pursuit of Patient-Centered Care

    This special theme issue of NEJM Catalyst Innovations in Care Delivery, guest edited by James Merlino, MD, FACS, includes articles, case studies, and research reports on patient-centered care, PROMs and PREMs, at-home acute care, and payment for patient safety. Listen to the audio summary of NEJM ...

  22. Calculation of the minimum clinically important difference ...

    All analyses in this case study were carried out using R version 2023.12 + 402 (The R Foundation for Statistical Computing, Vienna ... Establishing reliable thresholds for MCID is key in clinical research and forms the basis of patient-centered treatment evaluations when using patient-reported outcome measures or objective functional tests. ...

  23. ADHD Medications and Long-Term Risk of Cardiovascular Diseases

    To our knowledge, few previous studies have investigated the association between long-term ADHD medication use and the risk of CVD with follow-up of more than 2 years. 13 The only 2 prior studies with long-term follow-up (median, 9.5 and 7.9 years 30,31) found an average 2-fold and 3-fold increased risk of CVD with ADHD medication use compared ...

  24. Case Study Methodology of Qualitative Research: Key Attributes and

    Case Study Methodology: Basic Definitions and Concepts The article dives into the case study research strategy by highlighting some of its fundamental definitions and aspects. Yin (2009, p. 18) defines case study as an empirical inquiry which investigates a phenomenon in its real-life context. In a case study research, multiple methods

  25. Research: Using AI at Work Makes Us Lonelier and Less Healthy

    Joel Koopman is the TJ Barlow Professor of Business Administration at the Mays Business School of Texas A&M University. His research interests include prosocial behavior, organizational justice ...

  26. Distinguishing case study as a research method from case reports as a

    VARIATIONS ON CASE STUDY METHODOLOGY. Case study methodology is evolving and regularly reinterpreted. Comparative or multiple case studies are used as a tool for synthesizing information across time and space to research the impact of policy and practice in various fields of social research [].Because case study research is in-depth and intensive, there have been efforts to simplify the method ...

  27. Minerals

    With the decrease in shallow mineral reserves, deep mineral resources have become the focus of exploration. Seismic exploration, renowned for its deep penetration and high spatial resolution and precision, stands as a primary technique in geophysical exploration. In comparison to traditional P-wave seismic exploration, multi-component seismic techniques offer the advantage of simultaneously ...

  28. What Is a Case, and What Is a Case Study?

    Résumé. Case study is a common methodology in the social sciences (management, psychology, science of education, political science, sociology). A lot of methodological papers have been dedicated to case study but, paradoxically, the question "what is a case?" has been less studied.

  29. Enhancing geometric representations for molecules with ...

    Furthermore, through a series of simulations and case studies, ViSNet can efficiently explore the conformational space and provide reasonable interpretability to map geometric representations to ...

  30. Fostering Sustainability Through Workplace Spirituality: A Qualitative

    Workplace spirituality was explored through this qualitative study in three case study social sector organizations in Pakistan and it was inferred how these organizations carried sustainable organizational practice. The interviews and focus group discussions were carried out with the organizational members.