Show that you understand the current state of research on your topic.
The length of a research proposal can vary quite a bit. A bachelor’s or master’s thesis proposal can be just a few pages, while proposals for PhD dissertations or research funding are usually much longer and more detailed. Your supervisor can help you determine the best length for your work.
One trick to get started is to think of your proposal’s structure as a shorter version of your thesis or dissertation , only without the results , conclusion and discussion sections.
Download our research proposal template
Writing a research proposal can be quite challenging, but a good starting point could be to look at some examples. We’ve included a few for you below.
Like your dissertation or thesis, the proposal will usually have a title page that includes:
The first part of your proposal is the initial pitch for your project. Make sure it succinctly explains what you want to do and why.
Your introduction should:
To guide your introduction , include information about:
Discover proofreading & editing
As you get started, it’s important to demonstrate that you’re familiar with the most important research on your topic. A strong literature review shows your reader that your project has a solid foundation in existing knowledge or theory. It also shows that you’re not simply repeating what other people have already done or said, but rather using existing research as a jumping-off point for your own.
In this section, share exactly how your project will contribute to ongoing conversations in the field by:
Following the literature review, restate your main objectives . This brings the focus back to your own project. Next, your research design or methodology section will describe your overall approach, and the practical steps you will take to answer your research questions.
? or ? , , or research design? | |
, )? ? | |
, , , )? | |
? |
To finish your proposal on a strong note, explore the potential implications of your research for your field. Emphasize again what you aim to contribute and why it matters.
For example, your results might have implications for:
Last but not least, your research proposal must include correct citations for every source you have used, compiled in a reference list . To create citations quickly and easily, you can use our free APA citation generator .
Some institutions or funders require a detailed timeline of the project, asking you to forecast what you will do at each stage and how long it may take. While not always required, be sure to check the requirements of your project.
Here’s an example schedule to help you get started. You can also download a template at the button below.
Download our research schedule template
Research phase | Objectives | Deadline |
---|---|---|
1. Background research and literature review | 20th January | |
2. Research design planning | and data analysis methods | 13th February |
3. Data collection and preparation | with selected participants and code interviews | 24th March |
4. Data analysis | of interview transcripts | 22nd April |
5. Writing | 17th June | |
6. Revision | final work | 28th July |
If you are applying for research funding, chances are you will have to include a detailed budget. This shows your estimates of how much each part of your project will cost.
Make sure to check what type of costs the funding body will agree to cover. For each item, include:
To determine your budget, think about:
If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.
Methodology
Statistics
Research bias
Once you’ve decided on your research objectives , you need to explain them in your paper, at the end of your problem statement .
Keep your research objectives clear and concise, and use appropriate verbs to accurately convey the work that you will carry out for each one.
I will compare …
A research aim is a broad statement indicating the general purpose of your research project. It should appear in your introduction at the end of your problem statement , before your research objectives.
Research objectives are more specific than your research aim. They indicate the specific ways you’ll address the overarching aim.
A PhD, which is short for philosophiae doctor (doctor of philosophy in Latin), is the highest university degree that can be obtained. In a PhD, students spend 3–5 years writing a dissertation , which aims to make a significant, original contribution to current knowledge.
A PhD is intended to prepare students for a career as a researcher, whether that be in academia, the public sector, or the private sector.
A master’s is a 1- or 2-year graduate degree that can prepare you for a variety of careers.
All master’s involve graduate-level coursework. Some are research-intensive and intend to prepare students for further study in a PhD; these usually require their students to write a master’s thesis . Others focus on professional training for a specific career.
Critical thinking refers to the ability to evaluate information and to be aware of biases or assumptions, including your own.
Like information literacy , it involves evaluating arguments, identifying and solving problems in an objective and systematic way, and clearly communicating your ideas.
The best way to remember the difference between a research plan and a research proposal is that they have fundamentally different audiences. A research plan helps you, the researcher, organize your thoughts. On the other hand, a dissertation proposal or research proposal aims to convince others (e.g., a supervisor, a funding body, or a dissertation committee) that your research topic is relevant and worthy of being conducted.
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
McCombes, S. & George, T. (2023, November 21). How to Write a Research Proposal | Examples & Templates. Scribbr. Retrieved June 27, 2024, from https://www.scribbr.com/research-process/research-proposal/
Other students also liked, how to write a problem statement | guide & examples, writing strong research questions | criteria & examples, how to write a literature review | guide, examples, & templates, what is your plagiarism score.
Home » Research Methodology – Types, Examples and writing Guide
Table of Contents
Definition:
Research Methodology refers to the systematic and scientific approach used to conduct research, investigate problems, and gather data and information for a specific purpose. It involves the techniques and procedures used to identify, collect , analyze , and interpret data to answer research questions or solve research problems . Moreover, They are philosophical and theoretical frameworks that guide the research process.
Research methodology formats can vary depending on the specific requirements of the research project, but the following is a basic example of a structure for a research methodology section:
I. Introduction
II. Research Design
III. Data Collection Methods
IV. Data Analysis Methods
V. Ethical Considerations
VI. Limitations
VII. Conclusion
Types of Research Methodology are as follows:
This is a research methodology that involves the collection and analysis of numerical data using statistical methods. This type of research is often used to study cause-and-effect relationships and to make predictions.
This is a research methodology that involves the collection and analysis of non-numerical data such as words, images, and observations. This type of research is often used to explore complex phenomena, to gain an in-depth understanding of a particular topic, and to generate hypotheses.
This is a research methodology that combines elements of both quantitative and qualitative research. This approach can be particularly useful for studies that aim to explore complex phenomena and to provide a more comprehensive understanding of a particular topic.
This is a research methodology that involves in-depth examination of a single case or a small number of cases. Case studies are often used in psychology, sociology, and anthropology to gain a detailed understanding of a particular individual or group.
This is a research methodology that involves a collaborative process between researchers and practitioners to identify and solve real-world problems. Action research is often used in education, healthcare, and social work.
This is a research methodology that involves the manipulation of one or more independent variables to observe their effects on a dependent variable. Experimental research is often used to study cause-and-effect relationships and to make predictions.
This is a research methodology that involves the collection of data from a sample of individuals using questionnaires or interviews. Survey research is often used to study attitudes, opinions, and behaviors.
This is a research methodology that involves the development of theories based on the data collected during the research process. Grounded theory is often used in sociology and anthropology to generate theories about social phenomena.
An Example of Research Methodology could be the following:
Research Methodology for Investigating the Effectiveness of Cognitive Behavioral Therapy in Reducing Symptoms of Depression in Adults
Introduction:
The aim of this research is to investigate the effectiveness of cognitive-behavioral therapy (CBT) in reducing symptoms of depression in adults. To achieve this objective, a randomized controlled trial (RCT) will be conducted using a mixed-methods approach.
Research Design:
The study will follow a pre-test and post-test design with two groups: an experimental group receiving CBT and a control group receiving no intervention. The study will also include a qualitative component, in which semi-structured interviews will be conducted with a subset of participants to explore their experiences of receiving CBT.
Participants:
Participants will be recruited from community mental health clinics in the local area. The sample will consist of 100 adults aged 18-65 years old who meet the diagnostic criteria for major depressive disorder. Participants will be randomly assigned to either the experimental group or the control group.
Intervention :
The experimental group will receive 12 weekly sessions of CBT, each lasting 60 minutes. The intervention will be delivered by licensed mental health professionals who have been trained in CBT. The control group will receive no intervention during the study period.
Data Collection:
Quantitative data will be collected through the use of standardized measures such as the Beck Depression Inventory-II (BDI-II) and the Generalized Anxiety Disorder-7 (GAD-7). Data will be collected at baseline, immediately after the intervention, and at a 3-month follow-up. Qualitative data will be collected through semi-structured interviews with a subset of participants from the experimental group. The interviews will be conducted at the end of the intervention period, and will explore participants’ experiences of receiving CBT.
Data Analysis:
Quantitative data will be analyzed using descriptive statistics, t-tests, and mixed-model analyses of variance (ANOVA) to assess the effectiveness of the intervention. Qualitative data will be analyzed using thematic analysis to identify common themes and patterns in participants’ experiences of receiving CBT.
Ethical Considerations:
This study will comply with ethical guidelines for research involving human subjects. Participants will provide informed consent before participating in the study, and their privacy and confidentiality will be protected throughout the study. Any adverse events or reactions will be reported and managed appropriately.
Data Management:
All data collected will be kept confidential and stored securely using password-protected databases. Identifying information will be removed from qualitative data transcripts to ensure participants’ anonymity.
Limitations:
One potential limitation of this study is that it only focuses on one type of psychotherapy, CBT, and may not generalize to other types of therapy or interventions. Another limitation is that the study will only include participants from community mental health clinics, which may not be representative of the general population.
Conclusion:
This research aims to investigate the effectiveness of CBT in reducing symptoms of depression in adults. By using a randomized controlled trial and a mixed-methods approach, the study will provide valuable insights into the mechanisms underlying the relationship between CBT and depression. The results of this study will have important implications for the development of effective treatments for depression in clinical settings.
Writing a research methodology involves explaining the methods and techniques you used to conduct research, collect data, and analyze results. It’s an essential section of any research paper or thesis, as it helps readers understand the validity and reliability of your findings. Here are the steps to write a research methodology:
Research methodology is typically written after the research proposal has been approved and before the actual research is conducted. It should be written prior to data collection and analysis, as it provides a clear roadmap for the research project.
The research methodology is an important section of any research paper or thesis, as it describes the methods and procedures that will be used to conduct the research. It should include details about the research design, data collection methods, data analysis techniques, and any ethical considerations.
The methodology should be written in a clear and concise manner, and it should be based on established research practices and standards. It is important to provide enough detail so that the reader can understand how the research was conducted and evaluate the validity of the results.
Here are some of the applications of research methodology:
Research methodology serves several important purposes, including:
Research methodology has several advantages that make it a valuable tool for conducting research in various fields. Here are some of the key advantages of research methodology:
Research Methodology | Research Methods |
---|---|
Research methodology refers to the philosophical and theoretical frameworks that guide the research process. | refer to the techniques and procedures used to collect and analyze data. |
It is concerned with the underlying principles and assumptions of research. | It is concerned with the practical aspects of research. |
It provides a rationale for why certain research methods are used. | It determines the specific steps that will be taken to conduct research. |
It is broader in scope and involves understanding the overall approach to research. | It is narrower in scope and focuses on specific techniques and tools used in research. |
It is concerned with identifying research questions, defining the research problem, and formulating hypotheses. | It is concerned with collecting data, analyzing data, and interpreting results. |
It is concerned with the validity and reliability of research. | It is concerned with the accuracy and precision of data. |
It is concerned with the ethical considerations of research. | It is concerned with the practical considerations of research. |
Researcher, Academic Writer, Web developer
Introduction, background and methods, limitations, acknowledgements, data availability.
Md Mamunur Rashid, Kumar Selvarajoo, Advancing drug-response prediction using multi-modal and -omics machine learning integration (MOMLIN): a case study on breast cancer clinical data, Briefings in Bioinformatics , Volume 25, Issue 4, July 2024, bbae300, https://doi.org/10.1093/bib/bbae300
The inherent heterogeneity of cancer contributes to highly variable responses to any anticancer treatments. This underscores the need to first identify precise biomarkers through complex multi-omics datasets that are now available. Although much research has focused on this aspect, identifying biomarkers associated with distinct drug responders still remains a major challenge. Here, we develop MOMLIN, a multi-modal and -omics machine learning integration framework, to enhance drug-response prediction. MOMLIN jointly utilizes sparse correlation algorithms and class–specific feature selection algorithms, which identifies multi-modal and -omics–associated interpretable components. MOMLIN was applied to 147 patients’ breast cancer datasets (clinical, mutation, gene expression, tumor microenvironment cells and molecular pathways) to analyze drug-response class predictions for non-responders and variable responders. Notably, MOMLIN achieves an average AUC of 0.989, which is at least 10% greater when compared with current state-of-the-art (data integration analysis for biomarker discovery using latent components, multi-omics factor analysis, sparse canonical correlation analysis). Moreover, MOMLIN not only detects known individual biomarkers such as genes at mutation/expression level, most importantly, it correlates multi-modal and -omics network biomarkers for each response class. For example, an interaction between ER-negative-HMCN1-COL5A1 mutations-FBXO2-CSF3R expression-CD8 emerge as a multimodal biomarker for responders, potentially affecting antimicrobial peptides and FLT3 signaling pathways. In contrast, for resistance cases, a distinct combination of lymph node-TP53 mutation-PON3-ENSG00000261116 lncRNA expression-HLA-E-T-cell exclusions emerged as multimodal biomarkers, possibly impacting neurotransmitter release cycle pathway. MOMLIN, therefore, is expected advance precision medicine, such as to detect context–specific multi-omics network biomarkers and better predict drug-response classifications.
The advent of high-throughput sequencing technologies has revolutionized our ability to collect various ‘omics’ data types, such as deoxyribonucleic acid (DNA) methylations, ribonucleic acid (RNA) expressions, proteomics, metabolomics and bioimaging datasets, from the same samples or patients with unprecedented details [ 1 ]. By far, most studies have performed single omics analytics, which capture only a fraction of biological complexity. The integration of these multiple omics datasets offers a more comprehensive understanding of the underlying complex biological processes than single-omic analyses, particularly in human diseases like cancer and cardiovascular disease, where it significantly enhances prediction of clinical outcomes [ 2 , 3 ].
Cancer is a highly complex and deadly disease if left unchecked, and its heterogeneity poses significant challenges for treatment [ 4 ]. Standard treatments, including chemotherapy with or without targeted therapies, aim to reduce tumor burden and improve patient outcomes such as survival rate and quality of life [ 5–7 ]. However, even for the most advanced therapies, such as immunotherapies, treatment effectiveness varies widely across cancer types and even between patients with same diagnosis [ 8 ]. This heterogeneity is believed to be due to tumor microenvironment heterogeneity and their effects on the resultant complex and myriad molecular interactions within cells and tissues [ 9 , 10 ]. This variability underscores the urgent need to identify precise biomarkers to predict individual patient responses and potential adverse reactions to a particular therapy [ 11 ]. This can be made possible through multi-omics data integration analyses at the individual patient scale [ 12 ].
To assess treatment response, such as pathologic complete response (pCR) and residual cancer burden (RCB), current clinical practice relies on clinical parameters (e.g. tumor size/volume and hormone receptor status), along with genetic biomarkers (e.g. TP53 mutations) [ 13–15 ]. However, these approaches do not fully capture the complex intracellular regulatory dynamics [ 16 , 17 ] or the tumor-immune microenvironment (TiME) interactions that influence outcomes [ 18 , 19 ]. Thus, to enhance personalized cancer treatments, we need novel methodologies that can handle large, complex molecular (omics) and clinical datasets. Machine learning (ML) methods integrating multi-omics data offer a promising avenue to improve prediction accuracy and uncover robust biomarkers across drug-response classes [ 20 ], which may be overlooked by single-omics analytics. This approach can predict patients benefiting from standard treatments and those requiring alternative plans like combination therapies or clinical trials.
The current drug-response prediction methods can be broadly categorized into ML-based and network-based approaches. ML methods often analyze each data type (e.g. mutations and gene expression) independently using univariable selection [ 21 , 22 ] or dimension reduction methods [ 23 ]. These results are then integrated using various classifiers or regressors [e.g. support vector machine, elastic-net regressor, logistic regression (LR) and random forest (RF)] [ 24–26 ] and ensemble classifier to make predictions [ 9 ]. However, these methods often overlooked the crucial interactions among different data modalities. Deep learning methods, while gaining popularity, are limited by the need for large clinical sample sizes to achieve sufficient accuracy [ 27 ]. Recent ML advancements have focused on integrating multimodal omics features with patient phenotypes to improve predictive performance [ 28 , 29 ]. To discover multimodal biomarker, techniques such as multi-omics factor analysis (MOFA) and sparse canonical correlation analysis (SCCA), including its variant multiset SCCA (SMCCA) offer realistic strategies for integrating diverse data modalities [ 30–32 ]. However, although these methods are suitable for classification tasks, they are unsupervised and do not directly incorporate phenotypic information (e.g. disease status) to integrate diverse data types. As a result, they are limited to identify phenotype-specific biomarkers.
Recently, advanced supervised approaches like data integration analysis for biomarker discovery using latent components (DIABLO) by Sing et al. (2019) have emerged to overcome these limitations [ 28 ]. DIABLO is an extension of generalized SCCA (GSCCA), considers cross-modality relationships and extracts a set of common factors associated with different response categories. Network-based methods, like unsupervised network fusion or random walk with restart approaches construct drug–target interaction and sample similarity networks that are effective for patient stratification [ 20 , 33 ]. However, these methods lack a specific feature selection design, limiting their utility for identifying biomarkers for patient classification. Nevertheless, none of these ML methods are rigorous in terms of task/class-specific biomarker discovery and interpretability, and both SMCCA and GSCCA struggle with gradient dominance problem due to naive data fusion strategies [ 34 ]. Therefore, it is essential to develop novel interpretable methods for identifying robust multimodal network biomarkers across diverse data types to advance our understanding of the complex factors that influence drug responses.
In this study, we introduce MOMLIN, a multi-modal and -omics ML integration framework to enhance the prediction of anticancer drug responses. MOMLIN integrates weighted multi-class SCCA (WMSCCA) that identifies interpretable components and enables effective feature selection across multi-modal and -omics datasets. Our method contributes in three keyways: (i) innovates a class-specific feature selection strategy with SCCA methods for associating multimodal biomarkers, (ii) includes an adaptive weighting scheme into multiple pairwise SCCA models to balance the influence of different data modalities, preventing dominance during training process and (iii) ensures robust feature selection by employing a combined constraint mechanism that integrate lasso and GraphNet constraints to select both the individual features and subset of co-expressed features, thereby preventing overfitting to high-dimensional data.
We applied MOMLIN to a multimodal breast cancer (BC) dataset of 147 patients comprising clinical features, DNA mutation, RNA expression, tumor microenvironment and molecular pathway data [ 9 ], to predict drug-response classes, specifically distinguishing responders and non-responders. Our results demonstrate MOMLIN’s superiority in terms of outperforming state-of-the-art methods and interpretability of the underlying biological mechanisms driving these distinct response classes.
The workflow of our proposed method MOMLIN for identifying class- or task-specific biomarkers from multimodal data is shown in Fig. 1 . The core of this pipeline involves three stages: (i) identification of response-specific sparse components, in terms of input features and patients, (ii) development of drug-response predictor using latent components of patients and (iii) interpretation of sparse components and multi-modal and -omics biomarker discovery.
Schematic representation of the proposed framework. In stage 1, multimodal datasets from cancer patients (e.g. BC) were sourced from a published study [ 9 ]. This dataset comprises clinical features, DNA mutations, and gene expression from pre-treatment tumors, alongside post-treatment response classes (pCR, RCB-I to III). TiME and pathway activity were derived from transcriptomic data using statistical algorithms. For identifying class-specific correlated biomarkers, class binarization and oversampling were used to balance between classes. WMSCCA models the multimodal associations across different biomarkers and identifies response-specific sparse components on diverse input features and patients. In stage 2, a binary LR classifier then utilizes these patient latent components for predicting response to therapies, evaluated by AUROC. Next in stage 3, class–specific sparse components are shown in a heatmap, highlighting key signatures (non-zero loading) in colors. Finally, the identified multi-modal and -omics signatures then formed a correlation network, revealing pathways associations with multi-modal and -omics biomarkers for each response class. Nodes with colors in the network indicate multimodal features.
The rationales underpinned of this approach is that effective biomarkers are: (i) response–related multimodal features including genes, cell types and pathways, and (ii) features that demonstrate prediction capabilities on unseen patients. The first stage, a ‘feature selection step’ that selects multimodal features on the generated sparse components based on their relevance to drug-response categories (pCR and RCB-I to III). Features with high loading identified are considered as potential biomarker candidates. The second stage, a ‘classification step’, validates these biomarkers by assessing their predictive power in distinguishing responders from non-responders to anticancer therapy; any predictions indicating chemo-resistant tumors should be considered for enrolment in clinical trials for novel therapies. The third stage, an ‘interpretation step,’ analyzes the candidate biomarkers in a multi-modal and-omics network associated with relevant biological pathways. This step aims to elucidate the underlying biological processes differentiating between drug–response phenotypes.
Multi-modal and -omics data overview and preparation.
This study utilized clinical attributes, DNA mutation and gene expression (transcriptome) data from147 matched samples of early and locally advanced BC patients (categorized as pCR, n = 38, RCB-I, n = 23, or RCB-II, n = 61, or RCB-III, n = 25), obtained from the TransNEO cohort at Cambridge University Hospitals NHS Foundation [ 9 ]. The dataset includes clinical attributes (8 features, summary attributes are available in Supplementary Table S1 available online at http://bib.oxfordjournals.org/ ), genomic features (31 DNA mutation genes, applying a strict criterion of genes mutated in at least 10 patients) and RNA-sequencing (RNA-Seq) features (18 393 genes), covering major BC subtypes-normal-like, basal-like, Her2, luminalA and luminalB. Although DNA mutation genes typically represent binary data, we used mutation frequencies to construct a mutation count matrix. Initial data pre-processing involved a log2 transformation on the RNA-Seq features after filtering out less informative features at 25th percentile (in terms of mean and standard deviation) using interquartile range. For integrative modeling, we used the top 40% of variable genes (3748 genes, based on median absolute deviation ranking) from the RNA-Seq datasets. Finally, each feature was normalized dividing by its Frobenius norm, adjusting the offset between high and low intensities across different data modalities.
To characterize TiME and pathway markers, we applied various statistical algorithms on the RNA-Seq data. The GSVA algorithm [ 35 ] calculated (i) the GGI gene sets [ 36 ] and (ii) STAT1 immune signature scores [ 37 ]. For immune cell enrichment, three methods were used: (i) MCPcounter [ 37 ] with voom-normalized RNA-Seq counts; (ii) enrichment over 14 cell types using 60 gene markers, employing log2-transformed geometric mean of transcript per million (TPM) expression [ 38 ]; and (iii) z -score scaling of cancer immunity parameters [ 39 ] to classify four immune processes (major histocompatibility complex molecules, immunomodulators, effector cells and suppressor cells). Additionally, the TIDE algorithm [ 40 ] computed T-cell dysfunction and exclusion metrics for each tumor sample using log2-transformed TPM matrix of counts, which can serve as a surrogate biomarker to predict the response to immune checkpoint blockade. Pathway activity scores for each tumor sample were computed using the GSVA algorithm with input gene sets from Reactome [ 41 ], PIP [ 42 ] and BioCarta databases within the MSigDB C2 pathway database [ 43 ].
In this study, lowercase letters denote a vector, and uppercase ones denote matrices, respectively. The term |${\left\Vert .\right\Vert}_{1,1}$| denotes the matrix |${l}_1$| -norm, and |${\left\Vert .\right\Vert}_{gn}$| denotes the GraphNet regularization. The sparse multiset canonical correlation analysis (SMCCA) is an extension of dual-view SCCA, proposed to model associations among multiple types of datasets [ 31 ]. Given the multiple types of datasets, let |$X\in{\mathcal{R}}^{n\times p}$| represent gene expression data with |$p$| features, and |${Y}_k\in{\mathcal{R}}^{n\times{q}_k}$| represent the |$k$| -th data modality (e.g. clinical, DNA mutation and tumors microenvironment) with |${q}_k$| features. Both |$X$| and |${Y}_k$| have |$n$| samples, and |$k=\left(1,\dots, K\right)$| , where |$K$| denotes the number of different data modalities. The objective function of SMCCA is defined as follows:
where |$u$| and |${v}_k$| are the canonical weight vectors corresponding to |$X$| and |${Y}_k$| , indicating the importance of each respective biomarkers. The term |${\left\Vert .\right\Vert}_1$| represents the |${l}_1$| regularization to detect small subset of discriminative biomarkers and prevent model overfitting. |${\lambda}_u,{\lambda}_{vk}$| are non-negative tuning parameters balancing between the loss function and regularization terms. The term |${\left\Vert .\right\Vert}_2^2$| denotes the squared Euclidean norm to constraint weight vectors |$u$| and as unit length |${v}_k$| , respectively.
However, SMCCA has limitations: (i) it is naturally unsupervised, meaning SMCCA cannot leverage phenotypic information (e.g. disease status and drug-response classes); (ii) pairwise association among multiple data types can vary significantly and can lead to gradient dominance issues during optimization; and (iii) SMCCA mines a common subset of biomarkers for classifying different tasks, which diminishes its relevance, as each task might require distinct features sets.
To address the above limitations, here we propose weighted multi-class SCCA (WMSCCA), a formal model for class/tasks-specific feature selection, different from the conventional SMCCA. Throughout this study, we used the terms tasks/classes/drug-response classes interchangeably. WMSCCA includes phenotypic information as an additional data type, employs a weighting scheme to resolve the gradient dominance issue and innovates traditional class–specific feature selection strategies through the one-versus-all strategies into its core objective function. In this study, the underlying motivation is WMSCCA can jointly identify drug-response class–specific multimodal biomarkers to improve drug-response prediction. For ease of presentation, we consider |$n$| patients with data matrices |${X}_c\in{\mathcal{R}}^{n\times p},{Y}_{ck}\in{\mathcal{R}}^{n\times{q}_k}$| , and |$Z\in{\mathcal{R}}^{n\times C}$| from C different drug-response classes. Here, |${X}_c$| denotes |$p$| features from gene expression datasets, |${Y}_{ck}$| denotes |${q}_k$| features from |$k$| -th data modality (e.g. mutation, clinical features, TiME and pathway activity), |${Z}_c$| denotes |$c$| response class, and |$k=\left(1,\dots, K\right)$| , |$K$| denotes the number of data modalities. The WMSCCA optimization problem can be formulated as follows:
where |$U\in{\mathcal{R}}^{p\times C},{V}_k\in{\mathcal{R}}^{q_k\times C}$| are canonical loading matrices correspond to |$X$| and |${Y}_k$| , representing the importance of candidate biomarkers for each class |$C$| , respectively. In this equation, the first term models associations among |$X$| , and |${Y}_k$| datasets; the second- and third terms correlate class labels |${Z}_c$| with |$X$| and |${Y}_k$| data modalities for each |${C}^{th}$| class, aiming to identify class-specific features and their relationships; |$\psi (U)$| and |$\psi \left({V}_k\right)$| represent sparsity constraints on |$U$| and |${V}_k$| , to select a subset of discriminative feature. As mentioned in Equation ( 1 ), to address gradient dominance, the adjusting weight parameter |${\sigma}_{xy}$| , |${\sigma}_{xz}$| and |${\sigma}_{yz}$| can be defined as:
where |$k=\left(1,\dots, K\right)$| , |$K$| denotes the number of data modalities. |${\sigma}_{..}$| adjusts a larger weight if the non-squared loss (denominator term) between datasets is small and vice versa.
Given high-dimensional datasets, the model in Equation ( 2 ) encounters an overfitting problem. Therefore, the use of a sparsity constraint is appropriate to address this issue. We hypothesized that gene expression biomarkers can be either single genes or co-expressed sets; thus, a combined penalty is designed for the |$X$| dataset. Therefore, |$\psi (U)$| for |$X$| takes the following form:
where, |${\mathrm{\alpha}}_u,\beta$| are nonnegative tuning parameters. |$\beta$| balances between the effect of co-expressed and individual feature selection. The first sparsity constraint is matrix |${l}_{1,1}$| -norm, which is defined as follows:
This penalty promotes class-specific features on |$U$| . The second sparsity constraint GraphNet regularization, defined as follows:
where |${L}_c$| represents the Laplacian matrices of the connectivity in |$\boldsymbol{X}$| matrices. The Laplacian matrix is defined as |$L=D-A$| , where |$D$| is the degree matrix of connectivity matrix |$A$| (e.g. gene co-expression or correlation network). This penalty term promotes a subset of connected features to discriminate each response on |$U$| .
Besides, neither every mutation marker nor every clinical/TiME/pathways involves in predicting response classes, therefore, the |${l}_{1,1}$| -norm is used on the |${Y}_k$| datasets to select individual markers, i.e. |$\psi \left({V}_k\right)$| for the |${\boldsymbol{Y}}_k$| data modalities take the following form:
where |${\mathrm{\alpha}}_{vk}$| is non-negative tuning parameter.
Finally, we obtained C pairs of canonical weight matrices |$\big({U}_c{V}_{ck}\big)\left(c=1,\dots, C;k=1,\dots, K\right)$| using an iterative alternative algorithm by solving Equation ( 2 ) [ 44 , 45 ]. Detected features with non-zero weights in each class in the weight vectors were extracted as correlated sets.
The WMSCCA method involves parameters |${\mathrm{\alpha}}_u,\mathrm{\beta}, and\ {\mathrm{\alpha}}_{vk}$| |$\left(k=1, \dots, K\right)$| . Given the limited number of samples, we applied a nested cross-validation (CV) strategy on training sets and evaluated the maximum correlation on the test datasets. Optimal values for the regularization parameters were determined within each training set via internal five-fold CV.
To predict drug-response categories, we trained LR classifier using the latent components of patients (or raw multimodal features) generated by MOMLIN in Fig. 1 : stages 1 and 2. We used a binary classification scheme, distinguishing pCR versus non-pCR, RCB-I versus non-RCB-I, RCB-II versus non-RCB-II and RCB-III versus non-RCB-III, to evaluate model performance. In addition, we performed analyses with existing multi-omics methods, including SMCCA+LR, MOFA+LR, DIABLO and latent principal component analysis (PCA) features, with LR classifiers. To assess prediction performance for the response to treatment in an unbiased manner, we used five-fold cross-validated performance and repeated the process over 100 runs. The partitioning of data was kept consistent across all models for fair comparisons. The accuracy of response prediction was evaluated using area under the receiver operating characteristic curve (AUROC).
After learning sparse latent components of features across different data modalities using MOMLIN, we identify the most relevant feature based on the loading weight of genes, TiME and pathways, which reveal underlying interactions for discriminating response classes. The larger the loading weight, the more important the pair of features in discriminating response categories. We then use these selected features to construct a sample correlation network, or a relationship matrix based on their canonical weights [ 46 ]. In this network, nodes represent selected features, and the edge weights between two interconnected features indicate correlation or relatedness. The generated network is visualized using the ggraph package in R ( https://cran.r-project.org ). Finally, we prioritize multi-omics biomarkers based on their degree centrality within the interconnected correlation network.
We applied MOMLIN to analyze a breast cancer (BC) dataset to predict treatment response and gain molecular insights. The dataset comprised 147 BC patients with early and locally advanced pretherapy tumors [ 9 ], categorized as follows: pCR with 38 patients, RCB-I (good response) with 23 patients, RCB-II (moderate response) with 61 patients and RCB-III (resistance) with 25 patients. After preprocessing and filtering least informative features, the final dataset comprised 3748 RNA genes (top 40% out of 9371 genes), 31 mutation genes, 8 clinical attributes, 64 TiME and 178 pathways activities ( Fig. 1 : stage 1). Supplementary Table S1 available online at http://bib.oxfordjournals.org/ summarizes overall clinical characteristics by patients’ response classes.
While our proposed framework offers general applicability for identifying context-specific multi-omics biomarkers, this study specifically focused on discovering drug-response–specific biomarkers to enhance the prediction of pCR and RCB resistance. MOMLIN decomposed the input multimodal data into response-associated sparse latent components of input-features and patients. These sparse components reveal patterns of how various features (e.g. genes and mutations) and clinical attributes related to treatment outcomes ( Fig. 1 : stage 1–3), and their effectiveness was evaluated by measuring prediction performance. We assessed the predictive ability of MOMLIN through five-fold CV repeated 100 times. In each iteration, the dataset is divided into five-folds, with one random fold assigned as the held-out test set, and the remaining folds used as the training set. MOMLIN was trained using the training dataset, including detection of predictive marker candidates, and its performance was evaluated on the ‘unseen’ test set. This process was repeated for all five-folds to ensure robust evaluation of MOMLIN’s generalizability. Performance was measured by the AUROC matrices ( Fig. 1 : stage 2).
To evaluate the prediction capability of MOMLIN, we modeled each response category as a binary classification problem and compared its prediction accuracy to existing multi-omics integration algorithms. For comparison, we randomly split the dataset into a training set (70%) and a test set (30% unseen data), with balanced inclusion of response classes. We employed LR as the classifier to assess predictive performance of multimodal biomarkers. We compared MOMLIN with four other classification algorithms for omics data: (i) SMCCA, which integrates multi-omics data by projecting it onto latent components for discriminant analysis; (ii) MOFA, which decomposes multi-omics data into common factors for discriminant analysis; (iii) sparse PCA; and (iv) DIABLO, a supervised integrative analysis method, represent the state-of-the-art in classification. All methods were trained on the same preprocessed data.
The classification results showed that MOMLIN outperformed the compared multi-omics integration methods in most classification tasks on unseen test samples ( Fig. 2A ). Notably, DIABLO, the next best performer, was 10 to 15% less effective than our MOMLIN. Additionally, we compared the performance of component-based LR models against raw feature-based LR models to predict RCB response classes. Although raw feature-based models showed improved prediction, their performance was notably dropped compared to component-based models ( Fig. 2B ). This indicates the superior adaptability and effectiveness of component-based models in leveraging multi-omics data for predictive purposes.
Performance comparison with existing methods and detection of informative data combination. All results in the plots depict test AUROC over five-fold CV obtained from 100 runs. (A) Box plots comparing response prediction performance of MOMLIN against existing state-of-the-art multi-omics methods. (B) Performance comparison between predictors based on latent components and those utilizing a selected subset of multimodal features. (C) Comparing AUROCs for the models with different data subset combinations (clinical, clinical + DNA, clinical + RNA and clinical + DNA + RNA) using MOMLIN.
Moreover, to test and demonstrate generalizability of this framework, we applied MOMLIN to a preprocessed multi-omics dataset of colorectal adenocarcinoma (COAD) with 256 patients [ 47 ]. This dataset included gene expression, copy number variations and micro-RNA expression data, which we used to classify COAD subtypes such as chromosomal instability (CIN, n = 174), genomically stable (GS, n = 34) and microsatellite instability (MSI, n = 48). The performance results shown in Supplementary Table S2 available online at http://bib.oxfordjournals.org/ and Supplementary Figure S1 available online at http://bib.oxfordjournals.org/ , indicate that MOMLIN outperformed all state-of-the-art methods tested in classifying COAD subtypes. Moreover, when comparing the raw feature-based accuracies with sparse components-based (features derived from MOMLIN) accuracies, we found that raw feature-based classifier was superior against existing methods ( Figure S1A and B ), but lower than the components-based classifier. This consistent observation supports our findings with BC drug-response performances.
To assess the added value of integrating multimodal data for predicting treatment response, we trained four prediction models with different feature combinations: (i) clinical features only, plus adding (ii) DNA, (iii) RNA and (iv) both DNA and RNA. We found that adding different data modalities improved prediction performance across all response classes ( Fig. 2C ). Notably, the models that combined clinical data with either RNA or both DNA and RNA demonstrated superior and comparable performance with an average AUROC of 0.978. In contrast, the model based on clinical features alone had much lower AUROC, ranging from 0.51 to 0.82. These results suggest that RNA transcriptome is the most informative data modality in this dataset. Thus, integrating gene expression with clinical features could significantly improve our ability to predict treatment outcomes in BC.
To understand the molecular landscape of treatment response in BC, we used MOMLIN to model response–specific bi-multivariate associations across multiple data modalities. We observed stronger correlations between RNA gene expression and both TiME ( r = 0.701) and pathway activity ( r = 0.868), indicating greater overlap or explained information between them. Conversely, moderate correlations were found between RNA gene expression and DNA mutations ( r = 0.526), or clinical features ( r = 0.488), indicating partially overlapping or independent information. These results suggest that multimodal biological features provide complementary information in a combinatorial manner.
When investigating the importance of each feature to predict response classes, MOMLIN identified four distinct loading vectors corresponding to pCR and RCB response classes, highlighting distinct weight patterns for pCR versus non-pCR and RCB versus non-RCB classes ( Fig. 3 ). For example, in the pCR (complete response) components—taking the top five molecular features across different modalities revealed distinct molecular patterns. Specifically, gene expression analysis showed that downregulation of FBXO2 and RPS28P7 inhibits tumor cell proliferation, and potentially may enhance treatment efficacy, and the upregulation of C2CD4D-AS1, CSF3R, and SMPDL3B genes may promote immune response, increasing tumor cell vulnerability and therapeutic effect ( Fig. 3A ). Mutational analysis revealed negative associations of marker genes HMCN1 and GATA3, but a positive association for COL5A1 ( Fig. 3C ). Additionally, tumor mutation burden (TMB), and homologous recombination deficiency (HRD)-Telomeric AI signatures were higher in pCR patients, suggesting high genomic instability compared to RCB patients [ 9 ]. TiME analysis showed reduced immunosuppressive mast cells and extracellular matrix (ECM), along with increased infiltration of neutrophils, TIM-3 and CD8+ T-cells ( Fig. 3D ). Subsequently, the pathway analysis further revealed potential downregulation of the PDGFRB pathway, involved in stromal cell activity and associated with improved patient response [ 49 ], while upregulation of pathways for antimicrobial peptides, FLT3 signaling, ephrin B reverse signaling and potential therapeutics for SARS ( Fig. 3E ), suggesting enhanced immune surveillance and interaction with tumor cells. In summary, MOMLIN reveals distinct genomic landscape with higher immune activity and genomic instability in pCR that characterizes its favorable treatment response.
Heatmaps illustrate the features importance on response-associated components identified by MOMLIN. Each row in the heatmap represents a drug-response class, pCR, RCB-I , RCB-II and RCB-III, with columns representing features across different data modalities. The color gradient indicates feature loading or importance, representing the strength of association with response classes. The sign (negative or positive) of gradient denotes the association directions to response classes. All results in the heatmaps depict an average over 100 runs of five-fold CV. (A–E) represents the response-associated candidate biomarkers detected in latent components in (A) gene expression data (highlighting DE genes), (B) clinical features, (C) DNA mutations (highlighting mutated genes), (D) TiME cells and (E) functional pathway profiles (highlighting altered pathways).
Similarly, in the RCB-I (good response) components—RNA expression analysis revealed that lower expression of genes GPX1P1 and HBB are linked to less aggressive tumors [ 48 ], while those of thiosulfate sulfurtransferase (TST), NPIPA5 and GSDMB were overexpressed, linked to enhanced immune response and therapeutic effectiveness [ 49 , 50 ]. Mutational analysis showed positive association for therapeutic targets signatures TP53, MUC16 and RYR2 [ 51 , 52 ], but a negative in NEB, and CIN scores. TiME analysis demonstrated increased infiltration of Tregs, cancer-associated fibroblast (CAF), monocytic lineage and natural killer (NK) cells, indicating more active of immune environment [ 9 ], with reduced TEM CD4 cells. Pathway analysis further identified downregulation of NOD1/2 signaling, EPHA-mediated growth cone collapse and toll-like receptor (TLR1, TLR2) pathways, involved in inflammation and immune response, with the upregulation of allograft rejection, and G0 and early G1 pathways. In summary, tumors that achieve RCB-I is marked by distinct genomics marker, active immune response, and lower CIN.
In RCB-II (moderate response) components: RNA expression analysis revealed overexpression of RPLP0P9, FTH1P20, RNF5P1 pseudogenes, following accumulation of overexpressed ERVMER34-1, and PON3 genes play an oncogenic role in BC [ 53 ]. Mutation analysis revealed positive association of HRD-LOH, RYR1 and MT-ND4, but negative association of MACF1 and neoantigen loads, in line with previous reports [ 54 , 55 ]. Analysis of TiME features demonstrated increased infiltration of IDO1 and TAP2, with reduced CTLA 4, NK cells and PD-L2 cells, indicating a less suppressive immune environment. Pathways analysis further revealed downregulation pathways of G1/S DNA damage checkpoints and TP53 regulation, highlighting DNA repair issues, with the upregulation of PDGFRB pathway, E2F targets and signaling by Hedgehog associated with cell proliferation. In summary, RCB-II patients display distinct genomics markers including pseudogenes, lack of suppressive immune environment and active proliferation.
In RCB-III (resistant) components: RNA gene expression analysis revealed lower expression of therapeutic target PON3, and FGFR4 [ 56 ], and flowed accumulation of lower expressed lncRNAc ENSG00000225489, ENSG00000261116 and RNF5P1. Mutation signature analysis identified a positive association of MT-ND1, but a negative association in therapeutic targets TP53, and MT-ND4 [ 7 , 52 ]. Neoantigen loads were higher following lower TMB indicate reduced tumor suppressor activity. TiME analysis revealed reduced activity of T-cell exclusion, and HLA-E, with increased ECM, HLA DPA1 and LAG3, suggesting an immune suppressive tumor environment. Pathway analysis revealed upregulation of pathways involved in neurotransmitter release, cell-cycle progression (RB-1) and immune system diseases, suggesting active cell signaling and proliferation, with downregulation of EPHB FWD pathway and nucleotide catabolism. In summary, patients that attained RCB-III, characterized by low mutational burden and an immune suppressive environment, leading to treatment resistance.
To further extract multimodal network biomarkers and understand the complex biological interactions in patients with pCR and RCB, we performed cross-interaction network analysis using candidate signatures identified by MOMLIN across different modalities. This analysis included clinical features, DNA mutations, gene expression, TiME cells and enriched pathways, aiming to elucidate the underlying biology associated with specific treatment responses. Figure 4 shows the interaction networks of selected multimodal features for each RCB class. To identify potential biomarkers associated with pCR and RCB response, we specifically focused on the top ten multimodal features based on network edge connections. For example, tumors that attained in pCR, the network analysis revealed co-enrichment of mutations in HMCN1 and COL5A1 genes, particularly in estrogen receptor (ER)-negative patients. HMCN1 and COL5A1 therapeutic targets like molecules encode proteins for ECM structure, and mutations of these genes regulate tumor architecture and cell adhesion, potentially facilitating immune cell infiltration [ 52 ]. We also observed elevated expressions of FBXO2, CSF3R, C2CD4D-AS1 and RPS28P7 genes, alongside increased infiltration of CD8+ T-cells [ 9 , 57 ]. FBXO2 is a component of the ubiquitin-proteasome system, which regulates protein degradation and influences cell cycle and apoptosis [ 58 ], while CSF3R plays a vital role in granulocyte production and immune response [ 59 ]. These gene expression patterns, coupled with increased CD8+ T-cell infiltration, suggest a robust anti-tumor immune response. Furthermore, these molecular perturbations may be linked to antimicrobial peptide pathways and FLT3 signaling, potentially contributing to the favorable outcome in achieving pCR [ 60 , 61 ]. Future work could specifically search for these complex interactions across different molecules to gain more clinically relevant insights into pCR tumors. Supplementary Table S3 available online at http://bib.oxfordjournals.org/ presents the more detailed list (top 30) of the multi-modal and -omics biomarkers identified using the MOMLIN pipeline.
Multimodal network biomarkers explain drug-response classes. The multimodal networks detail the candidate biomarkers and their interactions for each response class, (A) the pCR patients (B) the RCB-I patients (good response), (C) the RCB-II patients (moderate response) and (D) the RCB-III resistance patients. Nodes in the network represent candidate biomarkers derived from clinical features, DNA mutations, gene expression, enriched cell-types and pathways, each indicated in different colors in the figure legend. Negative edges are light green; positive edges are in light magenta. Edge width reflects the strength of the interaction between features. Node size corresponds to the number of connections (degree), and the font size of node labels scales with degree centrality, highlighting the most interconnected biomarkers.
Similarly, RCB-I tumors exhibited co-enriched mutations in MUC16 and TP53, particularly in HER2+ cases [ 14 ]. MUC16 (CA125) is therapeutic molecule associated with immune evasion and tumor growth [ 51 ], while TP53 mutations can lead to loss of cell cycle control and genomic instability [ 62 ]. We also observed elevated expression of TST involved in the detoxification processes and GPX1P1 [long non-coding RNA (lncRNA)] involved in oxidative stress response. The immune landscape of these tumors showed increased infiltration of TEM CD4 cells (adaptive immunity), monocytic lineage cells (phagocytosis and antigen presentation) and NK cells (innate immunity), as well as CAFs. This immune landscape, coupled with potential perturbations in the allograft rejection pathway, suggests an active but potentially incomplete immune response against the tumor, resulting in minimal residual disease.
RCB-II tumors had lower neoantigen loads compared to pCR, both in ER-negative and HER2+ patients. This reduced neoantigen load might contribute to a weaker immune response. Gene expression analysis showed elevated levels of specific lncRNAs, including FTH1P20 (associated with iron metabolism), RNF5P1 (potentially affecting protein degradation) and RPLP0P9 (involved in protein synthesis), along with ERVMER34-1, which can influence gene expression and immune response in BC patients. Numerous studies have underscored the key regulatory roles of lncRNAs in tumors and the immune system. Notably, increased expression of the immune checkpoint protein IDO1 negatively regulates the expression of CTLA-4, both known to modulate antitumor immune responses [ 63 ]. The combined effect of these molecular alterations suggests potential tumor survival mechanisms, including immune evasion and dysregulation of G1/S DNA damage [ 64 ] contributing to moderate residual disease.
In RCB-III tumors, we observed the reduced prevalence of TP53 and MT-ND4 mutations, typically associated with genomic instability and aggressive tumor behavior [ 51 ], coupled with a higher neoantigen load, suggesting an alternative mechanism (pathways) that drives tumor progression. Despite the higher neoantigen loads, increased expression of HLA-E immune checkpoints and T-cell exclusion in the tumor microenvironment hindered effective anti-tumor immune responses. Additionally, the low-expressed genes PON3, ENSG00000261116 (lncRNA) and RNF5P1 are involved in detoxification, gene regulation and protein degradation, respectively, represents an adaptive response to cellular stress in these tumors. Clinical markers indicating lymph node involvement suggest a more advanced disease state [ 9 ]. These findings, along with potential perturbations in the neurotransmitter release cycle pathway, collectively portray RCB-III tumors as genetically unstable, yet effectively evading immune surveillance, contributing to their significant treatment resistance. Overall, further investigation of these interactive molecular networks, comprising both positive and negative interactions offers a more depth understudying of these potential candidate biomarkers for distinguishing treatment-sensitive pCR and resistant RCB tumors.
The advent of multi-omics technologies has revolutionized our understanding of cancer biology, offering unprecedented insights into the complex molecular interactions that shape tumor behavior and treatment response. In this study, we presented MOMLIN (multi-modal and -omics ML integration), a novel method to enhance cancer drug-response prediction by integrating multi-omics data. MOMLIN specifically utilizes class-specific feature learning and sparse correlation algorithms to model multi-omics associations, enables the detection of class-specific multimodal biomarkers from different omics datasets. Applied to a BC multimodal dataset of 147 patients (comprising RNA expression, DNA mutation, tumor microenvironment, clinical features and pathway functional profiles), MOMLIN was highly predictive of responses to anticancer therapies and identified cohesive multi-modal and -omics network biomarkers associated with responder (pCR) and various levels of RCB (RCB-I: good response, RCB-II: moderate response and RCB-III: resistance).
Using MOMLIN, we identified that pCR is determined by an interactive set of multimodal network biomarkers driven by distinct genetic alterations, such as HMCN1 and COL5A1, particularly in ER-negative tumors [ 9 , 65 ]. Gene expression signatures, including FBXO2 and CSF3R were associated with the immune cell infiltration (CD8+ T-cells), which has been previously reported as a key determinant of response [ 57 ]. The association of these biomarkers with antimicrobial peptide and FLT3 signaling pathways suggests a robust immune response [ 61 ] as a critical driver of complete response. Additionally, C2CD4D-AS1, an lncRNA was identified, and its exact role with these complex molecular interactions in BC remains to be elucidated. Future work could specifically search for these complex interactions across different molecules to gain more clinically relevant insights into pCR tumors.
RCB-I tumors, despite responding well to response, were associated with a distinct multimodal molecular signature. These tumors were enriched for mutations in the therapeutic target MUC16 (CA125), known for its role in immune evasion [ 51 ], and the tumor suppressor gene TP53, particularly in HER2+ cases [ 14 ]. Elevated expression of TST and GPX1P1 (lncRNA involved in oxidative stress response) were associated with increased infiltration of diverse immune cells, including Tem CD4+ cells, monocytes and NK cells [ 10 ]. This active immune landscape and the intricate interactions of these signature with the potential perturbations in the allograft rejection pathway, suggests a robust yet potentially incomplete anti-tumor immune response, contributing to the minimal residual disease observed in this subtype.
RCB-II tumors showed lower neoantigen loads compared to pCR, which could contribute to a weaker immune response, particularly in ER-negative and HER2+ subtypes. Increased expression of lncRNAs, such as FTH1P20, RNF5P1, RPLP0P9 and ERVMER34–1, were associated with the immune checkpoint protein IDO1, and negatively regulate the CTLA-4 protein expression, suggests immune evasion and alterations in tumor cell metabolism and proliferation. These molecules altered intricate interactions implicate dysregulation of G1/S DNA damage as a possible mechanism for moderate treatment response [ 64 ].
RCB-III tumors, classified as resistant, were associated with a distinct multimodal molecular landscape driven by reduced TP53 and MT-ND4 mutations [ 52 ], accompanied with higher neoantigen loads compared to other response groups. This suggests an alternative mechanism driving tumor progression and immune evasion. Despite the high neoantigen load which could potentially trigger immune response, these tumors exhibited immune evasion through increased HLA-E immune checkpoints and T-cell exclusion [ 40 , 55 ]. Also, the downregulation of genes like PON3 and the lncRNA ENSG00000261116, along with lymph node involvement, pointed to advanced disease and cellular stress adaptation [ 9 ]. The presence of these complex interactions, including potential perturbations in the neurotransmitter release cycle pathway, could contribute to treatment resistance in RCB-III tumors. Future studies targeting these immunosuppressive mechanisms and exploring novel pathways could offer promising avenues to overcome resistance in this aggressive subtype.
These findings above emphasize the potential of MOMLIN to enable deeper understanding of complex biological mechanism correspondence to each response class, ultimately paving the way for personalized treatment strategies in cancer. MOMLIN also demonstrated the best prediction performance for unseen patients by utilizing these identified sets of network biomarkers. By identifying response-associated biomarkers, researchers can stratify patients based on their likelihood of achieving pCR or experiencing RCB to anticancer treatments, facilitating more informed treatment decisions and potentially improving patient outcomes. Moreover, the identified biomarkers could serve as valuable targets for the development of novel therapeutic interventions and new biological hypothesis generation. However, the clinical translation of multimodal biomarkers necessitates addressing the potential economic burden associated with multi-omics testing. Developing targeted biomarker panels and prioritizing key hub molecules from the large-scale candidate multimodal network biomarkers identified by MOMLIN could be a viable strategy for reducing costs while maintaining predictive accuracy. Furthermore, ongoing advancements in sequencing and diagnostic technologies are expected to make multi-omics testing more accessible and affordable over time.
In conclusion, our study demonstrates MOMLIN’s capacity to uncover nuanced molecular signatures associated with different drug-response classes in BC. By integrating multi-modal and -omics datasets, we have highlighted the complex interplay between genetic alterations, gene expression, immune infiltration and cellular pathways that contribute to treatment response and resistance. Future research in this direction holds promise for refining risk stratification, optimizing treatment selection and ultimately improving patient outcomes.
While MOMLIN demonstrates promising results as shown, a key limitation lies in its reliance on correlation-based algorithms for multi-omics data integration. These algorithms are great at identifying associations, but they can fall short when it comes to inferring causality between different omics layers. This is a challenge faced by most current state-of-the-art methods [ 28 , 30 ]. In the future iterations of MOMLIN, we aim to incorporate causal inference methodologies alongside sparse correlation algorithms to better understand the complex causal relationships within multi-omics datasets.
We proposed MOMLIN, a novel framework designed to integrate multimodal data and identify response-associated network biomarkers, to understand biological mechanisms and regulatory roles.
MOMLIN employed an adaptive weighting for different data modalities and employs innovative regularization constraint to ensure robust feature selection to analyze high-dimensional omics data.
MOMLIN demonstrates significantly improved performance compared to current state-of-the-art methods.
MOMLIN identifies interpretable and phenotype-specific components, providing insights into the molecular mechanisms driving treatment response and resistance.
We thank Dr Yoshihiro Yamnishi and Mr Chen Yuzhou for their technical help.
This work was supported by the core research budget of Bioinformatics Institute, ASTAR.
Supplemental information and software are available at the Bib website. Our algorithm’s software is available for free download at https://github.com/mamun41/MOMLIN_softwar/tree/main
Hasin Y , Seldin M , Lusis A . Multi-omics approaches to disease . Genome Biol 2017 ; 18 : 83 .
Google Scholar
Rashid MM , Hamano M , Iida M . et al. Network-based identification of diagnosis-specific trans-omic biomarkers via integration of multiple omics data . Biosystems 2024 ; 236 : 105122 . https://doi.org/10.1016/j.biosystems.2024.105122 .
Zhu B , Song N , Shen R . et al. Integrating clinical and multiple omics data for prognostic assessment across human cancers . Sci Rep 2017 ; 7 : 16954 . https://doi.org/10.1038/s41598-017-17031-8 .
Aly HA . Cancer therapy and vaccination . J Immunol Methods 2012 ; 382 : 1 – 23 .
Debela DT . et al. New approaches and procedures for cancer treatment: current perspectives . SAGE Open Med 2021 ; 9 : 20503121211034366 .
Rauf A , Abu-Izneid T , Khalil AA . et al. Berberine as a potential anticancer agent: a comprehensive review . Molecules 2021 ; 26 :7368. https://doi.org/10.3390/molecules26237368 .
Islam MR , Islam F , Nafady MH . et al. Natural small molecules in breast cancer treatment: understandings from a therapeutic viewpoint . Molecules 2022 ; 27 : 2165 . https://doi.org/10.3390/molecules27072165 .
Emran TB , Shahriar A , Mahmud AR . et al. Multidrug resistance in cancer: understanding molecular mechanisms . Front Oncol 2022 ; 12 : 891652 . https://doi.org/10.3389/fonc.2022.891652 .
Sammut SJ , Crispin-Ortuzar M , Chin SF . et al. Multi-omic machine learning predictor of breast cancer therapy response . Nature 2022 ; 601 : 623 – 9 . https://doi.org/10.1038/s41586-021-04278-5 .
Zhang A , Miao K , Sun H . et al. Tumor heterogeneity reshapes the tumor microenvironment to influence drug resistance . Int J Biol Sci 2022 ; 18 : 3019 – 33 . https://doi.org/10.7150/ijbs.72534 .
Karczewski KJ , Snyder MP . Integrative omics for health and disease . Nat Rev Genet 2018 ; 19 : 299 – 310 .
In GK . et al. Multi-omic profiling reveals discrepant immunogenic properties and a unique tumor microenvironment among melanoma brain metastases . NPJ Precis Oncol 2023 ; 7 : 120 . https://doi.org/10.1038/s41698-023-00471-z .
Denkert C , Untch M , Benz S . et al. Reconstructing tumor history in breast cancer: signatures of mutational processes and response to neoadjuvant chemotherapy (small star, filled) . Ann Oncol 2021 ; 32 : 500 – 11 . https://doi.org/10.1016/j.annonc.2020.12.016 .
Lesurf R , Griffith OL , Griffith M . et al. Genomic characterization of HER2-positive breast cancer and response to neoadjuvant trastuzumab and chemotherapy-results from the ACOSOG Z1041 (alliance) trial . Ann Oncol 2017 ; 28 : 1070 – 7 . https://doi.org/10.1093/annonc/mdx048 .
Choi JH , Yu J , Jung M . et al. Prognostic significance of TP53 and PIK3CA mutations analyzed by next-generation sequencing in breast cancer . Medicine (Baltimore) 2023 ; 102 : e35267 . https://doi.org/10.1097/MD.0000000000035267 .
Simeoni O , Piras V , Tomita M . et al. Tracking global gene expression responses in T cell differentiation . Gene 2015 ; 569 : 259 – 66 . https://doi.org/10.1016/j.gene.2015.05.061 .
Piras V , Hayashi K , Tomita M . et al. Enhancing apoptosis in TRAIL-resistant cancer cells using fundamental response rules . Sci Rep 2011 ; 1 : 144 . https://doi.org/10.1038/srep00144 .
Misetic H , Keddar MR , Jeannon JP . et al. Mechanistic insights into the interactions between cancer drivers and the tumour immune microenvironment . Genome Med 2023 ; 15 : 40 . https://doi.org/10.1186/s13073-023-01197-0 .
Son B , Lee S , Youn HS . et al. The role of tumor microenvironment in therapeutic resistance . Oncotarget 2017 ; 8 : 3933 – 45 . https://doi.org/10.18632/oncotarget.13907 .
Wang C , Lye X , Kaalia R . et al. Deep learning and multi-omics approach to predict drug responses in cancer . BMC Bioinformatics 2021 ; 22 : 632 . https://doi.org/10.1186/s12859-022-04964-9 .
Li F , Yin J , Lu M . et al. ConSIG: consistent discovery of molecular signature from OMIC data . Brief Bioinform 2022 ; 23 :bbac253. https://doi.org/10.1093/bib/bbac253 .
Yang Q , Li B , Tang J . et al. Consistent gene signature of schizophrenia identified by a novel feature selection strategy from comprehensive sets of transcriptomic data . Brief Bioinform 2020 ; 21 : 1058 – 68 . https://doi.org/10.1093/bib/bbz049 .
Picard M , Scott-Boyer MP , Bodein A . et al. Integration strategies of multi-omics data for machine learning analysis . Comput Struct Biotechnol J 2021 ; 19 : 3735 – 46 . https://doi.org/10.1016/j.csbj.2021.06.030 .
Dong Z , Zhang N , Li C . et al. Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection . BMC Cancer 2015 ; 15 : 489 . https://doi.org/10.1186/s12885-015-1492-6 .
Menden MP , Iorio F , Garnett M . et al. Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties . PloS One 2013 ; 8 : e61318 . https://doi.org/10.1371/journal.pone.0061318 .
Basu A , Bodycombe NE , Cheah JH . et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules . Cell 2013 ; 154 : 1151 – 61 . https://doi.org/10.1016/j.cell.2013.08.003 .
Adam G , Rampášek L , Safikhani Z . et al. Machine learning approaches to drug response prediction: challenges and recent progress . NPJ Precis Oncol 2020 ; 4 : 19 . https://doi.org/10.1038/s41698-020-0122-1 .
Singh A , Shannon CP , Gautier B . et al. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays . Bioinformatics 2019 ; 35 : 3055 – 62 . https://doi.org/10.1093/bioinformatics/bty1054 .
Wang T , Shao W , Huang Z . et al. MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification . Nat Commun 2021 ; 12 : 3445 . https://doi.org/10.1038/s41467-021-23774-w .
Argelaguet R , Arnol D , Bredikhin D . et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data . Genome Biol 2020 ; 21 : 111 . https://doi.org/10.1186/s13059-020-02015-1 .
Rodosthenous T , Shahrezaei V , Evangelou M . Integrating multi-OMICS data through sparse canonical correlation analysis for the prediction of complex traits: a comparison study . Bioinformatics 2020 ; 36 : 4616 – 25 . https://doi.org/10.1093/bioinformatics/btaa530 .
Witten DM , Tibshirani RJ . Extensions of sparse canonical correlation analysis with applications to genomic data . Stat Appl Genet Mol Biol 2009 ; 8 : Article28 . https://doi.org/10.2202/1544-6115.1470 .
Jeong D , Koo B , Oh M . et al. GOAT: gene-level biomarker discovery from multi-omics data using graph ATtention neural network for eosinophilic asthma subtype . Bioinformatics 2023 ; 39 :btad582. https://doi.org/10.1093/bioinformatics/btad582 .
Hu W , Lin D , Cao S . et al. Adaptive sparse multiple canonical correlation analysis with application to imaging (epi)genomics study of schizophrenia . IEEE Trans Biomed Eng 2018 ; 65 : 390 – 9 . https://doi.org/10.1109/TBME.2017.2771483 .
Hanzelmann S , Castelo R , Guinney J . GSVA: gene set variation analysis for microarray and RNA-seq data . BMC Bioinformatics 2013 ; 14 : 7 .
Sotiriou C , Wirapati P , Loi S . et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis . J Natl Cancer Inst 2006 ; 98 : 262 – 72 . https://doi.org/10.1093/jnci/djj052 .
Desmedt C , Haibe-Kains B , Wirapati P . et al. Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes . Clin Cancer Res 2008 ; 14 : 5158 – 65 . https://doi.org/10.1158/1078-0432.CCR-07-4756 .
Danaher P , Warren S , Dennis L . et al. Gene expression markers of tumor infiltrating leukocytes . J Immunother Cancer 2017 ; 5 : 18 . https://doi.org/10.1186/s40425-017-0215-8 .
Charoentong P , Finotello F , Angelova M . et al. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade . Cell Rep 2017 ; 18 : 248 – 62 . https://doi.org/10.1016/j.celrep.2016.12.019 .
Jiang P , Gu S , Pan D . et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response . Nat Med 2018 ; 24 : 1550 – 8 . https://doi.org/10.1038/s41591-018-0136-1 .
D’Eustachio P . Reactome knowledgebase of human biological pathways and processes . Methods Mol Biol 2011 ; 694 : 49 – 61 .
Schaefer CF , Anthony K , Krupa S . et al. PID: the pathway interaction database . Nucleic Acids Res 2009 ; 37 : D674 – 9 . https://doi.org/10.1093/nar/gkn653 .
Liberzon A , Subramanian A , Pinchback R . et al. Molecular signatures database (MSigDB) 3.0 . Bioinformatics 2011 ; 27 : 1739 – 40 . https://doi.org/10.1093/bioinformatics/btr260 .
Du L . et al. Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification . Bioinformatics 2020 ; 36 : i371 – 9 . https://doi.org/10.1093/bioinformatics/btaa434 .
Hao X , Li C , du L . et al. Mining outcome-relevant brain imaging genetic associations via three-way sparse canonical correlation analysis in Alzheimer’s disease . Sci Rep 2017 ; 7 : 44272 . https://doi.org/10.1038/srep44272 .
Shi WJ , Zhuang Y , Russell PH . et al. Unsupervised discovery of phenotype-specific multi-omics networks . Bioinformatics 2019 ; 35 : 4336 – 43 . https://doi.org/10.1093/bioinformatics/btz226 .
Duan R , Gao L , Gao Y . et al. Evaluation and comparison of multi-omics data integration methods for cancer subtyping . PLoS Comput Biol 2021 ; 17 : e1009224 . https://doi.org/10.1371/journal.pcbi.1009224 .
Ponzetti M , Capulli M , Angelucci A . et al. Non-conventional role of haemoglobin beta in breast malignancy . Br J Cancer 2017 ; 117 : 994 – 1006 . https://doi.org/10.1038/bjc.2017.247 .
Yang X , Tang Z . Role of gasdermin family proteins in cancers (review) . Int J Oncol 2023 ; 63 : 100 . https://doi.org/10.3892/ijo.2023.5548 .
Chen Z , Yao N , Zhang S . et al. Identification of critical radioresistance genes in esophageal squamous cell carcinoma by whole-exome sequencing . Ann Transl Med 2020 ; 8 : 998 . https://doi.org/10.21037/atm-20-5196 .
Zhou Y , Zhang Y , Zhao D . et al. TTD: therapeutic target database describing target druggability information . Nucleic Acids Res 2024 ; 52 : D1465 – 77 . https://doi.org/10.1093/nar/gkad751 .
Li F , Yin J , Lu M . et al. DrugMAP: molecular atlas and pharma-information of all drugs . Nucleic Acids Res 2023 ; 51 : D1288 – 99 . https://doi.org/10.1093/nar/gkac813 .
Záveský L , Jandáková E , Weinberger V . et al. Human endogenous retroviruses (HERVs) in breast cancer: altered expression pattern implicates divergent roles in carcinogenesis . Oncology 2024 ; 102 : 1 – 10 . https://doi.org/10.1159/000538021 .
van der Wiel AMA , Schuitmaker L , Cong Y . et al. Homologous recombination deficiency scar: mutations and beyond-implications for precision oncology . Cancers (Basel) 2022 ; 14 : 4157 . https://doi.org/10.3390/cancers14174157 .
Morisaki T , Kubo M , Umebayashi M . et al. Neoantigens elicit T cell responses in breast cancer . Sci Rep 2021 ; 11 : 13590 . https://doi.org/10.1038/s41598-021-91358-1 .
Levine KM , Ding K , Chen L . et al. FGFR4: a promising therapeutic target for breast cancer and other solid tumors . Pharmacol Ther 2020 ; 214 : 107590 . https://doi.org/10.1016/j.pharmthera.2020.107590 .
Ali H , Provenzano E , Dawson SJ . et al. Association between CD8+ T-cell infiltration and breast cancer survival in 12 439 patients . Ann Oncol 2014 ; 25 : 1536 – 43 . https://doi.org/10.1093/annonc/mdu191 .
Liu Y , Pan B , Qu W . et al. Systematic analysis of the expression and prognosis relevance of FBXO family reveals the significance of FBXO1 in human breast cancer . Cancer Cell Int 2021 ; 21 : 130 . https://doi.org/10.1186/s12935-021-01833-y .
Park SD , Saunders AS , Reidy MA . et al. A review of granulocyte colony-stimulating factor receptor signaling and regulation with implications for cancer . Front Oncol 2022 ; 12 : 932608 . https://doi.org/10.3389/fonc.2022.932608 .
Aghamiri S , Zandsalimi F , Raee P . et al. Antimicrobial peptides as potential therapeutics for breast cancer . Pharmacol Res 2021 ; 171 : 105777 . https://doi.org/10.1016/j.phrs.2021.105777 .
Chen R , Wang X , Fu J . et al. High FLT3 expression indicates favorable prognosis and correlates with clinicopathological parameters and immune infiltration in breast cancer . Front Genet 2022 ; 13 : 956869 . https://doi.org/10.3389/fgene.2022.956869 .
Chen X , Zhang T , Su W . et al. Mutant p53 in cancer: from molecular mechanism to therapeutic modulation . Cell Death Dis 2022 ; 13 : 974 . https://doi.org/10.1038/s41419-022-05408-1 .
Azimnasab-Sorkhabi P , Soltani-As M , Yoshinaga TT . et al. IDO blockade negatively regulates the CTLA-4 signaling in breast cancer cells . Immunol Res 2023 ; 71 : 679 – 86 . https://doi.org/10.1007/s12026-023-09378-0 .
Sideris N , Dama P , Bayraktar S . et al. LncRNAs in breast cancer: a link to future approaches . Cancer Gene Ther 2022 ; 29 : 1866 – 77 . https://doi.org/10.1038/s41417-022-00487-w .
Burstein MD . et al. Comprehensive genomic analysis identifies novel subtypes and targets of triple-negative breast cancer . Clin Cancer Res 2015 ; 21 : 1688 – 98 .
Month: | Total Views: |
---|---|
June 2024 | 859 |
Citing articles via.
Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide
Sign In or Create an Account
This PDF is available to Subscribers Only
For full access to this pdf, sign in to an existing account, or purchase an annual subscription.
Degree grantor, degree level, degree name, committee member, thesis type, usage metrics.
You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.
Original Submission Date Received: .
Find support for a specific problem in the support section of our website.
Please let us know what you think of our products and services.
Visit our dedicated information section to learn more about MDPI.
Deep learning for flash drought detection: a case study in northeastern brazil.
Barbosa, H.A.; Buriti, C.O.; Kumar, T.V.L. Deep Learning for Flash Drought Detection: A Case Study in Northeastern Brazil. Atmosphere 2024 , 15 , 761. https://doi.org/10.3390/atmos15070761
Barbosa HA, Buriti CO, Kumar TVL. Deep Learning for Flash Drought Detection: A Case Study in Northeastern Brazil. Atmosphere . 2024; 15(7):761. https://doi.org/10.3390/atmos15070761
Barbosa, Humberto A., Catarina O. Buriti, and T. V. Lakshmi Kumar. 2024. "Deep Learning for Flash Drought Detection: A Case Study in Northeastern Brazil" Atmosphere 15, no. 7: 761. https://doi.org/10.3390/atmos15070761
Article access statistics, further information, mdpi initiatives, follow mdpi.
Subscribe to receive issue release notifications and newsletters from MDPI journals
You have full access to this open access article
39 Accesses
Explore all metrics
Establishing thresholds of change that are actually meaningful for the patient in an outcome measurement instrument is paramount. This concept is called the minimum clinically important difference (MCID). We summarize available MCID calculation methods relevant to spine surgery, and outline key considerations, followed by a step-by-step working example of how MCID can be calculated, using publicly available data, to enable the readers to follow the calculations themselves.
Thirteen MCID calculations methods were summarized, including anchor-based methods, distribution-based methods, Reliable Change Index, 30% Reduction from Baseline, Social Comparison Approach and the Delphi method. All methods, except the latter two, were used to calculate MCID for improvement of Zurich Claudication Questionnaire (ZCQ) Symptom Severity of patients with lumbar spinal stenosis. Numeric Rating Scale for Leg Pain and Japanese Orthopaedic Association Back Pain Evaluation Questionnaire Walking Ability domain were used as anchors.
The MCID for improvement of ZCQ Symptom Severity ranged from 0.8 to 5.1. On average, distribution-based methods yielded lower MCID values, than anchor-based methods. The percentage of patients who achieved the calculated MCID threshold ranged from 9.5% to 61.9%.
MCID calculations are encouraged in spinal research to evaluate treatment success. Anchor-based methods, relying on scales assessing patient preferences, continue to be the “gold-standard” with receiver operating characteristic curve approach being optimal. In their absence, the minimum detectable change approach is acceptable. The provided explanation and step-by-step example of MCID calculations with statistical code and publicly available data can act as guidance in planning future MCID calculation studies.
Determining the clinical importance of treatment benefits for interventions for painful orthopedic conditions.
Avoid common mistakes on your manuscript.
The notion of minimum clinically important difference (MCID) was introduced to establish thresholds of change in an outcome measurement instrument that are actually meaningful for the patient. Jaeschke et al . originally defined it “as the smallest difference in score in the domain of interest which the patient perceives as beneficial and which would mandate, in the absence of troublesome side-effects and excessive cost, a change in the patient’s management” [ 1 ].
In many clinical trials statistical analyses only focuses on intergroup comparisons of raw outcome scores using parametric/non-parametric tests and deriving conclusions based on the p -value. Using the classical threshold of p- value < 0.05 only suggests that the observed effect is unlikely to have occurred by chance, but it does not equate to a change that is clinically meaningful for the patient [ 2 ]. Calculating MCID scores, and using them as thresholds for “treatment success”, ensures that patients’ needs and preferences are considered and allows for comparison of proportion of patients experiencing a clinically relevant improvement among different groups [ 3 ]. Through MCID, clinicians can better understand the impact of an intervention on their patients’ lives, sample size calculations can become more robust and health policy makers may decide which treatments deserve reimbursement [ 4 , 5 , 6 ].
The MCID can be determined from the patient’s perspective, where it is the patient who decides whether a change in their health was meaningful [ 4 , 7 , 8 , 9 ]. This is the most common “gold-standard” approach and one that we will focus on. Occasionally, the clinician’s perspective can also be used to determine MCID. However, MCID for a clinician may not necessarily mean an increase in a patient’s functionality, but rather a change in disease survival or treatment planning [ 10 ]. MCID can also be defined at a societal level, as e.g. improvement in a patient’s functionality significant enough to aid their return to work [ 11 ].
MCID thresholds are intended to assess an individual’s clinical improvement and ought not to be applied to mean scores of entire groups post-intervention, as doing so may falsely over-estimate treatment effectiveness. It is also noteworthy to mention that obtained MCID values are not treatment-specific but broadly disease category-specific. They rely on a patient’s perception of clinical benefit, which is influenced by their diagnosis and subsequent symptoms, not just treatment modality.
In this study, we summarize available MCID calculation methods and outline key considerations when designing a MCID study, followed by a step-by-step working example of how MCID can be calculated.
To illustrate the MCID methods and to enable the reader to follow the practical calculation guide of different MCID values, based on the described methods along the way, a previously published data set of 84 patients, as described in Minetama et al ., was used based on CC0.10 license [ 12 ]. Data can be downloaded at https://data.mendeley.com/datasets/vm8rg6rvsw/1 . The statistical R code can be found in Supplementry content 1 including instructions on formatting the data set for MCID calculations The title of different MCID methods in the paper (listed below) and their number correspond to the same title and respective number in the R code. All analyses in this case study were carried out using R version 2023.12 + 402 (The R Foundation for Statistical Computing, Vienna Austria) [ 13 ].
The aim of Minetama et al . was to assess the effectiveness of supervised physical therapy (PT) with unsupervised at-home-exercises (HE) in patients with lumbar spinal stenosis (LSS). The main inclusion criteria were presence of neurogenic intermittent claudication and pain/or numbness in the lower extremities with or without back pain and > 50 years of age; diagnosis of LSS confirmed on MRI and a history of ineffective response to therapy for ≥ 3 months. Patients were then randomized into a 6-week PT or HE programme [ 12 ]. All data was pooled, as a clinically significant benefit for patients is independent of group allocation and because MCID is disease-specific. Therefore, the derived MCID will be applicable to most patients with lumbar spinal stenosis, irrespective of treatment modality. Change scores were calculated by subtracting baseline scores from follow-up scores.
There are multiple approaches to calculate MCID, mainly divided into anchor-based and distribution-based methods (Fig. 1 ) [ 4 , 10 , 14 , 15 , 16 , 17 ]. Before deciding on the method, it needs to be defined whether the calculated MCID will be for improvement or deterioration [ 18 ]. Most commonly, MCID is used to measure improvement (as per Jaeschke et al . definition) [ 1 , 4 , 7 , 14 , 15 , 16 , 19 , 20 ]. The value of MCID for improvement should not be directly applied in reverse to determine whether a decrease in patients' scores signifies a clinically meaningful deterioration – those are two separate concepts [ 18 ]. In addition, the actual MCID value ought to be applied to post-intervention score of an individual patient (not the overall score for the whole group), to determine whether, at follow-up, he or she experienced a change equating to MCID or more, compared to their baseline score. Such patient is then classified as “responders”.
Flow diagram presenting range of Minimum clinically important difference calculation methods stratified into anchor, distribution-based and “other” described in the study. MCID, Minimum Clinically Important Difference; MIC, Minimal Important Change
According to the Consensus-based Standards for the selection of health measurement instruments (COSMIN) guidelines, the “anchor-based” approach is regarded as the “gold-standard” [ 21 , 22 , 23 ]. In this approach, we determine the MCID of a chosen outcome measurement, based on whether a pre-defined MCID (usually derived from another published study) was achieved by an external criterion, known as the anchor, usually another patient-reported outcome measure (PROM) or an objective test of functionality [ 4 , 7 , 8 , 15 , 16 , 17 , 18 , 20 ]. It is best to use scales which allow the patient to rate the specific aspect of their health related to the disease of interest post-intervention compared to baseline on a Likert-type scale. This scale may range, for example, from “much worse”, “somewhat worse”, “about the same”, “somewhat better”, to “much better”, such as the established Global Assessment Rating tool [ 7 , 8 , 24 , 25 ]. Depending on the scale, some studies determine MCID by calculating change scores for patients who only ranked themselves as “somewhat better”, and some only consider patients who ranked themselves as “much better” [ 7 , 25 , 26 , 27 , 28 , 29 ]. This discrepancy is likely an explanation for a range of MCID for a single outcome measure dependent on the methodology. There appears to be no singular “correct” approach. One of the alternatives to the Global assessment rating is the use of the health transition item (HTI) from the SF-36 questionnaire, where patients are asked about their overall health compared to one year ago [ 7 , 30 , 31 ]. Although quick and easy to conduct, the patient’s response may be influenced by comorbid health issues other than those targeted by intervention. Nevertheless, any anchor where the patient is the one to decide what change is clinically meaningful, captures the true essence of the MCID. One should however, be mindful of the not easily addressed recall bias with such anchors – patients at times do not reliably remember their baseline health status [ 32 ]. Moreover, what the above anchors do not consider is, whether the patient would still choose the intervention for the same condition despite experiencing side-effects or cost. That can be addressed through implementing anchors such as the Satisfaction with Results scale described in Copay et al ., who found that MCID values based on the Satisfaction with Results scale were slightly higher than those derived from HTI-SF-36 [ 7 , 33 ].
Other commonly used outcome scales, such as Oswestry Disability Index (ODI), Roland–Morris Disability Questionnaire (RMDQ), Visual Analogue Scale (VAS), or EQ5D-3L Health-Related Quality of Life, can also act as anchors [ 7 , 14 , 16 , 34 , 35 ]. In such instances, patients complete the “anchor” questionnaire at baseline and post-intervention and the MCID of that anchor is derived from a previous publication [ 12 , 16 , 35 ]. Before deciding on the MCID, full understanding of how it was derived in that previous publication is crucial. Ideally, this should be done for a population similar to our study cohort, with comparable follow-up periods [ 18 , 20 ]. Correlations between the anchor instrument and the investigated outcome measurement instrument must be recorded, and ought to be at least moderate (> 0.05), as that is the best indicator of construct validity (whether both the anchor instrument and outcome instrument represent a similar construct of patient health) [ 18 , 36 ]. If such correlation is not available, the anchor-based MCID credibility instrument is available to aid in assessing construct proximity between the two [ 36 , 37 ].
Once the process for selecting an anchor and classifying “responders” and “non-responders” is established, the MCID can be calculated. The outcome instrument of interest will be defined as an outcome for which we want to calculate the MCID. The first anchor-based method (within-patient change) focuses on the average improvement seen among clear responders in the anchor. The between-patient change anchor-based method additionally subtracts the average improvement seen among non-responders (unchanged and/or worsened) and consequently ends up with a smaller MCID value. Finally, an anchor-based method based on Receiver Operating Characteristic (ROC) curve analysis–that can be considered the current “gold standard”- also exists, which effectively looks at the MCID calculation as a sort of diagnostic instrument and aims to improve the discriminatory performance of our MCID threshold. In the following paragraphs, the three anchor-based methods are described in more detail. The R code (Supplementry Content 1 ) enables the reader to follow the text and to calculate MCID for the Zurich Claudication Questionnaire (ZCQ) Symptom Severity domain, based on a publicly available dataset [ 12 ].
The chosen outcome measurement instrument in this case study for which MCID for improvement will be calculated is ZCQ Symptom Severity domain [ 12 ]. The ZCQ is composed of three subscales: symptom severity (7 questions, score per question ranging from 1 to 5 points); physical function (5 questions, score per question ranging from 1 to 4 points) and patient satisfaction with treatment scale (6 questions, score per question ranging from to 4 points). Higher scores indicate greater disability/worse satisfaction [ 38 ]. To visualize different MCID values, Numeric Rating Scale (NRS) for Leg Pain (score from 0 “no pain” to 10 “worse possible pain) and Japanese Orthopaedic Association Back Pain Evaluation Questionnaire (JOABPEQ) Walking Ability domain are chosen, as they showed high responsiveness in patients with LSS post-operatively [ 39 ].Through 25 questions, the JOABPEQ assesses five distinctive domains: pain-related symptoms, lumbar spine dysfunction, walking ability, impairment in social functioning and psychological disturbances. The score for each domain ranges from 0 to 100 points (higher score indicating better health status) [ 40 ]. The correlation of ZCQ symptom severity with NRS Leg Pain and JOABPEQ Walking Ability domain, is 0.56 and − 0.51, respectively [ 39 ]. For a patient to be classified as a “responder”, using the NRS for Leg pain or JOABPEQ walking ability, the score at 6-week follow-up must have improved by 1.6 points or 20 points, respectively [ 7 , 40 , 41 ].
This publicly available dataset does not report patient satisfaction or any kind of global assessment rating.
To enable calculation of global assessment rating-based MCID methods for educational purposes, despite very limited availability of studies providing MCID for deterioration of JOABPEQ, we decided to stratify patients in this dataset into the three following groups, based on the JOABPEQ Walking Ability as an anchor: likely improved (change score above 20 points according to Kasai et al . ), no significant change (− 20– + 20 points change score), and likely deteriorated (lower than − 20 points change score) [ 41 ]. As obtained MCID values were expected to be negative, all values, for clarity of presentation, were multiplied by − 1, except in Method (IX), where graphical data distribution was shown.
Method (i) calculating mcid using “within-patient” score change.
The first method focuses on calculating the change between baseline and post-intervention score of our outcome instrument, for each patient classified as a “responder”. A “responder” is a patient who, at follow-up, has achieved the pre-defined MCID of the anchor (or ranks themselves high enough on Global assessment rating type scale based on our methodology). The MCID is then defined as the mean change in the outcome instrument of interest of those classified as “responders” [ 4 , 7 , 16 , 31 ].
The corresponding R-Code formula is described in Step 5a of Supplementry Content 1 . Calculated within-patient MCID of ZCQ Symptom Severity based on NRS Leg Pain and JOABPEQ Walking Ability domain was 4.4 and 4.2, respectively.
In this approach, the mean change in our outcome instrument is calculated for not only “responders” but also for “non-responders”. “Non-responders” are patients who did not achieve the pre-defined MCID of our anchor or who did not rank themselves high enough (unchanged, or sometimes: unchanged + worsened) on Global Assessment Rating type scale according to our methodology. The minimum clinically important difference of our outcome instrument is then defined as the difference between the mean change scores of “responders” and “non-responders” [ 4 , 7 , 16 , 19 ].
The corresponding R-Code formula is described in Step 5b of Supplementry content 1 . Calculated between-patient MCID of ZCQ Symptom Severity based on NRS Leg Pain and JOABPEQ Walking Ability domain was 3.5 and 2.8, respectively.
Here the MCID is derived through ROC analysis to identify the “threshold” score of our outcome instrument that best discriminates between “responders” and “non-responders” of the anchor [ 4 , 7 , 16 , 19 , 27 ]. To understand ROC, one must familiarize oneself with the concept of sensitivity and specificity. In ROC analysis, sensitivity is defined as the ability of the test to correctly detect “true positives”, which in this context refers to patients who have achieved a clinically meaningful change.
“False negative” would be a patient, who was classified as “non-responder” but is really a “responder”. Specificity is defined as the ability of a test to correctly detect a “true negative” result- a patient who did not achieve a clinically meaningful change – a “non-responder” [ 25 ].
A “false positive” would be a patient, who was classified as a “responder” but who was a “non-responder”. Values for sensitivity and specificity range from 0 to 1. Sensitivity of 1 means that the test can detect 100% of “true positives”’ (“responders”), while specificity of 1 reflects the ability to detect 100% of “true negatives” (“non-responders”). It is unclear what the minimum sensitivity and specificity should be for a “gold-standard” MCID, which is why the most established approach is to opt for a MCID threshold that maximizes both sensitivity and specificity at the same time, which can be done using ROC analysis [ 4 , 7 , 25 , 31 , 42 ]. During ROC analysis, the “closest-to-(0,1)-criterion” (the top left most point of the curve) or the Youden index are the two methods to automatically determine the optimal threshold point [ 43 ].
When conducting the ROC analysis, the Area under the curve (AUC) is also determined–a measure of how well the MCID threshold discriminates responders and non-responders in general. Values in AUC can range 0–1. An AUC of 0.5 signifies that the score discriminates no better than random chance, whereas a value of 1 means that the score perfectly discriminates between responders and non-responders. In the literature, an AUC of 0.7 and 0.8 is deemed fair (acceptable), while ≥ 0.8 to < 0.9 is considered good and values ≥ 0.9 are considered excellent [ 44 ]. Calculating the AUC provides a rough estimate of how well the chosen MCID threshold performs. The corresponding R-Code formula is described in Step 5c of Supplementry content 1 . Statistical package pROC was used. The calculated MCID of ZCQ symptom severity based on NRS Leg Pain and JOABPEQ Walking Ability domain was for both 1.5.
Calculation of MCID using the distribution-based approach focuses on statistical properties of the dataset [ 7 , 14 , 16 , 27 , 45 ]. Those methods are objective, easy to calculate, and in some cases, yield values close to anchor-based MCID. The advantage of this approach is that it does not rely on any external criterion or require additional studies on previously established MCIDs or other validated “gold standard” questionnaires for the specific disease in each clinical setting. However, it fails to include the patient’s perspective of a clinically meaningful change, which will be discussed later in this study. In this sense, distribution-based methods focus on finding MCID thresholds that enable mathematical distinction of what is considered a changed vs. unchanged score, whereas anchor-based methods focus on finding MCID thresholds which represent a patient-centered, meaningful improvement.
The standard error of measurement conceptualizes the reliability of the outcome measure, by determining how repeated measurements of an outcome may differ from the “true score”. Greater SEM equates to lower reliability, which is suggestive of meaningful inconsistencies in the values produced by the outcome instrument despite similar measuring conditions. Hence, it has been theorized that 1 SEM is equal to MCID, because a change score ≥ 1 SEM, is unlikely to be due to measurement error and therefore is also more likely to be clinically meaningful [ 46 , 47 ]. The following formula is used: [ 1 , 7 , 35 , 46 , 48 ].
The ICC, also called reliability coefficient, signifies level of agreement or consistency between measurements taken on different occasions or by different raters [ 49 ]. There are various ways of calculating the ICC depending on the used model with values < 0.5, 0.5– 0.75, 0.75–0.9 and > 0.90 indicating poor, moderate, good and excellent reliability, respectively [ 49 ]. While a value of 1 × SEM is probably the most established way to calculate MCID, in the literature, a range of multiplication factors for SEM-based MCID have been used, including 1.96 SEM or even 2.77 SEM to identify a more specific threshold for improvement [ 48 , 50 ]. The corresponding R-Code formula is described in Step 6a of Supplementry Content 1 . The chosen ZCQ Symptom Severity ICC was 0.81 [ 51 ]. The SEM-based MCID was 1.9.
Effect size (ES) is a standardized measure of the strength of the relationship or difference between two variables [ 52 ]. It is described by Cohen et al . as “degree to which the null hypothesis (there is no difference between the two groups) is false”. It allows for direct comparison of different instruments with different units between studies. There are multiple forms to calculate ES, but for the purpose of MCID calculations, the ES represents the number of SDs by which the post-intervention score has changed from baseline score. It is calculated based on the following formula incorporating the average change score divided by the SD of the baseline score: [ 52 ].
According to Cohen et al . 0.2 is considered small ES, 0.5 is moderate ES and 0.8 or more is large ES [ 53 ]. Most commonly, a change score with an ES of 0.2 is considered equivalent to MCID [ 7 , 16 , 31 , 54 , 55 , 56 ]. Using this method, we are basically identifying the mean change score (in this case reflecting the MCID) that equates to an ES of 0.2: [ 7 , 55 ].
Practically, if a patient experienced small improvement in an outcome measure post intervention, the ES will be smaller than for a patient who experienced a large improvement in outcomes measure. The corresponding R-Code formula is described in Step 6b of Supplementry Content 1 . The ES-based MCID was 0.9.
The Standardized Response Mean (SRM) aims to gauge the responsiveness of an outcome similarly to ES. Initially described by Cohen et al . as a derivative of ES assessing differences of paired observations in a single sample, later renamed as SRM, it is also considered an “index of responsiveness” [ 38 , 53 ]. However, the denominator is SD of the change scores–not the SD of the baseline scores–while the numerator remains the average change score from baseline to follow-up: [ 10 , 45 , 57 , 58 , 59 ].
Similarly, to Cohen’s rule of interpreting ES, it has been theorized that responsiveness can be considered low if SRM is 0.2–0.5, moderate if > 0.5–0.8 and large if > 0.8 [ 58 , 59 , 60 ]. Again, a change score equating to SRM of 0.2 (although SRM of 1/3 or 0.5 were also proposed) can be considered MCID, although studies have used the overall SRM as MCID as well [ 45 , 54 , 56 , 61 ]. However, since SRM is a standardized index, similarly to ES, the aim of the SRM-based method ought to be to identify a change score that indicates responsiveness of 0.2: [ 61 ].
Similar to the ES-based method, the SRM-based approach for calculating the MCID is not commonly used in in spine surgery studies [ 14 ]. It is a measure of responsiveness, which is the ability to detect change over time in a construct to be measured by the instrument, and ought to be therefore calculated for the study-specific change score rather than extrapolated as a “universal” MCID threshold to other studies. The corresponding R-Code formula is described in Step 6c of Supplementry Content 1 . The SRM-based MCID was 0.8.
The limitation of using Method (V) and (VI) in MCID calculations will be later described in Discussion.
Standard Deviation represents the average spread of individual data points around the mean value of the outcome measure. Norman et al . found in their review of studies using MCID in health-related quality of life instruments that most studies had an average ES of 0.5, which equated to clinically meaningful change score of 0.5 × SD of baseline score [ 7 , 16 , 30 ].
The corresponding R-Code formula is described in Step 6d of Supplementry content 1 . The SD-based MCID was 2.1.
The MDC is defined as the minimal change below which there is a 95% chance that it is due to measurement error of the outcome measurement instrument: [ 7 , 61 ].
Usually, value corresponding to z is the desired level of confidence, which for 95% confidence level is 1.96. Although MDC–like all distribution-based methods–does not consider whether a change is clinically meaningful, the calculated MCID should be at least the same or greater than MDC to enable distinguishing true mathematical change from measurement noise. The 95% MDC calculation, is the most common distribution-based approach in spinal surgery, and it appears to most closely resemble anchor-derived MCID values, as demonstrated by Copay et al . [ 7 , 14 , 62 ]. The corresponding R-Code formula is described in Step 6e of Supplementry Content 1 . The 95% MDC was 5.1.
Another less frequently applied method through which “responders and “non-responders” can be classified but which does not rely on an external criterion is the Reliable Change Index (RCI), also called the Jacobson–Truax index [ 63 , 64 ]. It indicates whether an individual change score is statistically significantly greater than a change in score that could have occurred due to random measurement error alone [ 63 ].
In theory, a patient can be considered to experience a statistically reliably identifiable improvement ( p < 0.05), if the individual RCI is > 1.96. Again, it does not reflect whether the change is clinically meaningful for the patient but rather that the change should not be attributed to measurement error alone and likely has a component of true score change. Therefore, this method is discouraged in MCID calculations as it relies on statistical properties of the sample and not patient preferences–as all distribution-based methods do [ 65 ]. In the example of Bolton et al . who focused on the Bournemouth Questionnaire in patients with neck pain, RCI was subsequently used to discriminate between “responders” and “non-responders”. The ROC analysis approach was then used to determine the MCID [ 64 ]. The corresponding R-Code formula is described in Step 6f of Supplementry Content 1 . Again, pROC package was used. The ROC-derived MCID was 2.5.
Method (x) calculating mcid through anchor-based minimal important change (mic) distribution model.
In theory, combining anchor- and distribution-based methods could yield superior results. Some suggestions include averaging the values of various methods, simply combining two different methods (i.e. both an anchor-based criterion such as ROC-based MCID from patient satisfaction and 95% MDC-based MCID have to both be met to consider a patient as having achieved MCID) [ 25 ]. In 2007, de Vet et al . introduced a new visual method of MCID calculations that does not only combine but also integrates both anchor- and distribution-based calculations [ 25 ]. In addition, their method allows the calculation of both MCID for improvement and for deterioration, as these can differ.
In short form, using an anchor, patients were divided into three “importantly improved”, “not importantly changed” and “importantly deteriorated” groups (Fig. 2 ) . Then distribution expressed in percentiles of patients who “importantly improved”, “importantly deteriorated” and “not importantly changed” were plotted on a graph. This is the anchor-based part of the approach, ensuring that MCID thresholds chosen have clinical value.
Distribution of the Zurich Claudication Questionnaire Symptom Severity change scores for patients categorized as experiencing “important improvement”, “no important change” or “important deterioration” in JOABPEQ walking ability as an anchor (Method (X)). For ZCQ Symptom Severity score to improve, the actual value must decrease explaining the negative values in the model. ROC , Receiver Operating Characteristic; ZCQ , Zurich Claudication Questionnaire; JOABPEQ , Japanese Orthopaedic Association Back Pain Evaluation Questionnaire
The second part of the approach is then entirely focused on the group of patients determined by the anchor to be “unchanged”, and can be either distribution- or anchor-based:
In the first and more anchor-based method, the ROC-based method described in Method (III) is applied to find the threshold for improvement (by finding the ROC-based threshold point that optimizes sensitivity and specificity of identifying improved vs unchanged patients) or for deterioration (by finding the ROC-based threshold point that optimizes sensitivity and specificity of identifying deteriorated vs unchanged patients). For example, the threshold for improvement is found by combining the improved and unchanged groups, and then testing out different thresholds for discriminating those two groups from each other. The optimal point on the resulting ROC curve based on the closest-to-(0,1)-criterion is then found.
In the second method, which is distribution-based, the upper 95% (for improvement) and lower 95% (for deterioration) limits are found based solely on the group of patients determined to be unchanged. The following formula is used (instead, subtracting instead of adding the 1.645 × SD for deterioration or improvement, respectively): [ 25 ]
The corresponding R-Code formula can be found under Step 7a in Supplementry Content 1 . The model is presented in Fig. 2 . The 95% upper limit and 95% lower limit was 4.1 and − 7.2 respectively. The ROC-derived MCID using RCI was − 2.5 (important improvement vs unchanged) and − 0.5 (important deterioration vs unchanged). For the purpose of the model, MCID values were not multiplied by − 1 but remained in original form.
In recent years, a simple 30% reduction from baseline values has been introduced as an alternative to MCID calculations [ 66 ]. It has been speculated that absolute-point changes are difficult to interpret and have limited value in context of “ceiling” and “floor” effects (i.e. values that are on the extreme spectra of the measurement scale) [ 4 ]. To overcome this, Khan et al . found that 30% reduction in PROMs has similar effectiveness as traditional anchored or distribution-based methods in detecting patients with clinically meaningful differences post lumbar spine surgery [ 15 ]. The corresponding R-Code formula can be found under Step 7b in Supplementry Content 1 .
The Delphi Method is a systemic approach using the collective opinion of experts to establish a consensus regarding a medical issue [ 67 ]. It has mostly been used to develop best practice guidelines [ 68 ]. However, it can also be used to aid MCID determination [ 69 ]. The method focuses on distributing questionnaires or surveys to panel of members. The anonymized answers are grouped together and shared again with the expert panel in subsequent rounds. This allows the experts to reflect on their opinions and consider strengths and weaknesses of the others response. The process is repeated until consensus is reached. Ensuring anonymity, this prevents any potential bias linked to a specific participant’s concern about their own opinion being viewed or influenced by other personal factors [ 67 ].
The final approach is asking patients to compare themselves to other patients, which requires time and resources [ 70 ]. In a study by Redelmeier et al . patients with chronic obstructive pulmonary disease in a rehabilitation program were organized into small groups and observed each other at multiple occasions [ 70 ]. Additionally, each patient was paired with another participant and had a one-to-one interview with them discussing different aspects of their health. Finally, each patient anonymously rated themselves against their partner on a scale “much better”, “somewhat better”, “a little bit better”, “about the same”, “a little bit worse” “somewhat worse” and “much worse”. MCID was then calculated based on the mean change score of patients who graded themselves as “a little bit better” (MCID for improvement) or a “little bit worse” (MCID for deterioration), like in the within-patient change and between-patient change method described in Method (I) and (II) [ 70 ].
Over the years, it has been noted that MCID calculations based either purely on distribution-based method or only group of patients rating themselves as “somewhat better” or “slightly better” does not necessarily constitute a change that patients would consider beneficial enough “to mandate, in the absence of troublesome side effects and excessive cost, to undergo the treatment again” [ 3 , 24 ]. Therefore, the concept of substantial clinical benefit (SCB) has been introduced as a way of identifying a threshold of clinical success of intervention rather than a “floor” value for improvement- that is MCID [ 24 ]. For example, in Carreon et al ., ROC derived SCB “thresholds” were defined as a change score with equal sensitivity and specificity to distinguish “much better” from “somewhat better” patients post cervical spinal fusion [ 71 ]. Glassman et al . on the other hand used ROC derived SCB thresholds to discriminate between “much better” and “about the same” patients following lumbar spinal fusion. The authors stress that SCB and MCID are indeed separate entities, and one should not be used to derive the other [ 24 ]. Thus, while the methods to derive SCB and MCID thresholds can be carried out similarly based on anchors, the ultimate goal of applying SCB versus MCID is different.
Using the various methods explained above, overall, MCID for improvement for ZCQ Symptoms Severity domain ranged from 0.8 to 5.1 (Table 1 ). Here, the readers obtained results can be checked for correctness. On average distribution-based MCID values were lower than anchor-based MCID values. Within distribution-based approach, method (VIII) “Minimum detectable change” resulted in MCID of 5.1, which exceeded the MCID’s derived using the “gold-standard” anchor-based approaches. The average MCID based on anchor of NRS Leg pain and JOABPEQ walking ability was 3.1 and 2.8, respectively. Dependent on methods used, percentage of responders to HE and PT intervention fell within range of 9.5% for “30% Reduction from Baseline” method to 61.9% using ES- and SRM-based method (Table 2 ). Method (X) is graphically presented in Fig. 2 .
As demonstrated above, the MCID is dependent upon the methodology and the chosen anchor, highlighting the necessity for careful preparation in MCID calculations. The lowest MCID of 0.8 was calculated for Method (VI) being SRM. Logically, if a patient on average had a baseline ZCQ Symptom Severity score of 23.2, an improvement of 0.8 is unlikely to be clinically meaningful, even if rounded up. It rather informs on the measurement error property of our instrument as explained by COSMIN. Additionally, the distribution-based methods rely on statistical properties of the sample, which varies from cohort to cohort making it only generalizable to patient groups with similar SD but not applicable to others with a different spread of data [ 52 ]. Not surprisingly, anchor-based methods considering patient preferences yielded on average higher MCID values than distribution-based methods, which again varied from anchor to anchor. The mean MCID for improvement calculated for NPRS Leg Pain was 3.1, while for JOABPEQ Walking Ability it was 2.8—such similar values prove the importance of selecting responsive anchors with at least moderate correlations. Despite assessing different aspects of LSS disease, the MCID remained comparable in this specific case.
Interestingly, Method (VIII) MDC yielded the highest value of 5.1, exceeding the “gold-standard” ROC-derived MCID. This suggests that, in this example, using this ROC-derived MCID in clinical practice would be illogical, as the value falls within the measurement error determined by MDC. Here it would be appropriate to choose MDC approach as the MCID. Interestingly, ROC-derived MCID values based on Global Assessment Rating like stratification of patients based on their JOABPEQ Walking Ability (Method X) yielded higher MCID, than in Method (III). This may be attributed to a more a balanced distribution of “responders” and “non-responders” (only unchanged patients) in Method (X), unlike in the latter (Method III) where patients were strictly categorized into “responders” and “non-responders” (including both deteriorated and unchanged). This further highlights the importance of using global assessment rating type scales in determining the extent of clinical benefit.
Although ES-based (Method (V)) and SRM-based (Method (VI)) MCID calculations have been described in the literature, ES and SRM were originally created to quantify the strength of relationship between scores of two samples (in case of ES) and change score of paired observations in one sample (in case of SRM) [ 53 , 58 , 59 ]. They do offer an alternative to MCID calculations. However, verification with other MCID calculation methods, ideally anchor-based, is strongly recommended. As seen in this case study and other MCID’s derived similarly, they often result small estimates [ 7 , 55 ]. There is also no consensus regarding the choice of SD of Change Score vs. SD of Baseline Score as denominator. Additionally, whether the calculated MCID (mean change score) should represent value, such as the ES is 0.2 indicating small effect, or value should be 0.5 suggesting moderate effect is currently arbitrary and often relies on the researcher’s preference [ 53 , 55 , 59 ]. Both ES and SRM can be used to assess whether the overall change score observed in single study is suggestive of a clinically meaningful benefit in that specific cohort or in case of SRM, whether the outcome measure is responsive. However, it is our perspective that extending such value as “MCID” from one study to another is not recommended.
One can argue whether there is even a place for distribution-based methods in MCID calculations. They ultimately fail to provide an MCID value that meets the original definition of Jaeschke et al . “of smallest change in the outcome that the patient would identify as important”. At no point are patients asked about what constitutes a meaningful change for them, and the value is derived from statistical properties of the sample solely [ 1 ]. Nevertheless, conduction of studies on MCID implementing scales such as Global Assessment Rating is time-consuming and performing studies for each patient outcome and each disease is likely not feasible. Distribution-based methods still have some merit in that they–like the 95% MDC method—can help distinguish measurement noise and inaccuracy from true change. Even if anchor-based methods should probably be used to define MCID thresholds, they ought to be supported by a calculation of MDC so that it can be decided whether the chosen threshold makes sense mathematically (i.e., can reliably be distinguished from measurement inaccuracies) as seen in our case study.
Previously, MCID thresholds for outcome measurement instruments were calculated for generic populations, such as patients suffering from low back pain. More recently, MCID values for commonly used PROMs in spine surgery, such as ODI, RMDQ or NRS have been calculated for more narrowly defined diagnoses, such as lumbar disc herniation (LDH) or LSS. The question arises as to whether a separate MCID is needed for all the different spinal conditions. In general, establishing an MCID specific to these patient groups is only recommended if these patient’s perception of meaningful change is different from that of low back pain in general. Importantly, again, the MCID should not be treatment-specific, but rather broadly disease specific. Therefore, it is advisable to use MCID based on patients who had the most similar disease characteristics to our cohort. For example, an MCID for NRS Back Pain based on study group composed of different types of lumbar degenerative disease, may in some cases, be applied to study cohort composed solely of patients with LDH. However, no such extrapolation should be performed for populations with back pain secondary to malignancy, due to a totally different pathogenesis and associated symptoms that may influence the ability to detect a clinically meaningful change in the above NRS Back Pain such as fatigue or anorexia.
Regardless of robust methodology, it can be expected that it is impossible to obtain the same MCID on different occasions even in the same population due to the inherent subjectivity of what is perceived as “clinically beneficial” and day-to-day symptom fluctuation. However, it was found that patients who have worse baseline scores, reflecting e.g., more advanced disease, require greater overall change at follow-up to report it as clinically meaningful [ 72 ]. One should also be mindful of “regression to the mean” where extremely high or low-scoring patients then subsequently score closer to baseline at second measurement [ 73 ]. Therefore, adequate cohort characteristics need to be presented, for the readers to judge how generalizable the MCID may be to their study cohort. If a patient pre-operatively experiences NRS Leg Pain of 1, and the MCID is 1.6, they cannot achieve MCID at all, as the maximum possible change score is smaller than the MCID threshold (“floor effect”). A similar situation can occur with patients closer to the higher end of the scale (“ceiling effect”). The general rule is, that if at least 15% of the study cohort has the highest or lowest possible score for a given outcome instrument, one can expect significant “ceiling/floor effects” [ 50 ]. One way to overcome this, is through transferring absolute MCID scores to percentage change scores [ 4 , 45 ]. However, percentage change scores only account for high baseline scores, if high baseline scores indicate larger disability (as seen with ODI) and have a possibility of larger change. If a high score in an instruments reflects better health status (as seen in in SF-36), than percentage change scores will increase the association with baseline score [ 4 ]. In general, it is important to consider which patient to exclude from certain analyses when applying MCID: For example, patients without relevant disease preoperatively (for example, those exhibiting so-called “patient-accepted symptom states”, PASS) should probably be excluded altogether when reporting the percentage of patients achieving MCID [ 74 ].
Establishing reliable thresholds for MCID is key in clinical research and forms the basis of patient-centered treatment evaluations when using patient-reported outcome measures or objective functional tests. Calculation of MCID thresholds can be achieved using a variety of different methods, each yielding completely different results, as is demonstrated in this practical guide. Generally, anchor-based methods relying on scales assessing patient preferences/satisfaction or global assessment ratings continue to be the “gold-standard” approach- the most common being ROC analysis. In the absence of appropriate anchors, the distribution-based MCID based on the 95% MDC approach is acceptable, as it appears to yield the most similar results compared to anchor-based approaches. Moreover, we recommend using it as a supplement to any anchor-based MCID thresholds to check if they can reliably distinguish true change from measurement inaccuracies. The explanation provided in this practical guide with step-by-step examples along with public data and statistical code can add as guidance for future studies calculating MCID thresholds.
Jaeschke R, Singer J, Guyatt GH (1989) Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials 10:407–415. https://doi.org/10.1016/0197-2456(89)90005-6
Article CAS PubMed Google Scholar
Concato J, Hartigan JA (2016) P values: from suggestion to superstition. J Investig Med 64:1166. https://doi.org/10.1136/jim-2016-000206
Article PubMed PubMed Central Google Scholar
Zannikos S, Lee L, Smith HE (2014) Minimum clinically important difference and substantial clinical benefit: Does one size fit all diagnoses and patients? Semin Spine Surg 26:8–11. https://doi.org/10.1053/j.semss.2013.07.004
Article Google Scholar
Copay AG, Subach BR, Glassman SD et al (2007) Understanding the minimum clinically important difference: a review of concepts and methods. Spine J 7:541–546. https://doi.org/10.1016/j.spinee.2007.01.008
Article PubMed Google Scholar
Lanario J, Hyland M, Menzies-Gow A et al (2020) Is the minimally clinically important difference (MCID) fit for purpose? a planned study using the SAQ. Euro Respirat J. https://doi.org/10.1183/13993003.congress-2020.2241
Neely JG, Karni RJ, Engel SH, Fraley PL, Nussenbaum B, Paniello RC (2007) Practical guides to understanding sample size and minimal clinically important difference (MCID). Otolaryngol Head Neck Surg 136(1):14–18. https://doi.org/10.1016/j.otohns.2006.11.001
Copay AG, Glassman SD, Subach BR et al (2008) Minimum clinically important difference in lumbar spine surgery patients: a choice of methods using the Oswestry disability index, medical outcomes study questionnaire short form 36, and pain scales. Spine J 8:968–974. https://doi.org/10.1016/j.spinee.2007.11.006
Andersson EI, Lin CC, Smeets RJ (2010) Performance tests in people with chronic low back pain: responsiveness and minimal clinically important change. Spine 35(26):E1559-1563. https://doi.org/10.1097/BRS.0b013e3181cea12e
Mannion AF, Porchet F, Kleinstück FS, Lattig F, Jeszenszky D, Bartanusz V, Dvorak J, Grob D (2009) The quality of spine surgery from the patient’s perspective. Part 1: the core outcome measures index in clinical practice. Euro Spine J 18:367–373. https://doi.org/10.1007/s00586-009-0942-8
Crosby RD, Kolotkin RL, Williams GR (2003) Defining clinically meaningful change in health-related quality of life. J Clin Epidemiol 56:395–407. https://doi.org/10.1016/S0895-4356(03)00044-1
Gatchel RJ, Mayer TG (2010) Testing minimal clinically important difference: consensus or conundrum? Spine J 10:321–327. https://doi.org/10.1016/j.spinee.2009.10.015
Minetama M, Kawakami M, Teraguchi M et al (2019) Supervised physical therapy vs. home exercise for patients with lumbar spinal stenosis: a randomized controlled trial. Spine J 19:1310–1318. https://doi.org/10.1016/j.spinee.2019.04.009
R Core Team (2021) R A Language and Environment for Statistical Computing
Chung AS, Copay AG, Olmscheid N, Campbell D, Walker JB, Chutkan N (2017) Minimum clinically important difference: current trends in the spine literature. Spine 42(14):1096–1105. https://doi.org/10.1097/BRS.0000000000001990
Khan I, Pennings JS, Devin CJ, Asher AM, Oleisky ER, Bydon M, Asher AL, Archer KR (2021) Clinically meaningful improvement following cervical spine surgery: 30% reduction versus absolute point-change MCID values. Spine 46(11):717–725. https://doi.org/10.1097/BRS.0000000000003887
Gautschi OP, Stienen MN, Corniola MV et al (2016) Assessment of the minimum clinically important difference in the timed up and go test after surgery for lumbar degenerative disc disease. Neurosurgery. https://doi.org/10.1227/NEU.0000000000001320
Kulkarni AV (2006) Distribution-based and anchor-based approaches provided different interpretability estimates for the hydrocephalus outcome questionnaire. J Clin Epidemiol 59:176–184. https://doi.org/10.1016/j.jclinepi.2005.07.011
Wang Y, Devji T, Qasim A et al (2022) A systematic survey identified methodological issues in studies estimating anchor-based minimal important differences in patient-reported outcomes. J Clin Epidemiol 142:144–151. https://doi.org/10.1016/j.jclinepi.2021.10.028
Parker SL, Godil SS, Shau DN et al (2013) Assessment of the minimum clinically important difference in pain, disability, and quality of life after anterior cervical discectomy and fusion: clinical article. J Neurosurg Spine 18:154–160. https://doi.org/10.3171/2012.10.SPINE12312
Carrasco-Labra A, Devji T, Qasim A et al (2021) Minimal important difference estimates for patient-reported outcomes: a systematic survey. J Clin Epidemiol 133:61–71. https://doi.org/10.1016/j.jclinepi.2020.11.024
Prinsen CAC, Mokkink LB, Bouter LM et al (2018) COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res 27:1147–1157. https://doi.org/10.1007/s11136-018-1798-3
Article CAS PubMed PubMed Central Google Scholar
Mokkink LB, de Vet HCW, Prinsen CAC et al (2018) COSMIN risk of bias checklist for systematic reviews of patient-reported outcome measures. Qual Life Res 27:1171–1179. https://doi.org/10.1007/s11136-017-1765-4
Terwee CB, Prinsen CAC, Chiarotto A et al (2018) COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual Life Res 27:1159–1170. https://doi.org/10.1007/s11136-018-1829-0
Glassman SD, Copay AG, Berven SH et al (2008) Defining substantial clinical benefit following lumbar spine arthrodesis. J Bone Joint Surg Am 90:1839–1847. https://doi.org/10.2106/JBJS.G.01095
de Vet HCW, Ostelo RWJG, Terwee CB et al (2007) Minimally important change determined by a visual method integrating an anchor-based and a distribution-based approach. Qual Life Res 16:131–142. https://doi.org/10.1007/s11136-006-9109-9
Solberg T, Johnsen LG, Nygaard ØP, Grotle M (2013) Can we define success criteria for lumbar disc surgery? Acta Orthop 84:196–201. https://doi.org/10.3109/17453674.2013.786634
Power JD, Perruccio AV, Canizares M et al (2023) Determining minimal clinically important difference estimates following surgery for degenerative conditions of the lumbar spine: analysis of the Canadian spine outcomes and research network (CSORN) registry. The Spine Journal 23:1323–1333. https://doi.org/10.1016/j.spinee.2023.05.001
Asher AL, Kerezoudis P, Mummaneni PV et al (2018) Defining the minimum clinically important difference for grade I degenerative lumbar spondylolisthesis: insights from the quality outcomes database. Neurosurg Focus 44:E2. https://doi.org/10.3171/2017.10.FOCUS17554
Cleland JA, Whitman JM, Houser JL et al (2012) Psychometric properties of selected tests in patients with lumbar spinal stenosis. Spine J 12:921–931. https://doi.org/10.1016/j.spinee.2012.05.004
Norman GR, Sloan JA, Wyrwich KW (2003) Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation. Med Care 41:582–592. https://doi.org/10.1097/01.MLR.0000062554.74615.4C
Parker SL, Mendenhall SK, Shau DN et al (2012) Minimum clinically important difference in pain, disability, and quality of life after neural decompression and fusion for same-level recurrent lumbar stenosis: understanding clinical versus statistical significance. J Neurosurg Spine 16:471–478. https://doi.org/10.3171/2012.1.SPINE11842
Gatchel RJ, Mayer TG, Chou R (2012) What does/should the minimum clinically important difference measure?: a reconsideration of its clinical value in evaluating efficacy of lumbar fusion surgery. Clin J Pain 28:387. https://doi.org/10.1097/AJP.0b013e3182327f20
Lloyd H, Jenkinson C, Hadi M et al (2014) Patient reports of the outcomes of treatment: a structured review of approaches. Health Qual Life Outcomes 12:5. https://doi.org/10.1186/1477-7525-12-5
Beighley A, Zhang A, Huang B et al (2022) Patient-reported outcome measures in spine surgery: a systematic review. J Craniovertebr Junction Spine 13:378–389. https://doi.org/10.4103/jcvjs.jcvjs_101_22
Ogura Y, Ogura K, Kobayashi Y et al (2020) Minimum clinically important difference of major patient-reported outcome measures in patients undergoing decompression surgery for lumbar spinal stenosis. Clin Neurol Neurosurg 196:105966. https://doi.org/10.1016/j.clineuro.2020.105966
Wang Y, Devji T, Carrasco-Labra A et al (2023) An extension minimal important difference credibility item addressing construct proximity is a reliable alternative to the correlation item. J Clin Epidemiol 157:46–52. https://doi.org/10.1016/j.jclinepi.2023.03.001
Devji T, Carrasco-Labra A, Qasim A et al (2020) Evaluating the credibility of anchor based estimates of minimal important differences for patient reported outcomes: instrument development and reliability study. BMJ 369:m1714. https://doi.org/10.1136/bmj.m1714
Stucki G, Daltroy L, Liang MH et al (1996) Measurement properties of a self-administered outcome measure in lumbar spinal stenosis. Spine 21:796
Fujimori T, Ikegami D, Sugiura T, Sakaura H (2022) Responsiveness of the Zurich claudication questionnaire, the Oswestry disability index, the Japanese orthopaedic association back pain evaluation questionnaire, the 8-item short form health survey, and the Euroqol 5 dimensions 5 level in the assessment of patients with lumbar spinal stenosis. Eur Spine J 31:1399–1412. https://doi.org/10.1007/s00586-022-07236-5
Fukui M, Chiba K, Kawakami M et al (2009) JOA back pain evaluation questionnaire (JOABPEQ)/ JOA cervical myelopathy evaluation questionnaire (JOACMEQ) the report on the development of revised versions April 16, 2007: the subcommittee of the clinical outcome committee of the Japanese orthopaedic association on low back pain and cervical myelopathy evaluation. J Orthop Sci 14:348–365. https://doi.org/10.1007/s00776-009-1337-8
Kasai Y, Fukui M, Takahashi K et al (2017) Verification of the sensitivity of functional scores for treatment results–substantial clinical benefit thresholds for the Japanese orthopaedic association back pain evaluation questionnaire (JOABPEQ). J Orthop Sci 22:665–669. https://doi.org/10.1016/j.jos.2017.02.012
Glassman SD, Carreon LY, Anderson PA, Resnick DK (2011) A diagnostic classification for lumbar spine registry development. Spine J 11:1108–1116. https://doi.org/10.1016/j.spinee.2011.11.016
Perkins NJ, Schisterman EF (2006) The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. Am J Epidemiol 163:670–675. https://doi.org/10.1093/aje/kwj063
Nahm FS (2022) Receiver operating characteristic curve: overview and practical use for clinicians. Korean J Anesthesiol 75:25–36. https://doi.org/10.4097/kja.21209
Angst F, Aeschlimann A, Angst J (2017) The minimal clinically important difference raised the significance of outcome effects above the statistical level, with methodological implications for future studies. J Clin Epidemiol 82:128–136. https://doi.org/10.1016/j.jclinepi.2016.11.016
Wyrwich KW, Tierney WM, Wolinsky FD (1999) Further evidence supporting an SEM-based criterion for identifying meaningful intra-individual changes in health-related quality of life. J Clin Epidemiol 52:861–873. https://doi.org/10.1016/s0895-4356(99)00071-2
Wolinsky FD, Wan GJ, Tierney WM (1998) Changes in the SF-36 in 12 months in a clinical sample of disadvantaged older adults. Med Care 36:1589–1598
Wyrwich KW, Nienaber NA, Tierney WM, Wolinsky FD (1999) Linking clinical relevance and statistical significance in evaluating intra-individual changes in health-related quality of life. Med Care 37:469–478. https://doi.org/10.1097/00005650-199905000-00006
Koo TK, Li MY (2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 15:155–163. https://doi.org/10.1016/j.jcm.2016.02.012
McHorney CA, Tarlov AR (1995) Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res 4:293–307
Hara N, Matsudaira K, Masuda K et al (2016) Psychometric assessment of the Japanese version of the Zurich claudication questionnaire (ZCQ): reliability and validity. PLoS ONE 11:e0160183. https://doi.org/10.1371/journal.pone.0160183
Kazis LE, Anderson JJ, Meenan RF (1989) Effect sizes for interpreting changes in health status. Med Care 27:S178–S189. https://doi.org/10.1097/00005650-198903001-00015
Cohen J (1988) Statistical power analysis for the behavioral sciences. L Erlbaum Associates, Hillsdale, NJ
Franceschini M, Boffa A, Pignotti E et al (2023) The minimal clinically important difference changes greatly based on the different calculation methods. Am J Sports Med 51:1067–1073. https://doi.org/10.1177/03635465231152484
Samsa G, Edelman D, Rothman ML et al (1999) Determining clinically important differences in health status measures: a general approach with illustration to the health utilities index mark II. Pharmacoeconomics 15:141–155. https://doi.org/10.2165/00019053-199915020-00003
Wright A, Hannon J, Hegedus EJ, Kavchak AE (2012) Clinimetrics corner: a closer look at the minimal clinically important difference (MCID). J Man Manip Ther 20:160–166. https://doi.org/10.1179/2042618612Y.0000000001
Stucki G, Liang MH, Fossel AH, Katz JN (1995) Relative responsiveness of condition-specific and generic health status measures in degenerative lumbar spinal stenosis. J Clin Epidemiol 48:1369–1378. https://doi.org/10.1016/0895-4356(95)00054-2
Liang MH, Fossel AH, Larson MGS (1990) Comparisons of five health status instruments for orthopedic evaluation. Med Care 28:632–642
Middel B, Van Sonderen E (2002) Statistical significant change versus relevant or important change in (quasi) experimental design: some conceptual and methodological problems in estimating magnitude of intervention-related change in health services research. Int J Integr care. https://doi.org/10.5334/ijic.65
Revicki D, Hays RD, Cella D, Sloan J (2008) Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol 61:102–109. https://doi.org/10.1016/j.jclinepi.2007.03.012
Woaye-Hune P, Hardouin J-B, Lehur P-A et al (2020) Practical issues encountered while determining minimal clinically important difference in patient-reported outcomes. Health Qual Life Outcomes 18:156. https://doi.org/10.1186/s12955-020-01398-w
Parker SL, Mendenhall SK, Shau D et al (2012) Determination of minimum clinically important difference in pain, disability, and quality of life after extension of fusion for adjacent-segment disease. J Neurosurg Spine 16:61–67. https://doi.org/10.3171/2011.8.SPINE1194
Jacobson NS, Truax P (1991) Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 59:12–19
Bolton JE (2004) Sensitivity and specificity of outcome measures in patients with neck pain: detecting clinically significant improvement. Spine 29(21):2410–2417. https://doi.org/10.1097/01.brs.0000143080.74061.25
Blampied NM (2022) Reliable change and the reliable change index: Still useful after all these years? Cogn Behav Ther 15:e50. https://doi.org/10.1017/S1754470X22000484
Asher AM, Oleisky ER, Pennings JS et al (2020) Measuring clinically relevant improvement after lumbar spine surgery: Is it time for something new? Spine J 20:847–856. https://doi.org/10.1016/j.spinee.2020.01.010
Barrett D, Heale R (2020) What are Delphi studies? Evid Based Nurs 23:68–69. https://doi.org/10.1136/ebnurs-2020-103303
Droeghaag R, Schuermans VNE, Hermans SMM et al (2021) Evidence-based recommendations for economic evaluations in spine surgery: study protocol for a Delphi consensus. BMJ Open 11:e052988. https://doi.org/10.1136/bmjopen-2021-052988
Henderson EJ, Morgan GS, Amin J et al (2019) The minimum clinically important difference (MCID) for a falls intervention in Parkinson’s: a delphi study. Parkinsonism Relat Disord 61:106–110. https://doi.org/10.1016/j.parkreldis.2018.11.008
Redelmeier DA, Guyatt GH, Goldstein RS (1996) Assessing the minimal important difference in symptoms: a comparison of two techniques. J Clin Epidemiol 49:1215–1219. https://doi.org/10.1016/s0895-4356(96)00206-5
Carreon LY, Glassman SD, Campbell MJ, Anderson PA (2010) Neck disability index, short form-36 physical component summary, and pain scales for neck and arm pain: the minimum clinically important difference and substantial clinical benefit after cervical spine fusion. Spine J 10:469–474. https://doi.org/10.1016/j.spinee.2010.02.007
Wang Y-C, Hart DL, Stratford PW, Mioduski JE (2011) Baseline dependency of minimal clinically important improvement. Phys Ther 91:675–688. https://doi.org/10.2522/ptj.20100229
Tenan MS, Simon JE, Robins RJ et al (2021) Anchored minimal clinically important difference metrics: considerations for bias and regression to the mean. J Athl Train 56:1042–1049. https://doi.org/10.4085/1062-6050-0368.20
Staartjes VE, Stumpo V, Ricciardi L et al (2022) FUSE-ML: development and external validation of a clinical prediction model for mid-term outcomes after lumbar spinal fusion for degenerative disease. Eur Spine J 31:2629–2638. https://doi.org/10.1007/s00586-022-07135-9
Download references
Open access funding provided by University of Zurich. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Authors and affiliations.
Department of Neurosurgery, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam Movement Sciences, Amsterdam, The Netherlands
Anita M. Klukowska & W. Peter Vandertop
Department of Neurosurgery, University Clinical Hospital of Bialystok, Bialystok, Poland
Anita M. Klukowska
Department of Neurosurgery, Park Medical Center, Rotterdam, The Netherlands
Marc L. Schröder
Machine Intelligence in Clinical Neuroscience and Microsurgical Neuroanatomy (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zurich, University of Zurich, Zurich, Switzerland
Victor E. Staartjes
You can also search for this author in PubMed Google Scholar
Correspondence to Victor E. Staartjes .
Conflict of interest.
The authors declare that the article and its content were composed in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Below is the link to the electronic supplementary material.
Rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Klukowska, A.M., Vandertop, W.P., Schröder, M.L. et al. Calculation of the minimum clinically important difference (MCID) using different methodologies: case study and practical guide. Eur Spine J (2024). https://doi.org/10.1007/s00586-024-08369-5
Download citation
Received : 03 May 2024
Revised : 17 May 2024
Accepted : 10 June 2024
Published : 28 June 2024
DOI : https://doi.org/10.1007/s00586-024-08369-5
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .
Sarah crowe.
1 Division of Primary Care, The University of Nottingham, Nottingham, UK
2 Centre for Population Health Sciences, The University of Edinburgh, Edinburgh, UK
3 School of Health in Social Science, The University of Edinburgh, Edinburgh, UK
Aziz sheikh.
The case study approach allows in-depth, multi-faceted explorations of complex issues in their real-life settings. The value of the case study approach is well recognised in the fields of business, law and policy, but somewhat less so in health services research. Based on our experiences of conducting several health-related case studies, we reflect on the different types of case study design, the specific research questions this approach can help answer, the data sources that tend to be used, and the particular advantages and disadvantages of employing this methodological approach. The paper concludes with key pointers to aid those designing and appraising proposals for conducting case study research, and a checklist to help readers assess the quality of case study reports.
The case study approach is particularly useful to employ when there is a need to obtain an in-depth appreciation of an issue, event or phenomenon of interest, in its natural real-life context. Our aim in writing this piece is to provide insights into when to consider employing this approach and an overview of key methodological considerations in relation to the design, planning, analysis, interpretation and reporting of case studies.
The illustrative 'grand round', 'case report' and 'case series' have a long tradition in clinical practice and research. Presenting detailed critiques, typically of one or more patients, aims to provide insights into aspects of the clinical case and, in doing so, illustrate broader lessons that may be learnt. In research, the conceptually-related case study approach can be used, for example, to describe in detail a patient's episode of care, explore professional attitudes to and experiences of a new policy initiative or service development or more generally to 'investigate contemporary phenomena within its real-life context' [ 1 ]. Based on our experiences of conducting a range of case studies, we reflect on when to consider using this approach, discuss the key steps involved and illustrate, with examples, some of the practical challenges of attaining an in-depth understanding of a 'case' as an integrated whole. In keeping with previously published work, we acknowledge the importance of theory to underpin the design, selection, conduct and interpretation of case studies[ 2 ]. In so doing, we make passing reference to the different epistemological approaches used in case study research by key theoreticians and methodologists in this field of enquiry.
This paper is structured around the following main questions: What is a case study? What are case studies used for? How are case studies conducted? What are the potential pitfalls and how can these be avoided? We draw in particular on four of our own recently published examples of case studies (see Tables Tables1, 1 , ,2, 2 , ,3 3 and and4) 4 ) and those of others to illustrate our discussion[ 3 - 7 ].
Example of a case study investigating the reasons for differences in recruitment rates of minority ethnic people in asthma research[ 3 ]
Minority ethnic people experience considerably greater morbidity from asthma than the White majority population. Research has shown however that these minority ethnic populations are likely to be under-represented in research undertaken in the UK; there is comparatively less marginalisation in the US. |
To investigate approaches to bolster recruitment of South Asians into UK asthma studies through qualitative research with US and UK researchers, and UK community leaders. |
Single intrinsic case study |
Centred on the issue of recruitment of South Asian people with asthma. |
In-depth interviews were conducted with asthma researchers from the UK and US. A supplementary questionnaire was also provided to researchers. |
Framework approach. |
Barriers to ethnic minority recruitment were found to centre around: |
1. The attitudes of the researchers' towards inclusion: The majority of UK researchers interviewed were generally supportive of the idea of recruiting ethnically diverse participants but expressed major concerns about the practicalities of achieving this; in contrast, the US researchers appeared much more committed to the policy of inclusion. |
2. Stereotypes and prejudices: We found that some of the UK researchers' perceptions of ethnic minorities may have influenced their decisions on whether to approach individuals from particular ethnic groups. These stereotypes centred on issues to do with, amongst others, language barriers and lack of altruism. |
3. Demographic, political and socioeconomic contexts of the two countries: Researchers suggested that the demographic profile of ethnic minorities, their political engagement and the different configuration of the health services in the UK and the US may have contributed to differential rates. |
4. Above all, however, it appeared that the overriding importance of the US National Institute of Health's policy to mandate the inclusion of minority ethnic people (and women) had a major impact on shaping the attitudes and in turn the experiences of US researchers'; the absence of any similar mandate in the UK meant that UK-based researchers had not been forced to challenge their existing practices and they were hence unable to overcome any stereotypical/prejudicial attitudes through experiential learning. |
Example of a case study investigating the process of planning and implementing a service in Primary Care Organisations[ 4 ]
Health work forces globally are needing to reorganise and reconfigure in order to meet the challenges posed by the increased numbers of people living with long-term conditions in an efficient and sustainable manner. Through studying the introduction of General Practitioners with a Special Interest in respiratory disorders, this study aimed to provide insights into this important issue by focusing on community respiratory service development. |
To understand and compare the process of workforce change in respiratory services and the impact on patient experience (specifically in relation to the role of general practitioners with special interests) in a theoretically selected sample of Primary Care Organisations (PCOs), in order to derive models of good practice in planning and the implementation of a broad range of workforce issues. |
Multiple-case design of respiratory services in health regions in England and Wales. |
Four PCOs. |
Face-to-face and telephone interviews, e-mail discussions, local documents, patient diaries, news items identified from local and national websites, national workshop. |
Reading, coding and comparison progressed iteratively. |
1. In the screening phase of this study (which involved semi-structured telephone interviews with the person responsible for driving the reconfiguration of respiratory services in 30 PCOs), the barriers of financial deficit, organisational uncertainty, disengaged clinicians and contradictory policies proved insurmountable for many PCOs to developing sustainable services. A key rationale for PCO re-organisation in 2006 was to strengthen their commissioning function and those of clinicians through Practice-Based Commissioning. However, the turbulence, which surrounded reorganisation was found to have the opposite desired effect. |
2. Implementing workforce reconfiguration was strongly influenced by the negotiation and contest among local clinicians and managers about "ownership" of work and income. |
3. Despite the intention to make the commissioning system more transparent, personal relationships based on common professional interests, past work history, friendships and collegiality, remained as key drivers for sustainable innovation in service development. |
It was only possible to undertake in-depth work in a selective number of PCOs and, even within these selected PCOs, it was not possible to interview all informants of potential interest and/or obtain all relevant documents. This work was conducted in the early stages of a major NHS reorganisation in England and Wales and thus, events are likely to have continued to evolve beyond the study period; we therefore cannot claim to have seen any of the stories through to their conclusion. |
Example of a case study investigating the introduction of the electronic health records[ 5 ]
Healthcare systems globally are moving from paper-based record systems to electronic health record systems. In 2002, the NHS in England embarked on the most ambitious and expensive IT-based transformation in healthcare in history seeking to introduce electronic health records into all hospitals in England by 2010. |
To describe and evaluate the implementation and adoption of detailed electronic health records in secondary care in England and thereby provide formative feedback for local and national rollout of the NHS Care Records Service. |
A mixed methods, longitudinal, multi-site, socio-technical collective case study. |
Five NHS acute hospital and mental health Trusts that have been the focus of early implementation efforts. |
Semi-structured interviews, documentary data and field notes, observations and quantitative data. |
Qualitative data were analysed thematically using a socio-technical coding matrix, combined with additional themes that emerged from the data. |
1. Hospital electronic health record systems have developed and been implemented far more slowly than was originally envisioned. |
2. The top-down, government-led standardised approach needed to evolve to admit more variation and greater local choice for hospitals in order to support local service delivery. |
3. A range of adverse consequences were associated with the centrally negotiated contracts, which excluded the hospitals in question. |
4. The unrealistic, politically driven, timeline (implementation over 10 years) was found to be a major source of frustration for developers, implementers and healthcare managers and professionals alike. |
We were unable to access details of the contracts between government departments and the Local Service Providers responsible for delivering and implementing the software systems. This, in turn, made it difficult to develop a holistic understanding of some key issues impacting on the overall slow roll-out of the NHS Care Record Service. Early adopters may also have differed in important ways from NHS hospitals that planned to join the National Programme for Information Technology and implement the NHS Care Records Service at a later point in time. |
Example of a case study investigating the formal and informal ways students learn about patient safety[ 6 ]
There is a need to reduce the disease burden associated with iatrogenic harm and considering that healthcare education represents perhaps the most sustained patient safety initiative ever undertaken, it is important to develop a better appreciation of the ways in which undergraduate and newly qualified professionals receive and make sense of the education they receive. | |
---|---|
To investigate the formal and informal ways pre-registration students from a range of healthcare professions (medicine, nursing, physiotherapy and pharmacy) learn about patient safety in order to become safe practitioners. | |
Multi-site, mixed method collective case study. | |
: Eight case studies (two for each professional group) were carried out in educational provider sites considering different programmes, practice environments and models of teaching and learning. | |
Structured in phases relevant to the three knowledge contexts: | |
Documentary evidence (including undergraduate curricula, handbooks and module outlines), complemented with a range of views (from course leads, tutors and students) and observations in a range of academic settings. | |
Policy and management views of patient safety and influences on patient safety education and practice. NHS policies included, for example, implementation of the National Patient Safety Agency's , which encourages organisations to develop an organisational safety culture in which staff members feel comfortable identifying dangers and reporting hazards. | |
The cultures to which students are exposed i.e. patient safety in relation to day-to-day working. NHS initiatives included, for example, a hand washing initiative or introduction of infection control measures. | |
1. Practical, informal, learning opportunities were valued by students. On the whole, however, students were not exposed to nor engaged with important NHS initiatives such as risk management activities and incident reporting schemes. | |
2. NHS policy appeared to have been taken seriously by course leaders. Patient safety materials were incorporated into both formal and informal curricula, albeit largely implicit rather than explicit. | |
3. Resource issues and peer pressure were found to influence safe practice. Variations were also found to exist in students' experiences and the quality of the supervision available. | |
The curriculum and organisational documents collected differed between sites, which possibly reflected gatekeeper influences at each site. The recruitment of participants for focus group discussions proved difficult, so interviews or paired discussions were used as a substitute. |
A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table (Table5), 5 ), the central tenet being the need to explore an event or phenomenon in depth and in its natural context. It is for this reason sometimes referred to as a "naturalistic" design; this is in contrast to an "experimental" design (such as a randomised controlled trial) in which the investigator seeks to exert control over and manipulate the variable(s) of interest.
Definitions of a case study
Author | Definition |
---|---|
Stake[ ] | (p.237) |
Yin[ , , ] | (Yin 1999 p. 1211, Yin 1994 p. 13) |
• | |
• (Yin 2009 p18) | |
Miles and Huberman[ ] | (p. 25) |
Green and Thorogood[ ] | (p. 284) |
George and Bennett[ ] | (p. 17)" |
Stake's work has been particularly influential in defining the case study approach to scientific enquiry. He has helpfully characterised three main types of case study: intrinsic , instrumental and collective [ 8 ]. An intrinsic case study is typically undertaken to learn about a unique phenomenon. The researcher should define the uniqueness of the phenomenon, which distinguishes it from all others. In contrast, the instrumental case study uses a particular case (some of which may be better than others) to gain a broader appreciation of an issue or phenomenon. The collective case study involves studying multiple cases simultaneously or sequentially in an attempt to generate a still broader appreciation of a particular issue.
These are however not necessarily mutually exclusive categories. In the first of our examples (Table (Table1), 1 ), we undertook an intrinsic case study to investigate the issue of recruitment of minority ethnic people into the specific context of asthma research studies, but it developed into a instrumental case study through seeking to understand the issue of recruitment of these marginalised populations more generally, generating a number of the findings that are potentially transferable to other disease contexts[ 3 ]. In contrast, the other three examples (see Tables Tables2, 2 , ,3 3 and and4) 4 ) employed collective case study designs to study the introduction of workforce reconfiguration in primary care, the implementation of electronic health records into hospitals, and to understand the ways in which healthcare students learn about patient safety considerations[ 4 - 6 ]. Although our study focusing on the introduction of General Practitioners with Specialist Interests (Table (Table2) 2 ) was explicitly collective in design (four contrasting primary care organisations were studied), is was also instrumental in that this particular professional group was studied as an exemplar of the more general phenomenon of workforce redesign[ 4 ].
According to Yin, case studies can be used to explain, describe or explore events or phenomena in the everyday contexts in which they occur[ 1 ]. These can, for example, help to understand and explain causal links and pathways resulting from a new policy initiative or service development (see Tables Tables2 2 and and3, 3 , for example)[ 1 ]. In contrast to experimental designs, which seek to test a specific hypothesis through deliberately manipulating the environment (like, for example, in a randomised controlled trial giving a new drug to randomly selected individuals and then comparing outcomes with controls),[ 9 ] the case study approach lends itself well to capturing information on more explanatory ' how ', 'what' and ' why ' questions, such as ' how is the intervention being implemented and received on the ground?'. The case study approach can offer additional insights into what gaps exist in its delivery or why one implementation strategy might be chosen over another. This in turn can help develop or refine theory, as shown in our study of the teaching of patient safety in undergraduate curricula (Table (Table4 4 )[ 6 , 10 ]. Key questions to consider when selecting the most appropriate study design are whether it is desirable or indeed possible to undertake a formal experimental investigation in which individuals and/or organisations are allocated to an intervention or control arm? Or whether the wish is to obtain a more naturalistic understanding of an issue? The former is ideally studied using a controlled experimental design, whereas the latter is more appropriately studied using a case study design.
Case studies may be approached in different ways depending on the epistemological standpoint of the researcher, that is, whether they take a critical (questioning one's own and others' assumptions), interpretivist (trying to understand individual and shared social meanings) or positivist approach (orientating towards the criteria of natural sciences, such as focusing on generalisability considerations) (Table (Table6). 6 ). Whilst such a schema can be conceptually helpful, it may be appropriate to draw on more than one approach in any case study, particularly in the context of conducting health services research. Doolin has, for example, noted that in the context of undertaking interpretative case studies, researchers can usefully draw on a critical, reflective perspective which seeks to take into account the wider social and political environment that has shaped the case[ 11 ].
Example of epistemological approaches that may be used in case study research
Approach | Characteristics | Criticisms | Key references |
---|---|---|---|
Involves questioning one's own assumptions taking into account the wider political and social environment. | It can possibly neglect other factors by focussing only on power relationships and may give the researcher a position that is too privileged. | Howcroft and Trauth[ ] Blakie[ ] Doolin[ , ] | |
Interprets the limiting conditions in relation to power and control that are thought to influence behaviour. | Bloomfield and Best[ ] | ||
Involves understanding meanings/contexts and processes as perceived from different perspectives, trying to understand individual and shared social meanings. Focus is on theory building. | Often difficult to explain unintended consequences and for neglecting surrounding historical contexts | Stake[ ] Doolin[ ] | |
Involves establishing which variables one wishes to study in advance and seeing whether they fit in with the findings. Focus is often on testing and refining theory on the basis of case study findings. | It does not take into account the role of the researcher in influencing findings. | Yin[ , , ] Shanks and Parr[ ] |
Here, we focus on the main stages of research activity when planning and undertaking a case study; the crucial stages are: defining the case; selecting the case(s); collecting and analysing the data; interpreting data; and reporting the findings.
Carefully formulated research question(s), informed by the existing literature and a prior appreciation of the theoretical issues and setting(s), are all important in appropriately and succinctly defining the case[ 8 , 12 ]. Crucially, each case should have a pre-defined boundary which clarifies the nature and time period covered by the case study (i.e. its scope, beginning and end), the relevant social group, organisation or geographical area of interest to the investigator, the types of evidence to be collected, and the priorities for data collection and analysis (see Table Table7 7 )[ 1 ]. A theory driven approach to defining the case may help generate knowledge that is potentially transferable to a range of clinical contexts and behaviours; using theory is also likely to result in a more informed appreciation of, for example, how and why interventions have succeeded or failed[ 13 ].
Example of a checklist for rating a case study proposal[ 8 ]
Clarity: Does the proposal read well? | |
Integrity: Do its pieces fit together? | |
Attractiveness: Does it pique the reader's interest? | |
The case: Is the case adequately defined? | |
The issues: Are major research questions identified? | |
Data Resource: Are sufficient data sources identified? | |
Case Selection: Is the selection plan reasonable? | |
Data Gathering: Are data-gathering activities outlined? | |
Validation: Is the need and opportunity for triangulation indicated? | |
Access: Are arrangements for start-up anticipated? | |
Confidentiality: Is there sensitivity to the protection of people? | |
Cost: Are time and resource estimates reasonable? |
For example, in our evaluation of the introduction of electronic health records in English hospitals (Table (Table3), 3 ), we defined our cases as the NHS Trusts that were receiving the new technology[ 5 ]. Our focus was on how the technology was being implemented. However, if the primary research interest had been on the social and organisational dimensions of implementation, we might have defined our case differently as a grouping of healthcare professionals (e.g. doctors and/or nurses). The precise beginning and end of the case may however prove difficult to define. Pursuing this same example, when does the process of implementation and adoption of an electronic health record system really begin or end? Such judgements will inevitably be influenced by a range of factors, including the research question, theory of interest, the scope and richness of the gathered data and the resources available to the research team.
The decision on how to select the case(s) to study is a very important one that merits some reflection. In an intrinsic case study, the case is selected on its own merits[ 8 ]. The case is selected not because it is representative of other cases, but because of its uniqueness, which is of genuine interest to the researchers. This was, for example, the case in our study of the recruitment of minority ethnic participants into asthma research (Table (Table1) 1 ) as our earlier work had demonstrated the marginalisation of minority ethnic people with asthma, despite evidence of disproportionate asthma morbidity[ 14 , 15 ]. In another example of an intrinsic case study, Hellstrom et al.[ 16 ] studied an elderly married couple living with dementia to explore how dementia had impacted on their understanding of home, their everyday life and their relationships.
For an instrumental case study, selecting a "typical" case can work well[ 8 ]. In contrast to the intrinsic case study, the particular case which is chosen is of less importance than selecting a case that allows the researcher to investigate an issue or phenomenon. For example, in order to gain an understanding of doctors' responses to health policy initiatives, Som undertook an instrumental case study interviewing clinicians who had a range of responsibilities for clinical governance in one NHS acute hospital trust[ 17 ]. Sampling a "deviant" or "atypical" case may however prove even more informative, potentially enabling the researcher to identify causal processes, generate hypotheses and develop theory.
In collective or multiple case studies, a number of cases are carefully selected. This offers the advantage of allowing comparisons to be made across several cases and/or replication. Choosing a "typical" case may enable the findings to be generalised to theory (i.e. analytical generalisation) or to test theory by replicating the findings in a second or even a third case (i.e. replication logic)[ 1 ]. Yin suggests two or three literal replications (i.e. predicting similar results) if the theory is straightforward and five or more if the theory is more subtle. However, critics might argue that selecting 'cases' in this way is insufficiently reflexive and ill-suited to the complexities of contemporary healthcare organisations.
The selected case study site(s) should allow the research team access to the group of individuals, the organisation, the processes or whatever else constitutes the chosen unit of analysis for the study. Access is therefore a central consideration; the researcher needs to come to know the case study site(s) well and to work cooperatively with them. Selected cases need to be not only interesting but also hospitable to the inquiry [ 8 ] if they are to be informative and answer the research question(s). Case study sites may also be pre-selected for the researcher, with decisions being influenced by key stakeholders. For example, our selection of case study sites in the evaluation of the implementation and adoption of electronic health record systems (see Table Table3) 3 ) was heavily influenced by NHS Connecting for Health, the government agency that was responsible for overseeing the National Programme for Information Technology (NPfIT)[ 5 ]. This prominent stakeholder had already selected the NHS sites (through a competitive bidding process) to be early adopters of the electronic health record systems and had negotiated contracts that detailed the deployment timelines.
It is also important to consider in advance the likely burden and risks associated with participation for those who (or the site(s) which) comprise the case study. Of particular importance is the obligation for the researcher to think through the ethical implications of the study (e.g. the risk of inadvertently breaching anonymity or confidentiality) and to ensure that potential participants/participating sites are provided with sufficient information to make an informed choice about joining the study. The outcome of providing this information might be that the emotive burden associated with participation, or the organisational disruption associated with supporting the fieldwork, is considered so high that the individuals or sites decide against participation.
In our example of evaluating implementations of electronic health record systems, given the restricted number of early adopter sites available to us, we sought purposively to select a diverse range of implementation cases among those that were available[ 5 ]. We chose a mixture of teaching, non-teaching and Foundation Trust hospitals, and examples of each of the three electronic health record systems procured centrally by the NPfIT. At one recruited site, it quickly became apparent that access was problematic because of competing demands on that organisation. Recognising the importance of full access and co-operative working for generating rich data, the research team decided not to pursue work at that site and instead to focus on other recruited sites.
In order to develop a thorough understanding of the case, the case study approach usually involves the collection of multiple sources of evidence, using a range of quantitative (e.g. questionnaires, audits and analysis of routinely collected healthcare data) and more commonly qualitative techniques (e.g. interviews, focus groups and observations). The use of multiple sources of data (data triangulation) has been advocated as a way of increasing the internal validity of a study (i.e. the extent to which the method is appropriate to answer the research question)[ 8 , 18 - 21 ]. An underlying assumption is that data collected in different ways should lead to similar conclusions, and approaching the same issue from different angles can help develop a holistic picture of the phenomenon (Table (Table2 2 )[ 4 ].
Brazier and colleagues used a mixed-methods case study approach to investigate the impact of a cancer care programme[ 22 ]. Here, quantitative measures were collected with questionnaires before, and five months after, the start of the intervention which did not yield any statistically significant results. Qualitative interviews with patients however helped provide an insight into potentially beneficial process-related aspects of the programme, such as greater, perceived patient involvement in care. The authors reported how this case study approach provided a number of contextual factors likely to influence the effectiveness of the intervention and which were not likely to have been obtained from quantitative methods alone.
In collective or multiple case studies, data collection needs to be flexible enough to allow a detailed description of each individual case to be developed (e.g. the nature of different cancer care programmes), before considering the emerging similarities and differences in cross-case comparisons (e.g. to explore why one programme is more effective than another). It is important that data sources from different cases are, where possible, broadly comparable for this purpose even though they may vary in nature and depth.
Making sense and offering a coherent interpretation of the typically disparate sources of data (whether qualitative alone or together with quantitative) is far from straightforward. Repeated reviewing and sorting of the voluminous and detail-rich data are integral to the process of analysis. In collective case studies, it is helpful to analyse data relating to the individual component cases first, before making comparisons across cases. Attention needs to be paid to variations within each case and, where relevant, the relationship between different causes, effects and outcomes[ 23 ]. Data will need to be organised and coded to allow the key issues, both derived from the literature and emerging from the dataset, to be easily retrieved at a later stage. An initial coding frame can help capture these issues and can be applied systematically to the whole dataset with the aid of a qualitative data analysis software package.
The Framework approach is a practical approach, comprising of five stages (familiarisation; identifying a thematic framework; indexing; charting; mapping and interpretation) , to managing and analysing large datasets particularly if time is limited, as was the case in our study of recruitment of South Asians into asthma research (Table (Table1 1 )[ 3 , 24 ]. Theoretical frameworks may also play an important role in integrating different sources of data and examining emerging themes. For example, we drew on a socio-technical framework to help explain the connections between different elements - technology; people; and the organisational settings within which they worked - in our study of the introduction of electronic health record systems (Table (Table3 3 )[ 5 ]. Our study of patient safety in undergraduate curricula drew on an evaluation-based approach to design and analysis, which emphasised the importance of the academic, organisational and practice contexts through which students learn (Table (Table4 4 )[ 6 ].
Case study findings can have implications both for theory development and theory testing. They may establish, strengthen or weaken historical explanations of a case and, in certain circumstances, allow theoretical (as opposed to statistical) generalisation beyond the particular cases studied[ 12 ]. These theoretical lenses should not, however, constitute a strait-jacket and the cases should not be "forced to fit" the particular theoretical framework that is being employed.
When reporting findings, it is important to provide the reader with enough contextual information to understand the processes that were followed and how the conclusions were reached. In a collective case study, researchers may choose to present the findings from individual cases separately before amalgamating across cases. Care must be taken to ensure the anonymity of both case sites and individual participants (if agreed in advance) by allocating appropriate codes or withholding descriptors. In the example given in Table Table3, 3 , we decided against providing detailed information on the NHS sites and individual participants in order to avoid the risk of inadvertent disclosure of identities[ 5 , 25 ].
The case study approach is, as with all research, not without its limitations. When investigating the formal and informal ways undergraduate students learn about patient safety (Table (Table4), 4 ), for example, we rapidly accumulated a large quantity of data. The volume of data, together with the time restrictions in place, impacted on the depth of analysis that was possible within the available resources. This highlights a more general point of the importance of avoiding the temptation to collect as much data as possible; adequate time also needs to be set aside for data analysis and interpretation of what are often highly complex datasets.
Case study research has sometimes been criticised for lacking scientific rigour and providing little basis for generalisation (i.e. producing findings that may be transferable to other settings)[ 1 ]. There are several ways to address these concerns, including: the use of theoretical sampling (i.e. drawing on a particular conceptual framework); respondent validation (i.e. participants checking emerging findings and the researcher's interpretation, and providing an opinion as to whether they feel these are accurate); and transparency throughout the research process (see Table Table8 8 )[ 8 , 18 - 21 , 23 , 26 ]. Transparency can be achieved by describing in detail the steps involved in case selection, data collection, the reasons for the particular methods chosen, and the researcher's background and level of involvement (i.e. being explicit about how the researcher has influenced data collection and interpretation). Seeking potential, alternative explanations, and being explicit about how interpretations and conclusions were reached, help readers to judge the trustworthiness of the case study report. Stake provides a critique checklist for a case study report (Table (Table9 9 )[ 8 ].
Potential pitfalls and mitigating actions when undertaking case study research
Potential pitfall | Mitigating action |
---|---|
Selecting/conceptualising the wrong case(s) resulting in lack of theoretical generalisations | Developing in-depth knowledge of theoretical and empirical literature, justifying choices made |
Collecting large volumes of data that are not relevant to the case or too little to be of any value | Focus data collection in line with research questions, whilst being flexible and allowing different paths to be explored |
Defining/bounding the case | Focus on related components (either by time and/or space), be clear what is outside the scope of the case |
Lack of rigour | Triangulation, respondent validation, the use of theoretical sampling, transparency throughout the research process |
Ethical issues | Anonymise appropriately as cases are often easily identifiable to insiders, informed consent of participants |
Integration with theoretical framework | Allow for unexpected issues to emerge and do not force fit, test out preliminary explanations, be clear about epistemological positions in advance |
Stake's checklist for assessing the quality of a case study report[ 8 ]
1. Is this report easy to read? |
2. Does it fit together, each sentence contributing to the whole? |
3. Does this report have a conceptual structure (i.e. themes or issues)? |
4. Are its issues developed in a series and scholarly way? |
5. Is the case adequately defined? |
6. Is there a sense of story to the presentation? |
7. Is the reader provided some vicarious experience? |
8. Have quotations been used effectively? |
9. Are headings, figures, artefacts, appendices, indexes effectively used? |
10. Was it edited well, then again with a last minute polish? |
11. Has the writer made sound assertions, neither over- or under-interpreting? |
12. Has adequate attention been paid to various contexts? |
13. Were sufficient raw data presented? |
14. Were data sources well chosen and in sufficient number? |
15. Do observations and interpretations appear to have been triangulated? |
16. Is the role and point of view of the researcher nicely apparent? |
17. Is the nature of the intended audience apparent? |
18. Is empathy shown for all sides? |
19. Are personal intentions examined? |
20. Does it appear individuals were put at risk? |
The case study approach allows, amongst other things, critical events, interventions, policy developments and programme-based service reforms to be studied in detail in a real-life context. It should therefore be considered when an experimental design is either inappropriate to answer the research questions posed or impossible to undertake. Considering the frequency with which implementations of innovations are now taking place in healthcare settings and how well the case study approach lends itself to in-depth, complex health service research, we believe this approach should be more widely considered by researchers. Though inherently challenging, the research case study can, if carefully conceptualised and thoughtfully undertaken and reported, yield powerful insights into many important aspects of health and healthcare delivery.
The authors declare that they have no competing interests.
AS conceived this article. SC, KC and AR wrote this paper with GH, AA and AS all commenting on various drafts. SC and AS are guarantors.
The pre-publication history for this paper can be accessed here:
http://www.biomedcentral.com/1471-2288/11/100/prepub
We are grateful to the participants and colleagues who contributed to the individual case studies that we have drawn on. This work received no direct funding, but it has been informed by projects funded by Asthma UK, the NHS Service Delivery Organisation, NHS Connecting for Health Evaluation Programme, and Patient Safety Research Portfolio. We would also like to thank the expert reviewers for their insightful and constructive feedback. Our thanks are also due to Dr. Allison Worth who commented on an earlier draft of this manuscript.
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Scientific Reports volume 14 , Article number: 14841 ( 2024 ) Cite this article
71 Accesses
Metrics details
This research introduces a methodology for data-driven regression modeling of components exhibiting nonlinear characteristics, utilizing the sparse identification of nonlinear dynamics (SINDy) method. The SINDy method is extended to formulate regression models for interconnecting components with nonlinear traits, yielding governing equations with physically interpretable solutions. The proposed methodology focuses on extracting a model that balances accuracy and sparsity among various regression models. In this process, a comprehensive model was generated using linear term weights and an error histogram. The applicability of the proposed approach is demonstrated through a case study involving a sponge gasket with nonlinear characteristics. By contrasting the predictive model with experimental responses, the reliability of the methodology is verified. The results highlight that the regression model, based on the proposed technique, can effectively establish an accurate dynamical system model, accounting for realistic conditions.
Introduction.
In mechanical systems, components such as gaskets, mounts, washers, and O-rings play pivotal roles in mitigating vibrations and improving sealing performance. These components, fabricated from materials like rubber, metal, plastic, and foam, are instrumental in preventing leaks, aligning mechanical elements, and optimizing assembly convenience. Specifically, rubber materials, such as gaskets, mounts, and O-rings, are widely employed for enhanced sealing, vibration reduction, and shock absorption.
Rubber-like components, including gaskets, mounts, and O-rings, play a crucial role in various applications such as sealing, vibration damping, and shock absorption. These polymer materials exhibit nonlinear stress–strain behavior, influenced by strain rate variations. Exceeding stress limits can lead to plastic deformation, and prolonged loading may result in deformation over time, accompanied by the conversion of kinetic energy to thermal energy and occasional hysteresis phenomena 1 , 2 , 3 . Accurately modeling the mechanical properties of polymer materials is challenging due to their inherent nonlinearity and time dependence. Consequently, sophisticated analysis and modeling techniques are necessary to capture their actual behavior, prompting various studies in this field. Representative approaches encompass numerical, analytical, experimental, and data-driven methods.
The Numerical Approach provides numerically approximate solutions but lacks mathematical rigor. Luo et al. 4 applied this method to analyze dynamic characteristics in a rubber-based rail fastener system, utilizing the superposition principle for hysteresis loop simulations on rubber components. Chen et al. 5 extended this approach to fabric spacers, introducing asymmetric elastic force and fractional differential force. Roncen et al. 6 expanded the Numerical Approach using the harmonic balance and shooting method, focusing on the softening effect in a rubber isolator subjected to random exciting forces.
The Analytical Approach involves solving rigorous mathematical solutions. Balasubramanian et al. 7 used viscous damping to identify damping characteristics in rubber rectangular plates, exploring various models from energy-based to a nonlinear single-degree-of-freedom Duffing model. Mahsa et al. 8 studied nonlinear vibration behavior in multiscale doubly curved sandwich nanoshells, grounded in Hamilton's principle. Finegan et al. 9 explored loss characteristics in composite materials, utilizing microscopic mechanical equations, elasticity theory, and material mechanics formulas.
The Experimental Approach relies on interpreting experimental results. Kerem et al. 10 quantified the complex modulus of a hybrid layer system, comparing model predictions with actual responses. Nagasankar et al. 11 investigated damping effects in polymer matrix composites, analyzing the impact of fiber stacking direction and diameter. Shangguan et al. 12 constructed models for nonlinear rubber torsional vibration absorbers, exploring various models such as the Maxwell model, Voigt model, and fractional derivative model based on experimental results.
The Data-Driven Approach accesses system data through a processing system. Conti et al. 13 integrated Finite Element Method (FEM) and SINDy to construct a Reduced Order Model (ROM) with an encoder neural network. Brunton et al. 14 extended the SINDy method to macroscopic modeling studies on metamaterials. Wang et al. 15 explored the relationship between microstructure and mechanical properties in polymer nanocomposites using a data-driven deep learning approach. Kazi et al. 16 used the SINDy method to predict strain–stress curves of composite materials with consistent mechanical behavior. These approaches, combining microscopic insights with data-driven modeling, contribute to understanding the complex behaviors exhibited by polymer materials. In contemporary research endeavors, there is a growing trend to employ Sparse Identification of Nonlinear Dynamics (SINDy) for modeling nonlinear systems, accompanied by a parallel exploration of extensive data pertaining to regression targets. However, the challenges inherent in acquiring sufficient empirical data for regressing real-world systems pose a formidable obstacle. This limitation results in numerous regression models failing to adeptly capture the dynamic nuances of the actual system, particularly when confronted with subtle changes in system conditions.
This research introduces an innovative approach designed to enhance SINDy through a simplified process, addressing the pervasive issue of insufficient data. It integrates dynamic background knowledge with data regression techniques. Typically, when analyzing nonlinearity through data-driven methods, only data are often utilized. This leads to high accuracy in the regression domain but poses accuracy challenges in other areas. Therefore, this study innovatively implements constraints representing the dynamics of the physical domain during data regression. This strategically handles the generation of comprehensive data sets for real systems by integrating parameters characterized by linear relationships into the regression variable set. This method strategically addresses the generation of comprehensive datasets for real systems by incorporating parameters characterized by linear relationships into the set of regressors. Notably, the intentional weighting of linear terms enhances the extraction of models that are both more general and realistic, even when confronted with limited empirical data. To assess the effectiveness of this novel method, it is applied to a model describing the equation of motion for a vibrating gasket system. The outcomes of this application underscore the reliability of the proposed approach in accurately predicting the response of nonlinear vibration systems. A distinctive aspect of note is the method's ability to identify real systems without necessitating intricate material complexity or extensive mathematical analyses during the model construction phase. A noteworthy characteristic of this approach is the intentional fixing of weights on linear terms, contributing to heightened efficiency in system identification even with a relatively modest amount of available data. The resultant models, derived through this methodology, exhibit a commendable capacity to effectively track system responses, even in non-regressive conditions. This underscores the pragmatic utility of the proposed approach, particularly in scenarios characterized by limited data availability.
This chapter outlines the methodology employed in utilizing the SINDy method to model a nonlinear fastening component within a system. The objective is to estimate the governing equations that describe the behavior of this component. The process involves the application of an algorithm for sparse regression modeling, where weights are introduced to linear terms, and an error histogram facilitates the extraction of a robust model. This method proves particularly advantageous for nonlinear vibration systems prone to model divergence, which often hampers the accurate tracking of actual behavior. By acting as a preventive measure, this approach contributes to the development of more generalized models compared to traditional methods.
The research focus centers on elucidating the nonlinear equations governing the dynamic motions of coupling components, a notably intricate aspect of system dynamics. To assess the nonlinear behavior of an unidentified component, a meticulously designed experimental apparatus was employed. Data obtained from system identification experiments were utilized to construct three regression models. The first model, Case 1, assumed linearity and was created through linear regression to establish a baseline control group. To capture the system's nonlinearity, two sparse regression models, Case 2 and Case 3, were developed using techniques such as error histogram and L1 regularization. These models were separately regressed, with Case 3 incorporating weights assigned to linear elements, offering diverse perspectives on addressing the system's nonlinearity.
The validation process for the regression models entailed designing a separate validation system characterized by diverse physical conditions in comparison to those of the identification system. The reliability of the modeling process was affirmed by comparing the validation system's response to predictions from the three regression models (Cases 1, 2, 3) individually. This chapter provides an in-depth exploration of the theoretical background and methodology employed in the regression modeling of the nonlinear system, encompassing both the experimental setup for identification and the subsequent validation process.
SINDy, a data-driven technique, serves the purpose of estimating dynamic system governing equations from observational data, particularly in cases where modeling proves challenging or unknown. By collecting time-varying data and integrating them with measured system characteristics, SINDy utilizes sparse regression analysis, with a specific focus on Lasso regression, to identify the optimal model. This robust algorithm directly estimates governing equations for nonlinear systems from data, proving especially beneficial in scenarios where dynamic modeling presents challenges 17 , 18 .
To derive governing equations from data, time history data of the subject ( \({\varvec{Y}}\) ) and its parameters ( \({\varvec{X}}\) ) evolving over time were systematically gathered. A library, denoted as \(\boldsymbol{\Theta }({\varvec{X}})\) , comprised potential functions correlating with the subject, utilizing the collected data. Following the selection of suitable functions for the library, a relationship emerged between the subject and the library, expressed by a coefficient vector \(\boldsymbol{\Xi }\) , satisfying Eq. ( 1 ):
The coefficient vector \(\boldsymbol{\Xi }\) signifies the connection between \({\varvec{Y}}\) and \(\boldsymbol{\Theta }({\varvec{X}})\) , revealing the active functions from the candidate functions in the library. To prevent overfitting, it is assumed that only a selected subset of candidate functions holds significance in representing \({\varvec{Y}}\) , thereby framing the issue as a sparse regression problem. This approach aims to identify the activated column vectors in the library. The sparse coefficient vector ( \(\boldsymbol{\Xi }\) ) denotes which candidate functions are active and their respective coefficients. The derivation of the coefficient vector \(\boldsymbol{\Xi }\) involves the utilization of the Moore–Penrose pseudo-inverse matrix, as articulated below:
The Moore–Penrose pseudo-inverse matrix yields a dense vector \(\boldsymbol{\Xi }\) with the minimum norm. To induce sparsity, employ techniques like lasso regression, imposing constraints by assuming variables with low linear dependence as 0. This involves minimizing the regularization cost, denoted as Eq. ( 3 ), where the regularization parameter \(w\) determines the weight assigned to sparsity. Through this approach, achieve sparse regression modeling to obtain \({\boldsymbol{\Xi }}_{{\varvec{i}}}\) . In Eq. ( 3 ), the coefficient \(w\) (sparsification parameter) is a variable associated with sparsity and complexity. A higher value of \(w\) emphasizes sparsity over complexity, whereas a lower value of \(w\) emphasizes complexity over sparsity. Thers is no precise mathematical equation for determining \(w\) , however, it would be helpful to determine \(w\) value through cross-validation using machine learning 17 , 18 .
The general problem treated here is schematically depicted in Fig. 1 . A single-degree-of-freedom (SDOF) vibration system with nonlinear component is depicted in Fig. 1 . The equation governing the SDOF system is expressed as:
Single-degree-of-freedom vibration system schematic.
Here, \(F\) represents the applied force, \(f\) denotes the nonlinear force, \(m\) is mass, \(x\) represents the displacement, and \(\dot{x}\) and \(\ddot{x}\) are the first and second derivatives of displacement, respectively. Additionally, \(\omega\) denotes applied frequency. In practical applications, it is straightforward to discern the characteristics of solid components characterized by high stiffness. However, interpreting the dynamic behavior of connecting components with made of soft material proves to be challenging. The direct measurement of these components is hindered by the absence of distinct material points or designated measurement points. Consequently, the measurement of the nonlinear force is indirectly conducted from the sensor point on the solid components due to these constraints.
The conventional SINDy approach presents challenges in identifying a practical expression for the nonlinear force. In response to this limitation, our study introduces an effective linear weighting process. To assess the efficacy of this proposed process, we examine three regression models focused on identifying a single-degree-of-freedom vibration system with a nonlinear force. Extracting time history data \((x, \dot{x}, \ddot{x}, \omega , F)\) from the identification system, three models emerge: a linear model (Case 1), a sparse model without linear weights (Case 2), and a sparse model with linear weights (Case 3), as detailed in Fig. 2 a. The reliability of these models is validated by predicting system responses to variations in mass within the identification system. The evaluation entails calculating errors based on both least square and peak-to-peak criteria, as illustrated in Fig. 2 b. This approach allows for a comprehensive assessment of the proposed weighting process and its impact on the accuracy of the identified models.
Flowchart of ( a ) identification process of nonlinear component and ( b ) validation process.
The linear vibration model, denoted as Case 1, is formulated by employing constant damping ( \(c\) ) and a stiffness ( \(k\) ) coefficient. Consequently, the physical interpretation of the force from the component can be expressed as \(f\left( {x,\dot{x},\omega } \right) = c\dot{x} + kx\) . To ascertain the values of the damping and stiffness coefficients, linear regression is conducted, as illustrated in Fig. 3 . By taking the Moore–Penrose pseudo-inverse matrix of \(X\) , the coefficient vector \(\beta\) can be determined. In many instances, this problem is over constrained, leading to coefficients that remain invariant with respect to the excitation frequency ( \(\omega\) ). This observation underscores the inefficacy and instability of the linear regression approach in identifying the damping and stiffness characteristics of the system under consideration.
Linear regression configuration.
Sparse regression models (Case 2 and Case 3) were devised using the SINDy approach. Similar to the linear model, the time-varying dependent variable vector \(Y\) is defined as \(F - m\ddot{x},\) where \(F\) represents the external vibration force, and \(m\ddot{x}\) is the inertia force. The nonlinear coefficients, specifically stiffness and damping, may vary based on factors such as boundary force, deformation, deformation rate, and excitation frequency. Consequently, library functions are meticulously chosen depending on the values of \(x, \dot{x},\) and \(\omega\) . In the pursuit of creating a comprehensive sparse regression model, a diverse range of frequencies was systematically integrated into the library, as illustrated in Fig. 4 . The regression target \(Y\) and the library matrix \(\Theta\) were organized as cumulative data collected at multiple excitation frequencies. This methodology streamlines the development of a modeling approach capable of capturing the system's response across a broad spectrum of frequency conditions.
Library stack configuration.
During the library construction process, the inclusion of physical quantities is influenced by units and dimensions, impacting the column vector's norm. To ensure accurate derivation of governing equations, all library column vectors underwent normalization, as illustrated in Fig. 5 . In Fig. 5 a, an unnormalized coordinate system for parameter vectors is depicted, introducing scaling issues that can disproportionately affect a large norm vector. Consequently, to address concerns related to vector scaling, all parameter vectors were normalized as shown in Fig. 5 b. Throughout the normalization process, adjustments were made to accommodate any changes in vector norm size after model regression.
Normalizing library column vectors: ( a ) Unnormalized coordinate system and ( b ) Normalized coordinate system.
Theoretical expectations dictate that the library's column vectors should exhibit sparse dependence on the recursion target. However, overfitting may manifest due to insufficient or noisy accumulated data, or if the library is inadequately chosen. Additionally, in oscillatory systems, the damping value is influential in the convergence of the system, but if its magnitude is small, it may not be able to be identified effectively. This paper addresses these challenges by deliberately assigning weights to linear stiffness and linear damping, corresponding to Case 3, as detailed in Algorithm 1. Step 1 involves normalizing the library column vector for accurate sparse regression. Corrections to the changing norm size occur in Step 3. Step 2 encompasses the sparse regression process, where the constraint value \(\lambda\) adapts based on the number of parameters ( \(n\) ), and the model is extracted assuming that parameters larger than the \(\lambda\) value are linearly dependent. In this process, the difference between Case 2 and Case 3 is confirmed. Unlike Case 2, Case 3 receives the linear terms index included in the library and assumes that it is always included in the regression model. This was done by maximizing the elements value of the linear terms in the dense coefficient vector \({\boldsymbol{\Xi }}_{{\varvec{d}}{\varvec{e}}{\varvec{n}}{\varvec{s}}{\varvec{e}}}\) . This is expressed as \({\boldsymbol{\Xi }}_{{\varvec{d}}{\varvec{e}}{\varvec{n}}{\varvec{s}}{\varvec{e}}}(m)=\text{max}\left(abs\left({\boldsymbol{\Xi }}_{{\varvec{d}}{\varvec{e}}{\varvec{n}}{\varvec{s}}{\varvec{e}}}\right)\right)\) in Algorithm 1.
Algorithm 1 : model regression algorithm for Case 2, 3
This section introduces the experimental setup designed to assess the dynamic characteristics of nonlinear materials, exemplified by sponges. Figure 6 showcases the material employed in the experiment. In Fig. 6 b, a scanning electron microscope (SEM) provides a detailed depiction of the microscopic features of the sponge. The intricate porous geometry, characterized by numerous tiny pores exhibiting irregular patterns, poses a challenge for mathematical modeling due to its geometric complexity. In this regard, Nie et al. 19 studied the behavior of the porous structure through numerical methods, and it can be confirmed that nonlinearity of porous structures using FEM. Liu et al. 20 studied a methodology for predicting nonlinear behavior of porous and heterogeneous structures. From through thesis, it can be estimated that the porous structure has non-linearity, so the experiment was performed with the sponge set as non-linear. To probe the nonlinear behavior of sponge-like materials, a gasket-fastened vibration system is employed, as depicted in Fig. 7 . The schematic illustrates a 1-D vibration system where the shaker is securely affixed to the ground, ensuring that the excitation is exclusively transmitted to the mass. This mass is connected to the sponge, which is fixed to the ground. The displacement of the mass is precisely measured using a laser sensor, offering a resolution level of 0.01 µm. The utilization of this vibration system aids in investigating and understanding the complex nonlinear dynamics inherent in sponge materials.
Geometry of sponge 21 : ( a ) The photograph of porous sponge and ( b ) The microscopic structure of porous sponge.
Schematics of SDOF nonlinear vibration system.
A detailed depiction of the experimental setup is presented in Fig. 8 . Figure 8 a showcases an aluminum ground plate with a central hole and bolting holes on each side, firmly secured by bolts. In Fig. 8 b, a concentrated mass is affixed to the ground alongside a rectangular prism-shaped sponge gasket, featuring a central square hole for the bolt. The load cell establishes a connection between the concentrated mass, the holes in the sponge gasket, and the ground. Figure 8 c exhibits a laser sensor responsible for measuring displacement, while Fig. 8 d provides an overview of the entire experimental arrangement. Maintaining consistent contact conditions for the sponge gasket involved securely fixing the mass ( \(m\) ) in contact with the ground boundary. To prevent the detachment of the concentrated mass, a pre-load was applied, compressing the gasket to achieve a thickness of approximately 10 mm. The amplification of the applied force was facilitated through the voltage gain of a non-inverting amplifier utilizing an operational amplifier (OP-AMP, LM324). Signal measurement was conducted using an oscilloscope (TDS1002B). Comprehensive details regarding the experiment's structures and sensors are presented in Table 1 .
Experiment setup: ( a ) ground plate, ( b ) mass and gasket, ( c ) displacement measurement laser sensor and ( d ) overall experiment design.
As depicted in Fig. 9 , the experimental procedure involved the application of an excitation force to the concentrated mass within the gasket system using a shaker. During steady-state vibration, both displacement and excitation force of the concentrated mass were concurrently measured. This was achieved through the utilization of a laser sensor for displacement data and a load cell for capturing the excitation force. The displacement data, acquired from the laser sensor, served as the basis for determining velocity and acceleration. To consider nonlinearity effectively, the sampling frequency of the laser sensor and load cell was sustained several tens of times higher than excitation frequency. The excitation frequency was below 12 Hz, the laser sensor was measured up to 750 Hz, and the load cell was measured below a frequency of 500 Hz.
Process of SDOF nonlinear vibration system identification.
The collected data underwent thorough processing through a specialized algorithm designed to build a comprehensive library and formulate a robust regression model. This systematic approach allowed for a detailed analysis of the dynamic characteristics of the sponge material under the influence of excitation forces.
The validation system was meticulously constructed through modifications to the mass of the identification system. Illustrated in Fig. 10 , an additional mass, denoted a \({m}_{2}\) , was introduced to the validation system, serving to differentiate its configuration from that of the identification system. This deliberate adjustment in the system's weight introduces distinguishable changes in both the inertia force and the nonlinear reaction emanating from the sponge gasket. The addition of \({m}_{2}\) in the validation system provides a controlled variation, allowing for a nuanced examination of how alterations in mass influence the dynamic response of the sponge material. This intentional manipulation of the system's weight serves as a crucial component in validating the robustness and generalizability of the identified characteristics obtained from the identification system. The subsequent analysis of these changes contributes to a comprehensive understanding of the nonlinear behavior exhibited by the sponge material under varying experimental conditions.
Schematics of identification system and validation system: ( a ) Identification system and ( b ) Validation system.
The reliability of the proposed methodology is substantiated through the thoughtful design and execution of both identification and validation systems. In the identification experiment, the model adeptly captures the nuanced characteristics of stiffness and damping exhibited by the sponge gasket. To corroborate the robustness of the methodology, validation experiments were conducted, involving a meticulous comparison between the measured responses and the predicted responses. This systematic validation process serves as a critical step in ensuring the model's accuracy and applicability under varying conditions. The assessment of the utility of a regression model with fixed linear weights forms a pivotal aspect of the validation process. Specifically, the accuracy of three distinct responses is comprehensively compared, shedding light on the model's predictive capabilities. Additionally, the validation extends to a scrutiny of convergence/divergence issues and a thorough comparison of both least squares and peak-to-peak errors. These comprehensive validation measures collectively contribute to affirming the reliability and efficacy of the proposed methodology in capturing and predicting the dynamic behavior of the sponge material.
The gasket vibration system manifests a dual nature, embodying characteristics of both a linear mass-stiffness-damper system and a mass-nonlinear stiffness-nonlinear damper system. To systematically investigate these attributes, the identification system's mass was precisely set at 500 g. Utilizing a sinusoidal force as the excitation, the identification system was subjected to a comprehensive range of excitation frequencies spanning from 1 to 10 Hz, covering a total of 18 frequencies. Throughout this experimental spectrum, measurements of responses and applied forces were meticulously recorded. Vibration displacement, ranging between − 0.8 mm and 0.8 mm, corresponded to a strain ( \(\varepsilon\) ) magnitude spanning from − 0.08 to 0.08.
The resulting displacement-force curve, illustrated in Fig. 11 , serves as a graphical representation of the system's nonlinearity. Notably, variations in damping sizes during gasket extension and contraction are evident, affirming the nuanced and nonlinear dynamics inherent in the gasket vibration system. This experimental approach contributes to a comprehensive understanding of the system's behavior under diverse excitation conditions, crucial for accurate modeling and analysis.
Deformation-force curve of identification system (5.4 Hz).
Utilizing \(F-m\ddot{x}\) as the dependent variable and \(x\) and \(\dot{x}\) as independent variables, the linear model (Case 1) was regressed through linear regression. Equation ( 5 ) and Table 2 present the equation of motion and coefficients of the model.
Initiating the sparse regression model (Case 2, 3), the library's column vectors underwent normalization. Sequentially, linear regression applied the Moore–Penrose pseudo-inverse matrix to regress \(Y\) . This yielded the dense coefficient vector \({\Xi }_{dense}\) , defining candidate functions and their respective weight assignments. The resulting \({\Xi }_{dense}\) is detailed in Table 3 . By sorting the dense coefficient vector \({\Xi }_{dense}\) by its elements absolute magnitude, the parameters affecting the system are identified. As a result, multiple regression models were constructed, incorporating weights that not only reflected the magnitude of correlation but also considered the number of parameters utilized in the analysis. Various modeling scenarios were executed, varying the number of candidate functions in the regression model. Evaluation of model appropriateness involved calculating the error between the measured response ( \({Y}_{test}\) ) and the predicted model response ( \({Y}_{model}\) ). This facilitated the identification of the optimal number of candidate functions required for modeling. The prediction model's error was determined through comparing the norm of the time series vector \({Y}_{test}-{Y}_{model}\) with the norm of the time series vector \({Y}_{test}\) , as represented in Eq. ( 6 ).
The \({L}_{1}\) error trend, depicted in Fig. 12 , was explored by varying the number of candidate functions in the modeling process. Figure 13 illustrates the \({L}_{1}\) cost function, incorporating error and sparsity for optimal model selection. In Fig. 13 , the solid line represents the cumulative cost of the model, comprising error and sparsity costs. In this paper, the gasket used for research was judged to be a material with strong nonlinearity, so a conservative value of 0.01 was assigned to \(w\) to attribute more significance to complexity than sparsity. The model with the minimum cost, an 8-parameter model (Case 2), was chosen. Equation ( 7 ) represents the model, where \({k}_{0}\) and \({c}_{0}\) signify linear stiffness and linear damping coefficients, respectively. Additionally, \({k}_{1}\) ~ \({k}_{n}\) and \({c}_{1}\) ~ \({c}_{n}\) denote nonlinear stiffness and nonlinear damping coefficients, respectively. Table 4 provides the specific values for these coefficients.
Optimized model selection from error histogram.
L1 cost from error histogram.
Unlike Case 2, where we lacked linearity insights, Case 3 incorporates these insights by assigning weights to the linear terms \(x\) and \(\dot{x}\) . Other than this, the nonlinearity was regressed using the same method as in Case 2. This means \({L}_{1}\) cost function and sparsity cost, and as a result of constructing regression model using this approach, the number of parameters was obtained to be the same. Consequently, we obtained another sparse model, specifically identified as a sparse regression model with linear term weights (Case 3), expressed in Eq. ( 8 ). In alignment with the Case 2 model, \({k}_{0}\) and \({c}_{0}\) denote linear stiffness and linear damping coefficients, while \({k}_{1}\) ~ \({k}_{n}\) and \({c}_{1}\) ~ \({c}_{n}\) signify nonlinear stiffness and nonlinear damping coefficients. Precise values of these coefficients are referred to Table 5 .
Validation experiments covered six cases with vibration frequencies ranging from 3 to 12 Hz (3.2 Hz, 4.2 Hz, 6.2 Hz, 8.2 Hz, 8.3 Hz, 11.8 Hz). Frequencies below 10 Hz fall within the regression range, while those above 10 Hz are non-regression frequencies. Displacement, measured within the − 0.3 mm to 0.3 mm range, corresponds to strain ( \(\varepsilon\) ) magnitudes from − 0.03 to 0.03. In Fig. 14 , the displacement-force curve for the validation system under 8.2 Hz excitation confirms nonlinearity, attributed to variations in damping forces during expansion and contraction.
Deformation-force curve of validation system (8.2 Hz).
Figure 15 provides a comparative analysis of displacement-force curves for both the validation system and the regression model, shedding light on the similarity between the predicted and actual system behaviors. In Fig. 15 a, the validation system is juxtaposed with the linear model (Case 1), revealing noticeable disparities in curve slope and the location of the maximum force. Figure 15 b showcases the model without the weighting of linear terms (Case 2), adeptly capturing the curve's slope but diverging at the point of maximum force. Figure 15 c illustrates the model with the weighting of linear terms (Case 3), presenting the closest match to the maximum force point and accurately mirroring the curve's slope.
Deformation-force curve of validation system and modeling (8.2 Hz): ( a ) Case 1, ( b ) Case 2, ( c ) Case 3.
To assess the predictive model's performance under the same excitation force as the validation system, individual comparisons were conducted in the time domain across 6 frequencies, as depicted in Fig. 16 for the linear model (Case 1). Although the linear model exhibited convergence across all frequencies, its effectiveness in tracking the actual behavior was limited. Figure 17 depicts the response of the model without the weighting of linear terms (Case 2), effectively tracking behavior within the regression frequency range (1 Hz to 10 Hz) but experiencing divergence beyond 10 Hz. Finally, Fig. 18 portrays the response of the model with weighting linear terms (Case 3), demonstrating convergence across all validation frequencies and proving most effective in accurately tracking the actual behavior among the three models.
Time-domain response of experiment and modeling (Case 1): ( a ) 3.2 Hz, ( b ) 4.2 Hz, ( c ) 6.2 Hz, ( d ) 8.2 Hz, ( e ) 8.3 Hz and ( f ) 11.8 Hz.
Time-domain response of experiment and modeling (Case 2): ( a ) 3.2 Hz, ( b ) 4.2 Hz, ( c ) 6.2 Hz, ( d ) 8.2 Hz, ( e ) 8.3 Hz and ( f ) 11.8 Hz.
Time-domain response of experiment and modeling (Case 3): ( a ) 3.2 Hz, ( b ) 4.2 Hz, ( c ) 6.2 Hz, ( d ) 8.2 Hz, ( e ) 8.3 Hz and ( f ) 11.8 Hz.
The peak-to-peak ( \(\Delta x\) ) is calculated as the difference between the maximum and minimum values of signal \(x\) , as shown in Eq. ( 9 ).
Additionally, the peak-to-peak error ( ε p ) is calculated by dividing the difference between the predicted model and the experiment peak-to-peak by the experiment peak-to-peak, using the same method as Eq. ( 10 ).
The responses of the linear model (Case 1) and sparse regression models (Case 2, 3) were individually compared with the validation system, and errors were computed as part of the evaluation process. Error calculations were performed using the least squares method, which involved comparing response magnitudes (peak to peak). The validation results, obtained from six distinct verification scenarios corresponding to excitation frequencies of 3.2 Hz, 4.2 Hz, 6.2 Hz, 8.2 Hz, 8.3 Hz, and 11.8 Hz, are presented in Fig. 19 and Table 6 based on the least squares method. Each validation number (1–5) aligned with regression frequency domains, while validation number 6 represented a non-regression frequency. The Case 1 model demonstrated convergence for all validation numbers (1–6), yielding an average least squares error of 1.68%. The Case 2 model exhibited convergence for validation numbers 1 to 5 but diverged at 6, with an average least squares error of 0.46% for validation numbers 1 to 5. Conversely, the Case 3 model demonstrated convergence for all validation numbers (1–6), registering an average least squares error of 0.59%.
Least square error of regression model and experiment.
In Fig. 20 and Table 7 , a comprehensive comparison of response magnitudes (peak to peak) between the regression models and actual responses is presented. The validation numbers and their corresponding frequencies align with the least squares error graph (Fig. 19 ). The Case 1 model displayed an average error of 9.8% across validation numbers 1 to 6. Meanwhile, the Case 2 model diverged at frequency 6, resulting in an average error of 4.41% for validation numbers 1 to 5. The Case 3 model exhibited an average error of 3.80% across all validation numbers 1 to 6. These findings provide a detailed assessment of the predictive accuracy and robustness of the regression models under varied verification scenarios.
Peak to peak error of regression model and experiment.
This research emphasizes the use of regression methods to identify the dynamic properties of sponge-like materials characterized by substantial nonlinearity. Proficient regression models have been introduced to accurately predict the response of nonlinear systems. This involves integrating dynamical background knowledge with the conventional regression approach, SINDy, and imposing physical constraints to construct the regression model. This is achieved by fixing weights on linear parameters. In this paper, Cases 1–2 serve as control groups, while the regression model obtained through the proposed method is named Case 3. Additionally, the construction of the regression model in this paper involves tasks such as parameter regularization and extracting sparse models through an algorithm employing L1 cost functions, accompanied by error histograms to provide visual aid. Meanwhile, the sources of errors between actual behavior and regression models are outlined below.
Imperfections in Measurement Equipment: In this study, errors in the sampling rate, resolution of sensors, and the small mass of the vibration ground contributed to imperfections in measurement equipment. These issues can interfere with fixed boundary conditions, particularly in high-frequency vibrations, potentially introducing noise.
Limited Frequency Range: Due to imperfections in the measurement equipment, the data used for constructing the regression model is limited to frequencies below 10 Hz. This limitation results in insufficient information about the physical behavior at frequencies above 10 Hz, potentially leading to errors or divergence in the regression model in the higher frequency range.
Modal aliasing: Nonlinear systems exhibit more complex behavior than linear systems, characterized by amplitude dependence, frequency dependence, and the superposition of harmonic and subharmonic modes. In this paper, we attempted to address these complexities using high sampling frequencies. However, employing this strategy alone has limitations when dealing with strong nonlinear effects. To improve these problems, nonlinear system regression and modal parameter analysis can be helpful.
In this research paper, we meticulously examined the regression-based predictive models derived through identification experiments of the actual system behavior to validate the proposed methodology. The linear model(Case 1) serving as a control, successfully avoided divergence issues but fell short in accurately replicating the nonlinear characteristics of the system. In contrast, both the Case 2 and Case 3 effectively tracked responses within the regression frequency domain. However, the Case 2 model faced divergence issues outside the regression frequency domain, while model Case 3 consistently tracked responses. This indicates that the nonlinear model with physical constraints (Case 3) possesses broader applicability as a predictive model compared to unconstrainted nonlinear model (Case 2).
Each regression model can be summarized as follows:
Case 1: A linear model derived through linear regression. While this model converges for all frequencies, it fails to effectively predict the system's response.
Case 2: A sparse regression model without weighting the linear candidate function terms. This model effectively predicts the system's response within the regression frequency range. However, in the non-regression frequency range, the predicted response diverges.
Case 3: A sparse regression model with weights on the linear candidate function terms. This model successfully tracks the response in all frequency ranges and does not exhibit divergence issues. Therefore, this model is considered the most general and effective.
Data regression techniques typically exhibit high accuracy within the regression domain but lower accuracy outside of it, leading to frequent data scarcity issues. Additionally, this paper encountered difficulty in appropriately identifying systems outside the regression domain when using conventional methods alone, as evident in Fig. 17 f of our study. To address this issue, we partially employed dynamic knowledge and weighted linear parameters, as in Case 3 of our research. As a result, we achieved higher accuracy not only in estimating models outside the regression domain but also in modeling within it. Constructing sparse models with physical constraints effectively captures the dynamic characteristics of nonlinear systems, providing insights into the complexity of real systems. This insight is facilitated by assigning weights to linear terms, offering understanding into the intricate phenomena of real systems. This methodology is applicable to dynamic analysis or prediction of joint components with nonlinear materials, like polymers, and holds potential for diverse applications in mechanical systems. The proposed modeling technique offers a valuable tool for addressing ambiguous boundary conditions and nonlinearity in the real world, contributing to the efficient design and performance enhancement of mechanical joint components.
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Meng, F., Pritchard, R. H. & Terentjev, E. M. Stress relaxation, dynamics, and plasticity of transient polymer networks. Macromolecules 49 (7), 2843–2852. https://doi.org/10.1021/acs.macromol.5b02667 (2016).
Article ADS CAS Google Scholar
Swallowe, G. M. Mechanical properties and testing of polymers: An A-Z reference (Springer-Science + Business Media, B.V., 2016).
Google Scholar
Kar, K. K. & Bhowmick, A. K. Analysis of high strain hysteresis loss of nonlinear rubbery polymer. Polym. Eng. Sci. 38 (1), 38–48. https://doi.org/10.1002/pen.10163 (1998).
Article CAS Google Scholar
Luo, Y., Liu, Y. & Yin, H. P. Numerical investigation of nonlinear properties of a rubber absorber in rail fastening systems. Int. J. Mech. Sci. 69 , 107–113. https://doi.org/10.1016/j.ijmecsci.2013.01.034 (2013).
Article Google Scholar
Chen, F. & Hu, H. Nonlinear vibration of knitted spacer fabric under harmonic excitation. J. Eng. Fibers Fabr. 15 , 155892502098356. https://doi.org/10.1177/1558925020983561 (2020).
Article ADS Google Scholar
Roncen, T., Sinou, J.-J. & Lambelin, J.-P. Experiments and nonlinear simulations of a rubber isolator subjected to harmonic and random vibrations. J. Sound Vib. 451 , 71–83. https://doi.org/10.1016/j.jsv.2019.03.017 (2019).
Balasubramanian, P., Ferrari, G. & Amabili, M. Identification of the viscoelastic response and nonlinear damping of a rubber plate in nonlinear vibration regime. Mech. Syst. Signal Process. 111 , 376–398. https://doi.org/10.1016/j.ymssp.2018.03.061 (2018).
Karimiasl, M., Ebrahimi, F. & Mahesh, V. On nonlinear vibration of sandwiched polymer- CNT/GPL-fiber nanocomposite nanoshells. Thin-Walled Struct. 146 , 106431. https://doi.org/10.1016/j.tws.2019.106431 (2020).
Finegan, I. C. & Gibson, R. F. Analytical modeling of damping at Micromechanical level in polymer composites reinforced with coated fibers. Compos. Sci. Technol. 60 (7), 1077–1084. https://doi.org/10.1016/s0266-3538(00)00003-8 (2000).
Ege, K., Roozen, N. B., Leclère, Q. & Rinaldi, R. G. Assessment of the apparent bending stiffness and damping of multilayer plates; modelling and experiment. J. Sound Vib. 426 , 129–149. https://doi.org/10.1016/j.jsv.2018.04.013 (2018).
Nagasankar, P., Balasivanandha, P. S. & Velmurugan, R. The effect of the strand diameter on the damping characteristics of fiber reinforced polymer matrix composites: Theoretical and experimental study. Int. J. Mech. Sci. 89 , 279–288 (2014).
Shangguan, W.-B., Guo, Y., Wei, Y., Rakheja, S. & Zhu, W. Experimental characterizations and estimation of the natural frequency of nonlinear rubber-damped torsional vibration absorbers. J. Vib. Acoust. https://doi.org/10.1115/1.4033579 (2016).
Conti, P., Gobat, G., Fresca, S., Manzoni, A. & Frangi, A. Reduced order modeling of parametrized systems through autoencoders and Sindy Approach: continuation of periodic solutions. Comput. Methods Appl. Mech. Eng. 411 , 116072. https://doi.org/10.1016/j.cma.2023.116072 (2023).
Article ADS MathSciNet Google Scholar
Brunton, S. L. & Kutz, J. N. Methods for data-driven multiscale model discovery for materials. J. Phys. Mater. 2 (4), 044002. https://doi.org/10.1088/2515-7639/ab291e (2019).
Wang, Y. et al. Mining structure–property relationships in polymer nanocomposites using data driven finite element analysis and multi-task convolutional neural networks. Mol. Syst. Des. Eng. 5 (5), 962–975. https://doi.org/10.1039/d0me00020e (2020).
Kazi, M.-K., Eljack, F. & Mahdi, E. Data-driven modeling to predict the load vs. displacement curves of targeted composite materials for Industry 4.0 and Smart Manufacturing. Compos. Struct. 258 , 113207. https://doi.org/10.1016/j.compstruct.2020.113207 (2021).
Brunton, S. L., Proctor, J. L. & Kutz, J. N. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Nat. Acad. Sci. 113 (15), 3932–3937. https://doi.org/10.1073/pnas.1517384113 (2016).
Article ADS MathSciNet CAS PubMed PubMed Central Google Scholar
Kaheman, K., Kutz, J. N. & Brunton, S. L. Sindy-pi: a robust algorithm for parallel implicit sparse identification of nonlinear dynamics. Proc. R. Soc. A Math. Phys. Eng. Sci. https://doi.org/10.1098/rspa.2020.0279 (2020).
Nie, Y., Li, Z. & Cheng, G. Efficient prediction of the effective nonlinear properties of porous material by FEM-Cluster Based Analysis (FCA). Comput. Methods Appl. Mech. Eng. 383 , 113921. https://doi.org/10.1016/j.cma.2021.113921 (2021).
Liu, Z., Bessa, M. A. & Liu, W. K. Self-consistent clustering analysis: an efficient multi-scale scheme for inelastic heterogeneous materials. Comput. Methods Appl. Mech. Eng. 306 , 319–341. https://doi.org/10.1016/j.cma.2016.04.004 (2016).
Chen, Y. et al. Porous aerogel and sponge composites: assisted by novel nanomaterials for electromagnetic interference shielding. Nano Today 38 , 101204. https://doi.org/10.1016/j.nantod.2021.101204 (2021).
Download references
This research was supported by Korea Basic Science Institute (National research Facilities and Equipment Center) grant funded by the Ministry of Education. (Grant No. 2021R1A6C101A449)
Authors and affiliations.
School of Mechanical Engineering, Pusan National University, 30 Jangjeon-Dong, Geumjeong-Gu, Busan, 46241, Republic of Korea
Taesan Ryu & Seunghun Baek
You can also search for this author in PubMed Google Scholar
SHB provided the main research idea and designed the experiments. RTS wrote the main manuscript and carried out the experiments.
Correspondence to Seunghun Baek .
Competing interests.
The authors declare no competing interests.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Cite this article.
Ryu, T., Baek, S. Development of data-driven modeling method for nonlinear coupling components. Sci Rep 14 , 14841 (2024). https://doi.org/10.1038/s41598-024-65680-3
Download citation
Received : 07 February 2024
Accepted : 24 June 2024
Published : 27 June 2024
DOI : https://doi.org/10.1038/s41598-024-65680-3
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.
BMC Bioinformatics volume 25 , Article number: 226 ( 2024 ) Cite this article
85 Accesses
Metrics details
The matched case–control design, up until recently mostly pertinent to epidemiological studies, is becoming customary in biomedical applications as well. For instance, in omics studies, it is quite common to compare cancer and healthy tissue from the same patient. Furthermore, researchers today routinely collect data from various and variable sources that they wish to relate to the case–control status. This highlights the need to develop and implement statistical methods that can take these tendencies into account.
We present an R package penalizedclr , that provides an implementation of the penalized conditional logistic regression model for analyzing matched case–control studies. It allows for different penalties for different blocks of covariates, and it is therefore particularly useful in the presence of multi-source omics data. Both L1 and L2 penalties are implemented. Additionally, the package implements stability selection for variable selection in the considered regression model.
The proposed method fills a gap in the available software for fitting high-dimensional conditional logistic regression models accounting for the matched design and block structure of predictors/features. The output consists of a set of selected variables that are significantly associated with case–control status. These variables can then be investigated in terms of functional interpretation or validation in further, more targeted studies.
Peer Review reports
The matched case–control design is widely employed in biomedical studies, since matching on potentially confounding variables can significantly improve efficiency and statistical power, while mitigating the effect of potential confounders. This design has become popular in studies involving high-throughput assays, leading researchers to propose novel methods for the analysis of high-dimensional matched data, also with the aim of feature or variable selection [ 12 ]. As many of these ignore the study design and apply methods not designed for the matched design, this strategy can lead to sub-optimal results [see for instance 2 , 18 ] and potentially missing some important associations. A classical method for taking into account the matched design is offered by conditional logistic regression, either applied to each variable individually or applied to all variables jointly in a multiple regression model [see for instance 2 ] which is the approach we consider here.
Studies containing several types of high-dimensional measurements for each individual – for instance, DNA methylation, copy number variation and mRNA expression – are becoming increasingly common. Integrating such heterogeneous data layers poses an additional challenge to variable selection, as the optimal penalty parameters can vary across different layers. An intuitively simple solution is to generalize a well-investigated method of penalized conditional logistic regression to allow for different penalties for different data layers. This approach can be particularly useful when the proportions of relevant variables are expected to vary across layers.
The method proposed here is similar in spirit to the popular IPF-lasso [ 3 ] and IPFStructPenalty [ 19 ] which also consider blocks of covariates. With respect to these packages, in penalizedclr , the emphasis is on variable selection, so that the package also includes a function for performing stability selection in an automatic way, see below. This is different from IPF-lasso and IPFStructPenalty that can be used for both prediction and variable selection. This difference stems from the fact that in conditional logistic regression models, intercept terms are treated as nuisance, rendering predictions for new observations impossible. In view of this, in the context of multiomics data, this method is designed to address the initial challenge of selecting promising biomarker candidates.
Table 1 shows R packages that include functions for estimating penalized logistic regression models. As can be seen from this overview, none of the available packages were designed to take into account both matching and blocks of covariates. The R package penalizedclr is intended to fill this gap.
Results of variable selection procedures in high dimensional settings are known to suffer from limited replicability. To address this issue, our package provides an implementation of stability selection, a general method in which results of the selection procedure are aggregated over different data subsamples [ 14 ]. To develop good prediction algorithms useful from a diagnostic and clinical perspective, a biological interpretation of the selected candidates would be conducted and they should be further investigated in a prospective study.
penalizedclr is implemented in R and available from CRAN. A development version is also available from github https://github.com/veradjordjilovic/penalizedclr .
In what follows, we describe the two main functions of the package, penalized.clr , estimating a penalized c onditional l ogistic r egression model allowing for different penalties for different blocks of covariates, and stable.clr.g performing stability selection of variables in the penalized conditional regression model. We then discuss other important aspects of the implementation, such as the choice of the penalization parameters and computation time.
This is a wrapper function for the penalized function of the well-established R package of the same name [ 5 , 6 ]. A routine for conditional logistic regression is not directly available in penalized , but we exploit the fact that the likelihood of a conditional logistic regression model is the same as that of a Cox model with a specific data structure. In the input, we need to specify the response vector, the stratum membership of each observation, i.e. in case of 1:1 matching, the id of the case–control pair the observation belongs to; the overall matrix of covariates to be penalized, the sizes of the blocks of covariates and the ( \(L_1\) ) penalties to be applied to each block. The output is a list including the estimated regression coefficients, along with other useful information regarding the fitted model. It should be stressed, that the vector of penalties has no default value and thus needs to be specified by the user.
To increase the replicability of research findings – in this case selected variables – we aim to select variables that are robust to small perturbations in the data. To this end, we have implemented stability selection [ 14 ] in the function stable.clr.g . Here, most of the required input arguments are the same as in penalized.clr , with the argument lambda.list replacing lambda . The argument lambda.list consists of vectors of \(L_1\) penalties to be applied to each penalized block of covariates. Each vector has length equal to the number of blocks. For advice and considerations regarding how to specify lambda.list in practice, we refer to data applications in Sections " The NOWAC lung cancer dataset " and " The TCGA lung adenocarcinoma dataset " and Section " Choice of the tuning parameters " in Appendix. For each vector, 2 B random subsamples of \(\lfloor n/2 \rfloor\) (out of the total of n ) matched pairs are taken and a penalized model is estimated ( \(B = 100\) by default). The factor 2 in 2 B is due to a variant of stability selection that includes complementary pairs of subsamples [ 17 ]. For each variable and vector of penalties, a selection probability is estimated as the proportion of fitted models in which the associated coefficient estimate is different from zero. Finally, the estimate of the selection probability of a variable is obtained by taking the maximum selection probability over all considered penalty vectors. The user can then select the variables whose estimated selection probability is above a desired threshold, typically in the range \(0.55 - 0.9\) .
The user needs to specify penalties to be applied in the main functions. In general, choosing the appropriate amount of penalization is challenging, and even more so in the presence of multiple blocks of predictors with different penalties. Let \(\varvec{\lambda } = (\lambda _1, \lambda _2, \ldots , \lambda _P)\) represent a vector of \(L_1\) penalties, where \(\lambda _i\) is the penalty applied to the \(i-\) th block, and P is the number of blocks. In principle, the optimal value can be found by performing a grid search over a P -dimensional grid. However, this approach is computationally prohibitive, and less computationally demanding alternatives are typically considered. For instance, in [ 20 ] the authors propose a stochastic search over a grid. We follow a different strategy and combine a grid search for a scalar parameter with a heuristic data adaptive strategy as follows. The problem of setting \(\varvec{\lambda }\) can be decomposed into two subproblems to be solved independently, as we can write \(\varvec{\lambda } = \lambda (1, \lambda _2/\lambda , \ldots , \lambda _P/\lambda )\) , where \(\lambda\) can be viewed as the overall level of penalization, while the vector \((1, \lambda _2/\lambda , \ldots , \lambda _P/\lambda )\) represents the relative penalties with respect to the first block. In analogy with ipflasso , we refer to this vector as the vector of penalty factors. Our package offers two functions: default.pf that performs a heuristic search for the data adaptive vector of penalty factors (see below), and find.default.lambda that, given a vector of penalty factors, finds \(\lambda\) that maximizies the cross-validated conditional log-likelihood, see Section " Choice of the tuning parameters " in Appendix for further details.
To find a data adaptive vector of penalty factors, we follow the heuristic approach of [ 16 ]. In this extension of the original IPF-lasso method, a tentative conditional logistic regression model is fitted to all covariates, and for each block, the (relative) penalty is set to be inversely proportional to the mean of the estimated coefficients pertaining to that block. In this way, a block with larger estimated coefficients will have a lower penalty, and vice-versa. This step can be performed for each block separately, i.e. by fitting P tentative models, or jointly with all blocks included within a single model, see argument type.step1 . Once a vector of penalty factors is obtained in this way, we can call find.default.lambda to find the value of \(\lambda\) determining the overall extent of penalization. For more details, we refer to [ 16 ] and the penalizedclr package documentation.
The main focus of the package is on \(L_1\) or lasso penalty which, resulting in sparse estimated models, is appropriate for variable selection. Nevertheless, it is well-known that with \(L_1\) penalty, the presence of highly correlated variables can have a negative impact on selection stability [ 11 ]. Adding a small \(L_2\) or ridge penalty can alleviate this issue: our implementation offers this possibility by including the mixing parameter alpha , see package documentation and Section " Choice of the tuning parameters " in Appendix for details.
The computational cost of estimating a penalized conditional logistic model with a given vector of penalties equals the cost of estimating a penalized Cox model. The time consuming part of the analysis is stability selection, which requires fitting 2 Bs models, where s is the number of the vectors of penalties in lambda.list . Fortunately, stability selection is highly amenable to parallelization, which greatly reduces computation time especially when using a cluster of computers (see argument parallel of function stable.clr.g ).
We illustrate the proposed method with a small simulation study. This simulation study is by no means meant to be exhaustive since many different simulation settings can be envisioned. The main purpose of this study is to illustrate some of the numerous factors that influence performance of the proposed method in real applications. The R code files for reproducing the results reported here are available on github https://github.com/veradjordjilovic/Simulations_penalizedclr .
We considered six different settings described in Table 2 , where \(p_i\) and \(a_i\) denote the dimension and the number of active variables in block i , respectively, while \(\beta _i\) is the coefficient of an active variable in block i , \(i=1,2\) . Common for all settings is the number of blocks (2), the number of matched pairs (200), the total number of covariates (100) and the total number of active variables (20).
For each setting, we generated 100 datasets, to which we applied a variable selection procedure based on conditional logistic regression as follows. First, we computed data adaptive penalties, as described in Section " Data adaptive choice of penalty parameters ". Next, we ran stability selection with \(B=50\) on penalized conditional logistic regression with these penalties (Sect. " stable.clr.g function ") and a default \(\alpha =1\) . Finally, covariates with selection probability exceeding 0.55 were selected.
We evaluated performance by estimating power, defined as the proportion of active variables identified by our procedure, and false discovery rate (FDR), defined as the proportion of false discoveries among all discoveries; in this case, the proportion of inactive variables among the selected variables. Power and FDR were averaged over 100 datasets.
We compared our approach to two approaches that in practice could also be considered and applied in this context. The first one is IPF-Lasso [ 3 ] with an unconditional logistic regression model, and the second one is the conditional logistic regression with a single block of covariates. The former method, implemented in the package of the same name, takes into account the presence of different types of covariates, but ignores matching, the latter, implemented in the R package clogitL1 [ 15 ] fits the conditional logistic regression model but ignores the block structure of covariates.
Results are shown in Table 2 . We fit clogitL1 only in settings 1 and 4, since in the unconditional model all settings but setting 4 are equivalent, differing only in the position of active variables. We first notice that, in general, the two competing approaches select more variables then our method. In particular, the conditional model achieves reasonable power: 0.85 and 0.64, respectively, with quite high FDR: slightly below 50%. IPF-Lasso in settings 3, 4, 5 and 6 identifies almost all active variables. However, there are also many false positives (FDR in the range of 0.57 \(-\) 0.79). Note that this is expected since variable selection with IPF-Lasso and clogitL1 was performed based on a single model fit. Coupling stability selection with these methods is expected to decrease the number of selected null variables. In settings 1 and 2, the number of selected variables with IPF-Lasso is close to that of our approach, with the latter showing a slightly better performance in terms of power and FDR. For our approach, the power is lowest in settings 1, 4 and 6, in which either there is no (considerable) difference in the proportion of active variables in the two blocks (1 and 6) or the signal in one of the blocks is relatively weak. On the other hand, the highest power is achieved in setting 3, in which all active variables belong to the first block. Good power can also be observed in setting 5, where the majority of active variables is in the first block. As for the empirical FDR, it seems comparable across settings, varying in the range 0.18–0.26.
We set the threshold for selection to 0.55, which is at the low end of the suggested range (Sect. " stable.clr.g function "). To evaluate the impact of this choice, we computed the empirical power and FDR for a grid of potential thresholds across the suggested range \((0.55 - 0.9)\) . Results are shown in Fig. 1 .
As expected, both power and FDR decrease with an increasing threshold for selection, since a stricter criterion for selection leads to fewer selected variables, both active and inactive. Ordering of the settings is largely preserved across different thresholds (with some exceptions, for instance, the power for settings 3 and 5). Interestingly, setting 4 stands out from the rest: while its FDR decreases with the increasing threshold, as expected, its power remains constant. Recall that in setting 4, the number of active variables is equal among the two blocks, but the signal strength in the second block is lower. Indeed, this signal seems to be too low to be picked up, and the variable selection procedure selects only the variables of the first block.
A related question is how the proposed method behaves with varying sample size. To investigate this issue, we considered setting 5 and generated 50 datasets of sample sizes 50, 100 and 500. Estimated power and FDR are shown in Table 3 .
We see that for the given signal strength (see Table 2 ) the method has no power for the smallest sample size. Already for \(n=100\) , nearly half of the active variables are identified. For \(n=500\) , the method almost always identifies all active variables. The estimated FDR remains around 0.2. When the sample size is large, and it is desirable to keep the number of false positives low, we might increase the threshold for selection. For instance, in our example, for \(n=500\) , by increasing the threshold to 0.95, the power decreases to 0.86 and FDR to 0.06.
The main purpose of the presented simulation study and the data application is to illustrate the possibilities and limitations of the proposed method. The comparison with IPF-Lasso and clogitL1 , methods that take into account the block structure, but ignore matching, or vice versa, was reported, since, to the best of our knowledge, there are no other methods that implement penalized estimation of the conditional logistic regression model with multiple blocks of predictors.
The small simulation study has showed, in line with the reported results for the IPF-lasso [ 3 ], that taking into account the block structure of predictors brings an advantage when the blocks are indeed different, in terms of signal strength and/or the number or proportion of active variables. Otherwise, it is of course beneficial to treat all variables on equal standing, since in that case we are dealing with fewer tuning parameters. The comparison between conditional and unconditional penalized logistic models was also studied in [ 15 ]. Their results show that estimating conditional models when data come from a matched study is beneficial, especially when strata are large and the number of covariates is moderate.
Empirical power and false discovery rate as a function of the threshold of the proposed variable selection procedure in 6 considered settings
To illustrate the proposed method in practice, we consider a lung cancer matched case–control study nested within the Norwegian Women and Cancer Study (NOWAC) [ 13 ], a prospective cohort study. Our data consist of 125 case–control pairs matched by time since blood sampling and year of birth, identified in the NOWAC cohort. Methylation levels and gene expression were measured in peripheral blood. We have focused on CpGs and genes that have previously been reported to be associated with smoking. In particular, we considered a list of CpGs differentially methylated between current smokers and nonsmokers according to [ 10 ]. Since the total number of reported CpGs, 18760, precludes us from including them all in a considered multivariate model, we have selected the top 5000 CpGs according to their reported \(p\) -values. After restricting attention to complete observations, we were left with 4370 CpGs. Similarly, we considered a list of differentially expressed genes between current smokers and nonsmokers reported in [ 9 ]. Here, of the 1270 reported genes, in NOWAC we have information on 943 which we included in our analysis.
Our goal was to select genes and CpGs that are associated with lung cancer status. Assuming a conditional logistic regression model, this amounts to selecting variables in the joint model that have a non-zero coefficient.
We started our analysis by searching for the data adaptive vector of penalty factors. We set the elastic net mixing parameter \(\alpha =0.6\) and ran the function default.pf three times, since this function relies on cross validation for selecting the penalty in the tentative model producing results that may vary between runs. In our case, the average vector of penalty factors was proportional to (1, 3.6). We then ran find.default.lambda , to find \(\lambda _1 = 5.3\) .
For stability selection, we have considered the following list of penalty vectors: (5, 1), (5, 2), (5, 5), (5, 10), (5, 15), (5, 20). We intentionally included combinations of penalties that appear to be far from the estimated data adaptive penalty factor, both to allow for less overall penalization and to explore different relative penalties for the two blocks. Our motivation comes from the observation that when conducting stability selection, it is more desirable to err on the side of too little penalization than too much. In the former case, non-active variables are expected to vary randomly across different subsamples and achieve low selection probability. In the latter case, however, the large amount of penalization might negatively affect the power to identify active variables.
We set 0.55 as the threshold for selection, and ended up with selecting two CpGs: cg27039118 (estimated selection probability: 0.56), cg17065712 (0.56), and four genes: MAPRE2 (0.63), KCNMB1 (0.78), ATP1B1 (0.61) and SLC9A2 (0.6). Although they were all included in the analysis based on their reported association with smoking, none of these selected genes nor CpGs seem to have an established link to lung cancer.
We further illustrate our method on LUng Adenocarcinoma Dataset (LUAD) publicly available from TCGA. We downloaded the data from https://openml.org/search?type=data &status=any &id=42297 , following the instructions in [ 7 ].
Data consist of survival times for 426 subjects diagnosed with lung adenocarcinoma. For each subject, data include information on a small number of clinical variables (age, sex, smoking history and cancer stage) as well as gene expression level (mRNA) and copy number variation (cnv).
To illustrate our method, we defined a binary response variable describing survival status at a three year mark (1 = alive, 0 = dead). Subjects that were censored prior to the three year mark were excluded from the main analysis.
This dataset does not come from a case control study, so for the purpose of the illustration, we matched subjects on the basis of the available clinical information. The 1:1 matching was exact on sex and based on the Mahalanobis distance for age and smoking history (for further details, we refer to the documentation of the R package MatchIt [ 8 ]. This left us with 65 case–control pairs.
Our goal was to identify, among measured features, those that are associated with the survival status three years from diagnosis. The total number of available mRNAs and copy number variations was more then 80000, so we performed initial filtering to select 1000 to include in the conditional logistic regression model. To this aim, we defined a binary variable having value 1 if the subject was diagnosed with stage III cancer and 0 otherwise (stages Ib and IIa). We then carried out two sample t-tests comparing mean levels of each feature in stage III group vs. others and selected those having lowest p -values. Among 1000 selected features, 802 were copy number variations and 198 were mRNAs. Filtering was performed on data not used in the main analysis, that is, on subjects not included in the 65 matched case - control pairs.
Our algorithm for data adaptive choice of penalty factors suggested excluding cnv from further analysis. For illustration purposes, we decided to keep them and considered an adhoc vector of penalty factors (4, 1) that corresponds to penalizing the cnv block 4 times as much as that of mRNA. For this vector of penalty factors, we found the optimal \(\lambda = 9.37\) . We thus fit a penalized conditional logistic regression model with the vector \(\varvec{\lambda }=(40,10)^\top\) and \(\alpha = 0.6\) . This gave us 8 non-zero coefficients for the mRNA block, six of which are shown in Fig. 2 . The remaining two correspond to novel genes, at the moment not annotated. No cnvs were selected.
We then performed stability selection with the list of penalty vectors: (7, 4), (15, 5), (4, 8), (2, 6). As in the previous example, we considered a wider range of penalty vectors to give variables of each block a chance to enter the model. For the choice \(B=50\) , the analysis took 26 s on a personal computer. Estimated selection probabilities are shown in Fig. 3 . We see that setting the threshold at 0.55 leads to three stably selected features, all mRNAs also present in Fig. 2 , shown in green. The importance of these features in the given context is, however, unclear.
Non-zero estimated coefficients in the LUAD study. Those selected by stability selection are plotted in green
Selection probabilities for the considered features. The dashed line \(y = 0.55\) represents the considered threshold for inclusion
In this work we have presented our implementation of an algorithm that allows for fitting high dimensional conditional logistic regression models with covariates coming from different data sources. The output of the proposed method is a set of variables significantly associated with case–control status. To the best of our knowledge, no such software has so far been available in the statistical software R.
In the simulation study and the data application, we considered 1:1 matching, but the proposed method is suitable also for a general 1: k matching, for \(k\ge 1\) , where each case is matched to k controls.
In our implementation, we have opted for a data adaptive method for selecting penalty parameters that estimates tentative penalized model(s) and assigns less penalty to blocks that have higher mean estimated coefficients. Of course, there are many other sensible options for the choice of data adaptive penalty factors (see ipflasso R package). The user is free to combine the proposed estimation procedure with an arbitrary procedure for selecting penalty parameters.
We have implemented stability selection with the aim of stabilizing the obtained results in terms of selected variables. However, stability selection can also be used for Type 1 error control. In particular, [ 14 ] show how to bound the expected number of selected inactive variables by means of stability selection. Nevertheless, their method for ensuring error control relies on a nontrivial choice of tuning parameters, which is an interesting research question on its own. For this reason, we did not pursue this question in the present contribution.
penalizedclr is implemented in R. Release versions are available on CRAN and work on all major operating systems. The development version is available at https://github.com/veradjordjilovic/penalizedclr .
Project name: penalizedclr R package
Project home page: https://CRAN.R-project.org/package=penalizedclr
Operating system(s): Platform independent.
Programming language: R
Other requirements: No.
License: MIT + file LICENSE
Any restrictions to use by non-academics: No.
Data analyzed in Section " The NOWAC lung cancer dataset " cannot be shared publicly because of local and national ethical and security policies. Data access for researchers will be conditional on adherence to both the data access procedures of the NOWAC study and the UiT, The Arctic University of Norway (contact: Tonje Braaten [email protected]) in addition to approval from the local ethical committee. Data analyzed in Section " The TCGA lung adenocarcinoma dataset " are publicly available.
Avalos M, Pouyes H, Grandvalet Y, et al. Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm. BMC Bioinf. 2015;16(6):1–11.
Article Google Scholar
Balasubramanian R, Houseman EA, Coull BA, et al. Variable importance in matched case-control studies in settings of high dimensional data. J R Stat Soc Ser C Appl Stat. 2014;63(4):639–55.
Boulesteix AL, De Bin R, Jiang X, et al. IPF-LASSO: integrative-penalized regression with penalty factors for prediction based on multi-omics data. Comput Math Methods Med. 2017. https://doi.org/10.1155/2017/7691937 .
Article PubMed PubMed Central Google Scholar
Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1.
Goeman JJ. L1 penalized estimation in the Cox proportional hazards model. Biom J. 2010;52(1):70–84.
Article PubMed Google Scholar
Goeman JJ, Meijer RJ, Chaturvedi N (2018) Penalized: L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model. R package version 0.9-51
Herrmann M, Probst P, Hornung R, et al. Large-scale benchmark study of survival prediction methods using multi-omics data. Brief Bioinf. 2021;22(3):bbaa167.
Ho DE, Imai K, King G, et al. MatchIt: Nonparametric preprocessing for parametric causal inference. J Stat Softw. 2011;42(8):1–28. https://doi.org/10.18637/jss.v042.i08 .
Huan T, Joehanes R, Schurmann C, et al. A whole-blood transcriptome meta-analysis identifies gene expression signatures of cigarette smoking. Hum Mol Genet. 2016;25(21):4611–23.
CAS PubMed PubMed Central Google Scholar
Joehanes R, Just AC, Marioni RE, et al. Epigenetic signatures of cigarette smoking. Circ Cardiovasc Genet. 2016;9(5):436–47.
Article CAS PubMed PubMed Central Google Scholar
Kirk P, Witkover A, Bangham CR, et al. Balancing the robustness and predictive performance of biomarkers. J Comput Biol. 2013;20(12):979–89.
Article CAS PubMed Google Scholar
Liang S, Ma A, Yang S, et al. A review of matched-pairs feature selection methods for gene expression data analysis. Comput Struct Biotechnol J. 2018;16:88–97.
Lund E, Dumeaux V, Braaten T, et al. Cohort profile: the norwegian women and cancer study‒’nowac‒’kvinner og kreft. Int J Epidemiol. 2008;37(1):36–41.
Meinshausen N, Bühlmann P. Stability selection. J R Stat Soc Series B Stat Methodol. 2010;72(4):417–73.
Reid S, Tibshirani R. Regularization paths for conditional logistic regression: the clogitL1 package. J Stat Softw. 2014;58(12):12.
Schulze G. Clinical outcome prediction based on multi-omics data: extension of IPF-Lasso. Master’s thesis. 2017; https://epub.ub.uni-muenchen.de/59092/1/MA_Schulze.pdf
Shah RD, Samworth RJ. Variable selection with error control: another look at stability selection. J R Stat Soc Series B Stat Methodol. 2013;75(1):55–80.
Shomal Zadeh N, Lin S, Runger GC. Matched forest: supervised learning for high-dimensional matched case-control studies. Bioinformatics. 2020;36(5):1570–6.
Zhao Z, Zucknick M. Structured penalized regression for drug sensitivity prediction. J R Stat Soc Ser C Appl Stat. 2020;69(3):525–45. https://doi.org/10.1111/rssc.12400 .
Zhao Z, Zucknick M. Structured penalized regression for drug sensitivity prediction. J R Stat Soc Ser C Appl Stat. 2020;69(3):525–45.
Download references
This research has been funded by Grant Nos. 248804 and 262111 of the Norwegian Research Council.
Authors and affiliations.
Department of Economics, Ca’ Foscari University of Venice, Venice, Italy
Vera Djordjilović
Department of Biostatistics, University of Oslo, Oslo, Norway
Vera Djordjilović, Erica Ponzi & Magne Thoresen
Department of Public Health and Nursing, Norwegian University of Science and Technology, Trondheim, Norway
Therese Haugdahl Nøst
Department of Community Medicine, Faculty of Health Sciences, The Arctic University of Norway, Tromsø, Norway
You can also search for this author in PubMed Google Scholar
VD and MT conceived the research idea. VD, EP and THN conducted the statistical analyses. THN was responsible for the acquisition of data and the biological interpretation of the results. VD wrote the manuscript, with inputs from all authors. All authors gave final approval.
Correspondence to Vera Djordjilović .
Ethics approval and consent to participate.
All participants gave written informed consent and the study was approved by the Regional Committee for Medical and Health Research Ethics and the Norwegian Data Inspectorate. More information is available in [ 13 ].
Not applicable.
The authors declare that they have no Conflict of interest.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The conditional logistic regression model.
We consider a binary outcome Y and its association with a vector of covariates \({\varvec{X}}\) grouped into P blocks: \({\varvec{X}}^1, {\varvec{X}}^2,\ldots , {\varvec{X}}^P\) , where \(P\ge 2\) . We assume that available observations come from a matched case–control study, so that they are grouped into n clusters, each with one case and at least one control. We assume that the relationship between Y and \({\varvec{X}} = ({\varvec{X}}^1, \ldots , {\varvec{X}}^P)\) can be described by the conditional logistic regression model:
where each observation is indexed by two indices: the first one indicating the cluster \(i \in \left\{ 1,\ldots ,n\right\}\) , and the second one indicating the observation within the cluster \(l=1,\ldots ,K\) , where K is the size of the cluster, common for all clusters.
The interest lies in estimating \(\varvec{\beta }\) , while cluster specific effects \({\beta }_{0i}\) , \(i=1,\ldots , n\) are treated as nuisance. If, without loss of generality, we assume that the index of a case is 1 within each cluster, the inference is based on the so-called conditional likelihood:
representing the likelihood conditional on there being exactly one case within each cluster.
Usually, the estimate of \(\varvec{\beta }\) is obtained by maximizing the conditional log likelihood \(\log L(\varvec{\beta })\) . When the dimension of the parameter is high with respect to available sample size, this approach fails and the problem of estimation is commonly addressed by introducing a penalty term. The estimate of \(\varvec{\beta }\) is then obtained by minimizing:
where \(\lambda >0\) is a tuning parameter, and the \(L_1\) norm, denoted as \(\Vert \varvec{\beta }\Vert _1\) , represents a popular penalty choice.
In the presence of blocks of covariates, \(\varvec{\beta }\) naturally partitions into subparamateres corresponding to the P blocks as \(\varvec{\beta }= (\varvec{\beta }^{1}, \ldots , \varvec{\beta }^P)\) . In our approach we allow for different penalites for different blocks, so that the estimate of \(\varvec{\beta }\) is obtained by minimizing:
We also consider and implement the elastic net penalty so that the estimate of \(\varvec{\beta }\) is obtained by minimizing:
where \(\alpha \in \left( 0,1\right]\) is the so-called ”mixing” parameter. Note that this is a slightly different parametrization of the penalty term from the one analogous to the glmnet
Note also that our algorithm is not able to estimate pure ridge models \(\alpha =0\) .
Fitting the penalized conditional logistic regression model in ( A5 ) requires setting \(P+1\) parameters: the vector of penalties \(\varvec{\lambda } = (\lambda _1, \ldots , \lambda _P)^\top\) and the elastic net parameter \(\alpha\) .
The elastic net parameter determines the balance between the \(L_1\) and the \(L_2\) penalty and is typically considered a higher order parameter that is either set apriori on subjective grounds or after experimenting with a few different values, see “An introduction to glmnet ” https://glmnet.stanford.edu/articles/glmnet.html .
The vector \(\varvec{\lambda } = (\lambda _1, \ldots , \lambda _P)^\top\) determines the level of penalization. In penalizedclr , the vector of penalties is parameterized as the product of the scalar \(\lambda\) , determining the overall level of penalization and the vector of penalty factors: \((1, \ldots , \lambda _P/\lambda )^\top\) , determining the relative penalization for different blocks.
The scalar parameter is determined by cross-validation as follows. A subset of strata is left out at random. Without loss of generality, we assume that strata indexed \(1, \ldots , m\) are left out. A penalized conditional logistic regression model is fit to the non left out strata for a sequence of \(\lambda\) . For each \(\lambda\) , the cross validated log conditional likelihood is computed as
where \(\log L_{(m)}\) is the conditional log likelihood computed on strata \(1,\ldots ,m\) , and \(\hat{\varvec{\beta }}_{-(m)}\) is the estimate of \(\varvec{\beta }\) obtained by minimizing the penalized conditional log likelihood in ( A5 ) on data excluding strata \(1,\ldots ,m\) . The optimal \(\lambda\) maximizes ( A6 ). In our implementation, we choose the value of \(\lambda\) following the “1 standard error” rule: we choose \(\lambda\) that selects the simplest model with estimated \(CV(\lambda )\) within 1 standard deviation of the minimum CV .
The vector of penalty factors is chosen in a data adaptive fashion by a heuristic strategy described in Section " Data adaptive choice of penalty parameters ". Note that we set relative penalties inversely proportional to the mean of estimated coefficients from a tentative conditional logistic model in ( A3 ), however different functions of estimated coefficients could be used instead. For instance, an empirical study in [ 16 ] compares the performance of the method based on the arithmetic and geometric means.
A related question is how to specify the set of penalty vectors when performing stability selection. Our advice is to use the vector found by the data adaptive strategy and cross validation as a starting point, and to include vectors with lower levels of overall penalization as well as those with different vectors of penalty factors. In this way, the power to select important variables should increase, and variables of all blocks are given a chance to enter the model. We refer to examples in Sections " The NOWAC lung cancer dataset " and " The TCGA lung adenocarcinoma dataset " for illustration.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
Cite this article.
Djordjilović, V., Ponzi, E., Nøst, T.H. et al. penalizedclr: an R package for penalized conditional logistic regression for integration of multiple omics layers. BMC Bioinformatics 25 , 226 (2024). https://doi.org/10.1186/s12859-024-05850-2
Download citation
Received : 08 January 2024
Accepted : 20 June 2024
Published : 27 June 2024
DOI : https://doi.org/10.1186/s12859-024-05850-2
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1471-2105
IMAGES
VIDEO
COMMENTS
The purpose of case study research is twofold: (1) to provide descriptive information and (2) to suggest theoretical relevance. Rich description enables an in-depth or sharpened understanding of the case. It is unique given one characteristic: case studies draw from more than one data source. Case studies are inherently multimodal or mixed ...
Although case studies have been discussed extensively in the literature, little has been written about the specific steps one may use to conduct case study research effectively (Gagnon, 2010; Hancock & Algozzine, 2016).Baskarada (2014) also emphasized the need to have a succinct guideline that can be practically followed as it is actually tough to execute a case study well in practice.
A case study is one of the most commonly used methodologies of social research. This article attempts to look into the various dimensions of a case study research strategy, the different epistemological strands which determine the particular case study type and approach adopted in the field, discusses the factors which can enhance the effectiveness of a case study research, and the debate ...
A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes also used.
The case study is a qualitative approach used to study phenomena within contexts (Baxter & Jack, 2008) and can be used as a tool for learning (Baskarada, 2014).
Qualitative case study methodology provides tools for researchers to study complex phenomena within their contexts. When the approach is applied correctly, it becomes a valuable method for health science ... approaches that guide case study methodology; one proposed by Robert Stake (1995) and . 545 . The Qualitative Report. December 2008. the ...
Defnition: A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation. It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied.
Case study method is the most widely used method in aca-demia for researchers interested in qualitative research (Bas-karada, 2014). Research students select the case study as a ... Section I introduces the four phases of the proposed guide-line to conduct case study along with the supporting literature review. Section I The checklist with four ...
A case study is a detailed study of a specific subject, such as a person, group, place, event, organisation, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes also used.
A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table 5 ), the ...
Case studies are designed to suit the case and research question and published case studies demonstrate wide diversity in study design. There are two popular case study approaches in qualitative research. The first, proposed by Stake ( 1995) and Merriam ( 2009 ), is situated in a social constructivist paradigm, whereas the second, by Yin ( 2012 ...
A case study is a research method that involves an in-depth analysis of a real-life phenomenon or situation. Learn how to write a case study for your social sciences research assignments with this helpful guide from USC Library. Find out how to define the case, select the data sources, analyze the evidence, and report the results.
Case Study Design. Case study is an in-depth exploration from multiple perspectives of a bounded social phenomenon, be this a social system such as a program, event, institution, organization, or community (Stake, 1995, 2005; Yin, 2018). Case study is employed across disciplines, including education, health care, social work, sociology, and ...
qualitative tradition—namely, case study or multiple case study, ethnography, phe-nomenology, biography, or grounded theory. In your discussion, you begin by defining qualitative inquiryas distinct from quantita-tive research. Then you go on to discuss the values and benefits derived from using a qualitative approach; in other words, its ...
A case study analysis requires you to investigate a business problem, examine the alternative solutions, and propose the most effective solution using supporting evidence. ... State why these parts of the case study are or are not working well. Proposed Solution/Changes. Provide specific and realistic solution(s) or changes needed. Explain why ...
word guidelines to highlight the flexibility of this qualitative analytic method. These guidelines. are (1) familiarizing yourself with your data, (2) generating initial codes, (3) The researcher read. throughout each transcript to immerse in the data, (4) reviewing themes, (5) defining and naming.
Research proposal examples. Writing a research proposal can be quite challenging, but a good starting point could be to look at some examples. We've included a few for you below. Example research proposal #1: "A Conceptual Framework for Scheduling Constraint Management".
Case studies are often used in psychology, sociology, and anthropology to gain a detailed understanding of a particular individual or group. Action Research Methodology This is a research methodology that involves a collaborative process between researchers and practitioners to identify and solve real-world problems.
Chapter 3: Method (Exploratory Case Study)Chapter 3: Method (Exploratory Case Study) This workbook Chapter workbook 3 of contains your is intended proposal: information to hel that y will u to help write you Ch to pt understand r 3 of your what proposal. should Each be included part of this in Issues to points consider to include regarding in ...
Typically, case study research (CSR) is associated with a qualitative approach. However, the increased use of mixed methods to address complex research prob- lems provides an opportunity to ...
A case study proposal is a document created by a company (who is usually a service provider) for another entity (a prospective client of said provider/company). ... A case study is a research method and a type of social proof that is specifically made to show prospects how the vendor was able to help other clients. A case study proposal, on the ...
Background and methods Overview of our proposed method for treatment response prediction. The workflow of our proposed method MOMLIN for identifying class- or task-specific biomarkers from multimodal data is shown in Fig. 1. The core of this pipeline involves three stages: (i) identification of response-specific sparse components, in terms of ...
A hybrid algorithmic solution method is proposed combining set covering optimization, local search heuristics, and adaptive large neighborhood search algorithms to solve the problem that can produce scalable results with a very small optimality gap. ... The proposed method is tested for a large-scale case study of the north suburb of the ...
With the proposed method, the location and severity of the damage in the beam can be characterized based on the posterior marginal distributions of the model parameters x D and D. As an illustration, ... The case study shows the potential of this approach, but the assessed limitations highlight areas where further research is needed: ...
The definition above is an example of an all-inclusive descriptive definition of case study research represented by Yin (2003).According to the definition of case study research, there is no doubt that this research strategy is one of the most powerful methods used by researchers to realize both practical and theoretical aims.
Flash droughts (FDs) pose significant challenges for accurate detection due to their short duration. Conventional drought monitoring methods have difficultly capturing this rapidly intensifying phenomenon accurately. Machine learning models are increasingly useful for detecting droughts after training the models with data. Northeastern Brazil (NEB) has been a hot spot for FD events with ...
Navigating the case study. To illustrate the MCID methods and to enable the reader to follow the practical calculation guide of different MCID values, ... Again, a change score equating to SRM of 0.2 (although SRM of 1/3 or 0.5 were also proposed) can be considered MCID, although studies have used the overall SRM as MCID as well [45, ...
A case study is a research approach that is used to generate an in-depth, multi-faceted understanding of a complex issue in its real-life context. It is an established research design that is used extensively in a wide variety of disciplines, particularly in the social sciences. A case study can be defined in a variety of ways (Table.
The proposed methodology focuses on extracting a model that balances accuracy and sparsity among various regression models. In this process, a comprehensive model was generated using linear term ...
To illustrate the proposed method in practice, we consider a lung cancer matched case-control study nested within the Norwegian Women and Cancer Study (NOWAC) , a prospective cohort study. Our data consist of 125 case-control pairs matched by time since blood sampling and year of birth, identified in the NOWAC cohort.