Adjunct Instructor
https://www.ccc.edu/colleges/malcolm-x/departments/Pages/Medical-Coding.aspx
Fall credit classes start August 22 Fall ESL & GED/HiSET prep classes start August 26
Medical billing and coding are essential components of the healthcare system and are pivotal in reimbursing healthcare providers. However, ethical challenges often arise in this field.
Medical billing and coding professionals are a crucial link between healthcare providers, insurance companies, and patients. They are responsible for ensuring accuracy, fairness, and timeliness. Addressing these ethical challenges is essential to maintaining trust and upholding the integrity of healthcare practices.
Below are key considerations and strategies for navigating these challenges effectively:
Table of Contents
Confidentiality and data security, transparency in patient care, handling errors, fraudulent practices, fair billing practices.
Mistakes in coding can result in incorrect billing, leading to overcharging patients or underpayment for healthcare providers. It is essential to use accurate codes and resist the temptation to overcode or undercode to meet financial goals. Regular training and careful oversight help maintain high accuracy and ethical compliance standards.
In any medical practice, it is crucial to maintain extremely high levels of confidentiality , especially in departments such as billing and coding. Professionals in this field can access sensitive patient information, which must be safeguarded from unauthorized access or leaks.
It is a matter of ethics and a legal requirement to work within laws such as the Health Insurance Portability and Accountability Act ( HIPAA ) and other privacy regulations, ensuring that patient information is kept secure and only shared with appropriate parties. To meet these requirements, it is essential to implement and maintain advanced security measures and provide regular security training.
Mistakes are inevitable in any practice, but how they are corrected matters. This reflects significantly on the ethical considerations of the billing and coding practice. Any instances of overbilling or underbilling should be rectified as soon as possible with as much transparency as possible. Ethical practices are essential in medical billing and coding to ensure errors are identified, reported, and corrected systematically.
It would be a severe ethical violation to engage in fraudulent billing , which could involve billing for services that were not provided or misrepresenting services as medically necessary when they are not.
These practices could have legal consequences and harm the trust between patients, healthcare providers, and insurance companies. Ethical behavior in this area involves actively preventing fraud and abuse by understanding and following the laws and regulations that pertain to medical billing and coding.
Fair billing involves more than just ensuring that services are coded correctly. It also requires ethical billing practices, such as charging the same rates for similar services provided under identical conditions.
The ethical challenges associated with medical billing and coding are substantial and significantly impact healthcare delivery. Professionals in this field must promote fairness, transparency, and trust in the healthcare system.
Ethical billing and coding practices benefit both patients and healthcare providers and are critical for maintaining the integrity of the healthcare system. As the field continues to evolve, an unwavering commitment to ethical practice must exist to support the health and well-being of all involved.
Nancy began her career as a Medical Assistant in 1979. From there, Nancy mastered many other areas of the medical field. She spent 35 years in Ohio, building a successful medical practice with the same Physician until his retirement in… Read Full Bio
July 8th, 2024
We Value Your Privacy
As of January 1st, 2020, Internet Explorer (versions 11 and below) is no longer supported by Evolve. To get the best possible experience using Evolve, we recommend that you use another web browser. For HESI iNet users click here .
Ali S Tejani, Brian Bialecki, Kevin O’Donnell, Teri Sippel Schmidt, Marc D Kohli, Tarik Alkasab, Standardizing imaging findings representation: harnessing Common Data Elements semantics and Fast Healthcare Interoperability Resources structures, Journal of the American Medical Informatics Association , 2024;, ocae134, https://doi.org/10.1093/jamia/ocae134
Designing a framework representing radiology results in a standards-based data structure using joint Radiological Society of North America/American College of Radiology Common Data Elements (CDEs) as the semantic labels on standard structures. This allows radiologist-created report data to integrate with artificial intelligence-generated results for use throughout downstream systems.
We developed a framework modeling radiology findings as Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) observations using CDE set/element identifiers as standardized semantic labels. This framework deploys CDE identifiers to specify radiology findings and attributes, providing consistent labels for radiology report concepts—diagnoses, recommendations, tabular/quantitative data—with built-in integration with RadLex, SNOMED CT, LOINC, and other ontologies. Observation structures fit within larger HL7 FHIR DiagnosticReport resources, providing output including both nuanced text and structured data.
Labeling radiology findings as discrete data for interchange between systems requires two components: structure and semantics. CDE definitions provide semantic identifiers for findings and their component values. The FHIR observation resource specifies a structure for associating identifiers with radiology findings in the context of reports, with CDE-encoded observations referring to definitions for CDE identifiers in a central repository. The discussion includes an example of encoding pulmonary nodules on a chest CT as CDE-labeled observations, demonstrating the application of this framework to exchange findings throughout the imaging workflow, making imaging data available to downstream clinical systems.
CDE-labeled observations establish a lingua franca for encoding, exchanging, and consuming radiology data at the level of individual findings, facilitating use throughout healthcare systems.
CDE-labeled FHIR observation objects can increase the value of radiology results by facilitating their use throughout patient care.
Personal account.
Sign in with a library card.
Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:
Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.
Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.
If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.
Enter your library card number to sign in. If you cannot sign in, please contact your librarian.
Society member access to a journal is achieved in one of the following ways:
Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:
If you do not have a society account or have forgotten your username or password, please contact your society.
Some societies use Oxford Academic personal accounts to provide access to their members. See below.
A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.
Some societies use Oxford Academic personal accounts to provide access to their members.
Click the account icon in the top right to:
Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.
For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.
To purchase short-term access, please sign in to your personal account above.
Don't already have a personal account? Register
Citing articles via.
Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide
Sign In or Create an Account
This PDF is available to Subscribers Only
For full access to this pdf, sign in to an existing account, or purchase an annual subscription.
Published on 18.6.2024 in Vol 26 (2024)
Authors of this article:
1 Inserm, Sorbonne Université, université Paris 13, Laboratoire d’informatique médicale et d’ingénierie des connaissances en e-santé, LIMICS, F-75006, Paris, France
2 Service de santé publique et information médicale, CHU de Saint Etienne, 42000 Saint-Etienne, France
3 Institut National de la Santé et de la Recherche Médicale, Université Jean Monnet, SAnté INgéniérie BIOlogie St-Etienne, SAINBIOSE, 42270 Saint-Priest-en-Jarez, France
Marie-Christine Jaulent, PhD
Sorbonne Université
université Paris 13, Laboratoire d’informatique médicale et d’ingénierie des connaissances en e-santé, LIMICS, F-75006
15 rue de l'école de Médecine
Paris, 75006
Phone: 33 144279108
Email: [email protected]
Background: To mitigate safety concerns, regulatory agencies must make informed decisions regarding drug usage and adverse drug events (ADEs). The primary pharmacovigilance data stem from spontaneous reports by health care professionals. However, underreporting poses a notable challenge within the current system. Explorations into alternative sources, including electronic patient records and social media, have been undertaken. Nevertheless, social media’s potential remains largely untapped in real-world scenarios.
Objective: The challenge faced by regulatory agencies in using social media is primarily attributed to the absence of suitable tools to support decision makers. An effective tool should enable access to information via a graphical user interface, presenting data in a user-friendly manner rather than in their raw form. This interface should offer various visualization options, empowering users to choose representations that best convey the data and facilitate informed decision-making. Thus, this study aims to assess the potential of integrating social media into pharmacovigilance and enhancing decision-making with this novel data source. To achieve this, our objective was to develop and assess a pipeline that processes data from the extraction of web forum posts to the generation of indicators and alerts within a visual and interactive environment. The goal was to create a user-friendly tool that enables regulatory authorities to make better-informed decisions effectively.
Methods: To enhance pharmacovigilance efforts, we have devised a pipeline comprising 4 distinct modules, each independently editable, aimed at efficiently analyzing health-related French web forums. These modules were (1) web forums’ posts extraction, (2) web forums’ posts annotation, (3) statistics and signal detection algorithm, and (4) a graphical user interface (GUI). We showcase the efficacy of the GUI through an illustrative case study involving the introduction of the new formula of Levothyrox in France. This event led to a surge in reports to the French regulatory authority.
Results: Between January 1, 2017, and February 28, 2021, a total of 2,081,296 posts were extracted from 23 French web forums. These posts contained 437,192 normalized drug-ADE couples, annotated with the Anatomical Therapeutic Chemical (ATC) Classification and Medical Dictionary for Regulatory Activities (MedDRA). The analysis of the Levothyrox new formula revealed a notable pattern. In August 2017, there was a sharp increase in posts related to this medication on social media platforms, which coincided with a substantial uptick in reports submitted by patients to the national regulatory authority during the same period.
Conclusions: We demonstrated that conducting quantitative analysis using the GUI is straightforward and requires no coding. The results aligned with prior research and also offered potential insights into drug-related matters. Our hypothesis received partial confirmation because the final users were not involved in the evaluation process. Further studies, concentrating on ergonomics and the impact on professionals within regulatory agencies, are imperative for future research endeavors. We emphasized the versatility of our approach and the seamless interoperability between different modules over the performance of individual modules. Specifically, the annotation module was integrated early in the development process and could undergo substantial enhancement by leveraging contemporary techniques rooted in the Transformers architecture. Our pipeline holds potential applications in health surveillance by regulatory agencies or pharmaceutical companies, aiding in the identification of safety concerns. Moreover, it could be used by research teams for retrospective analysis of events.
Social media as a complementary data source for pharmacovigilance.
One primary mission of regulatory agencies such as the FDA (Food and Drug Administration) or the EMA (European Medicines Agency) is to monitor drug usage and adverse drug events (ADEs) to mitigate the risks associated with drugs within the population. This task entails analyzing diverse data sources, including clinical trials, postmarketing surveillance, spontaneous reporting systems, and published scientific literature. Despite the wealth of available data, some ADEs are not always detected promptly, largely because of underreporting. In France, for instance, underreporting was estimated to range between 78% and 99% from 1997 to 2002 [ 1 ]. To tackle this challenge, several countries have implemented systems allowing patients to report ADEs.
Additional sources for detecting ADEs have been under exploration, such as electronic patient records [ 2 - 4 ] and social media platforms [ 5 - 9 ]. While some argue that social media alone cannot serve as a primary source for signal detection [ 10 ], it can be viewed as a valuable secondary source for monitoring emerging adverse drug reactions or reinforcing signals previously identified through spontaneous reports stored in traditional pharmacovigilance databases [ 11 ]. In a prior study by the authors, patient profiles and reported ADEs found in web forums were compared with those in the French Pharmacovigilance Database (FPVD). The forums tended to represent younger patients, more women, less severe cases, and a higher incidence of psychiatric disorder–related ADEs compared with the FPVD [ 12 ]. Moreover, forums reported a greater number of unexpected ADEs. Over the past decade, several tools for evaluating social media posts have been described in the literature [ 13 ]. Specifically, effective ADE detection in social media necessitates both quantitative and qualitative analyses of data [ 14 ].
Qualitative assessment entails evaluating whether users’ messages contain pertinent information for an assessment akin to a pharmacovigilance case report. This includes details such as the patient’s age and gender, the severity of the case, the expectedness and timeline of the adverse event, time-to-onset, dechallenge (outcome upon drug withdrawal), and rechallenge (outcome upon drug reintroduction). For instance, GlaxoSmithKline Inc. implemented the qualitative approach Insight Explorer, which facilitates the collection of extensive data for causality and quality assessment. Users can input data including personal information (eg, age range, gender) and product details (eg, name, route of administration, duration of use, dosage). This approach was adapted for the WEB-RADR (Recognizing Adverse Drug Reactions) project to manually construct a gold standard of curated patient-authored text [ 15 ].
Quantitative evaluation involves analyzing extracted data using descriptive and analytical statistics, such as signal detection and change-point analysis. Numerous projects have been undertaken to monitor ADEs on social media. One of the earliest projects is the PREDOSE (Prescription Drug Abuse Online Surveillance and Epidemiology) project [ 5 ], which investigates the illicit use of pharmaceutical opioids reported in web forums. While the PREDOSE project showcased the potential of leveraging social media for opioid monitoring, notable limitations are the lack of deidentification and signal detection methods. MedWatcher Social, a monitoring platform for health-related web forums, Twitter, and Facebook, represents a prototype application developed in 2014 [ 16 ]. Yeleswarapu et al [ 6 ] outlined a semiautomatic pipeline that applies natural language processing (NLP) tasks to extract ADEs from MEDLINE abstracts and user comments from health-related websites. However, this pipeline was not intended for routine use.
The Domino’s interface [ 17 ], developed in 2018 by the University of Bordeaux in France and funded by the French Medicines Agency (Agence nationale de sécurité du médicament et des produits de santé [ANSM]), was designed to analyze drug misuses in health-related web forums using NLP methods and the summary of product characteristics. Initially tailored for antidepressant drugs, this tool does not primarily focus on ADE surveillance.
Another pipeline, described by Nikfarjam et al in 2019 [ 7 ], used a neural network–based named entity recognition system specifically designed for user-generated content in social media. This platform is dedicated to identifying the association of cutaneous ADEs with cancer therapy drugs. The study focused on a selection of drugs and only examined 8 ADEs.
Magge et al [ 8 ] described a pipeline aimed at the extraction and normalization of adverse drug mentions on Twitter. Their pipeline consisted of an ADE classifier designed to identify tweets mentioning an ADE, which were then mapped to a MedDRA (Medical Dictionary for Regulatory Activities Terminology) code. However, the normalization process was confined to the ADEs present in the training set. Neither Nikfarjam’s nor Magge’s pipeline provides a graphical user interface.
Some private companies also offer tools for analyzing social media for pharmacovigilance purposes. For instance, the DETECT platform was developed as part of a collaborative project in France by Kappa Santé [ 18 ]. This system enabled the labeling of posts with known controlled vocabulary concepts, and signal detection was conducted [ 19 ]. Within the scope of this project, Expert System Company implemented BIOPHARMA Navigator to extract web forum posts, while the Luxid Annotation Server provided web services for the automatic annotation of posts.
An important finding from the studies of the last decade is that while regulatory agencies have begun using data sources beyond spontaneous reports, social media has yet to be fully leveraged in real-world settings due to the immaturity of available solutions. Primarily, these solutions are essentially proofs of concept that lack scalability and are challenging for experts to evaluate routinely, primarily due to the absence of a graphical user interface to present information.
Our aim was to assess the potential of integrating social media into pharmacovigilance and enhancing decision-making with this novel data source. To achieve this, our objective was to develop and assess a pipeline that processes data from the extraction of web forum posts to the generation of indicators and alerts within a visual and interactive environment. The goal was to create a user-friendly tool that enables regulatory authorities to make better-informed decisions effectively.
This article presents the design and implementation of our pipeline dedicated to harnessing posts from social media. In addition, we showcase the use of the pipeline through a specific use case, emphasizing the importance of monitoring drugs in social media to better address patients’ expectations.
The PHARES project (Pharmacovigilance in Social Networks), funded from 2017 to 2019 by the French ANSM, aimed to develop a software suite (a pipeline) enabling pharmacovigilance users to analyze social networks, particularly messages posted on forums. The objective of the pipeline is to facilitate routine use through continuous post extraction and quantitative data analysis from web forums, specifically tailored for the French language.
The pipeline is made up of 4 modules, each referring to its own methods ( Figure 1 ):
The Scraper module, which extracts posts from forums using a previously developed tool, Vigi4Med (V4M) scraper [ 9 ], and produces a comma-separated values (CSV) file filled with the texts extracted.
The Annotation module, which extracts elements of interest from the posts and registers annotations in CSV files, with each line representing an annotation of an ADE or a drug. When a causality relationship is identified, both an ADE and a drug are annotated on the same line.
The Statistical module, which performs quantitative analysis on the annotated posts, generating numerical data, tables, or figures.
The Interface module, which supports query definition and visualization of results.
The methodology used to evaluate the PHARES pipeline involved comparing its performance with existing platforms mentioned above, in accordance with a set of criteria established with prospective PHARES users. The criteria, specific to each module, are as follows:
V4M Scraper is an open-source tool designed for data extraction from web forums [ 9 ]. Its primary functions are optimizing scraping time, filtering out posts primarily focused on advertisements, and structuring the extracted data semantically. The module operates by taking a configuration file as input, which contains the URL of the targeted forum. The algorithm navigates through forum pages and generates resource description framework (RDF) triplets for each extracted element, allowing for potential alignment with external semantic resources. A caching mechanism has been integrated into this tool to maintain a local copy of previously visited pages, thereby avoiding redundant requests to websites for already scraped web pages, particularly in cases of errors or testing, for example. Vigi4Med V4M Scraper was customized for the PHARES project, as indicated by the red elements in Figure S1 in Multimedia Appendix 1 . The database format (Figure S2 in Multimedia Appendix 1 ) was implemented to enhance interaction with the interface. Specifically, the main scraping script was adjusted to produce a simplified tabular format (CSV) of the extracted data and to store these data in a database. This modification aims to facilitate input to the subsequent module of the pipeline (annotation). V4M Scraper was customized to enable a continuous scraping routine, wherein data extracted from web forums are automatically and regularly annotated and registered. A log file was integrated into the scraper structure to maintain a record of the last scraped element. This log file ensures that the daily routine scraping always begins from the last scraped point. An automation tool (crontab) is used to schedule the execution of the pipeline for each forum on a daily basis at a specific time.
A total of 23 public French health-related web forums were selected through a combination of Google searches and from a list of certified health websites provided by the HON Foundation, in collaboration with the French National Health Authority (HAS). The selection criteria included the requirement for websites to be hosted in France, feature a discussion board or space for sharing experiences, and have more than 10 patient contributions. Furthermore, Twitter posts are collected and analyzed by the pipeline. This is achieved using the Twitter API for data collection, followed by employing the same modules used for processing web forum posts.
Entities corresponding to drugs and pathological conditions in social media were identified and annotated using an NLP pipeline [ 20 ]. Initially, conditional random fields were used to account for global dependencies [ 21 ]. Specifically, the model considers the entire sequence when making predictions for individual tokens. This approach is advantageous for entity extraction tasks, as the presence of an entity in one part of the text can influence the likelihood of other entities in the vicinity. Second, a support vector machine is used to predict the causality relationship between an entity identified as a drug and another entity identified as an ADE. The annotation method used in this module was implemented at an early stage of the pipeline’s design. Currently, the named entity recognition task of this module is undergoing revision to incorporate more recent advancements in NLP algorithms [ 22 - 26 ].
In a third step, the detected annotations were normalized using codes from the MedDRA and the Anatomical Therapeutic Classification (ATC) to ensure they were suitable for signal detection purposes.
MedDRA is an international medical hierarchical terminology comprising 5 levels used to code potential ADEs in pharmacovigilance. The highest level is the system organ class, which is further divided into high-level group terms, then into high-level terms, preferred terms (PTs), and finally lowest level terms. Typically, the PT level is used in pharmacovigilance signal detection.
The ATC classification system is a drug classification used in France for pharmacovigilance purposes. It categorizes the active ingredients of drugs based on the organ system they primarily affect. The classification comprises 5 levels: the anatomical main group (consisting of 14 main groups), the therapeutic subgroup, the therapeutic/pharmacological subgroup, the chemical/therapeutic/pharmacological subgroup, and the chemical substance. Typically, the fifth level (chemical substance) is used in pharmacovigilance signal detection.
The outputs of the annotation module are CSV files with the following variables:
In these CSV files, each line can consist of either an adverse event (ADE) annotation, a drug annotation, or both when a causality relationship has been identified between the drug and the ADE. Table 1 provides a sample of the database.
In a prior study, we selected posts where at least one ADE associated with 6 drugs (agomelatine, baclofen, duloxetine, exenatide, strontium ranelate, and tetrazepam) had been detected by this algorithm. A manual review revealed that among 5149 posts, 1284 (24.94%) were validated as pharmacovigilance cases [ 12 ]. The fundamental metrics used to assess the performance of the annotation module were precision (P), recall (R), and their harmonic mean F 1 -score. To calculate these metrics, it is necessary to evaluate false negatives for nonrecognition of relevant terms, false positives for irrelevant recognitions, and true positives for correct recognitions. Precision, recall, and F 1 -score are defined as follows:
Precision = (true positive)/(true positive + false positive); recall = (true positive)/(true positive + false negative); F 1 -score = (2 × precision × recall)/(precision + recall) (1)
In the “Results” section, we present a comparison of the performance of the annotation module with the performance of state-of-the-art methods [ 8 , 22 , 25 , 26 ].
Forum name | Post ID | Date | Time | ADE verbatim | ADE normalized | Concept unique identifier | Drug verbatim | Drug normalized | Active ingredient | MedDRA code | ATC code |
Atoute | 7354 | October 8, 2018 | 21:37:00 | Maux de tête | Céphalée | C0018681 | Lévothyrox | LEVOTHYROX | Levothyroxine sodique | — | H03AA01 |
Atoute | 7354 | October 8, 2018 | 21:37:00 | Maux de tête | Céphalée | C0018681 | Calcium | — | — | — | — |
Atoute | 7354 | October 8, 2018 | 21:37:00 | Nodules cancereux | — | — | Lévothyrox | LEVOTHYROX | Levothyroxine sodique | — | H03AA01 |
Atoute | 7354 | October 8, 2018 | 21:37:00 | Nodules cancereux | — | — | Calcium | — | — | — | — |
Atoute | 7354 | October 8, 2018 | 21:37:00 | Fatigue | Fatigue | C0015672 | Lévothyrox | LEVOTHYROX | Levothyroxine sodique | 10016256 | H03AA01 |
Atoute | 7354 | October 8, 2018 | 21:37:00 | fatigue | Fatigue | C0015672 | Calcium | — | — | 10016256 | — |
Atoute | 7354 | October 8, 2018 | 21:37:00 | Perte de poids | Poids diminué | C0043096 | Lévothyrox | LEVOTHYROX | Levothyroxine sodique | 10048061 | H03AA01 |
Atoute | 7354 | October 8, 2018 | 21:37:00 | Perte de poids | Poids diminué | C0043096 | Calcium | — | — | 10048061 | — |
a ADE: adverse event.
b MedDRA: Medical Dictionary for Regulatory Activities Terminology.
c ATC: Anatomical Therapeutic Classification.
d No data are available for this slot.
This module generates general statistics and diagrams for web forums or Twitter. It provides data such as the number of annotated posts (related to the drug, the ADE, or both), the count of drug-ADE pairs identified, and the distribution of ADEs’ MedDRA-PTs. In addition, a change-point analysis method was used to detect significant changes over time in the mean number of posts mentioning the drug and ADE [ 27 ].
Besides, several statistical signal detection methods were implemented to generate potential signals. Safety signals, which provide information on adverse events that may potentially be caused by a medicine, were further evaluated by pharmacovigilance experts to determine the causal relationship between the medicine and the reported adverse event.
The statistical module implements 3 signal detection methods, including 2 well-known and frequently used disproportionality signal detection methods: the PRR [ 28 ] and the reporting odds ratio (ROR) [ 29 ]. In addition, a complementary method, a logistic regression–based signal detection method known as the class imbalanced subsampling lasso [ 30 ], was used.
PRR and ROR are akin to a relative risk and an odds ratio, respectively. However, they differ in their denominators: as the number of exposed patients is typically unknown in pharmacovigilance databases, the denominator in PRR and ROR calculations is the number of cases reported in the pharmacovigilance database.
PRR and ROR are specific to each drug-ADE pair and can be directly computed from the contingency table ( Table 2 ).
Adverse drug event of interest | Other adverse drug events | |
Drug of interest | ||
Other drugs |
The PRR compares the proportion of an ADE among all the ADEs reported for a specific drug with the same proportion for all other drugs in the database (Equation 2). A PRR significantly greater than 1 suggests that the ADE is more frequently reported for patients taking the drug of interest, while a PRR equal to 1 suggests independence between the 2 variables.
PRR = [a/(a + b)]/[c/(c + d)] (2)
The ROR quantifies the strength of the association between drug administration and the occurrence of the ADE. It represents the ratio of the odds of drug administration when the ADE is present to the odds of drug administration when the ADE is absent (Equation 3). When the 2 events are independent, the ROR equals 1. An ROR significantly greater than 1 suggests that drug administration is associated with the presence of the ADE.
ROR = ad / bc (3)
We considered events over posts for the calculation of disproportionality statistics. If the same drug-ADE pair was identified multiple times within a post, the pair was counted as many times as it occurred in the calculation.
Disproportionality analysis has certain limitations, including the confounding effect resulting from coreported drugs and the masking effect, where the background relative reporting rate of an ADE is distorted by extensive reporting on the ADE with a specific drug or drug group. Caster et al [ 31 ] demonstrated through 2 real case examples how multivariate regression–based approaches can address these issues. Harpaz et al also suggested that logistic regression could be used for safety surveillance [ 32 ]. Initially designed for pharmacovigilance case reports, we hypothesize that they may also be applicable to posts.
The logistic regression model specifically focuses on a particular ADE or a group of ADEs. It involves creating a vector that represents the presence (1) or absence (0) of the ADE of interest in the pharmacovigilance case (in our case, in the post). Additionally, a matrix is generated to represent the administration or nonadministration of all drugs in the database by the patient (1 for administration and 0 for nonadministration). Figure S3 in Multimedia Appendix 1 illustrates an example of using logistic regression. In our case, we assumed that if a drug was annotated in the post, it was taken by the patient. The logistic regression aims to predict the probability of the presence of the ADE (ADE=1) of interest based on the presence of all ( N m ) drugs in the database (Equation 4), where X represents the distribution of the presence/absence of the drugs. The adjusted factors included only concomitant medications, as patient-related factors are often missing in web forums’ posts. Therefore, we did not need to address the impact of missing data, which should be evaluated when necessary.
ln([P(X|ADE=1)]/[P(X|ADE=0)]) = a + b1 × Drug1 + ... + bi × Drug i + .. . + bNm × Drug Nm (4)
The selection of the drugs depends on the parameter b i . If b i <0, the drug i decreases the risk of the ADE, and if b i >0, the drug i increases the risk of the ADE.
Then, 2 sets are defined:
In our case n 0 >> n 1 , indicating a significant imbalance toward posts lacking annotations of the ADEs of interest. To address this issue, we took a subsample with a more favorable ratio of posts with annotated ADEs versus those without. Additionally, to enhance result stability, we conducted multiple draws instead of just one.
In practice, we generated B subsamples. Each subsample was constructed by randomly drawing, with replacement, n 1 posts from S 1 and R posts from S 0 , where R=max(4 n 1 , 4 N m ). The choice of 4 n 1 was inspired by case-control studies, while 4 N m was included to ensure an adequate number of observations considering the multitude of predictors.
We implemented a change-point analysis method described in [ 27 ] to detect whether there was a change in the evolution over time of a chosen statistic, such as the number of a specific drug-ADE pair, the number of ADEs associated with a specific drug, or the number of drugs associated with a specific ADE. The method uses the Cumulative Sum (CUSUM) algorithm to analyze the evolution of statistics over time, comparing current values with the period mean. It identifies breakpoints by calculating the highest difference in statistical values and comparing it with random samples. The process repeats for periods before and after detected breakpoints until no more are found.
The user interface module facilitates user interaction with the pipeline in a user-friendly manner. The interface comprises a dashboard divided into 2 main parts. The left dark column ( Figure 2 ) serves as a control sidebar, where users can select parameters to filter the data, including the forum, period, drug(s) according to the ATC classification, and ADE(s) according to a level in the MedDRA hierarchy. On the right side of the interface, various visualizations are available, organized into several tabs such as “Forum Statistics” and “Consultation of Posts,” with additional tabs for statistics that become active upon querying.
Before applying a specific query, the interface provides general information about the currently available data ( Figure 2 ), including the total annotated posts since 2017 (n=2,081,296) and total annotations since 2017 (n=2,454,310). In addition, a “Consultation of Tweets” tab (not visible in the figure) displays the total annotated tweets since March 2020 (n=46,153).
Furthermore, several tabs corresponding to different types of statistics, including “Forums Statistics” and “Twitter Statistics,” provide general statistics and diagrams for web forums and Twitter. Examples of these are pie charts showing forum distribution, line charts depicting the evolution of drug and ADE mentions, histograms displaying ADE distribution by system organ class, and line charts illustrating the temporal trend of posts containing the drug and an ADE, as shown in Figures 3 and 4 . The “Annotations Plot” tab displays annotations of drugs and adverse effects selected by the user, along with forum information, PTs, high-level terms, high-level group terms, dates, and hours. The “Logistic Regression” tab allows users to choose parameters for applying logistic regression. In the “Disproportionality” tab, users can choose between the PRR and ROR methods, with the time evolution of the chosen method displayed. The “Change-Point” tab enables analysis of temporal evolution, with identified breakpoints indicated. The “Consultation of Posts” and “Consultation of Tweets” tabs provide details on annotated posts/tweets, including downloadable tables. The statistical module performs calculations based on user queries, updating the interface accordingly. If multiple drugs or adverse events are selected, they are treated as new entities for analysis.
The interface was implemented using the R language and environment (R Foundation) for statistical computing and graphics [ 33 ], leveraging the Shiny package [ 34 ] for development.
A statement by an Institutional Review Board was not required because we used only publicly available data that do not necessitate Institutional Review Board review.
This study complied with the European General Data Protection Regulation (GDPR), which has been in force since 2018 in Europe [ 35 ]. The GDPR enhances the protection of individuals by introducing the right to be informed about the processing of personal data. However, informing each user individually may be impractical. Therefore, the GDPR introduces 2 legal conditions where informed consent is not mandatory, which can be interpreted as supporting the processing of web forum posts for pharmacovigilance (Article 9): “(e) processing relates to personal data which are manifestly made public by the data subject; [. . .] (i) processing is necessary for reasons of public interest in the area of public health, such as [. . .] ensuring high standards of quality and safety of health care and of medicinal products . . ..” The GDPR also requires data processing to “not permit or no longer permits the identification of data subjects” (Article 89). Deidentification was conducted during the extraction of posts from web forums to ensure privacy [ 9 ]. User identifiers in the main RDF file were encrypted using the SHA1 algorithm [ 36 ]. The correspondence between these encrypted identifiers and the original keys is presented in RDF triplets in a separate file, referred to as the “keys file.” Therefore, the only way to retrieve the original authors’ identities is by concatenating the main RDF containing the encrypted data with the keys file, which is kept in a secured location. Moreover, all our data processing was carried out on a secured server with restricted access.
The primary outcome of this study is the operational PHARES pipeline itself. Daily extraction and annotation of posts are initiated and imported into the database linked to the user interface. In this paper, the platform’s use will be demonstrated through a specific use case on the analysis of Levothyrox ADE mentions in forums (discussed later). In addition, we conducted a comparative analysis of the PHARES pipeline with the existing platforms mentioned in the “Introduction” section, based on the criteria listed in the “Methods” section.
Of the 10 identified pipelines, half were public and half were private. While 8 out of 10 focused on ADEs, only 4 were designed for routine usage. Five scrapers were open source, and all posts from considered websites were extracted by only 6 of the scrapers (with others extracting posts under certain conditions). Six scraped web forum posts, but only 3 performed deidentification. Additionally, 4 pipelines focused on the French language. A total of 6 pipelines displayed the temporal evolution of the number of posts, but only 1 conducted a change-point analysis. Signal detection methods were performed by only 4 of them, with none displaying the temporal evolution of the PRR nor a logistic regression–based method. Finally, 6 of them had an interface ( Table 3 ).
Pipeline | General | Scraper | Annotation | Statistics | Signal detection | ||||||||||||
Focus on ADEs | Routine usage | Public/private | All posts | Deidentification | Web forums | Open source | French language | Temporal evolution | Change-point analysis | Signal detection | PRR temporal evolution | Logistic regression | Interface | ||||
PREDOSE | X | ✓ | Public | ✓ | X | ✓ | ✓ | X | ✓ | X | X | X | X | ✓ | |||
Insight Explorer | ✓ | X | Private | X | X | X | ✓ | X | X | X | X | X | X | ✓ | |||
MedWatcher Social | ✓ | ✓ | Public | X | X | ✓ | ✓ | X | ✓ | X | ✓ | X | X | ✓ | |||
Yeleswarapu et al [ ] | ✓ | X | Private | X | X | X | X | X | X | X | ✓ | X | X | X | |||
Domino | X | ✓ | Public | ✓ | X | ✓ | ✓ | ✓ | ✓ | X | X | X | X | ✓ | |||
Nikfarjam et al [ ] | ✓ | X | Public and Private | X | X | X | X | X | X | X | X | X | X | X | |||
Magge et al [ ] | ✓ | X | Public | ✓ | X | X | ✓ | X | ✓ | X | X | X | X | X | |||
ADR-PRISM | ✓ | X | Public and Private | ✓ | ✓ | ✓ | X | ✓ | ✓ | X | ✓ | X | X | ✓ | |||
Kappa Santé | ✓ | ✓ | Private | ✓ | ✓ | ✓ | X | ✓ | ✓ | ✓ | ✓ | X | X | ✓ | |||
Expert System | ✓ | X | Private | ✓ | ✓ | ✓ | X | ✓ | X | X | X | X | X | ✓ |
a PHARES: Pharmacovigilance in Social Networks.
b The X symbol means that the characteristic is missing and the symbol ✓ means the characteristic is fulfilled.
c ADE: adverse drug event.
d PRR: proportional reporting ratio.
e PREDOSE: Prescription Drug Abuse Online Surveillance and Epidemiology.
f ADR-PRISM: Adverse Drug Reaction from Patient Reports in Social Media.
We also compared the performance of our annotation process with those of up-to-date state-of-the-art methods ( Table 4 ).
While the annotation module demonstrated good performance for named entity recognition ( F 1 -score=0.886), it remains slightly below the state of the art. Presently, in medical texts, the best performances are achieved by Hussain et al [ 25 ] and Ding et al [ 26 ] for the named entity recognition task, and by Xia [ 22 ] for the relationship extraction task. On Twitter, known for its notably more complex data, Hussain et al [ 25 ] achieved slightly better results than our annotator, while Ding et al [ 26 ] achieved slightly worse results.
Annotator | Language | Data | Natural language processing method | Named entity recognition (precision; recall; -score) | Relationship extraction (precision; recall; -score) |
PHARES | French | Patient’s web drug review | Conditional random fields and support vector machines | 0.926; 0.845; 0.886 | 0.683; 0.956; 0.797 |
Magge et al [ ] | English | BERT neural networks | 0.82; 0.76; 0.78 | — | |
Xia [ ] | English | Medical texts | HAMLE model | — | 0.929; 0.914; 0.921 |
Hussain et al [ ] | English | Medical texts (PubMed) and Twitter | BERT | 0.982; 0.964; 0.976 (PubMed) and 0.840; 0.861; 0.896 (X/Twitter) | — |
Ding et al [ ] | English | Medical texts (PubMed) and Twitter | BGRU + char LSTM attention + auxiliary classifier | 0.867; 0.948; 0.906 (PubMed) and 0.785; 0.914; 0.844 (Twitter) | — |
a The 2 categories are entity recognition, which is the detection of a drug or ADE mention, and relationship extraction, which is the detection of a relation between a drug and an ADE.
b PHARES: Pharmacovigilance in Social Networks.
c BERT: Bidirectional Encoder Representations from Transformer.
d Not available.
e HAMLE: Historical Awareness Multi-Level Embedding.
f BGRU: Bidirectional Gated Recurrent Unit.
g LSTM: Long-Short-Term-Memory.
From January 1, 2017, to February 28, 2021, a total of 2,081,296 posts were extracted from 23 French web forums ( Table 5 ). We obtained 713,057 normalized annotations of drugs, 1,527,004 normalized annotations of ADEs, and 437,192 annotations of normalized drug-ADE couples. The number of posts annotated with at least one normalized drug-ADE couple was equal to 125,279 (6.02%). Table 4 summarizes the number of posts extracted per forum, the publication dates, and the description of the web forum. For 1 forum, the publication dates were not available. A total of 9 were generalist health forums, 3 were specialized for parents of a young baby, 2 for families, 3 for mothers, 2 specialized in thyroid issues, 1 for pregnant women, 1 for women, 1 for parents of a teenager or for teenagers, 1 for sports persons, and 1 specialized in rare diseases.
Forum | Extracted posts, n | Publication date of the first extracted post | Publication date of the last extracted post | Description |
thyroideNEW | 451,253 | February 15, 2001 | February 25, 2021 | Specialized in thyroid issues |
doctissimoSante | 248,691 | March 19, 2003 | January 16, 2021 | Generalist health forum |
doctissimoNutrition | 183,730 | December 30, 2002 | January 16, 2021 | Specialized in nutrition |
infoBebe | 127,341 | November 30, 2000 | March 08, 2019 | Specialized for parents of a young baby |
atoute | 118,415 | February 05, 2005 | February 28, 2021 | Generalist health forum |
notreFamille | 97,098 | March 16, 2000 | October 26, 2017 | Specialized for families |
magicMaman | 96,713 | June 14, 1999 | February 22, 2021 | Specialized for mothers |
doctissimoMed | 95,531 | August 05, 2002 | January 15, 2021 | Generalist health forum |
doctissimoGrossesse | 93,449 | November 09, 2006 | January 15, 2021 | Specialized for pregnant women |
thyroide | 73,376 | September 25, 2001 | January 07, 2019 | Specialized in thyroid issues |
aufeminin | 72,732 | April 05, 2001 | January 09, 2020 | Specialized for women |
mamanVie | 69,167 | June 07, 2006 | April 10, 2019 | Specialized for mothers |
onmeda | 61,428 | July 25, 2001 | February 24, 2021 | Generalist health forum |
ados | 58,181 | June 20, 2006 | March 08, 2019 | Specialized for parents of a teenager or for teenagers |
carenity | 52,659 | May 16, 2011 | August 29, 2020 | Generalist health forum |
famili | 51,844 | November 06, 2000 | November 17, 2019 | Specialized for families |
babyFrance | 43,806 | January 20, 2003 | April 30, 2018 | Specialized for parents of young baby |
bebeMaman | 38,450 | — | — | Specialized for mothers of young baby |
alloDocteurs | 15,833 | June 15, 2009 | February 09, 2021 | Generalist health forum |
reboot | 9383 | May 04, 2016 | February 25, 2021 | Generalist health forum |
futura | 6765 | May 12, 2003 | February 22, 2021 | Generalist health forum |
sportSante | 6350 | May 10, 2011 | January 14, 2020 | Specialized for sportsperson |
maladieRares | 4827 | October 09, 2012 | May 14, 2020 | Specialized in rare diseases |
queChoisir | 4250 | June 16, 2003 | February 11, 2021 | Generalist health forum |
a Not available.
To demonstrate the usage of the pipeline, we chose to focus on Levothyrox as a case study. Levothyrox is a drug prescribed in France since 1980 for hypothyroidism and circumstances where it is necessary to limit the thyroid-stimulating hormone. In 2017, a new formula of Levothyrox, differing from the 30-year-old drug at the excipient level (with lactose being replaced by mannitol and citric acid in the new formula), was marketed with widespread media coverage. In parallel, an unexpected increase in notifications of ADEs for this drug was detected. Viard et al [ 37 ] were unable to find any pharmacological rationale to explain that signal. Approximately 32,000 adverse effects were reported by patients in France in 2017, representing 42% of all the ADEs collected yearly [ 38 ]. Most of these notifications concerned the new formulation of Levothyrox and led to the “French Levothyrox crisis.” In 2017, 1664 notifications of ADEs were spontaneously reported by patients to the Pharmacovigilance Center of Nice. Among the 1544 reviewed notifications, 1372 concerned Levothyrox while only 172 concerned other drugs [ 37 ].
In this use case, the study period was from January 1, 2017, to February 28, 2021, and the drugs included were 2 drugs from the “H03AA Thyroid hormones” ATC class: “Levothyroxine sodium” and “associations of levothyroxine and liothyronine.” A total of 17 forums were selected as they included at least one post with information about these drugs. Posts were extracted, annotated, and analyzed through the pipeline from several forums ( Table 6 ). Signal detection methods were applied to an ADE chosen as it frequently appeared with Levothyrox in our data: “tiredness.” A signal can be detected when the lower bound of the 95% CI of the logarithm of the PRR is greater than 0. For logistic regression, we applied the tenth quantile. A total of 11,340 posts contained an annotation concerning the drugs of interest. Figure S4 in Multimedia Appendix 1 illustrates the source and evolution over time of these posts. Out of a total of 50,127 annotations of Levothyrox, they principally originated from the Vivre sans thyroïde forum and were mostly posted in mid-2017 ( Figure 4 , Table 6 ). The results of the statistical analysis were displayed by the user interface.
ADEs annotated with Levothyrox were mainly from system organ classes: general disorders and administration site conditions (29.6%), metabolism and nutrition disorders (11.6%), and endocrine disorders (11.4%). The PTs mostly found in association with Levothyrox are listed in Table 7 . All this information is accessible in the interface module (Figure S5 in Multimedia Appendix 1 ).
We chose the PT “tiredness” for the signal detection analysis. A total of 85,976 posts were annotated with either one of the drugs of interest or the ADE tiredness. Among them, 1841 Levothyrox-tiredness couples were found, mostly in 2017 ( Table 7 ).
Figure 5 illustrates the time evolution of the PRR for the Levothyrox-tiredness couple. Figure S6 in Multimedia Appendix 1 displays the source and evolution over time of French web forums’ posts for this couple. A signal is consistently generated throughout the period as the logarithm of the PRR is always greater than 0.
Forum | Value, n | Cumulative frequency, % |
Vivre sans thyroïde | 41,211 | 82.21 |
Doctissimo Santé | 4230 | 90.65 |
Doctissimo Grossesse | 1476 | 93.60 |
Doctissimo Nutrition | 1177 | 95.94 |
Carenity | 863 | 97.67 |
Allo docteurs | 502 | 98.67 |
Atoute | 170 | 99.01 |
Doctissimo medicaments | 166 | 99.34 |
Que choisir | 85 | 99.51 |
Maladie rares | 76 | 99.66 |
Au feminin | 58 | 99.77 |
Sport santé | 50 | 99.87 |
Onmeda | 48 | 99.97 |
Famili | 7 | 99.98 |
Futura | 5 | 99.99 |
Maman vie | 2 | 100.00 |
Magic maman | 1 | 100.00 |
Preferred terms | Values, n |
Pain | 1882 |
Tiredness | 1841 |
Faintness | 1267 |
Hypothyroidism | 1110 |
Dizziness | 912 |
Insomnia | 627 |
Palpitations | 571 |
Hyperthyroidism | 568 |
Malignant tumor | 560 |
Anxiety | 498 |
Overdose | 490 |
Nervous tension | 484 |
Myalgia | 409 |
Nausea | 388 |
Stress | 380 |
Diarrhea | 354 |
Tachycardia | 322 |
Muscle spasms | 321 |
Convulsions | 302 |
Arthralgia | 276 |
A total of 11 drugs were found to be associated with tiredness using logistic regression: paclitaxel, pegfilgrastim, Levothyrox, glatiramer acetate, escitalopram ferrous sulfate, the combination of Levothyrox and liothyronine, secukinumab, methotrexate, bismuth potassium, tetracycline, and metronidazole.
Change-point analysis was conducted on the monthly evolution of the number of Levothyrox-ADE couples detected in web forums. Six breakpoints were identified ( Figure 6 ), and 3 of them correlated with an increase in the number of ADEs found with Levothyrox on web forums. These increases occurred in August 2017 and in September and December 2018.
This use case demonstrates that the results obtained through the pipeline, particularly in the context of Levothyrox, align with findings in the literature derived from more traditional data sources such as case reports in pharmacovigilance (see the “Discussion” section). It underscores the potential of leveraging such a pipeline to monitor a drug, not only retrospectively but also in real time using social media. Consequently, PHARES has the capability to potentially uncover new signals in pharmacovigilance.
To align with our objective, we implemented and evaluated a pipeline that processes data from the extraction of web forum posts to the generation of indicators and alerts within a visual and interactive environment. Through this pipeline, we demonstrated that quantitative analysis can be conducted through the interface without requiring the user to code. We discovered the feasibility of acquiring information akin to the literature regarding a drug’s ADEs, as well as unexpected ADEs and significant event dates related to a drug. This underscores the relevance and utility of such a pipeline.
A conceptual contribution of this research was the proposal of a methodology for designing a pipeline to facilitate pharmacovigilance studies on web forums. This involved describing 4 independent modules and outlining their interactions. Additionally, another contribution was the adaptation of certain pharmacovigilance analysis methods for the examination of data extracted from web forum posts. The logistic regression–based method presented in this article was originally tailored for pharmacovigilance cases to consider co-prescriptions of drugs. We have adapted it to suit the analysis of pharmacovigilance data extracted from web forum posts.
The PHARES pipeline offers added value compared with previous pipelines in terms of the criteria set, which reflects an analysis of experts’ needs for routine monitoring of ADEs in social media. Unlike previous approaches, the scrapers used in PHARES routinely perform deidentification, and the inclusion of change-point analysis, the evolution of PRRs over time, and a logistic regression–based signal detection method were previously unavailable. The temporal evolution of the number of posts and a signal detection method are also seldom supported. Designed for routine usage and focused on ADEs, all posts from selected web forums are scraped and deidentified using an open-source scraper.
The period and selected web forums differed between both studies: Audeh et al [ 38 ] covered the period from January 2015 to December 2017, while our study spanned from January 2017 to February 2021. Additionally, Audeh et al [ 38 ] included only 1 web forum specialized in thyroid issues, whereas we incorporated this specific forum along with 16 others. The main ADEs associated with Levothyrox in our study align with those found by Audeh et al [ 38 ] on similar data, albeit without using the interface. In our study, the 10 most frequent symptoms were pain, tiredness, faintness, hypothyroidism, dizziness, insomnia, palpitations, hyperthyroidism, malignant tumor, and anxiety. By contrast, Audeh et al [ 38 ] reported tiredness, weight gain, pain, ganglions, hot flush, chilly, inflammation, faintness, weight loss, and discomfort.
Furthermore, the PHARES pipeline surpasses previous efforts, particularly regarding several criteria. These include the annotation tool, where only 4 pipelines were identified using a French annotator tool. In terms of available statistics, only 1 pipeline met both criteria we identified. Regarding signal detection, among the 3 criteria identified, 5 pipelines matched with only 1, while the remaining 5 matched with none.
In the use case, a notable increase in the number of ADEs associated with Levothyrox was detected using the change-point analysis method a few months after the introduction of the new formula in March 2017, specifically in August 2017. This surge coincided with the initial declaration to the pharmacovigilance network and a petition initiated by patients to reintroduce the former formula in June 2017. We compared these findings with results from a pharmacovigilance study based on spontaneous reporting. Out of 1554 notifications spontaneously addressed by patients to the Pharmacovigilance Center of Nice from January 1, 2017, to December 31, 2017, 1372 were related to the new formula of Levothyrox, representing 7342 ADEs. Our comparison with these data clarified our findings. The 10 most frequently reported ADEs in these notifications closely resembled our own results [ 37 ]. These were asthenia, headache, dizziness, hair loss, insomnia, cramps, weight gain, nausea, muscle pain, and irritability. Consequently, our results demonstrate coherence with the existing literature. This study illustrates the feasibility of identifying the date of significant events related to a drug. However, it is noteworthy that the detection of such events is not necessarily expedited through social media compared with the traditional pharmacovigilance system.
The method used in our annotation process was integrated at an early stage during the pipeline’s design. Regarding the identification of drugs and symptoms, our annotation process exhibited the following performances: precision=0.926, recall=0.845, and F 1 -score=0.886 [ 20 ]. Similarly, for discerning the relationship between the drug and the ADEs, the performances were precision=0.683, recall=0.956, and F 1 -score=0.797 [ 20 ]. This study marked the inaugural publication on using NLP methods to identify ADEs in French-language web forums. The annotation process was thus developed using contemporary state-of-the-art methodologies at the time. However, it would now stand to gain from the integration of more recent NLP algorithms for named entity recognition [ 8 , 23 , 24 ]. These newer algorithms offer comparable performances while effectively handling more complex data, thereby enhancing the efficacy of NLP analysis. However, because of our emphasis on the genericity of the approach and the interoperability between the different modules rather than solely focusing on the performance of each module, we opted not to use these algorithms. Nevertheless, contemporary state-of-the-art methods for annotating ADEs from social media posts encompass convolutional neural networks trained on top of pretrained word vectors for sentence-level classification [ 24 ] and transformers using the bidirectional encoder representations from transformers (BERT) language model [ 39 ]. Hussain et al [ 25 ] introduced a multitask neural network based on BERT with hyperparameter optimization capable of sentence classification and named entity recognition. This model achieved performances of precision=0.840, recall=0.861, and F 1 -score=0.896 on the Twitter (X)-TwiMed data set. Additionally, Magge et al [ 8 ] presented a pipeline consisting of 3 BERT neural networks designed to classify sentences, extract named entities, and normalize those entities to their respective MedDRA concepts. The performances of this model were as follows: precision=0.82, recall=0.76, and F 1 -score=0.78 on the SMM4H-2020 data set (Twitter/X). Thanks to our modular design, it will be straightforward to substitute our current annotation process with an enhanced model in the future.
Several limitations should be acknowledged for future work. First, the scraper relies on the HTML structure of web forums, necessitating updates to its configuration files if a forum alters its page design. Additionally, our interface lacks the capability to incorporate alternate identifiers for drugs or ADEs. For instance, patients may commonly refer to the drug “baclofen” as “baclo” on social media platforms. Consequently, the number of posts pertaining to a drug or ADE could potentially be underestimated.
Forums must be selected before query execution to mitigate calculation time. However, selecting forums based on the presence of information related to a particular drug or ADE can introduce bias into signal detection methods, particularly in disproportionality analysis, where the drug-ADE pair may be overrepresented. Another limitation in qualitative analysis of posts is the inability of users to edit annotations or record typical pharmacovigilance qualitative data.
The assumption that all drugs mentioned in a post were consumed simultaneously by the user, as applied in the logistic regression–based method, introduces an evident bias.
One limitation associated with the use of social media data pertains to fraudulent posts. The pseudonymity inherent in these platforms provides malevolent individuals with the opportunity to disseminate false rumors. Additionally, patients might post identical or similar messages across multiple discussion boards, or even multiple times on the same board. Thus, it is crucial to consider these factors to mitigate biases in signal detection.
In the short to medium term, our objectives are updating the annotation module to enhance accuracy, improving the qualitative analysis by enabling users to edit and correct annotations, and expanding the range of signal detection methods available in the statistics module.
This method could indeed be beneficial for identifying potential drug misuse and unknown ADEs [ 40 ]. By categorizing pathological terms found in web forums based on their presence in the summary of product characteristics, we can distinguish between indications, known ADEs, and potential instances of drug misuse or unexpected ADEs. However, it is important to note that considering all pathological terms found in the summary of product characteristics as indications might obscure cases of drug inefficiency. Therefore, a nuanced approach is necessary to ensure comprehensive and accurate analysis.
We next tested our pipeline from the perspective of end users. However, the hypothesis was only partially confirmed, indicating the need for further studies. These studies should include evaluations with ergonomic criteria.
In the long term, our vision is to expand this tool to encompass other languages and themes beyond pharmacovigilance. This includes areas such as drug misuse, the consumption of food supplements, and the use of illegal drugs. French web forums dedicated to recreational drug use already exist, providing a valuable source of data for such endeavors.
Our hypothesis focused on the challenge encountered by regulatory agencies in using social media, primarily because of the lack of appropriate decision-making tools. To tackle this challenge, we devised a pipeline consisting of 4 editable modules aimed at effectively analyzing health-related French web forums for pharmacovigilance purposes. Using this pipeline and its user-friendly interface, we successfully demonstrated the feasibility of conducting quantitative analyses without the need for coding. This approach yielded coherent results and holds the potential to reveal new insights about drugs.
A practical implication of our pipeline is its potential application in health surveillance by regulatory agencies such as the ANSM or pharmaceutical companies. It can be instrumental in detecting issues related to drug safety and efficacy in real time. Furthermore, research teams can leverage this tool to retrospectively analyze events and gain valuable insights into pharmacovigilance trends.
The annotation module was developed by François Morlane-Hondère, Cyril Grouin, Pierre Zweigenbaum, and Leonardo Campillos-Llanos from the Computer Science Laboratory for Mechanics and Engineering Sciences (LIMSI). Code review for the graphical user interface in R language was performed by Stevenn Volant under a contract with the Stat4Decision company. Stat4Decision was not involved in designing the study and writing this article. This work was funded by the Agence nationale de sécurité du médicament et des produits de santé (ANSM) through Convention No. 2016S076 and was supported by a PhD contract with Sorbonne Université.
Our data were extracted from web forums that do not allow data sharing. Thus, as we are not the owners of the data we cannot make the data available. The scrapper we developed to extract these data is open source and can be used to extract data from web forum posts. The tool as well as full documentation (in English and French) of the code and configuration file are available online [ 41 ].
None declared.
Vigi4Med Scraper structure, PHARES database structure, example of data representation, and source and evolution over time of web forum posts. PHARES: Pharmacovigilance in Social Networks.
adverse drug event |
Agence nationale de sécurité du médicament et des produits de santé |
Anatomical Therapeutic Classification |
Bidirectional Encoder Representations from Transformer |
comma-separated values |
Cumulative Sum |
European Medicines Agency |
Food and Drug Administration |
French Pharmacovigilance Database |
General Data Protection Regulation |
French National Health Authority |
Medical Dictionary for Regulatory Activities Terminology |
natural language processing |
Pharmacovigilance in Social Networks |
Prescription Drug Abuse Online Surveillance and Epidemiology |
proportional reporting ratio |
preferred term |
resource description framework |
reporting odds ratio |
Recognizing Adverse Drug Reactions |
Edited by A Mavragani; submitted 01.02.23; peer-reviewed by S Matsuda, L Shang; comments to author 06.07.23; revised version received 20.10.23; accepted 12.03.24; published 18.06.24.
©Pierre Karapetiantz, Bissan Audeh, Akram Redjdal, Théophile Tiffet, Cédric Bousquet, Marie-Christine Jaulent. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 18.06.2024.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
IMAGES
VIDEO
COMMENTS
A Level 4 E/M ED service was provided. Instructions: Assign ICD and CPT codes for this case. CPT: 99284, 93000. ICD: 733.6. Diagnosis: Appendicitis with gangrene and perforation. Procedure: Abdominal appendectomy. A 24-year-old female presented to the emergency department (ED) with complaints of midepigastric pain.
Case Study 4: Radiology Coding. Scenario: A patient comes to the radiology department for an X-ray of the right ankle after twisting it during a sports activity. Question 4: Assign the appropriate CPT code for the X-ray of the right ankle. Answer 4: CPT code 73610 (Radiologic examination, ankle; 2 views) should be assigned for the X-ray of the ...
Below are practice medical coding study cases meant to provide you with some real-world case studies. Keep in mind that medical coding requires knowledge of the current coding systems, such as ICD-10-CM (International Classification of Diseases, 10th Revision, Clinical Modification) for diagnosis coding and CPT (Current Procedural Terminology) for procedure coding.
Because of new coding rules one unit (15 min) of a prolonged services code can be used. CASE 4 OFFICE VISIT: Return visit for vaccination with series, with separately identifiable E/M The 28-year old female from Case 3 returns to her physician's office 2 months later for the second of her series of three HPV vaccines.
CPC (Medical Coding) CPB (Medical Billing) CPC + CPB. All Certification Courses. Continuing Education. Search for CEUs. Webinars. Workshops. Specialty Certificates. ... 2021 E/M Case Study: How the Latest Changes May Impact Your Practice. Read Case Study. Case Study. How AAPC Audit Services Improves Coding Accuracy. Read Case Study.
Medical Coding Training. ... All examples and case studies used in our study guides, exams, and workbooks are actual, redacted office visits and procedure ... interventions, diagnostic test and studies, and treatment outcomes. Coding is the process of translating this written
Let our coding case studies & solutions inspire you. Our Featured Medical Coding Case Studies provide an in-depth review of challenges with coding quality, reimbursements, rework, compliance, etc. Conducting audits, evaluations, and implementation plans are inherent components of any effective solution. If you're searching for medical coding ...
Here are some more examples of Evaluation and Management (E/M) coding scenarios: Case Study 6: Outpatient Clinic Visit. Scenario: A 45-year-old female patient visits an outpatient clinic for a follow-up regarding her diabetes and hypertension. The physician reviews her current medications, orders lab tests, and adjusts her treatment plan.
Case Study 2: Insurance Claims. Medical coding also plays a vital role in insurance claims processing. For instance, imagine a patient visits a primary care physician for a routine check-up. During the visit, the physician diagnoses the patient with hypertension and prescribes medication. The medical coder will translate the diagnosis and ...
This refers you to ICD-10-CM code H65.0-. As you can see, the code requires a fifth digit, which specifies laterality and other conditions, such as recurrence. To find the choices for the additional digit, you need to look in the ICD-10-CM Tabular List under H65.0 (Acute serous otitis media). The fifth digit symbol next to the code is another ...
In the medical billing and coding field, getting paid requires accurate documentation and selecting the correct codes. In our Coding Case Studies, we explore the correct coding for a specific condition based on a hypothetical clinical scenario. This scenario involves a patient presenting for a follow-up visit with symptoms of Type 2 Diabetes.
Learn everything you need to know about transforming medical diagnosis, procedures, medical services, and equipment into universal codes. ... Case Study and Coding. case-studies. Article; Audio; Case Study; E-brief; Tools; Video; White Paper; Case Study; Coding; Case Study. From Rework to Revenue | A Case Study . Read Case Study.
The CPC Exam has 100 questions from 17 categories. The exam is composed of 100 questions, drawn from the same 17 medical billing and coding categories you will see on the actual exam: CPT 10000 - Integumentary System (Skin, nails, hair) CPT 20000 - Musculoskeletal System. CPT 30000 - Respiratory System. CPT 40000 - Digestive.
Case Study Example for ICD-10-CM. Here is an example of an ICD-10-CM case study that will help you understand proper coding. CHIEF COMPLAINT: Follow up on Diabetes. HISTORY OF PRESENT ILLNESS: Established female patient presents today for a re-evaluation of her Diabetes Mellitus Type I. Last A1C was done in April with a result of 7.8.
Case 1: Level 3 ESTABLISHED (99213) A 12-year-old female is brought by her mother for a follow-up of known complex partial seizure disorder on carbamazepine. Pre-Visit: Reviewed prior clinic notes [2 MINS] Visit: Performed medically appropriate history and exam. No recent reported seizures since carbamazepine was initiated.
In this case there was no noted history of noncompliance. In this note the side effects of stopping the medication include headache, which remains as a patient complaint for this encounter. When documenting headache do differentiate if intractable versus non-intractable. Coding ICD-9-CM Diagnosis Codes. V70.0 Routine medical exam
Coding Practice II: Medical Record Case Study Chapter Objectives. Identify common formats of the medical record.. Describe the basic steps taken to review a medical ... Ten Steps for Coding from Medical Records Before beginning the process of coding, make sure sufficient basic materials are in place, including up-to-date ICD-9-CM codebooks, a ...
Study with Quizlet and memorize flashcards containing terms like Inpatient Physician's Progress Note 12/20/XX Two-day-old infant examined today to follow up after the results of diagnostic tests. BLOOD GAS: Study indicates reduced oxygen tension and ineffective gas exchange. CHEST X-RAY: Presence of infiltrate Infant continues to exhibit signs of infant respiratory distress syndrome, type 2 ...
In this case study on outsourced medical coding services, we recount the story of transition and accuracy of services while ramping up to 70,000 charts. ... Access Healthcare's medical coding outsourcing services are backed by years of experience in healthcare revenue cycle management. Our team uses technology and human expertise to provide ...
H01.004, unspecified blepharitis of left eyelid - (lid means upper, also "upper" is stated in the medical record) Blepharitis is the medical term for inflammation of the eyelid.-ICD-10-CM > Index> Blepharitis, left, upper. -Coding guidelines for outpatient encounters state not to code possible or probably diagnoses; therefore, herpes is not assigned a code.
Case study Optum360 helps Truman Medical Centers solve coding challenges, support ICD-10 efficiency and improve financial performance. Truman Medical Centers (TMC) provides accessible, state-of-the-art quality health care to the Kansas City community. Anchored by two acute care academic hospitals, TMC Hospital Hill and TMC Lakewood,
Delve into the complexities of inpatient Evaluation and Management (E/M) coding through our insightful case studies. Designed for medical coder trainees and professionals, this resource provides real-world scenarios illustrating the intricacies of inpatient E/M coding. Learn how to navigate challenging coding situations, ensure accuracy in documentation, and optimize billing processes.
This significantly improves efficiency and accuracy, reducing the burden on medical coders. Reason 5: Accurate Capture of Past Surgical History. Failure to capture past surgical history, such as ...
Competitive salaries. Quick-launch career. High-demand opportunities. Career longevity. Medical billing and coding professionals enjoy financial security working in the healthcare industry and earn an average annual salary of $60,917 with an annual growth rate of 9%. Becoming a medical biller and coder doesn't require a 4-year college education — or even a 2-year college education.
Medi cal Coding Program is intended for individuals seeking entry-level employment as well as for advancement or cross-training opportunities for those who are currently employed in healthcare environment. The students learn ICD-10-CM/PCS and CPT/HCPCS Level II guidelines and apply them while coding various coding scenarios including inpatient records, outpatient visits, operative reports, and ...
Medical billing and coding professionals are a crucial link between healthcare providers, insurance companies, and patients. They are responsible for ensuring accuracy, fairness, and timeliness. Addressing these ethical challenges is essential to maintaining trust and upholding the integrity of healthcare practices.
Skip to main content
In addition, in the interest of tracking the progression or resolution of findings over sequential radiology studies, the Observation.focus can be used to attach a tracking identifier by referencing DICOM "tracking-uid" coding. 22 A tracking UID uniquely identifies the observation as part of a series of snapshots of the imaging appearance ...
Background: To mitigate safety concerns, regulatory agencies must make informed decisions regarding drug usage and adverse drug events (ADEs). The primary pharmacovigilance data stem from spontaneous reports by health care professionals. However, underreporting poses a notable challenge within the current system. Explorations into alternative sources, including electronic patient records and ...