As the amount of available real-world data continues to increase, sophisticated methods to analyze large quantities of healthcare data are gaining traction, paving the way for personalized care and treatments. What are these innovative methods, how are they used, and why go beyond traditional data analysis techniques?
What is artificial intelligence?
Let’s begin by defining some key terms that are frequently used to analyze real-world data.
Artificial Intelligence (AI) covers a wide range of processes that allow computers to perform tasks that generally require human intelligence, such as visual perception, decision-making, and speech recognition.
Machine learning (ML) is a subset of AI. It refers to algorithms that are able to make predictions or decisions with ever-increasing accuracy through experience and data analysis. There are two main types of learning algorithms:
- Unsupervised learning uses techniques such as clustering and dimensionality reduction to automatically discover patterns in data.
- Supervised learning uses classification and regression techniques to map inputs onto a predefined output based on training and test datasets.
Natural language processing (NLP) processes and analyzes large amounts of unstructured language data to understand its full meaning.
Current applications in HEOR
AI has a variety of applications in Health Economics and Outcomes Research (HEOR), notably to respond to a number of important research questions around disease burden, real-world effectiveness and safety, patient pathways, and more.
Disease burden studies often rely on electronic health records and/or insurance claims databases to evaluate the healthcare costs and resource utilization for a specific disease. While the ‘human cost’ (i.e. how the disease affects a patient’s quality of life) also needs to be factored in, this can be difficult to capture and evaluate from existing data sources, as information relating to patients’ quality of life is often collected through cross-sectional surveys in order to gain a more holistic understanding of their personal experience. That said, social media listening can also provide a valuable and searchable dataset to gather information on how patients’ lives are affected by the disease.
Social media listening uses NLP to analyze publicly available social media posts from platforms such as Twitter, YouTube, Reddit, and Instagram. Data points from these sources are then aggregated to identify patterns and draw informed conclusions as to the disease burden for a specific pathology. Topic modeling, aspect extraction, and sentiment analysis are specific types of NLP that are frequently used to identify trends within the identified texts.
While social media listening may support evidence generation in relation to patients’ quality of life, it can also be used to evaluate the real-world effectiveness and safety of a newly launched drug or treatment. For instance, it can be used to monitor improvement in ability to perform day-to-day activities, measure the frequency of side effects, and analyze patient feedback.
The evaluation of patient pathways is another integral component of HEOR. Robust data on real-world patient journeys can help improve clinical care strategies and patient outcomes by reducing variability in clinical practice. However, standard data analytics techniques require a multidisciplinary approach to extract and analyze data on patient pathways. Advanced ML methods can provide a more detailed overview of patient pathways using clustering techniques, for instance to identify patterns in the data based on treatment phases, their duration, and associated outcomes.
There are many other ways AI can be successfully applied to HEOR to generate previously unavailable insights on a patient population or subgroup of interest. In particular, case ascertainment algorithms may be used to generate administrative claims algorithms in order to identify patients with a specific disease. Classification algorithms may also be used for predictive modeling, which helps practitioners forecast clinical outcomes and identify which factors play an important role in these forecasts. These applications may help identify which patients should be screened for specific diseases, leading to early diagnoses and an increased ability to forecast future clinical events.
High potential, with some caveats
By applying innovative methods to large volumes of healthcare data, AI has the potential to generate real breakthroughs in patient care & management, and disease detection. Methods such as NLP and clustering provide novel approaches to identify patterns in healthcare data that could otherwise go unnoticed. Furthermore, findings from a previous targeted literature review show that ML classification methods used in real-world data have the potential to improve the predictive power for different health outcomes.
However, as the use of AI in healthcare data continues to rise, there are important considerations to be kept in mind. These methods are relatively new, so the ability to replicate results is still quite low – which is why it is important to provide full transparency in terms of datasets, methodology etc. when applying AI to healthcare data. Furthermore, as with any data analytics technique, the quality of the data itself will considerably affect the quality of the results. If certain relevant variables are not available or significant volumes of data are missing, the interpretability and reliability of results may be quite limited. Finally, it is important to note that expert clinicians still need to be consulted and indeed actively involved in the process if we are to fully understand how relevant and applicable any set of data might be to the research question we are investigating: this is crucial to ensure the results can be safely interpreted.
If appropriate methodological guidelines are developed and these relatively new approaches become more standardized over time, then it is only a matter of time until AI can unlock the full potential of healthcare data.