At the second Xavier Health AI Summit, Amgen Quality Data Sciences Executive Director Dan Weese and Director Mark DiMartino presented the strategy, approaches, and lessons learned from their implementation of AI tools in Amgen’s quality processes, with a specific focus on the natural language processing (NLP) project used to analyze and trend manufacturing non-conformance reports (NCs).
In the first part of this series, we covered:
- Creation of Amgen’s data science tool
- Amgen’s data science process
- Quality data sciences areas of focus
If you missed the first part, you can find it here.
Following, we review:
- Amgen’s current NC trending process
- Designing and piloting a new process
- Q&A focusing on GMP documentation aspects
The Current NC Trending Process Is Manual
DiMartino explained that the purpose of an NC system is to:
- Capture errors or deviations
- Understand the impact on the product and process
- Investigate to find a root cause
- Fix the root cause in order to prevent a recurrence
He pointed to the FDA 2006 guidance on quality systems as well as ISO 9001 and ISO 13485, which indicate such a process is needed for both drug and device manufacturing.
“At Amgen, we do not investigate all of our NCs,” DiMartino said. “We rank them based on risk. If they are at low risk, we do not necessarily investigate them. But we track them to be sure we do not see a recurrence.”
Trending Process
While the goal of the process is finding and correcting root causes, not all root cause investigations are effective. However, a trending process can aid in the investigations by connecting like occurrences the investigator may not have seen.
DiMartino shared a graphic that outlines Amgen’s current NC trending process, which he characterized as robust. It involves the use of control charts and Pareto charts, with monthly and quarterly activities (see Figure 1).
Figure 1
While robust, the trending program does have weaknesses, DiMartino said. “One of the things I have been asked to defend in inspections is, ‘If you are using a control chart, you are setting a baseline and evaluating against an increase relative to that baseline. In doing so, you are accepting there is a baseline level of deviations that is acceptable.’ It is hard to argue against not being the best approach.”
The NC categorizations are performed using a drop-down menu. While that ensures consistency as opposed to using free form text, the menu selections are based on one or two people’s impressions of what the actual deviation was. This introduces some level of bias and leads to analyses in a silo, such that “weak signals across categories will most likely be missed,” DiMartino pointed out.
Arbitrary Influence
He also introduced the topic of “arbitrary influence,” which is illustrated by the use of monthly and quarterly timelines. While useful for planning and scheduling, looking at NCs monthly and quarterly could miss correlations that do not fall within the month or the quarter.
As a first step in beefing up the process, one-sentence descriptions of the NCs were extracted and examined manually for commonalities, which were then reported to functional area leads. However, the flaw with this process is it is labor-intensive and requires judgment at the initial step, which introduces the potential for bias.
Designing And Piloting A New Process
A project team was formed to look for a system algorithm that could replicate and perhaps improve upon the manual process. The goal was to think big but start small and build a product that could be deployed across the manufacturing network.
Using an agile development approach and NLP tools, the team developed a consistent algorithm that was able to reasonably replicate the manual process.
DiMartino described NLP as an AI technology that turns text into numbers, which can be read by a computer and used to identify similar records. Each record has a series of numbers associated with it that can be analyzed to create similarity scores.
“This is our first step at building some knowledge of how to take some of that data to feed into a more predictive model. Eventually, we would like to connect it to lots of different data sources and get a more holistic view of product quality in our operations.”
The records can then be clustered together. Those clusters can then be given to an SME, who can decide if there is trending and if action should be taken. Feedback can then be given to the algorithm, which can be adjusted.
In addition, the tool “takes away the bias of the categorization because it is based on the actual description of the event. It takes away the arbitrary endpoints because it allows looking across longer time frames, taking away the monthly or quarterly component. It updates every day an NC is initiated,” he explained.
Weak Signals and Ontologies
The tool also enables exploration of “weak signals” — unstructured, fragmented, or incomplete pieces of data that would rarely be found by human analysis, but that combined could show trends or point to more systemic issues.
In the future, by looking at the full text of long descriptions, a more sophisticated NLP algorithm could use topical clustering rather than just text similarity and surpass the current trending and analysis capability.
“If you think about clustering based on like words, what if there are language or context differences in the words — for example, the spellings?” DiMartino asked. “There is a lot of time and effort behind our efforts to create libraries, ontologies, and so on. A big part of this is developing and managing those.”
According to an IBM Community blog on language processing, an ontology is “a description of things that exist and how they relate to each other.”
Ontologies are used to aid in the classification of entities and the modeling of relationships between those entities. Ontology-driven NLP involves the use of a semantic model to understand what exists in the unstructured data.
“What we would like to get to is real-time predictive models, and not just for NCs,” DiMartino said. “I would like to be able to predict the outcome of a manufacturing operation based upon different factors that may or may not lead to an NC. This is our first step at building some knowledge of how to take some of that data to feed into a more predictive model. Eventually, we would like to connect it to lots of different data sources and get a more holistic view of product quality in our operations.”
[Editor’s Note: Learn more about avoiding common deviation investigation pitfalls from consultant Stephanie Gaulding.]
Q&A Focuses On GMP Documentation Aspects
The Q&A session after the Amgen presentation focused on what documentation surrounding this process should conform to GMPs, how the NLP process compares to the manual one, and the cost of investment versus the benefit derived.
Q: An audience member asked the presenters to discuss the company’s validation strategy regarding GMP documentation.
A: DiMartino responded, “We have deployed this in a production environment. It is a qualified environment, so we know our data access is validated. The transition of the data from one step to the next is all covered.”
He noted the algorithm used for the application is “traceable and repeatable. It is pretty simple math. We have an approach to validate it. What we in the network are having discussions about is how strict the validation has to be.”
Weese said, “The data lake itself is validated. We are pulling from there, and we are certain the data in the data lake from our core systems are correct. As Mark mentioned, we are augmenting humans. How can it be worse than sitting around a table reading all these NCs?”
FDA Center for Devices and Radiological Health (CDRH) Office of Compliance, Case for Quality Program Manager Francisco Vicenty commented, “The system works. It is as good as what you had before, and maybe better. That is what you are testing. You have confidence in it. You want to roll it out. I think I also heard you say something which I liked to hear mentioned, that you are focusing on the result it is trying to deliver, not predicting the non-conformance. That is not the goal. Non-conformances happen. My question is, why do we need to see anything else from a GMP standpoint? We should not be driving extra effort and behavior.”
Weese responded, “We have people who totally agree with you. And that is the pathway we would have probably gone down. But what you heard in this audience is exactly what we have heard frequently from other colleagues throughout this process.”
Q: Another audience member asked if the results had been compared to the manual process output.
A: DiMartino responded that when user acceptance testing was completed, “for the most part we found it matched. There were one or two cases where we missed something. We will take a deep dive into those and understand what we can do to improve our tool.”
Q: Another question dealt with the size of the investment to create the NLP tool and how the team convinced company management to make the investment.
A: DiMartino said the team pointed to three areas of savings or improvement when presenting the idea to company management. The first was the time spent on the manual process, which can be quantified. The second was the dollar value of reducing NCs across the network — for example, by 5 percent. The third, a non-quantitative one, was creating a better process and increasing compliance.
Weese referred to the Amgen Quality Data Sciences Areas of Focus figure (Part 1, Figure 2). He said the eight bubbles “are somewhat sequential, and they build on each other. If we can get this to work correctly with deviations, then looking at safety incidents and things like that, it would use the same underlying platform we have already paid for.”
He said since the projects all rely on the analysis of documents that are text-heavy, “if we get the text analysis correct, then it builds on itself as we go around my circle. The platforms are in place, so we do not need to re-invest in those same platforms. That is another part of the discussion with management.”
DiMartino added, “We did an analysis that sits on existing platforms. We did not have to buy the software. There was still an investment in time and resources to build it and there will be some to maintain it, but this is not a multimillion-dollar project.”
The dialogue on the use of AI in what has been termed “the fourth industrial revolution” — a fusion of technologies that is blurring the lines between the physical, digital, and biological spheres — will continue at the FDA/Xavier PharmaLink Conference in March 2020. Participants will be shown how some of the latest technological tools are being used to rewrite how pharma manufacturing operations, quality systems, and supply chains are designed and executed.
Get a Demo
We’ll can show you insights into any of your key suppliers, FDA investigators, inspection trends, and much more.