Data integrity remains a top priority for FDA and other regulators. In the pharmaceutical GMP space, quality and compliance teams need to be aware of the types of data integrity issues that tend to draw citations.

On Aug. 3, Redica Systems Senior GMP Quality Expert Jerry Chapman presented the webinar, “A.I. for Quality and Compliance Teams.” In this webinar, he used two case studies to show how quality and compliance teams can use regulatory enforcement data to identify critical trends. One of the case studies examined A.I. and data analytics in the context of data integrity (the other case study looked at enforcement trends affecting 503B outsourcing trends and can be accessed in Part I).

[Related: To view the webinar, including slides, click here.]

Data Integrity Trends Hiding in Plain Sight

For the data integrity case study, Chapman presented FDA data integrity citations. But as he explained, it can be challenging to find this information by searching publicly available Warning Letters.

In fact, it is rare for FDA to specifically use the phrase “data integrity” in a Warning Letter, unless the violations were “those rare ones where it is so egregious that FDA is pointing to its data integrity guidance.”

When it comes to data integrity violations, these are generally cited under a specific CFR citation. At the same time, it can be challenging to search just using those citations because those same citations are also used for other things. For example, 211.100 covers not following procedures, yet this is the CFR citation usually cited when documentation is not contemporaneous since it contains the terms “documented at the time of performance.”

Chapman turned to the Redica Systems platform to analyze the 49 Warning Letters FDA issued to traditional sterile drug manufacturing sites from 2014 to 2019. Data integrity is one of seven categories under the Redica Systems GMP model for human drugs since it is a critical focus for regulators and the industry (the other six categories are the same six used by FDA: Quality System, Packaging and Labeling, Facilities and Equipment, Materials, Laboratory, and Production).

Within this data integrity category, there are 11 subcategories:

Accurate
Attributable
Backup and archival
Contemporaneous
Data destruction
Data manipulation
Legible
Original data
Paper record controls
System controls
Testing into compliance

Surprising Data Integrity Findings

“If we look at data integrity, what issues did our model find? Well, almost 30% had to do with original data. Over a quarter had to do with data manipulation. About 14% for system controls. And nearly 14% for data destruction and you can see the others,” he said.

Figure 1 shows which data integrity subcategories the citations fell under.

Figure 1 Data Integrity Findings Organized into 11 Subtopics — **FIGURE 1 | Data Integrity Findings Categorized into the 11 Sub-Topics**

But where were the facilities cited in these data integrity Warning Letters? Within the past decade, high-profile incidents involving data integrity violations have occurred in places like India and China, in particular, at API manufacturing facilities in those countries.

“Is that skewing this dataset? Is that where they came from? The answer is ‘no,’” Chapman emphasized. “Data integrity is really a global issue. We have the ability now to run large datasets. And that can produce some surprises.”

Figure 2 shows the location of the Warning Letters with at least one “data integrity” n-gram. What is an n-gram? An n-gram is a contiguous sequence of n items from a given text sample. N-grams are created by subject matter experts based on their experience and then tested over time in iterations. N-grams have meaning, e.g., within pharma the phrase “Your Quality Unit” in a Warning Letter has significant meaning.

Figure 2 Number of Countries with at Least One Data Integrity N-Gram — FIGURE 2 | **Number of Warning Letters with at least One Data Integrity N-Gram**

“The first two columns are not a big surprise,” he said. “India and China had an equal number. But there are more Warning Letters issued in those countries. So, we need to be careful about making too many judgments based on that. If you discard those momentarily and look at Canada, the United States, and South Korea, those are not third world countries. But those are still quite a few numbers.”

…it is rare for FDA to specifically use the phrase ‘data integrity’ in a Warning Letter

He then normalized this data by analyzing the number of Warning Letters and the number that contained data integrity issues. This produced surprising results as shown in Figure 3.

Figure 3 Countries with 7 or More Drug GMP Warning Letters — FIGURE 3 | **Countries with 7 or More Drug GMP Warning Letters**

“When I first looked at these results, I was surprised enough myself,” Chapman said. “I trust the model. I have helped build and perfect the model. But it was still a big surprise to me that I looked at all of the Warning Letters and all of the citations that it pointed to from Canada to make sure that this information was correct and indeed it was correct.”

He encourages anyone interested in how he obtained the data for this and the 503B outsourcing case study to contact Redica Systems to learn more. Further, he explained that new datasets are being added to the platform.

“We expect that we are going to get more surprises and more insights as we examine more datasets with increasingly sophisticated tools,” he said.

[Related: To view the webinar, including slides, click here.]

Get a Demo

We can show you insights into any of your key suppliers, FDA investigators, inspection trends, and much more.

Request a Demo

What Can Regulatory Data Tell Us About Data Integrity Trends?

Data Integrity Trends Hiding in Plain Sight

Surprising Data Integrity Findings

Get a Demo