Data Integrity Focus, Part II: Using Data Process Mapping to Identify Integrity Gaps


LCGC North America

LCGC North AmericaLCGC North America-02-01-2019
Volume 37
Issue 2
Pages: 118-123

Data integrity is paramount when working in a regulated environment. Data process mapping is an excellent way to identify and mitigate data gaps and record vulnerabilities in a chromatographic process. This approach is simple and practical.

Understanding and mitigating risks to regulatory records is an important part of a data integrity program. We discuss data process mapping as a technique to identify data gaps and record vulnerabilities in a chromatographic process and look at ways to mitigate or eliminate them.

Welcome to the second installment of "Data Integrity Focus." In the last part, we looked at the overall scope of a data integrity and data governance program via a four-layer model (1–3). In this part, we look at a simple and practical methodology that can be applied to identify the risks with any process in a regulated "good practice" (GXP) laboratory. Once identified, the risks can be mitigated or eliminated to ensure the integrity of data and records. The methodology is called data process mapping, and it is a variant of process mapping, which some of you may be familiar with if you have been involved with implementation of a computerized system or six sigma improvement project. Once the process is mapped, the data and records created, modified, or calculated are identified and assessed to see if there any data vulnerabilities in a paper process or computerized system.

Ignore Paper Processes at Your Peril!

It is very important to understand that data integrity is not just a computer or information technology (IT) equipment problem. There are many manual process generating paper records that occur in the laboratory, such as sampling, sample preparation, calculation, and review (3–5). Many observation tests such as appearance, color, and odor are typically recorded on paper. Even with a computerized system, there are additional and essential paper records, such as the instrument and column log books.

What Do the Regulators Want?

What do the regulatory guidance documents say about assessment of processes? There are three documents that I would like to focus on. The first is the World Health Organization (WHO) in their guidance document (6), that notes:

  • 1.4. Mapping of data processes and application of modern quality risk management (QRM) and sound scientific principles throughout the data life cycle;

  • 5.5. Record and data integrity risks should be assessed, mitigated, communicated, and reviewed throughout the data life cycle in accordance with the principles of QRM.

The second is the UK's Medicines and Healthcare products Regulatory Agency (MHRA) in their 2018 GXP guidance, which makes the following statements about assessment of processes and systems (7):

  • 2.6 Users of this guidance need to understand their data processes (as a life cycle) to identify data with the greatest GXP impact. From that, the identification of the most effective and efficient risk-based control and review of the data can be determined and implemented.

  • 3.4 Organizations are expected to implement, design, and operate a documented system that provides an acceptable state of control based on the data integrity risk with supporting rationale. An example of a suitable approach is to perform a data integrity risk assessment (DIRA) where the processes that produce data or where data are obtained are mapped out and each of the formats and their controls are identified, and the data criticality and inherent risks documented.

  • 4.5 The data integrity risk assessment (or equivalent) should consider factors required to follow a process or perform a function. It is expected to consider not only a computerized system, but also the supporting people, guidance, training, and quality systems. Therefore, automation or the use of a "validated system" (such as analytical equipment) may lower but not eliminate data integrity risk.

The third and final guidance is from the Pharmaceutical Inspection Cooperation Scheme (PIC/S) (8):

  • 5.2.2 Data governance system design, considering how data is generated, recorded, processed, retained and used, and risks or vulnerabilities are controlled effectively;

  • 5.3.2 Manufacturers and analytical laboratories should design and operate a system which provides an acceptable state of control based on the data integrity risk, and which is fully documented with supporting rationale.

  • 5.3.4 Not all data or processing steps have the same importance to product quality and patient safety. Risk management should be utilized to determine the importance of each data or processing step. An effective risk management approach to data governance will consider data criticality (impact to decision making and product quality) and data risk (opportunity for data alteration and deletion, and likelihood of detection or visibility of changes by the manufacturer's routine review processes).

From this information, risk-proportionate control measures can be implemented.

Summarizing the guidance documents:

  • Processes should be assessed to identify the data generated and the vulnerabilities of these records, and this assessment should be documented.

  • Vulnerabilities and risks to records must be mitigated or eliminated, and the extent of the controls used depends on the data criticality and risk to the records.

  • In some cases systems should be replaced, and there should be a plan for this over a reasonable timeframe,

  • Management must accept the process risk and support the migration plan.

Enter the Checklist

Typically, assessment of computerized systems involves a checklist where questions are posed for a spectrometer and the associated computerized system, such as:

  • Does each user have a unique user identity?

  • Is the audit trail turned on?

  • Is there segregation of duties for system administration?

The checklist questions can go on, and on, and on, and, if you are (un)lucky, it can go into such excruciating detail that it becomes much cheaper and safer than a sleeping pill. There are three main problems with a checklist approach to system assessment:

  • The checklists are not applicable to all computerized systems, as the questions may not cover all functions of the application

  • Checklists can mislead an assessor into focusing too much on the checklist at the risk of not seeing additional data risks posed by a specific system

  • Typically, checklists don't cover manual processes, of which there are many in a laboratory.

If a checklist is not the best tool, what tool should be used to identify data and records and then understand the risks posed?

Principles of Data Process Mapping

Instead of starting with a fixed checklist, start with a blank whiteboard or sheet of paper together with some Post-it notes, pencils, and an eraser. Why the eraser? You are not going to get this right the first time, and you'll be rubbing out lines and entries on the notes until you do. You'll need a facilitator who will run the meeting and two to three experts (perhaps laboratory administrators) who know the process, and, if software is involved, how the application works at a technical level.

The first stage is to visualize the process. Define the start and end of an analytical process (for example, from sampling to reportable result). The process experts should write the stages of the process down on the notes, and place them on the whiteboard or paper in order. The first attempt will be rough and will need revising, as the experts can miss activities, or some activities will be in the wrong order, or the detail is uneven. The facilitator should encourage and challenge the experts to revise and refine the process flow, which may take two or three attempts. Although you can use a program like Visio to document the process, this slows the interaction between the participants during the initial mapping. I would suggest paper and pencil or whiteboard is an easier, and more flexible, option at this stage. When the process is agreed, then commit the final maps to software.

The second stage is to document data inputs, outputs, processing, verification steps, and storage for each of the process activities. This can involve manual data recording in log books, laboratory notebooks, and blank forms, as well as inputs to and outputs from any computerized systems involved in the process. Typically, such a process has not been designed, but has evolved over time, and can often look like a still from Custer's Last Stand with the number of arrows involved. This is the data process map or what we can call the current way of working.

Once the process is complete and agreed, look at each step and document:

  • How critical is each activity within the overall process (for example, product submission, release, stability, analytical development, and so on)?

  • Where are the data and records stored in each activity?

  • Are the data vulnerable at each stage?

  • What is the reason for the vulnerability? (Reasons may include, but are not limited to, manual recording, or manual data transfer between a standalone instrument and another application)

  • Are data entered into a computerized system manually, how is this checked, and how are corrections documented?

  • Who can access the data? (Consider both paper and electronic records)

  • Are the access controls for each application adequate, and are there any conflicts of interest?

  • Are data corrections captured in an audit trail and, most importantly, are the entries understandable, transparent, and clear (9)?

  • Are the responsibility for all steps and data clearly described, such as decisions or further actions taken and attributed to an individual?

  • Is the data verification process clearly described, such as ensurance of accuracy of measurement, double checks performed (if any), and the review of the whole data package?

Any vulnerabilities need to be risk assessed, and remediation plans need to be developed. These plans will fall into two areas: quick fix remediation and long-term solutions. We will look at these two areas now for an example involving a chromatography data system.


Practical Example for Chromatography

From the theory, we need to look at how data process mapping could work in practice with a chromatograph linked to a chromatography data system (CDS). Welcome to a chromatography laboratory near to you operating the world's most expensive electronic ruler, as shown in Figure 1.

Figure 1: Current hybrid process for chromatographic analysis.

Let me describe the main features of the simplified process:

  • The chromatograph and CDS are set up for the analysis (we'll not consider the manual, paper-based sampling and sample preparation, because this was discussed recently by Newton and McDowall (4), together with the related data integrity problems).

  • Although there are several instances of the same CDS, they are all standalone systems, and not networked together.

  • There is a shared log on for all users, and this account has all privileges available, including the ability to configure the software.

  • Paper printouts are considered the raw data from each instance of the CDS.

  • Electronic records are backed up by the chromatographers when they have time, using a variety of media such as USB sticks and hard drives.

  • Peaks are integrated, but there is no standard operating procedure (SOP) or control over the integration, such as when manual integration can or cannot be used (2,10).

  • The integrated chromatograms are printed out.

  • Peak areas from the printouts are entered manually into a spreadsheet (unvalidated, naturally!), to calculate system-suitability test (SST) parameters and the reportable results.

  • The calculations are printed and signed, but the spreadsheet file is not saved.

  • The results are entered manually from the spreadsheet printout into a laboratory information management system (LIMS) for release.

  • The second-person review is not shown in this figure for simplicity, but this is a crucial part for ensuring data integrity (11).

Some of you may be reading the process with abject horror, and may think that this would never occur in a 21st century chromatography laboratory. Based on my experience, and this is also seen in numerous U.S. Food and Drug Administration (FDA) warning letters, this process is more common than you may think. Remember that the pharmaceutical industry is ultraconservative, and if it worked for the previous inspection, all is well. However, to quote that world-famous chromatographer, Robert Zimmerman, the times, they are a-changin'. Hybrid systems (discussed in the next part of this series) are not encouraged by at least one regulator (6), and now some inspectors are unwilling to accept procedural controls to mitigate record vulnerabilities.

Identifying Record Vulnerabilities

Once the process is mapped, reviewed, and finalized, the data vulnerabilities can be identified for each process step. The main data vulnerabilities identified in the current chromatographic process steps are listed in Table I. To put it mildly, there are enough regulatory risks to generate a cohort of warning letters. There are many data integrity red flags in this table, including the fact that work cannot be attributed to an individual, defining raw data as paper, and failing to backup, or even save, electronic records. There is also the shambles of the business process, due to the use of the spreadsheet to calculate all the values from SST parameters and reportable results. Overall, the process is slow and inefficient. These risks need to be mitigated as an absolute minimum or, even better, eliminated entirely.

Table I: Main data vulnerabilities identified in a chromatography process

Fix and Forget or Long-Term Solution?

Enter stage left that intrepid group: senior management. These are the individuals who are responsible and accountable for the overall pharmaceutical quality system, including data integrity. The approaches that a laboratory will take are now dependent on them.

Figure 2: Remediate or solve data integrity vulnerabilities?

Figure 2 shows the overall approach that should happen to resolve data integrity issues. There are two outcomes:

1. Short-term remediation to resolve some issues quickly. Ideally, this should involve technical controls where available (for example, giving each user a unique user identity, or creating and allocating user roles for the system and segregation of duties). However, remediation often involves procedural controls, such as the use of SOPs or log books to document work. This slows the process down even further, and will result in longer second-person review times (11).

2. Long-term solutions to implement and validate technical controls, to ensure that work is performed correctly and consistently. This should involve replacement of hybrid systems with electronic working and ensuring business benefit from the investment in time and resources.

The problem is management. In many organizations, they want only to focus on the first option (fix and forget) and not consider the second, as it would detract from the work or cost money. While this may be thought to be an option in the very short term, it is not viable when regulatory authorities become more focused on hybrid systems with procedural controls.

In organizations that claim there is no money to provide long-term solutions, however, the financial taps are quickly turned on following an adverse regulatory inspection. However, it is better, more efficient, and cheaper to implement the long-term solution yourself, because then the company, not the regulator, is providing the solution.

Quick Fixes and Short-Term Remediation

From the data process map in Figure 1, some short-term solutions can be implemented as shown in Figure 3. Rather than attempt to fix a broken and inefficient process, use the CDS software that the laboratory has paid for to calculate the SST and final results. This would eliminate the spreadsheet, as well as manual entry into the spreadsheet and subsequent transcription checks.

Figure 3: Short term remediation of the chromatography process.

Attention must also be focused on the CDS application, and some of the main changes for immediate implementation must be:

  • Unique identities for all users

  • Implement user types with access privileges, and allocate the most appropriate one to each user

  • Segregation of application administration from normal laboratory users is more difficult, as the systems are currently standalone, and will probably require a two-phase approach: Short-term with laboratory administrators having two access types, one with administration functions (user account management and application configuration) but no access to CDS functions, and vice versa. This approach should only be used as a temporary fix, and not a permanent solution.

  • Write an SOP for chromatographic integration, and specifically control when manual integration can be used (2,10,12), and train the staff. Where feasible, restrict a user's ability to perform manual integration for some methods.

  • Validate and use the CDS application ability to calculate SST parameters and eliminate the spreadsheet calculation (the former should be relatively easy to implement). Calculation of the reportable result may have to wait until the CDS is networked. In the latter case, the spreadsheet calculations will need to be validated and all files saved.

This should result in an improved business process, as shown in Figure 3. The CDS is still a hybrid system, but the spreadsheet has been eliminated, along with manual entry to a second system, but the process is under a degree of control. Left like this (the fix and forget option from Figure 2), there is substantial risk remaining in the process, such as backup of the standalone systems and the need for plans for a long-term solution.


Implementing Long-Term Solutions

Long-term solutions require planning, time, and money. However, with the potential business and regulatory benefits that can be obtained, management should be queuing up to hand over money. Let us look at some of the remaining issues to try and solve with this process:

  • Standalone CDS systems need to be implemented into a networked solution including the migration of existing data to the central server. This has several advantages: IT backup of records, IT application administration, and time and date stamps from the network time server.

  • Consistency of operation: The same methods can be applied across all chromatographs.

  • Removal of a hybrid system: Design the networked CDS for electronic working and electronic data transfer to the LIMS, which results in minimal or zero paper to be printed out.

  • Efficient, effective, and faster business process, as shown in Figure 4, and this should be compared with that in Figure 1.

Figure 4: Long-term solution for the chromatographic process.

The regulatory risks of the original process have been greatly reduced or eliminated at the end of the long-term solution. The laboratory can face regulatory inspections with confidence.


I would like to thank Christine Mladek for helpful review comments during preparation of this column.


(1) R.D. McDowall, LCGC North Amer. 37(1), 44–51 (2019).

(2) R.D McDowall, Validation of Chromatography Data Systems: Ensuring Data Integrity, Meeting Business and Regulatory Requirements (Royal Society of Chemistry, Cambridge, UK, 2nd Ed., 2017).

(3) R.D. McDowall, Data Integrity and Data Governance: Practical Implementation in Regulated Laboratories (Royal Society of Chemistry, Cambridge, UK, 2019).

(4) M.E. Newton and R.D. McDowall, LCGC North Amer. 36(1), 46–51 (2018).

(5) M.E. Newton and R.D. McDowall, LCGC North Amer. 36(4), 270-274 (2018).

(6) WHO Technical Report Series No. 996 Annex 5 Guidance on Good Data and Records Management Practices. 2016, World Health Organization: Geneva.

(7) MHRA GXP Data Integrity Guidance and Definitions. 2018, Medicines and Healthcare products Regulatory Agency: London.

(8) PIC/S PI-041 Draft Good Practices for Data Management and Integrity in Regulated GMP / GDP Environments. 2016, Pharnaceutical Inspection Convention / Pharmaceutical Inspection Co-Operation Scheme: Geneva.

(9) R.D. McDowall, Spectroscopy 32(11), 24-27 (2017).

(10) Technical Report 80: Data Integrity Management System for Pharmaceutical Laboratories. 2018, Parenteral Drug Association (PDA): Bethesda, MD.

(11) M.E. Newton and R.D. McDowall, LCGC North Amer. 36(8), 527–529 (2018).

(12) M.E. Newton and R.D. McDowall, LCGC North Amer. 36(7), 458–462 (2018).

R.D. McDowall is the director of RD McDowall Limited in the UK. Direct correspondence to:

Related Videos
Toby Astill | Image Credit: © Thermo Fisher Scientific