Trends in Pharmaceutical Data Science – A Brief Look at Internally Structured Data in Medical Device Development


October 16, 2023

Share this...
Share on facebook Share on twitter Share on linkedin Youtube

Pharmaceutical and medical device companies benefit from using a structured approach to data items, connections between data items, and data containers. Working with data that is internally structured reduces the time needed to create submission reports, eliminates the manual verification of large data tables, and ensures the creation of submission deliverables with no technical errors.

Leading medical device manufacturers have employed internal structure with data items, and connections between data items, in their product development processes to support initial health authority submissions and ongoing product lifecycle management. This article highlights some areas where the pharmaceutical industry may find inspiration from the device industry.

Read The Previous Article

previous article


In this second article of the series (1), we briefly examine some of the data and connections used by engineering and product development teams in medical device development. Many of the leading device manufacturers have employed processes and technologies to support a truly data-centric approach to their formal submissions as well as their internal reports. This includes managing data for risk, requirements, test, and other typical product development domains. As does the pharmaceutical industry, device manufacturers must comply with both health authority regulations and industry standards. Although the two industries use different data sets and follow different standards, they both benefit from using an approach to data and relationships using a rigorous, structured approach.

Data Items

Consider structure in the fundamental sense of how an individual data item is internally organized; the data item along with its attributes and connections.

As discussed in the previous article, a data item with internal structure is not simply an “entry or row in MS Excel.” It is a unique entity that exists with a set of attributes. It exists over time and tracks who uses it, if it can be changed, and who approves it. A data item can connect with other data items for dynamic links, which can be traced and displayed to the user. It may exist in many locations while still being a single data item. If something about the item changes, the change is seen everywhere the data item is used. It is a single data item being used in multiple locations (Fig. 1). These different locations may be data tables, prose sections of reports (where the data item is automatically inserted), documents, projects, and traces.

cognition blog image 1

Figure 1: Internally structured data item

This approach to structuring individual data items creates a foundation for data science initiatives in both formal and informal reporting projects. It provides the flexibility to use data in multiple formats and reports while ensuring data integrity and a trustworthy chain of custody throughout the life span of every data item. Most data items have a life span of years and using this approach to structure will support the many ways in which data is consumed, including in fully structured online filings to health authorities.

Risk Management

Risk Management activities in device development comprise several processes and deliverables. The foundational industry standard is ISO 14971 (2). The current version of this standard was released in 2019 and describes processes to identify, evaluate, and control risk for medical devices. In the general case, a list of hazards for the device is compiled along with sequences of events that may produce hazardous situations. The risk is assessed based on the probability of a hazardous situation occurring as a result of a sequence of events leading from the hazard, as well as the probability of a hazardous situation leading to a harm (see ISO 14971:2019 Annex C). A harm has a severity, and the overall risk is calculated based on probabilities of occurrence and the severity of harm. The risk is then mitigated by applying risk controls, which themselves may expose additional risks. Hazards, harms, situations, and other entities are captured as data items with specific, dynamic connections between the data items (Fig 2). If a data item is changed or updated, the connected data items can be updated or notify the user to review the new information. In an actual device development project, there can be thousands of risk items. Using well-structured data items for each element ensures that relevant connections are updated and notified when changes occur, eliminating the possibility of a user overlooking an update to the design.

risk management

Figure 2: Medical device risk data items and connections

A common risk activity performed early in device development is a Preliminary Hazard Analysis (PHA). It is often used to identify, and then categorize, hazards that may be associated with the use of a device. In many cases, a PHA uses prompts or questions to drive the process. ISO 24971 (3) includes an Annex A with a series of questions that are useful when performing a PHA. To perform a PHA using a data item approach with structure and connections, device companies begin by prompting the user with a question and then guiding the user through a series of steps to complete the exercise. Each step of the process generates a data item and connects the data item to other items in the “row” of the risk activity (Fig. 3). Probabilities of occurrence and severities can both be numeric attributes on data items, leading to numeric values for risk and residual (post mitigation) risk. When a change is made, numeric calculations automatically update. This ensures data integrity, especially when a risk activity has hundreds of “risk rows” and thousands of risk data items. Note also that there are connections between risk mitigations (controls) and device requirements. The manufacturer needs to track that a risk control has been (or will be) implemented in the design. The data items used to describe the design are requirements. This means that there is an inherent connection between risk and requirements data in a device development project.

cognition article image 3

Figure 3: Medical device PHA example structured “risk row”

This approach also uses data items and connections that are external to the specific PHA activity. For example, when the user enters a hazard or a harm, it is done by selecting from preapproved repositories of hazards and harms (red boxes in Fig. 3). Using this data-centric, structured approach has four advantages over an unstructured approach:

  1. Approved hazards and harms already exist in the repositories and cannot be “made up” by the user; all projects draw from the same approved repositories (there may be different repositories for different product categories, families, etc.);
  2. Harms in a repository include severity ratings that are approved by the Chief Medical Officer and cannot be changed by a user in a PHA activity; these ratings can be numeric, so an overall risk can be numeric and not just a string or a piece of text;
  3. Approved changes to repositories trigger notifications to PHA activities that review and update actions are required; if the severity of harm is updated in the repository, it triggers review of the risk activity wherever that harm is used;
  4. Items in repositories are reused in many different projects; if an update is made to a repository, it updates (or triggers review) in all projects where the items are used.

In pharmaceutical applications, Criticality Analysis (CA) is also a risk management activity. Even though language and approaches may differ from device development, this activity will still benefit from using data-driven approaches combining individual data items with structure, reusable data repositories, and dynamic connections between data items (Fig. 4). In this case, the CA may begin with identification of certain causes. These causes may be Critical Process Parameters, Critical Material Attributes, or other items. The items may reside in a specification document, recipe, or other repository, along with the identification of Quality Attributes (QA) and especially Critical Quality Attributes (CQA). The user in a CA activity would draw from the repositories in the same way that a user in a device PHA would draw from their repositories. The same benefits apply to a CA activity as they do to a PHA activity.

cognition article image 4

Figure 4: Pharmaceutical Criticality Analysis example “risk row”

Requirements and Test Management

Health authorities have defined certain regulations regarding how device manufacturers must document and report on requirements and testing for medical devices. In the United States, FDA has an existing regulation regarding Design Controls. Manufacturers are required to have procedures in place to manage three broad categories of requirements: needs, inputs, and outputs (4). They are also required to conduct and document comprehensive design validation and design verification testing on the device. Validation is used to ensure the device satisfies the documented needs and intended uses of the device. Verification is used to ensure the device “design outputs” satisfy the “design inputs.” As a result, manufacturers must manage all requirements and tests for the device as well as the relationships between the requirements and tests.

Requirements in a device development project are often created by a combination of methods including directly authoring a new requirement, importing a requirement from another tool, and reusing an existing requirement from another project or repository of requirements. The requirements are organized into multiple levels. For example, certain design input requirements satisfy certain user needs and certain design outputs are allocated to higher level design inputs. Although FDA mentions three specific levels (needs, inputs, and outputs), in reality there may be many levels of requirements. For example, design inputs may be tiered into multiple levels: system, subsystem, software, etc. The result is often a large set of requirements organized into a hierarchical structure with needs at the top and outputs at the bottom (Fig. 5).

Cognition article 6

Figure 5: Medical device requirements

In addition to requirements, device manufacturers author, execute, and manage tests of many types. Tests (also sometimes known as test cases or test methods) are associated with one or more requirements to provide evidence of success for both validation and verification of needs, inputs, and outputs. Testing is rarely conducted only one time. In reality, multiple test executions are run over time to monitor and report on the trends during development. This leads to large data sets where each requirement in a project may have many test execution runs (Fig. 6).

cognition article 6

Figure 6: Medical device tests

Note that for some time, FDA has stated its intention to modernize their Quality System Regulation (820) to align more closely with ISO 13485 (5). If or when that happens, manufacturers will still be required to satisfy similar definitions of requirements, tests, and their relationships.

Trace Matrices

One of the most important tools used to help device manufacturers understand their data and connections during development is the trace matrix. Traces are used in many areas to show schematic relationships between risks, requirements, tests, documentation, and other data items and containers. Documents are included in the list because, in a data-centric approach to development, they are technically structured containers of structured data and relationships.

There are innumerable traces possible. A simple example is a trace showing each need (the high-level requirements) and the lower-level inputs that satisfy each need. Such a trace shows any needs for which there are no lower-level inputs. Transposing the trace, it is apparent if any inputs do not satisfy any needs and may therefore be considered “orphaned.” Traces are easy to generate when using the data-centric approach because dynamic connections between data items were created throughout the development process. The word dynamic is important because it implies that connections may change over time. If the need that had no lower-level input satisfying it is updated to have one or more connected inputs, the trace updates accordingly. This is one of the most powerful results of the structured approach; it leads to the ability to generate traces automatically and dynamically update traces over time. Such an approach dramatically reduces resource time to generate important traces and eliminates the need for manual/visual verification of trace accuracy. The data model can, of course, be expanded to include items for BOM, clinical data, and any other data that connects to development items. The fundamental items in the model are risks, requirements, tests, documents, and the connections between them (Fig. 7).

cognition article 7

Figure 7: Medical device development data model

A trace may be centered on a particular data item. It may show a design input requirement together with whatever connections are desired. For example, a requirement may show the higher-level needs it satisfies, the lower-level requirements associated with it, its tests, current test execution results, risk controls that it implements, and any failure modes defined for the requirement (Fig. 7). This type of trace puts a focus on one specific item and displays its connections.

cognition image8

Figure 8: Medical device data item trace

Another type of trace is an overview trace showing a series of data items with some level of hierarchical order and other related connections. In this situation, the user may need to see a list of the needs in a device project along with evidence of any existing validation tests for the needs. For each need, the trace may then show the next level of requirements (inputs) that satisfy each need, along with any tests or risks (failure modes) for the inputs. Then, the trace displays the ultimate design outputs for each design input, again with any potential risks and tests.  Each design output may also display if it is considered an essential output, something required by 820.30: “…and shall ensure that those design outputs that are essential for the proper functioning of the device are identified (6).”

cognition image 9

Figure 9: Medical device overview trace

Any number of traces can be defined one time and then used multiple times in any project. Traces automatically populate and dynamically update. Referring back to Figure 7, what a trace does is traverse the lines between the circles and rectangles. They can start at any point and follow any path. Traces are an almost “free result” of using the data structure approach described in this article.


This article has covered some of the basics of using internally structured data for authoring, connecting, and tracing common data items used in medical device product development. The pharmaceutical industry has different data items and connections, but must also use such a data-centric approach to benefit from the many advantages that have been demonstrated with devices.

The power is evident with the ability to manage and render data in various ways, over a period of time, while not altering source data. This fundamental aspect of internally structured data is what gives it the power to handle all types of reporting, whether device or pharmaceutical. In the next article, we will examine examples of data mapping and tracing from both medical device and pharmaceutical and show how they have more in common than many may realize.

Author Information


About Cognition Corporation

Cognition Corporation, headquartered in Lexington, Massachusetts, develops, sells, and supports product development and compliance solutions for the pharmaceutical and medical device industries. Its Software-as-a-Service solutions help meet regulations faster with real-time traceability, guided design controls, and “change once, update everywhere” functionality–turning manual and disconnected data into streamlined, structured submissions that enable them to get to market faster.

Visit www.cognition.us.

About Astrix

Astrix is the unrivaled market leader in creating & delivering innovative strategies, technology solutions, and people to the life science community. Through world-class people, process, and technology, Astrix works with clients to fundamentally improve business, scientific, and medical outcomes and the quality of life everywhere. Founded by scientists to solve the unique challenges of the life science community, Astrix offers a growing array of fully integrated services designed to deliver value to clients across their organizations. To learn the latest about how Astrix is transforming the way science-based businesses succeed today, visit www.astrixinc.com.


1- Astrix, Inc.

Introduction to Pharmaceutical Data Items and Their Structure

Accessed October 12, 2023


2 – ISO

ISO 14971:2019 – Application of risk management to medical devices

December 2019


3- ISO

ISO/TR 24971:2020 – Guidance on the application of ISO 14971

June 2020


4 – FDA

21 CFR 820.30 – Design controls.

Accessed October 12, 2023


5 – FDA

FDA Proposal to Align its Quality Systems with International Consensus Standard Will Benefit Industry and Other Regulators

February 2022


6 – FDA

21 CFR 820.30 – Design controls.

Accessed October 12, 2023



Contact us today and let’s begin working on a solution for your most complex strategy, technology and staffing challenges.

Web developer Ibiut