Emanate Blogs

Data Management – Changing Industry


Data is the in-demand resource of the modern world, but instead of being scarce, like gold, it is almost overwhelmingly abundant. Our “gold rush” is not so much a mad dash to get more data, as it is a demand to understand, manage, and apply that data.

Clinical Data Management is no different.

In the days of paper records it took an average of 125 days to enter data. An average of 189 days to clean that data.

  • 48% of queries raised were a result of missing data
  • 35% of queries were a result of inconsistent data
  • 9% were a result of out-of-range data
Electronic Data Capture (EDC) reduced the time required. Now:
  • 5 days to enter data
  • 22 days to clean data
  • zero queries for missing data and for out-of-range checks
  • only 5% queries for inconsistent data

In part this reduction in the queries is due the fact that we can build safeguards into the database to find the issue before it gets to that point.

When EDC was adopted by the industry, we believed we would be able to lock a database in two weeks. However, in the last 10 years, we’ve barely reduced cycle time from Last Patient Last Visit to database lock by two weeks. We’ve gone from approximately 9 weeks s to 7 weeks to lock a database. This is because we have effectively taken many of the inefficiencies of the paper-based process and implemented them into an electronic tool.

There are companies who have created tools that interface with the EDC and seamlessly transfer the data between databases, but the industry has been slow to adopt these options.

Now we are seeing changes to study designs and where data is coming from, both of which affect the way we collect data beyond EDC, and the way we work with it.

Industry Changes Data Managers Need to Understand

A couple of things that are happening in our industry that affect data managers are a shift to adaptive design for studies, and decentralized trials.

Adaptive Studies

Adaptive design consists of combining study phases into one protocol (Phase I/II, Phase II/III). Its intent is to use data collected in the earlier stages of the study to adapt its design moving forward in accordance with pre‐specified rules defined in the study protocol.

The impact of adaptive design by its nature requires multiple protocol amendments which in turn creates multiple database amendments. As a result, designing a single database that can be used for dose escalation, dose expansion, PK sub-studies, and phase II need to be planned at the initial build.

It is no longer practical to expect that databases will not have multiple protocol amendments. As a result, data managers need to carefully pre-plan our resourcing related to constant database amendments and data cleaning.

Decentralized Trials

decentralized trials

Not all trials will be decentralized. However, the healthcare industry is changing and telehealth is becoming mainstream.

Imagine a sponsor conducting a clinical trial on hypertension. The patient receives treatment at the site and they are sent home with a blood pressure monitor. All of the follow-up visits are conducted via video chat and the blood pressure reading is automatically sent to the site and sponsor. During the video chat the physician collects any adverse events. The final visit we ask the subject to come into the office for a last check-up and collection of equipment. There is no reason this can’t be done today.

We already have phone calls embedded into our clinical trials. It isn’t that large of a leap to use telehealth in clinical trials. This means that there will be more disparate sources of data and different ways of actually looking at data. As data managers we have to determine how we manage that.

EDC is No Longer the Center of the Data Management Universe

Not all that long ago, EDC was the center of clinical data management. Everything had to go into the EDC.

EDC is not the center of the universe

Today there is a paradigm shift. Data is being collected in a lot of different places. We still have EDC, but for example we have IxRS, where we capture dosing information. We use ePro or eCOA for questionnaires to capture quality of life info. There's imaging and safety data, we have eConsent and eSource.

The landscape is changing, even just from a technology perspective. There's new sources of data seemingly every day, including wearables, sensors, and more. The "data explosion," as we like to call it, is real. I often tell my clients that the clinically relevant data is no longer in EDC. We still collect demographic, medical history, adverse events, and IP exposure in EDC. What has changed are the volume of data and the number of specialty labs. For example, the PK and PD results are not collected in EDC, this data is provided in a different medium from our vendors. We're seeing genotyping and subgenotyping, and where we might have had one file in the past, now we are seeing five different files coming in from five different vendors. Data includes biomarkers, reports from central labs, imaging wearable, questionnaires, PK/PD, clinical events committees, and sensors.

At Emanate, we have clients that are using sensor and wearable data quite a bit. So, for example, the accelerometer in a FitBit is capturing a piece of data every single time the wearer moves. As a result, there are millions and millions more data points that are being captured. The data are more complex, CDISC has provided a clear path for lab data, the same can’t be said for accelerometer. An accelerometer datasets as millions upon millions of records that need to be summarized. The standard cookie cutter approach to data no longer applies.

Another example: I received a dataset of over 1600 human proteins not in STDM format. After discussing with the biostats team, I learned they were not planning to analyze the data using SAS. Instead they ask me to transfer the raw files to the biomarker group. There are more different types of data being collected now, rather than what we think of as "standard data".

Industry Changes Require Flexible Data Managers

It's more important than ever that we, as data managers, are flexible and work to understand how the data needs to be used. We have to sit down with our clients and determine the best way to pull the data in, and how to best work with the vendors. As the life sciences industry lumbers slowly but surely towards acceptance of digital data sources and non-traditional trial designs, we have to be ready to meet new data collection and analysis needs. As data mangers we are going to have to be a step ahead, thinking strategically about the best database designs for different situations.

Leave a Reply

Your email address will not be published. Required fields are marked *