Most of data management life eventually just ends up as a blur. You recall the name of the company and those of the colleagues that you worked alongside but you don't recall the studies you worked on. Mainstream DM activities are repetitive whether it's ensuring data is clean as per expectations or working across functional groups towards a pivotal database lock. It's only those times that you are able to make a difference that stands out and therefore remains in my mind. 

Here are just a few examples

Example1: Drowning in documentation

The problem:
The amount of documentation that a data manager has to reference constantly is phenomenal. Working practices, SOPs, templates, guidance documents....... the list is endless, unlike time! Typically one SOP will reference another SOP and that in turn will reference numerous other documents that may or may not be relevant. Not all documents are stored in the same location. In some cases the referenced document is not current anymore or has been replaced by a new document (yet the original referring document has not been updated yet). The data manager is at a loss trying to follow correct procedure, but how can he/she do that when it's akin to searching for a needle in a haystack!!

Typically in many DM departments:

  • Documentation of the clinical trial process is an important area within any department, but access or knowing locations of certain documents can be problematic.

  • Many study processes may be study-team-defined leading to non-standardised methodologies. As a result there is no coherent process and framework for consolidating best practice.

  • Differing approaches are also taken by localised affiliates.

The solution:

Working with various stakeholders to conduct a comprehensive review of all data management documentation to ascertain validity of all documentation. The objective was to provide a clear framework for all DM documentation which has some reference to the clinical trial process (e.g. SOPs, ICH guidance, Working Practices and forms/templates). This was to be presented as a flow map - the 'DM-ProFloMap' and consisted of 3 sections:


  • study-start

  • study conduct

  • study close-out


It was important that the flow-map not only described a process that incorporated best practices already being followed, but also to try and introduce elements that teams believed would enhance the effectiveness of clinical trial conduct.


Creation of the tool guided a user to each task

e.g. "I am currently looking at database lock for a phase 1 study. What procedures exist to undertake that?"

1) User opens tool, navigates to Study close out section and follows a flow chart that displays all actions associated with locking a study.

2) Clicking on any particular action will bring up a listing of all associated SOPs, Working practices, templates and guidance documents.

3) User is able to easily navigate to any of the mentioned documents


Benefits of the tool:

  • Better relationships between functions and departments

  • Ownership

  • Transparency leading to mutual understanding

  • Basis for Continuous Improvement

  • Training & education tool

  • Better planning

  • Faster study start-up

  • No need to “reinvent the wheel”

  • Allows for focus on important matters

  • Framework to ensure the process are Global and Compliant

  • Leads to consistent SOPs

  • Reducing amendments / loops in the process

  • Framework for creating relevant metrics for performance, efficiency, resource, cost

  • More professional approach

  • Eliminating waste

  • People taking ownership of processes

  • Fast & efficient

  • Achieving a defined level of quality

  • Ensuring that regulatory requirements are met

The tool ensured that everyone was confident that they were always referring to the latest documentation. The tool ended up as a 'One-stop shop for all DM documentation globally at a major pharma.

Example 2: Drowning in data

Now obviously data management is about data and how it's managed. For many clients that I have worked for, 'data' is the common theme for all of them - obviously! There's nothing different in an AE dataset at Pfizer, AstraZeneca or MSD. What is different, however, is how the data is managed. In some scenarios, it's been quite a shock to see how data is accessed, reviewed and cleaned (Not specifically referring to any of the previously named companies!). 


The problem:
In particular, it's the metadata where the problems occur. Data about the data. I'm not referring to the clinical data here that should of course be easily accessible from the trial database or captured in SDTM format. It's often more about getting topline information about the data that's been captured to a certain timepoint or for a certain subset e.g:

 - the clinical operations team needs to know how many patients exist who consented for protocol version 3.6 who have more than 20 open queries and have completed at least 4 cycles of treatment [ Typically the lead data manager will compile a listing of all pts consented to v3.6, create a vlookup listing of queries and treatment cycles listing, although many DMs may end up doing this manually - time consuming and prone to errors! ]

 - management need to know whether problem sites  A,B,C in phase 1 studies are also causing issues for late phase studies. [ Typically an email will be sent out to all DMs "Can you all check whether your studies use sites A,B,C and whether you have had ISSUES with them". Unclear and vague with no benchmark for DMs to go by]

- non-EDC users such as medics (who may have access to EDC but have no time or inclination of trying to login let alone navigate the system).

At one previous client, the department had been working on an inhouse tool for many years with the latest effort being programmed in Tableau. There were a number of reasons the project did not succeed:

- Led by project managers that had no direct DM experience or any programming experience
- Programmers who were unfamiliar with DM processes or day to day EDC activity
- Programmers who were not able to 'map' EDC raw datasets into a standardised data silo
- Inflexibility of Tableau functions

The solution:

The solution that I co-created was a powerful yet user-friendly metrics tool that extracts numerous pt variables from clinical database (EDC-independent - will work with multiple EDC platforms) IWRS (to ascertain whether pt has had their latest visit completed), and separate medical query database.  The system showed key study metrics at various levels which enabled user to drill down as required:

  • Region

  • Country

  • Site

  • patient

The metrics tool provided the following (amongst other things)

  • Track studies against department KPIs

  • Patient status: screened, randomised, completed, withdrawn, lost to follow up, in follow-up

  • Query status tally (Open, Closed, unreviewed)    

  • Query text (read-only) from both EDC and medical systems

  • Query ageing

  • Ownership of queries (site/DM/CRA)

  • Entered and missing visits

  • DE backlogs, missing data, incorrect data, unconsented data

  • Visit forecasting (what dates the next 2 visits should be happening based on last visit date and pt status)

  • CT scan dates, missing scans, projected scan dates, Number of scans overdue

DMs were able to use the tool as a tracking mechanism at the patient level and were able to enter comments (e.g. "As per discussion with CRA, pt was unable to attend visit  5 and 6 but will discuss with site staff whether pt will continue on study". The tool enabled all pt data to be refreshed without end users having to re-enter comments.


Data cut off date option - allowed users to see what pt visits would be expected at a certain date (useful for interim analyses, DBL etc) 

In terms of 'management of data' a DM would use this as their main 'go to tool' to see where action would be required

"I have been informed CRA is visiting site XYZ in 2 weeks time, so will ensure all medical and DM queries are ready and remind her about data entry backlogs at the site that can be discussed on her visit"

"Japan has public holidays in two months time - what patient visits is that going to impact for those Japanese sites?"


Management tool

By collating all metrics across the TA, it enabled management to look at resourcing. As more studies were supported by the tool, it was possible to extrapolate key data into a management tool that gave a higher overview at the TA level. This could show:

- No of active patients on certain studies

- % of pts in follow up

- Query backlogs with DM, Site, CRA

- Comparison of sites across various studies (Useful also for RBM)

The tool was ultimately used by DM to track where they were with cleaning on a study and essentially ended up the 'go-to' tool for data managers as well as other functions. The metrics tool that I co-developed was ultimately a success as:

  • Developed with a DM background (knowledge of what is needed by the end users (DM, Clinical, Medics...)-

  • Not over-complex from an end-user perspective. It was MS Excel-native so users had nothing new to learn

  • Little interaction required for end user - nothing to build. Tool showed exactly what the status of the study was and showed where the issues were.

  • Programming from a DM perspective (no external programming resourcing required)

study close2.JPG
study close.JPG
dm docs listing.JPG

How my past has influenced my work.

From data management to process improvement

more about me