End of study data transfer to site​​: Regulations and retention times


Whilst it is often an afterthought, the requirement of ensuring that a site has a copy of all their patient data remains an important one. After study closure or termination, sites should have independent access to all patient CRFs whether the medium is via paper, CD-ROM or electronic via the cloud. Sites may need to access these patient records for future patient research, to permit any medical follow-up which may be warranted (e.g. follow-up for delayed adverse events) or to ensure they are inspection-ready for regulatory audits. So, it is crucial and indeed mandated by regulatory authorities that sites have independent access to patient data after study closure.

During a study, the sponsor’s emphasis is on data collection and repeated cycles of data review and cleaning (see below). There is little emphasis placed on the importance on many post study closure activities such as the provision of patient data to investigator sites.



ICH/GCP stipulates the requirement of investigators having access to a copy of their patient's CRF data after study close out. Regulation around this has been in existence for more than 20 years, although differences exist on the actual retention periods of the records themselves at both national and regulatory levels.

Data retention records will vary from 2 to 25 years after study closure/termination, dependent on the regulatory authority or type of study (studies involving children, in-vitro specimens, or pregnant women typically mandate longer retention periods, for example). The FDA requires that investigators and sponsors retain records for 2 years following study termination/completion or 2 years premarket approval (PMA) or new drug (NDA) application:

21 CFR Part 312.57 “(c) A sponsor shall retain the records and reports required by this part for 2 years after a marketing application is approved for the drug; or, if an application is not approved for the drug, until 2 years after shipment and delivery of the drug for investigational use is discontinued and FDA has been so notified.”


Whilst in the EU, TMFs have to be retained for 25 years, medical records (which would include patient data) come under national scope:


Article 58: Archiving of the clinical trial master file: “Unless other Union law requires archiving for a longer period, the sponsor and the investigator shall archive the content of the clinical trial master file for at least 25 years after the end of the clinical trial. However, the medical files of subjects shall be archived in accordance with national law.”

Note however, that European Directive 2003/63/EC Annex 1 Chapter 5:


Marketing authorisation holders must arrange for essential clinical trial documents (including case report forms) other than subject’s medical files, to be kept by the owners of the data:

- for at least 15 years after completion or discontinuation of the trial,

- or for at least two years after the granting of the last marketing authorisation in the European Community and when there are no pending or contemplated marketing applications in the European Community,

- or for at least two years after formal discontinuation of clinical development of the investigational product.

It is therefore crucial that the sponsor complies with all applicable regulatory and national expectations and ensure retention periods reflect the media and process of patient data archival with numerous options available for the sponsor.

What should be included for investigator sites?


Typically, patient CRFs should be made available to the sites whether that is via paper or electronic medium (the latter being a more favourable option in this day and age!). What is often misinterpreted is the need to have an audit trail to clearly illustrate how the final copy of the CRF came into being with many iterations over months and years, through data updates and corrections. Whilst all the data in the CRF should also be evident in the patient's source notes, an audit trail of the CRF should reflect any changes that were made to the CRF and this is particularly important for future site inspections. This requirement is clearly indicated in ICH E6, with the site needing to have a copy of the CRFs (8.3.14) and corrections (8.3.15).

Whilst some EDC systems will allow full audit trails to be included as part of the PDF, this often proves to be problematic due to the increase in PDF processing time as well as the actual number of pages (and file size!). A separate audit trial (either in PDF format or Excel) may therefore be warranted.

How soon should sites receive access to their data?

All sites should of course have access to their data via an EDC platform (or a paper site-copy if the study is still paper-based). Ultimately the inactivation of the EDC platform will dictate the timelines of patient data delivery to the sites. The sponsor needs to ascertain how soon after CSR/Database lock the study database can be decommissioned and the following factors should be taken into account:

1) Sites are not ‘closed’ before CSR/DBL as sponsor may still need information from the investigator (a database re-lock may result in additional queries going to site!). Once a site is closed with the IRB, they will in many cases refuse to answer any further queries except from regulatory authorities.

2) Some functions within the sponsor may lag behind data management and therefore it may not be possible to close out a site once data management deems it fit to do so.

3) There are many advantages of early site closure if all sponsor activities follow set procedures and all functions are aligned with the potential savings per site in the region of $20,000 and upwards. See typical cost savings here.

The sponsor will have to decide on a study-by-study basis when to trigger the decommissioning of sites. What is of paramount importance is the requirement that sites should not be left in a position where they do not have any access to their patient CRF data after their access to EDC has been withdrawn (Sites should at least have read-only access whilst patient PDFs are being produced!).

How can a sponsor ensure delivery of data to sites?


All clinical trial data generated by a site must be made available to the investigator at all times not only during the trial, but also after the trial. This data should be verifiable via a copy that is not in possession by the sponsor. These requirements are not met if data are captured in an electronic system and the data is stored on a central server under the sole control of the sponsor. In this set up, the investigator does not hold an independent copy of the data and therefore the sponsor has exclusive control of the data. This could extend to cloud-based systems owned or managed by the sponsor. Regulatory authority may also have concerns where a third-party is , contracted to host a the sponsor's CRF data as a long term solution to enable sites to access their data on an ongoing basis. Contracts may lapse or hosting of the data expires prior to site data retention period (for example whilst a sponsor may have only negotiated a 2-year hosting agreement, whereas some sites may have local 25-year data retention requirement).


Therefore, whilst CDs may appear archaic in their use for data transfer to sites, they do have their merits and may still be regarded as the safest medium with DVDs offering increased storage for larger sites.

Distribution of site data via cloud-based systems is however one area that should not be dismissed especially where the focus is on a  'grab and drop' facility. rather than a longer-term warehouse option for sites. Again, an independent-from-sponsor system (whilst not mandated by the global regulatory frameworks) provided by an independent third party will often have the expertise in secure cloud-based solutions for the distribution of data to sites with the regulatory requirement of data receipt from the site. Delivery of data to sites can be a resource-intensive process for many sponsors to absorb especially with larger phase 3 trials.

Whilst some EDC vendors have started to offer data access to sites after study completion as a stop solution for their clients, a cost-benefit analysis by sponsors may conclude that longer term hosting costs outweigh the perceived benefits. From a regulatory stance, the close relationship between sponsor and EDC vendor may be seen to be increasingly blurred in this setup with the requirement that sites may not have an independent copy of their own data. This is especially true where a sponsor still has access to the very same system that the sites are accessing their patient data from.

Independent setups may therefore be warranted for the delivery of patient data. A&O now offers a number of different end of study service plans that can be moulded around the needs of sponsors.

Wish to know more about our end of study patient data transfer plans?

data flow from site.png

If you have any questions or comments on this topic, please enter them below.