Access and validation

Governance details

Documents or webpages that describe the overall governance of the data source and processes and procedures for data capture and management, data quality check and validation results (governing data access or utilisation for research purposes).

Biospecimen access

Are biospecimens available in the data source (e.g., tissue samples)?

Yes

Biospecimen access conditions

Informed consent forms are signed by participants. https://files.genomicsengland.co.uk/documents/Patient-Information-Research-V1.4.pdf

Access to subject details

Can individual patients/practitioners/practices included in the data source be contacted?

No

Description of data collection

Clinical and phenotypic data: Sourced from Electronic Medical Records from NHS
Genomic data: Sequencing
Event triggering registration

Event triggering registration of a person in the data source

Disease diagnosis
Practice registration
Start of treatment
Other

Event triggering registration of a person in the data source, other

Genomic testing

Event triggering de-registration of a person in the data source

Other

Event triggering de-registration of a person in the data source, other

Exit from resource is only possible upon change in consent status

Event triggering creation of a record in the data source

Diagnosis of disease, Hospital discharge,recording of congential or genetic abnormality, Hospital stay, Hospital procedure, Genetic sequencing,
Data source linkage

Linkage

Is the data source described created by the linkage of other data sources (prelinked data source) and/or can the data source be linked to other data source on an ad-hoc basis?

Yes

Linkage description, pre-linked

Genomics Data generated upon patient enrolment is linked to provide additional clinical information for the Data source. For cancer Data, linkage to CAS (Cancer Analysis System). NCRAS and SACT are accessed within CAS. For clinical secondary care Data, linked to HES (Hospital Episode Statistics)

Linked data source 1

Pre linked

Is the data source described created by the linkage of other data sources?

Yes

Data source, other

Cancer Analysis System (CAS)

Linkage strategy

Deterministic

Linkage variable

“participant_id” this is the main linkage between the participants genomic Data and Other associated Data, we do also use platekey’s but they have slightly different formats “plate-key” “platekey” “germline_sample_platekey”

Linkage completeness

NCRAS:94%; SACT:44%

Linked data source 2

Pre linked

Is the data source described created by the linkage of other data sources?

Yes

Data source, other

HES (Hospital Episode Statistics)

Linkage strategy

Deterministic

Linkage variable

“participant_id” this is the main linkage between the participants genomic Data and Other associated Data, we do also use platekey’s but they have slightly different formats “plate-key” “platekey” “germline_sample_platekey”

Linkage completeness

98%
Data management specifications that apply for the data source

Data source refresh

October
January
April
July

Informed consent for use of data for research

Possibility of data validation

Can validity of the data in the data source be verified (e.g., access to original medical charts)?

Yes

Data source preservation

Are records preserved in the data source indefinitely?

Yes

Approval for publication

Is an approval needed for publishing the results of a study using the data source?

Yes

Data source last refresh

Common Data Model (CDM) mapping

CDM mapping

Has the data source been converted (ETL-ed) to a common data model?

Yes

CDM Mappings

Data source ETL status

In progress