Access and validation

Governance details

Documents or webpages that describe the overall governance of the data source and processes and procedures for data capture and management, data quality check and validation results (governing data access or utilisation for research purposes).

Biospecimen access

Are biospecimens available in the data source (e.g., tissue samples)?

No

Access to subject details

Can individual patients/practitioners/practices included in the data source be contacted?

No

Description of data collection

The data collection process starts with a business and technical scoping. Data source experts and medical information department experts identify the variables that meet our use cases.

An ELT was developed in JAVA to build a datalake. Several data cleaning (data quality) and data transformation operations are then performed to obtain our database. The DBMS used is postgresql.

Several tests are performed on this ELT:
Data integration test - confirms that data from all sources has been loaded correctly into the target datalake, with threshold values checked.

Source-target data test - Ensures that the intended data is injected into the target system without being lost or truncated, and also that the number of records loaded into the datalake corresponds to the different sources.
Event triggering registration

Event triggering creation of a record in the data source

not availabe.

The source database is updated by Cron jobs, which are scheduled to run overnight.
Data source linkage

Linkage

Is the data source described created by the linkage of other data sources (prelinked data source) and/or can the data source be linked to other data source on an ad-hoc basis?

Yes

Linked data source 1

Pre linked

Is the data source described created by the linkage of other data sources?

No

Data source, other

Air pollution index

Linkage strategy

Deterministic

Linkage variable

Geographical address

Linked data source 2

Pre linked

Is the data source described created by the linkage of other data sources?

Yes

Data source, other

AXIGATE

Linkage strategy

Deterministic

Linkage variable

IEP : All data sources have this variable. IEP uniquely identifies an admission to the APHM, it is associated with the IPP which is the patient's number.
An IPP can have one or more IEPs

Linked data source 3

Pre linked

Is the data source described created by the linkage of other data sources?

Yes

Data source, other

CORA(PMSI)

Linkage strategy

Deterministic

Linkage variable

IEP : All data sources have this variable. IEP uniquely identifies an admission to the APHM, it is associated with the IPP which is the patient's number.
An IPP can have one or more IEPs

Linked data source 4

Pre linked

Is the data source described created by the linkage of other data sources?

No

Data source, other

Deprivation Index

Linkage strategy

Deterministic

Linkage variable

Geographical address

Linked data source 5

Pre linked

Is the data source described created by the linkage of other data sources?

No

Data source, other

Insee death

Linkage strategy

Combination

Linkage variable

Lastname, firstname,date of birth

Linked data source 6

Pre linked

Is the data source described created by the linkage of other data sources?

Yes

Data source, other

PHARMA

Linkage strategy

Deterministic

Linkage variable

IEP : All data sources have this variable. IEP uniquely identifies an admission to the APHM, it is associated with the IPP which is the patient's number.
An IPP can have one or more IEPs
Data management specifications that apply for the data source

Data source refresh

Monthly

Informed consent for use of data for research

Possibility of data validation

Can validity of the data in the data source be verified (e.g., access to original medical charts)?

Yes

Data source preservation

Are records preserved in the data source indefinitely?

Yes

Approval for publication

Is an approval needed for publishing the results of a study using the data source?

Yes

Data source last refresh

Common Data Model (CDM) mapping

CDM mapping

Has the data source been converted (ETL-ed) to a common data model?

Yes

CDM Mappings

Data source ETL CDM version

1

Data source ETL frequency

4,00 months

Data source ETL status

Completed