Cohort Identification and Feasibility

In the early trial design stage, it may be beneficial to take a look at exactly how many available subjects within Penn Medicine there are for a trial. This can be helpful information to include in a grant proposal or when planning a recruitment strategy. It may also be necessary information to provide to an external sponsor during the site selection process. Below is a quick snapshot and some more details about some of the resource available to support this.

For information on access to PennOmics/ Penn G&P/ Clarity or Cosmos please contact the Penn Data Analytics Center (DAC)

For research requests in which identifiable data is needed, IRB approval will need to be supplied and access granted to appropriate tools. For most researchers, Slicer Dicer or reporting Workbench are the recommended tools for this activity. The alternative is to use the DAC who will serve as a broker for the identifiable data.

Slicer Dicer research users access will be coming in August 2020. Permissible use cases for the system are detailed below.

Slicer Dicer Use Cases in Research

Preparatory to Research:

The preparatory research provision permits covered entities to use or disclose protected health information for purposes that are in preparation of research, such as to aid study recruitment or determine if there is a viable cohort of patients at a site. The preparatory to research provision allows such a researcher to identify prospective research participants for purposes of seeking their authorization to use or disclose their prospective health information as part of a research study. However, it does not allow for a researcher outside the covered entity to identify patients that may be eligible for a study, this must be done under a full HIPAA waiver and with appropriate agreements, as well as IRB approval, in place. It also does not allow the Penn researcher to actually reach out to those patients to recruit or enrollment them in a trial without full IRB approval.

Slice Dicer Use Cases:

  • Feasibility/Cohort Identification: A researcher wants to determine if he or she has a large enough available pool of patients to enroll in a research study or needs to provide a sponsor or funder with an estimate of the number of evaluable patients he/ she would be able to enroll. Slicer Dicer can be used as a self service tool searching all of PennChart data to provide counts of eligible number of patients.
    • Anyone with PennChart access who has taken Slicer Dicer online training can perform this function.
    • Under this use case a researcher may not:
      • Access identifiers
      • Record level data may be accessed but the following will be excluded; HIV data, substance abuse information, behavioral health encounter information 
  • Preparatory to Research: A researcher can access potential research patient PHI to prepare a recruitment strategy but they may not actively reach out to patients until the study has been approved.
    • Under this use case a researcher may not:
      • Access name or SSN number but contact information can be viewed
      • Record level data may be accessed but the following will be excluded; HIV data, substance abuse information, behavioral health encounter information 
    • Query can be saved and rerun by the team so that once IRB approval is obtained, patient contact information can be obtained
  • IRB Approved Protocol Use Cases: In all of these cases Slicer Dicer may be used to conduct the research or for the purposes of research operations. A valid IRB number must be entered before accessing the data
    • Public health research
    • Quality improvement research
    • Retrospective review or biospecimen studies with necessary identifiers when consent and hipaa authorization are waived
    • Prospective recruitment for a clinical trial.
      • Patients can be contacted through IRB approved messaging via an IRB approved application – MPM, phone call,etc.

For the two cases which require IRB approval the following must be done:

  • IRB application to describe use of Slicer Dicer and identifiers to be obtained
  • At least one member of the research team must have Slicer Dicer access and gone through training. Training and access will be confirmed by the IRB.
  • IRB will confirm the PHI meets the minimum necessary standards
  • The following data will not be able to be accessed through the application; HIV status, substance abuse information, behavioral health encounters.
  • Data will be downloaded onto a Penn Medicine device or stored in a secure location

Research Cohort Exploration and Data Analytics Tools- TriNetX

How Do I get Access to TriNetX?

TriNetX is a local account access tool. Complete this form and submit completed form to:

Where does TriNetX data come from?

The main source of TriNetX data comes from healthcare organizations (HCOs) around the globe. Ranging from specialty clinics to large academic medical centers, HCOs start with providing data typically found in a structured format (e.g. Diagnoses, Procedures, Medications, Labs, and Vitals) from their electronic health records system (EHR). From there, HCOs can opt into sharing additional data not typically found in their EHR, such as cancer registry, genomics data, and data found in notes (extracted via natural language processing).

Availability of data can vary by institution or region. For example, nearly all of USA HCOs provide four or five of the main data types (Diagnoses, Procedures, Medications, Labs, and Vitals), but Procedures and Medications might not be as readily available to ingest from ex-USA HCOs.

See this link for more details (requires a TriNetX login):

How does TriNetX map the data?

As part of onboarding an HCO, their data is mapped to a set of standard terminologies. Demographics data (e.g., race and ethnicity) are mapped to HL7 administrative standards. Diagnoses are represented by ICD-9-CM and ICD-10-CM. Procedures are represented by ICD-9-CM, ICD-10-PCS and CPT. Medications are mapped to RxNorm ingredients. Laboratory test results and vital signs are mapped to LOINC. Molecular genomics data conforms to HGNC for gene naming and HGVS for variant descriptions.

The TriNetX Master Terminology also includes lab roll-ups and derived facts. For example, to ease finding and using common labs, LOINC codes are rolled up to clinically significant level for most frequent labs. One case you’ll see this is the lab TNX:LAB:9029 Sodium [Moles/volume] in Serum, Plasma or Blood corresponds to 2947-0 Sodium [Moles/volume] in Blood and 2951-2 Sodium [Moles/volume] in Serum or Plasma.

Examples of derived and calculated facts include:

  • The Oncology Treatments hierarchy identifies patients who have received radiation, chemotherapy, targeted therapy, hormone therapy, and stem cell transplants.
  • Chemotherapy Lines of Treatment identifies patients who received anywhere from 1 to 5 lines of chemotherapy.
  • Glomerular Filtration Rate (GFR) is based on serum creatinine and other information according to MDRD, CKD-EPI, and Schwartz formulas.

Are the dates in TriNetX shifted?