Penn Medicine Academic Computing Services


IS Leadership Insights from Brian Wells

Leadership Insights

There is growing interest at Penn Medicine in collecting discrete patient outcomes data to add to our existing clinical and research data warehouses. Because Penn Medicine focuses on treating a large volume of high acuity patients needing advanced diagnostic, therapeutic and procedural services (cancer, cardiology, neurology), one of the most desirable outcomes to track is survival. But determining survival rate measures is a challenge at Penn Medicine city hospitals since large volumes of patient referrals (approximately 80% of non-emergent patients) originate from non-Penn providers and return to the care of those external providers after their care at Penn. This makes it very difficult for Penn Medicine to determine patient survivorship.

To begin to address this challenge the Data Analytics Center has subscribed to the Social Security Administration's (SSA) Death Master File (DMF) service. This information on 89 million deceased patients has been carefully matched to the 5 million patients in Penn Data Store. We will soon be updating the deceased status of nearly 400,000 patients in our PennChart electronic medical record to reflect this valuable information. This status will also flow into Penn Data Store and PennOmics. Our performance in identifying valid patient cohorts for research studies, clinical trials and clinical registries will improve significantly. We will also receive a monthly DMF update from the SSA which will ensure our survivorship data will remain accurate.

While this effort will address a large gap in our survivorship knowledge, there are patients who do not possess a Social Security Number or refuse to provide it to Penn Medicine. About 1.3 million patients in Penn Data Store do not have a value in the Social Security Number field. To address this situation we will be pursuing a match of this population of patients against the CDC's National Death Index. This is a much more complex matching process given the lack of an identifying number and actually requires sending information to the CDC which they manually match, for a fee, against their records of state provided death certificate data.

We are pursuing this effort to improve the quality of data in PennChart and Penn Data Store. This improvement will benefit many downstream uses of this data. Some examples include:

  • Research feasibility studies undertaken by individual investigators at Penn or external sponsors that need to determine how many living patients qualify for inclusion and exclusion criteria
  • Recruiting patients into active clinical trials
  • Surveys sent to patients post care at Penn Medicine
  • Fundraising activities.

We have proven the value of high quality data aggregation with our data assets: Penn Data Store, PennOmics and PennSeek. We continue to strive to add value to those assets. This project is just one example of our ongoing efforts.


HPC Statistics (March 2016)
  • CPU Hours — 1,572,835
  • Disk (TB) — 1448
  • Archive(TB) — 413
  • Total Number of Users — 320
LIMS Statistics
  • Total Number of Users — 48
  • Total Samples — 353,907
ACC Velos Statistics
  • Total Studies — 2,780
  • Total Subject ~ 75,000
  • Total Active Accounts — 300
Customer Service Group (CSG) Tickets
  • Total Tickets — 1,600

Technology Initiatives


Faculty Onboarding Application (Request to Recruit)

In November 2015, PMACS began development of the new Faculty Onboarding application, an innovative solution that owes its ideation and design to the Penn Medicine Center for Health Care Innovation. Phase 1 implementation of the Request to Recruit (RTR) consists of a financial approval process for the hiring of a candidate for a faculty position at PSOM. Later phases will integrate processes for provider credentialing and the faculty appointment and promotion process. This new shared database tool will replace the current paper-based process allowing for a candidate's information to be entered and approved online. Dashboards will show the location of candidates in process, global views of all candidates, and delays and attention alerts for the many offices involved, including department and division staff, Faculty Affairs, and UPHS finance including CPUP and hospital administration.

The team is currently working toward making the tool available in April, when we will enable select users to pilot the ability to capture, validate, and output all data, including: candidate and appointment information, distribution of effort, academic plan, work location, space and equipment needs, draft offer letter and comp statement, salary and salary sources, and uploads such as CV, CHOP/VA support, CPUP member agreement, etc. The final milestone is planned for August 2016 and will include the online routing, dashboards, and approval.


PennChart Research

The successful completion of the PennChart 2015 upgrade in February provides users with significant improvements in reporting and managing patient research information within PennChart. Noteworthy new functionality includes a home dashboard providing quick access to various activities and new reporting functionality to help research staff update patient enrollment statuses and upcoming appointments. This upgrade also provides the foundation for the future integration with the CTMS and the additional PennChart applications being implemented in October, 2016 and March, 2017. The PennChart Phase 2 Implementation (which will be going live starting this fall) will include substantial integration with the current PennChart Research application and provide significant improvement for research workflows. The new integrated platform will allow more streamlined enrollment and patient management within PennChart as well as streamline the billing processes related to research activities.

The go-lives are scheduled for October 2016 at Pennsylvania Hospital (PAH) and Chester County Hospital (CCH) and then March 2017 at Hospital of the University of Pennsylvania (HUP), Penn Presbyterian Medical Center (PPMC) and Homecare / Hospice. The project team has been focused on building, testing, and preparing for the education curriculum for users. Education for users will be a combination of online video tutorials and in-classroom training, along with reference materials such as tip sheets and quick start guides. Look for more announcements in the coming months around education and activations. More information on the PennChart implementation is available here:

PSOM Biospecimen Annotation Standard for Medical Research

Under the leadership of Dr. Katherine Nathanson and Dr. Michael Feldman, a PSOM biospecimen annotation standard was developed and released in March 2016. The goal of the standard is to promote consistency across the institution in terms of specimen annotation and sharing of biospecimen data across medical research projects at Penn Medicine and with other institutions.

The standard was developed in a series of workshops attended by several in-house biospecimen lab managers, bio-informaticians, and Drs. Nathanson and Feldman. It borrows heavily from the National Cancer Institute (NCI) CA Tissue package, the NCI CDEBrowser data, and established data annotations used by pre-existing repositories at Penn. All permissible values (for example, body site locations) are also mapped to ontologies to allow for flexible data analytics.

For information on acquiring a PSOM sponsored biospecimen database to manage your research samples, contact Jason Hughes, Director of Enterprise Research Applications & High-Performance Computing.

Penn Medicine Clinical Trials and Document Management Systems

Penn Medicine has embarked upon a strategic set of projects that will serve to align the clinical and research missions of the organization. These projects serve to aggregate, normalize, share, and analyze clinical and research data that were previously dispersed and dissimilar. Initiatives include a number of projects and systems, two of which are an enterprise clinical trials management system and an enterprise document management system. The ultimate goal of this program is enhanced research and clinical outcomes using improved processes and systems, and the transformation of data into enhanced value.

A clinical trials management system is a software system designed to help principal investigators, study coordinators, and other faculty and staff track and report on nearly every aspect of a clinical trial. Paper-based processes are replaced with electronic processes, and data is aggregated for use in the CTMS, as well as potentially exported and interfaced for use in external systems such as a data warehouses and the electronic medical record.

Penn Medicine is extending an existing CTMS, in use for over seven years in the Abramson Cancer Center, called Velos eResearch. The benefits of this centralized Clinical Trials Management System are improved regulatory compliance, study data aggregation, visibility into study status and study progress, and financial oversight. Expansion to encompass the entire research enterprise is underway, and adoption of the system is currently planned to begin in early 2017. Once fully adopted, the system will house an estimated 2,500 clinical trials. The CTMS will have interfaces to the PennChart electronic medical record, such that study, patient, and demographic data will remain current and accurate in both systems.

A document management system (DMS) is a key component of the clinical trials management process. Currently, document management in the clinical trials research space is manual and relies both on electronic and paper documentation. To improve compliance, accuracy, transparency and provide economies of scale, the organization has selected a document management system called Veeva to serve as a process and document aggregation point for clinical research studies. The system will manage the electronic trial master file (eTMF) in total, which is a large collection of documents generated in the course of a clinical trial. The DMS is cloud-based, and is one of the first enterprise cloud software initiatives the school has undertaken. The deployment of this system will begin with a small number of early adopters in June of 2016, and an enterprise phased adoption will occur over 12-18 months to encompass a majority of studies.

The CTMS and DMS will provide valuable tools for managing clinical trials processes, and provide valuable reports and insights to Penn Medicine's faculty and staff. If you have questions about these projects, please contact Jason Hughes at or 215-573-7079.


PMACS storage options are located within an integrated PSOM/PMACS private network also encompassing all PMACS-supported desktops, servers, users, and lab instrumentation services.

The primary goals of the PMACS enterprise storage systems are to eliminate redundant IT infrastructure, continuously maintain a secure and compliant IT environment, reduce individual risk related to compliance and security, and attain better integration with Penn Medicine clinical & research workflows.

Features of all PMACS enterprise storage environments include file services configured within firewall protected secured private networks. File shares with tight logical and physical controls are in place to control access to data only by approved users via a formal request processes; allowing protected health information data [PHI] to be stored. Also, all data storage housed within physically secure facilities are protected from unauthorized physical access and direct logical access.

The PMACS standard storage models available for research use within Penn Medicine are listed below. PMACS can also custom fit a storage model for particular use-cases that may not fit into the standards addressed here. All these storage services are available to University staff and faculty, and can be configured for use by UPHS users as well, should that need arise.

  1. Commodity Storage, [$35.00/TB/month]
  2. High Performance Computing (HPC) Storage, [$55.00/TB/month]
  3. Archive Storage, [$15.00/TB/month] — Previously known as HPC Storage
  4. Custom, pricing based on use-case and funding available

Commodity Storage is the primary Penn Medicine research enterprise storage. Connectivity to this storage can be made via block device assignments commonly used for structured data, i.e. databases and also file system assignments; commonly used for storage of unstructured research data. Features of commodity storage include full disk redundancy and automatic failover, with no single point of failure when accessing data and an identity management protocol in place used to assign ownership to data storage with assignment of users to specific groups used to access data. There are also several schedules of backups of all data. These schedules include daily incremental backups kept for 90 days; monthly full backups kept for 4 months; quarterly full backups kept for 1 year, and yearly full backups that are kept for 5 years. In addition, specific requested backups can be created on request. The data backed up is stored in secure locations, separate from the servers and desktops that use this commodity storage environment.

HPC Storage is associated directly with the PSOM HPC compute nodes. The compute nodes are attached to 1.8 Petabytes of IBM Storwise V7000 disk storage, housed in two separate performance tiers (there is no backup performed on the HPC storage). The storage is presented to the compute nodes via a ten-node IBM Scale out Network Attached Storage (SONAS) system leveraging the IBM General Parallel File System (GPFS).

HPC Archiving is a tape-based system that provides data mirroring for fault tolerance, and access via a simple Mac, Windows, or Linux drive mapping. This long-term storage is priced as a fee-for-use model, and clients have the ability to purchase only the storage they need, for as long as they need it. The long-term active archive of data is available via a SpectraLogic T950 tape library, housing 1910 LTO-5 tapes, 290 LTO-6 tapes, and 12 LTO-6 drives with a total raw capacity of 3.6 Petabytes of storage. Each archive tape provides redundancy in the event of a tape failure, bringing the useable archive capacity to 1.8 Petabytes. While slower than disk-based storage, this tape archive has expansion capabilities into the thousands of terabytes. The system allows users to store data in a cost-effective manner, and have that data easily accessible days, months, or years in the future.

Users can copy data from existing Penn HPC directories or from other network and host locations to this archive system. The process of moving files to the archive is very similar to what many labs and users do when backing up their data to DVD drives, external hard drives, or USB keys. Data retrieval is equally as easy, requires no intervention from PMACS, but does take somewhat longer than retrieving data from a hard drive or USB key (since the data is read directly from the tape system).

Also see the following past newsletter for additional details: Winter 2015-HPC Archive System, Summer 2015-New data backup, and spring 2014-Benefits of using PMACS services.

User Tip

Phishing is a form of social engineering that attempts to 'con' users into giving up sensitive information, for example personally identifiable or financial information.

Communications that seem to be from popular social web sites, financial institutions, well-known companies, IT support staff, etc. are used to lure people into giving up information which could then be used to impersonate the individual for financial gain or some other reason.

Phishing is typically carried out through e-mail spoofing or instant messaging and it characteristically directs users to enter information at a fake website that looks very much like the legitimate website.

Some quick tips to stay safe:

  • Be suspicious of attachments and unexpected e-mail messages.
  • Never reveal personal or financial information in e-mail, and never respond to e-mail requests for this information.
    • Legitimate organizations will never ask for this type of information in an e-mail.
  • Pay attention to the 'make up' of the message. It may look authentic, i.e. logos, urls, etc., but there are typically mistakes, for example with grammar.
  • Never click on a link in a suspicious message; it could be used to spread malicious software. The domain name may appear to be the real domain name but it may have been altered. If you have reason to believe that the message is real, call the organization or go out to a web browser and enter the web address but don't click on any links in the message.
  • Ensure that your antivirus is up to date and running in real time scan mode.
  • Take note if the message contains threats or a sense of urgency.
    • Cybercriminals often use threats to get what they want. The threat could make you let your guard down and cause you to respond quickly without thinking.

Remember, you can usually trust your instincts; if you think the message is suspicious it probably is, just delete it. If you continually have an issue with phishing, please contact the PMACS Service Desk or Information Security for assistance.

Remember, security won't work without you. YOU are the key to security at PSOM.

Please refer all information security comments or concerns to David Wargo:

For more security related information, visit the PMACS Information Security web page:

© The Trustees of the University of Pennsylvania || Site best viewed in a supported browser. || Site Design: PMACS Web Team