Penn Medicine Academic Computing Services


IS Leadership Insights from Brian Wells

Leadership Insights

When "big data" became a popular buzzword, many people in the information systems industry used a test to determine the big-ness of data. This test scored the data utilizing three v's — volume, velocity and variability. Volume is a measure of the size of the data using statistics such as record counts and total disk storage required. Velocity is a measure of the speed or frequency at which the data is generated or received using statistics such as characters or records per minute or second. And lastly, variability is a measure of the degree of uniqueness in each data record or data field.

While the social media industry (Twitter, Facebook and others) clearly have data that scores high on each v (Twitter is now managing over 500 million tweets a day), healthcare traditionally only scored the highest on variability. In fact, we may have all industries beat when it comes to variability because of the uniqueness of each patient we treat and the breadth of structured and unstructured data we capture.

On the volume front, Penn Medicine is managing big data with several petabytes (one million gigabytes) in our respective data centers. Our high performance computing cluster alone can hold two petabytes of genomic sequencing data. This equates to over 40 million 4 drawer filing cabinets filled with text. Relatively speaking, however, our volumes were traditionally not that high and the velocity of new data generation did not come close to other big data industries. With the broad implementation of electronic medical record (EMR) systems and the continued integration of medical devices into EMRs, healthcare is rapidly catching up on the velocity front.

At Penn the velocity is the greatest in our ICUs and surgical suites where many electronic monitors are taking readings continuously and sending out billions of data observations a year. Even in normal care settings, more and more clinical monitoring devices are being integrated with our EMR.

And just on the horizon is patient-generated data. The Apple Watch, for example, sends a heart rate measurement to the wearer's iPhone every 10 minutes. If clinically valuable, this data can flow into our EMRtoday. In the past we have not saved this real-time monitor data because of the cost of storage and the lack of powerful tools to analyze it. All of that has changed.

The Data Analytics Center team is assembling a big data cluster to retain the terabytes of data emanating from patient monitors and the unstructured text content created by our clinicians. This data will be utilized by Penn Medicine's Data Science team to create new predictive analytics capabilities. It will enable our natural language processing and text mining efforts. The monitor data can also be easily de-identified and shared with biomedical informatics researchers, Penn students or medical device innovators. Penn Medicine's investment in electronic medical records continues to support all three of our missions.

Metrics - August

HPC Statistics
  • CPU Hours — 2,450,943
  • Disk (TB) — 1,130
  • Archive(TB) —265
  • Total Number of Users — 270
LIMS Statistics
  • Total Number of Users — 41
  • Total Samples — 321,724
ACC Velos Statistics
  • Total Studies — 2,668
  • Total Subject ~ 70,000
  • Total Active Accounts — 475
CSG (Customer Support Group) Tickets
  • Total Tickets — 1,632

Technology Initiatives


Move to a New Home

In response to client demand, PMACS has grown to a ninety-two person team since its inception over four years ago. The demand for new services has grown in all service areas — software development, information technology (IT), and enterprise research applications (ERA).

To better accommodate our increasing team, PMACS recently relocated our IT and ERA staff, as well as several others, to new office space in the C and D towers of the Richards complex. Not only does this move bring these teams closer to the main PMACS office in Anatomy Chemistry, it also puts them in the midst of the several Perelman School informatics groups that recently relocated to Richards as well. The proximity of our staff to many of our primary clients should facilitate greater interaction and collaboration on research computing.

New Analytics Initiative

Until recently, use of analytics in medical schools has primarily been in the research setting, focused on making sense of research data. However, with the increasing availability of business intelligence (BI) tools, medical schools are now turning their attention to using analytics to better manage their administrative and educational missions as well.

Over the next several months, PMACS will be assisting Perelman School administration with identifying opportunities to better manage the school via analytics. Once the administrative project team (led by the Division of Finance and Office of Decision Support and Analysis) identify these opportunities, PMACS will work with other members of Information Services to identify the appropriate BI tools. Over the last several years, Information Services has completed similar efforts for UPHS. As a result, we have deep experience and a substantial BI infrastructure to support increased use of analytics.


New Clinical Research Systems

As PennMedicine advances its strategic initiative to align research and clinical domains more closely, the need to collect, codify, organize, store, retrieve, analyze, and share both clinical and research data becomes increasingly imperative. Compliance and privacy concerns also drive the need for more transparent data management and an understanding of how data are used, when, where, and for what purpose. The PMACS Enterprise Research Applications (ERA) team is leading a number of projects under the purview of the Office of Clinical Research (OCR) to realize the goals of data management and aggregation. Some of these projects are:

  • Clinical Trial Management System (CTMS) Upgrade and Expansion: The Abramson Cancer Center (ACC) uses the Velos CTMS system to manage ACC studies and has employed this system for approximately seven years. PMACS is working with the vendor to upgrade the current ACC Velos instance to a newer version as well as conduct some data cleanup. This effort paves the way to expand the use of Velos across a majority of clinical trials in many departments. Key goals of this effort are greater visibility into clinical trials operations at PSOM, improved compliance with federal regulations, as well as the ability to interface clinical trial data with data warehouses such as PennOmics.
  • Document Management System: Clinical trials require a Trial Master File (TMF), which is a collection of documents related to the trial. This requirement is an important part of federal compliance regulations, as well as compliance with pharmaceutical and biomedical device vendors with whom PSOM collaborates. A document management system (DMS) helps to maintain the electronic version of the TMF (eTMF) to aid in the creation, tracking, and archiving of these important documents. As with the CTMS, the DMS can prove an invaluable tool in producing evidence of compliance with federal regulations.
  • EPIC Interface: Epic is the vendor that Penn Medicine uses as its main electronic medical records system. A multi-year project is well underway to transition the inpatient medical record to Epic in addition to key clinical and business areas such as ambulatory, ED, radiology, and transplant. In conjunction with this project, PMACS ERA is working with our CTMS vendor, and Epic to enable data interfaces between these two systems. This interfacing takes advantage of a relatively new module in the Epic system, called Epic Research. This module is specifically designed to enable the Epic EMR system to aid in the enrollment of research subjects, as well as help clinical research faculty and staff gain valuable insight into the clinical activities surrounding research participants.

These projects will help PSOM manage its clinical research more effectively, gain valuable insight into clinical research activities with a minimal data gathering effort, aid in compliance with federal regulations, and provide both structured and unstructured data from which additional analysis may be possible.

REDCap — Application Update

Over the last year, there have been several newsletter articles detailing the installation of new REDCap hardware infrastructure and version upgrades.

In September, PMACS and CCEB's Clinical Research Computing Unit (CRCU) completed this "upgrade project" and Penn Medicine REDCap will now be maintained as an ongoing "program".

The current STD version of REDCap will be maintained and procedurally upgraded to keep the application current as it evolves. PMACS and the CCEB/CRCU support teams will be implement a new testing and quality assurance process to ensure the integrity of each new REDCap release before it is applied to the production environment. As always, these teams will continue to communicate their activities to the REDCap community before any new releases are implemented.


The current REDCap @ UPENN installation is available at: REDCap

Assistance with Grant Proposals and Data Management Plans

PMACS provides the expertise to design, implement, and manage the Information Technology (IT) required by the Perelman School of Medicine's research, academic, and administrative activities. For the preparation of sponsored project proposals and data management plans, investigators should consider and involve PMACS in the initial planning stages.

PMACS's expertise and knowledge can help develop the project's data work flow, identify the IT technologies required for that work flow, and then provide accurate cost models to accomplish collection, management, and use of the project's data. PMACS can also contribute the narrative to accurately and concisely describe these activities for use within the project proposal.

Many times, the total cost of technology is not covered by the project. An institutional letter of support implies a clear and complete projection of the costs the institution must bear and the letters of support implies a commitment by the institution to support the project and commit to resources not provided by the sponsor. PMACS can help develop the total costs to ensure both the sponsor and the institution knows what each will be providing, thus making a stronger proposal.

The Service Information Officers (SIOs), the primary liaisons between the School of Medicine and PMACS, can assist with all of the aforementioned proposal services. Our SIOs and their respective areas of coverage are:

Please contact any of our Service Information Officers for assistance.

User Tip

Using Social Media for Recruitment and Research

Social media can be an instrumental tool while conducting research and recruiting study subjects. However, appropriate measures must be taken to ensure that personally identifiable information, (PII), protected health information, (PHI), and other types of sensitive information are protected throughout the recruitment and research process.

Issues of privacy, confidentiality, informed consent and potential risks to subjects arise when researchers and subjects interact virtually. Through the development of a carefully considered management plan, social media accounts can facilitate ease in targeting potential subjects and widen the scope of potential data collection.

When appropriate and applicable for the study, the Penn IRB supports the use of social media in research and has issued a guidance document to serve as a resource for assisting with the development of a management plan. Please note that the Penn IRB requires the review of this comprehensive management plan before the initiation of research-specific utilization of social media.

The IRB guide covers the following topics:

The Use of Social Media as a Recruitment Activity —

  • Recruitment Ads:
    1. One-way ads that do not involve direct communication with potential subjects.
  • Interactive Recruitment:
    1. Recruiting subjects through two-way communication via researcher-initiated social media accounts.

The Use of Social Media as a Research Activity —

  • Social media may be used in research procedures for things like data collection or as a means of intervening with subjects for research purposes. When social media is used for research, the IRB requires that investigators carefully consider a plan for protections that will be used for each social media application to minimize privacy, confidentiality and safety risks to subjects.

The guide includes specific information about IRB submission requirements for any planned use of social media.

To view the complete IRB Guide go to:

If you have any questions about the IRB's policies on the use of social media in research, please contact David Heagerty ( or Jessica Jones (

© The Trustees of the University of Pennsylvania || Site best viewed in a supported browser. || Site Design: PMACS Web Team