Open-Sourcing Synthetic Clinical Data for Innovation

October 21, 2024

Particle Health Team

Introducing Particle Health's open-source repository of synthetic patients in CCDA format. This dataset includes five diverse, clinically relevant patient profiles ideal for demos, development, and testing—making clinical data more accessible for all.

Today, Particle Health is open-sourcing a Github repository with synthetic clinical data in CCDA format.

At Particle, we specialize in working with CCDA XML files to extract critical clinical data, empowering our customers and ultimately benefiting patients. While CCDA is an older format, it remains the primary standard for sharing clinical data across health information networks.

A common challenge when working with CCDA files is the need for synthetic data that clearly indicates the patient’s medical conditions and helps identify where to locate this information within the document. To address this, we collaborated with clinical experts to create five synthetic patients in the CCDA format. These datasets are designed to showcase a variety of clinical use cases and cover multiple disease areas. Using synthetic CCDA data removes privacy concerns, as no real patient information is involved. This ensures that organizations can innovate and develop healthcare solutions without violating HIPAA or other privacy regulations. Synthetic data also eliminates the need for anonymization, which can be costly and time-consuming when dealing with real data.

We initially developed this dataset for internal use during a hackathon a few months ago, and it has proven valuable across several areas:

Customer Use Cases: Identifying clinical use cases that our customers can solve for.
Demos: Showcasing our products with clinically relevant data, now available in our sandbox environment.
Feedback: Gathering input on new features, like AI-powered summarization.
Development: Supporting software testing and refinement.

Given how useful it has been for us, we believe it could be equally beneficial to others.

The repository contains C-CDA XML files for five synthetic patients, each representing different clinical conditions, along with AI-generated summaries of their medical histories. To make them easier to remember, the patients are named mnemonically:

Freda Quently - A patient with multiple recent urgent care visits
Artie Joinston - A patient with musculoskeletal issues
Glynda Sugarman - A patient managing diabetes
Hart Fallon - A patient with cardiac conditions
Tuma Nephro - An oncology patient with myeloma and Chronic Kidney Disease (CKD)

Here is the link to the repository. The data is licensed under Creative Commons ShareAlike Attribution License.

About the author

Particle Health Team

Since 2018, we've powered the future of healthcare by connecting innovative provider organizations to patient data.

Most Read Stories

See for yourself

Tech Talk: Why Choose an API as a Product?

We know what APIs are, and what they do. So why choose an API as a product? APIs have become largely ubiquitous in the tech landscape, and for good reason. While there are alternatives, it's important to understand why we don’t choose them. Check out Part 2 of our three part miniseries on APIs.

Particle Health Team

Dec 10, 2020

Tech Talk: What is an API?

What is an API? The answer to what an API does seems pretty obvious, at first glance. However, what an API actually is can be sometimes misunderstood, so we've put together this high-level technical overview to help you better understand both what Particle's API is, and what our API does.

Particle Health Team

Dec 3, 2020

🎧 Scaling Engineering Teams, and Their People

Particle VP of Engineering Dave Hauenstein spoke with PodRocket about what happens when companies start relying on teams instead of individuals, and how developers can build meaningful careers.

Mar 9, 2022

Open-Sourcing Synthetic Clinical Data for Innovation

Most Read Stories

Tech Talk: Why Choose an API as a Product?

Tech Talk: What is an API?

🎧 Scaling Engineering Teams, and Their People

Look under our hood.

Request received!