By continuing to use our site, you consent to the processing of cookies, user data (location information, type and version of the OS, the type and version of the browser, the type of device and the resolution of its screen, the source of where the user came from, from which site or for what advertisement, language OS and Browser, which pages are opened and to which buttons the user presses, ip-address) for the purpose of site functioning, retargeting and statistical surveys and reviews. If you do not want your data to be processed, please leave the site.

Delivering on the Promise of Synthetic Data

Replica Synthesis Software

Replica Synthesis software ingests real data and builds data synthesis models to generate high utility synthetic datasets.

Key features of Replica Synthesis include:

  • Synthetic Cohort Builder to query and integrate data from multiple data sources and generate synthetic variants of these cohorts.
  • Validation Server to re-run analytics code on the original data.
  • Produces a synthesis report which describes the data, methodology, the synthesis results, the utility results, and any limitations. The report template can be easily customized.
  • Customizable synthesis parameters.
  • Can be deployed on the cloud or on-premises. Clients do not have to share their data to synthesize it.
  • Comprehensive REST API for integration with multiple and varied front ends.
  • Generates detailed data utility results.
  • The software is built for small and large datasets.
  • SDKs supporting multiple data science and software engineering end-users.
  • Flexible synthesis plan specifications to handle complex datasets.

Data Synthesis

The basic functionality of Replica Synthesis allows the user to define a cohort from the source data and return a synthetic version of that cohort. A report summarizing the synthesis and the utility assessment is sent directly to the user.

Complex workflows can be defined using the workflow designer tool. These can include cohort definition steps using an interactive tool or using scripts to perform more complex data transformations and selections on the data. Data linking and data pooling can be performed within the workflow definitions. Workflows can also be cascaded to modularize complex data processing requirements that include synthesis.

Data Simulators

Data simulators are stored generative models that can be used to produce an arbitrary number of new records. Each data simulator has its own meta-data documentation, utility assessment, and license. They provide data consumers with an easy way to access realistic data on-demand without having access to real data.

Replica Synthesis now has a simulator exchange capability. There is a public exchange available to all users on the platform. Users can also create their own simulator exchanges by sharing simulators with other users. This provides a very flexible mechanism to make simulators available within teams, within departments, or whole enterprises.

The following are examples of the currently platform-wide simulators available to users of Replica Synthesis:

  • The Canadian COVID-19 test positive (case) dataset (open license).
  • The Canadian responses to a COVID-19 behavioral survey (open license).
  • The global responses to a COVID-19 behavioral survey (open license).
  • The US CDC COVID-19 dataset (open license).
  • A bank marketing dataset - term deposits (open license).

For more information on Replica Synthesis software, please email