By continuing to use our site, you consent to the processing of cookies, user data (location information, type and version of the OS, the type and version of the browser, the type of device and the resolution of its screen, the source of where the user came from, from which site or for what advertisement, language OS and Browser, which pages are opened and to which buttons the user presses, ip-address) for the purpose of site functioning, retargeting and statistical surveys and reviews. If you do not want your data to be processed, please leave the site.

Delivering on the Promise of Synthetic Data

Predictions and resolutions for synthetic data

Author: Dr. Khaled El Emam

The end of the calendar year can be an opportunity to take stock and see what expectations materialized, as well as to look forward at and consider what’s to come.

Over the course of 2021, we certainly watched with great interest as Forrester and Gartner laid out a series of expectations and predictions about the future of artificial intelligence (AI), and within their predictions, synthetic data generation (SDG) plays a big role.

In their AI 2.0 report, Forrester [1] noted SDG’s growth and identified it as one of 5 transformative, breakthrough advances in AI that organizations should be paying very close attention to. They said it is already producing concrete results, eliminating a number of barriers associated with obtaining, using and sharing data, for example, by enabling more rapidly access to data to build and validate AI and machine learning models. Forrester expects SDG to unlock a wide range of new capabilities and ultimately help organizations become more resilient.

Gartner made similar predictions about synthetic data, some of which we’ve highlighted before, including their #1 prediction that synthetic data will result in better privacy. Gartner anticipates that by 2024, 60% of the data used for the development of AI and analytics solutions will be synthetically generated and synthetic data will halve the volume of real data needed for machine learning. They also say that by 2025, synthetic data will reduce personal customer data collection, avoid 70% of privacy violation sanctions and reduce the risks of privacy breaches.

As a data synthesis company, it is exciting for us to see these predictions materialize, to witness the rapid adoption of SDG and to experience the growing number of use cases we are seeing, in healthcare and other areas where sharing personal information can have such tremendous benefits.

With the great dedication, skill and hard work of the Replica Analytics team, as well as excellent collaboration and support of our many clients, partners and colleagues, I am happy to say we are ready to take 2022 by storm and to continue to advance our mission to make the world’s health data universally and responsibly accessible for secondary analysis.

Our resolution for 2022 will be to ensure our solutions continue to deliver and build on this central promise.

Now, if you’re considering new year’s resolutions for 2022, I humbly suggest:

  • If you’re an organization facing challenges using and sharing data, reach out to us to see how we can help.
  • If your work involves personal information, have a look at our resources to bolster your knowledge of SDG as a modern privacy enhancing technology (PET).
  • If you’re a regulator, consider our recommendations for regulating non-identifiable data and creating the conditions for the adoption of PETs.
  • If you’re a smart, curious, skilled data scientist or engineer who wants to advance health care, AI and privacy at the same time – three of the hottest topics in the world today – come join our growing team.

And with that, I wish you all the best for the holidays and a happy, healthy 2022!


[1]       AI 2.0: Upgrade Your Enterprise With Five Next Generation AI Advances, Forrester Research, August 2, 2021