We are lucky to live in the age of Big Data. With rapid technological advances, the cost of creating large datasets such as genomic, transcriptomic or proteomic studies, continues to plummet. Life science companies now have the luxury of creating data relatively quickly and easily. However, these masses of data are only valuable if they can be used. They become even more valuable if they can be reused.
Data management isn’t just a question of value generation, it’s also a regulatory requirement for biotech and pharmaceutical companies. It’s a big challenge for life science organizations, many of whom have amassed decades-worth of data from many different sources, generated in a range of formats.
Moving Away from Data Silos and Data Lakes
Data management was a major topic at the 2019 SmartLabs & Laboratory Informatics Congress in London, a meeting of over 450 life science leaders. Many biotech and pharmaceutical companies face the challenge of ‘data silos’, where each lab or department stores and manages data separately. This approach can lead to data being unused or inaccessible, especially if original team members leave the organization.
Some life science organizations have replaced data silos with ‘data lakes’ – central repositories containing all data within an organization. A data lake may make data more accessible but it doesn’t necessarily make data more useable. For data to be useable, it must first be harmonized, standardized and labelled.
One approach to ‘silo breaking’ is the FAIR (Findable, Accessible, Interoperable, Reusable) data movement, which has been supported by the National Institutes of Health (NIH) in the US, the European Commission, the Pistoia Alliance and many other organizations.
FAIR data management aims to make it easy for humans and machines to find, access and reuse internal and external data.
Companies can make data findable by standardizing all data, generating rich metadata to describe datasets, and assigning unique permanent identifiers to all data and metadata. Another key step is to index all data within a central, searchable database or resource. Ensuring that machines can read all identifiers and data formats will make complex informatic analysis easier to perform.
Companies can foster collaboration by making data accessible to different teams. For this to work, there must be clear guidelines about who can access the data, and how and when that data can be used. This may include authentication and authorization steps.
Interoperable data can be used by many different people and machines because it is standardized and labelled according to standards recognized by everyone in the organization. This includes using standardized vocabulary within data and metadata. That way, anyone within an organization could use a particular dataset.
Standardization combined with thorough and accurate labelling also makes data reusable across a number of different projects.
How to Make Data FAIR
At the SmartLabs Congress, Kees Van Bochove, Founder of The Hyve, shared a blueprint for implementing FAIR data principles in biopharmaceutical R&D. He described three steps:
1. Create socio-cultural changes within an organization to promote the idea of working together on data. Give people time to become comfortable with sharing data, allowing others access to their data, and using external datasets to supplement their own data.
2. Train data stewards within an organization. Promote awareness of FAIR principles and teach all staff best practices on how to make data available to others. Show people how to share data and implement data standardization, for example using standardized vocabularies to describe and label data.
3. Provide the technological infrastructure to support the change to FAIR data management. This requires technology, such as LabTwin's digital lab assistant, that can acquire, accurately label, securely store and share data.
According to Van Bochove, the effort of implementing FAIR data principles is rewarded with improved scientific collaboration across an organization. Rather than attempting a radical, company-wide change, organizations could take small steps towards switching to FAIR data management. For example, Bayer reported working with the Pistoia Alliance to deploy the FAIR framework in a single lab. Lessons learnt from such a pilot study could then be applied to the whole organization.Contact us for more information on how LabTwin's digital lab assistant can help you implement FAIR data management in your lab.