After five years, more than 350,000 hours of genome sequencing and more than £200 million of investment, UK Biobank is releasing the world’s largest single set of human sequencing data to date – making it the most ambitious project of its kind to date. Is completing. The new data, including the complete genome sequences of half a million of its participants, will certainly lead to the discovery of new diagnoses, treatments and cures. Typically, the data is available to approved researchers worldwide through a protected database containing only de-identified data.

This progress lies not only in the abundance of genomic data, but in using it in combination with existing data the UK Biobank has collected over the past 15 years on lifestyle, whole body imaging scans, health information and proteins found in the blood. The Pharma Proteomics Project was published last month NatureIn the paper, “Plasma proteomic associations with genetics and health in the UK Biobank.”

UK Biobank

Looking ahead, these data can be used to advance efforts such as more targeted drug discovery and development, discovering thousands of disease-causing non-coding genetic variants, accelerating precision medicine, and understanding the biological basis of disease. Is.

Sir Rory Collins, FRS FMedSci, UK Biobank Principal Investigator, said: “This is a real treasure trove for approved scientists carrying out health research, and I hope it will have transformative consequences for diagnosis, treatment and cure around the world. “

Nearly 20 years ago, the UK Biobank recruited half a million volunteers to create the world’s most comprehensive source of health data. The new addition of sequencing data comes after a series of huge leaps forward made using the huge UK Biobank biomedical database. These leaps include: Discovering genes associated with protection against obesity and type 2 diabetes , Identifying individuals at very high genetic risk for diseases such as heart disease, breast cancer and prostate cancer, and a link between activity and Parkinson’s that can predict the disease up to seven years before diagnosis from smartwatch data, potentially leading to This may lead to early intervention. The new sequencing data will dramatically increase the capacity of existing data.

UK Biobank

The sequencing project was funded by Wellcome, UKRI and four biopharmaceutical companies: Amgen, AstraZeneca, GSK and Johnson & Johnson. In return for a significant investment, the UK Biobank grants nine months of exclusive data access to industry members of the consortium. DNA sequencing was completed by Amgen’s subsidiary, deCODE Genetics, and the Wellcome Sanger Institute using Illumina NovaSeq technology, and deCODE provided additional informatics processing support.

The four pharmaceutical companies plan to publicly share their summary statistical analyzes arising from the consortium collaboration, including genome-wide association results, which will free the research community from the expensive and time-consuming process of analyzing raw data. Provide highly valuable insights without the burden.

The data – and the UK Biobank’s remaining de-identified data – is now globally accessible to approved researchers on the UK Biobank Research Analysis Platform, hosted on Amazon Web Services (AWS) in the London region and enabled by DNAnexus . Following completion of sequencing, the industry consortium led efforts to process the genomes using the DRAGEN pipeline on AWS infrastructure and perform joint calls, allowing this vast amount of data to be converted into a single combined genetic dataset by Illumina. Could. These outputs further enhance the potential of these data to identify less frequent genetic variants and make it more cross-comparable with other large-scale population health studies.

Source: www.genengnews.com