Thank you for Subscribing to CIO Applications Weekly Brief
Navigating the Big Data Explosion in Biomedical Sciences
Jun Yin, Ph.D., Director, Bioinformatics Shared Resource, Sanford Burnham Prebys Medical Discovery Institute
Biomedical research is experiencing a big data explosion. This started in the 1990s with the ability to sequence the entire human genome. Since then DNA sequencing has improved even further: there are now instruments that have lowered the cost of sequencing to less than$1,000 per genome and that trend is heading toward $100 to sequence an entire human genome.
The invention of DNA sequencing has inspired more tools that are deepening our understanding of the human body. Now there are technologies that can measure total RNA expression in a tissue, an indicator of genes that are “turned on”; investigate how gene expression is regulated; obtain the genetic sequence of a single cell and much more.
This huge amount of information brings many challenges. Logistically, simply managing and storing big data may present difficulties. Additionally, scientists typically are not trained in big data analysis. At Sanford Burnham Prebys, the breadth of research taking place is also a challenge. Our researchers study cancer, neurological disorders such as Alzheimer’s, heart disease, children’s diseases and many more. We also have a state-of-the-art drug discovery center, called the Conrad Prebys Center for Chemical Genomics. As a result, we need to be able to address a broad range of computational biology requests.
What is the unique leadership strategy that you are following after witnessing the current trends?
Because bioinformatic tools are evolving so quickly, our top priority is to make sure we provide our scientists with the most cutting-edge methods available today. We are constantly evaluating new technologies and making sure we stay up to speed on best practices to ensure data reproducibility and integrity.
We are currently working to standardize and streamline workflows for the most commonly used analyses using the state-of-art computational tools. This allows our team to work efficiently and focus on helping scientists with more difficult and unique problems. For example, after generating analyses reports for multiple assays—such as analyzing RNA expression in tumor tissues, profiling tumor genomic mutations and screening various cancer drugs—we can help scientists analyze these overlaid assays that will reveal a full story of how the cancer developed and progressed—and which drug is most suitable.
We are lucky to live in the modern world where we have an explosion of new genomics technologies, but we need to be able to integrate and interpret the data to deepen our understanding of disease and improve drug discovery
This comprehensive view of data will lead to deeper biological understandings and better medicines.
We also try to put power back into researchers’ hands as much as possible. We routinely provide training on advanced bioinformatics technologies and software—especially for those that are commonly used—so our scientists can analyze their own data. Additionally, we often set up computational infrastructure and customize available software for individual research groups.
Together, these efforts speed turnover time for analysis, allowing our scientists to get to important biological answers faster.
According to you, what would be the next big disruption in the market? Any new updates that you feel are going to really change things in the bioinformatics market.
Machine learning will absolutely disrupt—and improve—biological research. This method allows patterns to be easily identified and data to be deeply mined. For biologists, this means new insights that humans are unable to spot can be identified simply based on the huge amount of information. Right now, we are working to apply different machine-learning methods for data integration, to help discover new therapeutic targets, to understand drug resistance mechanisms, and to generate deeper insights for drug development—especially personalized therapies—and other research activities.
In addition, I am particularly excited about applying single-cell sequencing technology to biological research. This technology allows gene expression and regulation to be studied at single cell level. In contrast, scientists were previously only able to study gene expression in tissues, not at a cellular level. Applying machine learning to this method will reveal even deeper insights into the molecular machinery inside a cell—and how this may lead to disease.
What do you feel drives you from a professional level in terms of what you’re doing as director of bioinformatics at Sanford Burnham Prebys?
I am passionate about helping scientists find treatments for serious conditions, which has been the focus of my career. I feel privileged to be able to help experts at our institute discover new therapies for some of the deadliest diseases, including cancer, heart disease and Alzheimer’s. I am also proud to work for a nonprofit research institute, as we can focus on conditions that might otherwise be overlooked due to the lack of a profit motive.
We are lucky to live in the modern world where we have an explosion of new genomics technologies, but we need to integrate and interpret the data to be able to improve our understanding of disease and drug discovery. Imagining a future when children survive brain cancer or a grandparent's lives to see the birth of her or his first grandchild drives me every day.
What would you advise a fellow professional who is just starting off in the bioinformatics space as a piece of advice?
Bioinformatics is multidisciplinary, so you will need to have solid training in biology, computer programming, and statistics. With this background, you will be able to understand the question a researcher is asking, know the right algorithm needed to analyze the data and create a program that automates the process and generates robust data. There are many Ph.D. programs in bioinformatics and now you can even get an undergraduate degree in the field.
Equally important, you have to be a good communicator. In bioinformatics, you will collaborate with researchers from diverse fields. The ability to understand the problem at hand and communicate a solution is critical to professional success.
I highly recommend bioinformatics as a profession. Almost every biomedical research institute and pharmaceutical company has a bioinformatics core like ours. The career opportunities are abundant and growing.