Robert van den Berg is currently the Senior Director, Head Data Sciences & Computational Vaccinology for Vaccines R&D at GlaxoSmithKline, a position he reached after over a decade in the company – in addition to his prior work in academia. We caught up with him for a discussion on the current trends in multi-omics approaches and novel technologies.
RB: It was really exciting to start my career out in the omics fields – it felt really pioneering, working in the interface between biology and data science, during my PhD. Perhaps not one of my highlights, but in general it is astonishing to see data science join the mainstream of biomedical research. Corollary to that, it is great to see talented individuals being introduced to our world – the life sciences – from other sectors such as tech, which will enable them to form a real impact on public health and improve the livelihoods of people around the world. We have seen technology revolutionize biomedical science – from microarrays and next generation sequencing to single-cell analysis. We have also seen imaging techniques make groundbreaking progress at breakneck speeds. This progress is supported by advancements not only in the life sciences, but throughout the technological industry and the other sectors it strongly interfaces with. The computer power available for big data analysis is unprecedented – and I suspect it will continue to be as we see subsequent improvements, as we see advancements in AI and Deep Learning while these technologies make their way to biomedical sciences.
RB: It has definitely been really thrilling to see – the possibility of generating this much data has opened many avenues for investigating complex questions and generating novel insights. It has also given rise to other questions – how to improve our own experiment design to generate the right data to answer these complex questions. We also have to think about other concerns: assessing the reusability of such large datasets in other applications, considering the future-proofing of such data, and what kind of metadata we need to capture. I also think the community made the collectively right decision for encouraging data sharing across scientific publications, and I am really proud of the large repositories we have that store such data for sharing information. Although these datasets aren’t always reusable by anyone – sometimes data is partly redacted, we have made great strides; there is just room for improvement.
RB: That is actually a great question – I think the architecture of data in the life sciences means all the players and actors have different motivations for conducting their research and using the tools they choose to use. Compared to physical sciences, the cost of research is not so high that it becomes impossible to perform – allowing people to generate data independently. There are common drivers to harmonizing these endeavors of course – such as the standards set by journals for data sharing and coding being made available. Obviously standardization and harmonization still have a long way to go – especially compared to physics, as not all datasets have comparable metadata. It could be that this data architecture across the field results in different levels of cooperation – although the COVID-19 pandemic has certainly illustrated the need for greater collaboration, leading to an enormous amount of research generation in a very short period of time. The community was feverishly at work trying to find a way out of the pandemic and in the process, people pre-published much of their work on pre-print servers, accelerating the pace with which other researchers could use them. We have also seen a greater push for transparency and information sharing over the pandemic, which are factors that definitely contribute to a frame shift towards greater standard definitions and cooperation in the life sciences.
RB: A definitely interesting question – from a scientific perspective, I think designing experiments based on the latest insights which generate data for testing our hypotheses, or to demonstrate advancements in the field is key. It can be difficult to step back and pause, however as a company we are not only discussing the value of generating new data, but we are also looking at maximizing the value of existing data – good quality data with the right metadata has extremely high reusability. There are still other things to consider – and a lot to gain from new data, and the difficulty of convincing people to stop generating the amount of data they think they need cannot be underestimated.
RB: There are so many large challenges in this domain – one cursory look at the papers being published can identify progress aimed at key deficiencies. These include the availability of high resolution reference matches; consortia like the Human Cell Atlas project have set out to provide these. There is definitely value in harmonizing and “bringing data together”, so to speak, to the benefit of everyone. There is also a constant flow of novel computational methods aimed at improving our approach towards multiomic integration, and perform better comparisons. We can also see that new concepts, such as AI and Machine Learning, have also begun entering the field. The heterogeneity of datasets and experiment variability is indeed a large part of the problem that needs to be addressed, and for that I think improving our experimental design – particularly with a healthy dose of forward thinking with the data analyses and interpretation in mind – can simplify integration the most. It is similar to the beginning of the microbial and sequencing eras – people were facing big technological issues, but we eventually addressed them through perseverance.
RB: The pandemic, and the impact of vaccines in tackling the pandemic, has certainly brought vaccines front and center on people’s minds, and also we’ve seen that the pandemic moved RNA-based vaccines from a promising future technology (one that was decades in the making) to a proven, commercially viable implementation. However, it still looks like further progress will be required for SARS-CoV-2 vaccines in the future, to deal with waning immunity and novel variants. Having a deep understanding of how various vaccines work – the quality and the duration of the immunity they generate, the severity of potential side effects, and the variants they work against – will remain critical for advancing the field. Multi-omic technologies have a large role to play in facilitating these advancements, specifically providing an understanding of the mechanistic way vaccines work at a molecular level – involving the entire adaptive and innate immune systems. Multi-omics can also provide data from different levels – tissue cultures, organs such as lymph nodes or injection sites, which can assist in teasing apart the exact effects of a vaccination.
RB: I think it is very interesting to be active in this field; the opportunities that pioneering technologies provide us with allow us to re-examine our field as a whole as we find ways to adopt them. It will be exciting to see how quantum computing will develop in the life sciences. Currently it is much more developed in physics, which can provide the infrastructure at key expert centers; it will be interesting to see if its eventual adoption in the life sciences brings with it new needs for harmonization and standardization in our field. I am also very curious to see how the algorithms which will be run on quantum computers will be designed to assist in solving multi-omics problems. Formulating questions from multi-omics in architectures that quantum computers can deal with will certainly be an intriguing challenge to see being addressed. For machine learning, I think we already see increasing efforts on that front with integration approaches – it is trending in publications across a plethora of applications, and looks like it is very much here to stay. And indeed, there are other, more familiar challenges we can anticipate: we have a lot of data, though it may not all be as useful as we had hoped. We need to continue to find ways to capture the “spirit” of the data in more innovative and effective ways to increase its reusability and applicability across wider implementations, while also increasing its homogeneity to eliminate biases.
RB: I am absolutely looking forward to physical conferences happening again – and the unique setting of the Bioinformatics Strategy Meeting establishes the most personal setting for scientific discourse. I cannot wait to meet other people active in the field and learn from their experiences; hopefully, they will also learn a little bit from mine. Discussions are always more effective when the attendees bring their own insights and ideas to the table – and the Strategy Meetings are perfect for facilitating this.
Tuğçe Freeborough-Gerrard, Producer, Proventa International
Nick Zoukas, Former Editor, PharmaFEATURES
Join Proventa International’s Bioinformatics Strategy Meeting in Boston to discuss the latest impact of novel data management, analysis and interpretation trends in multi-omics approaches, as well as the wider field of Bioinformatics. Meet and discuss cutting-edge topics with world-renowned experts and stakeholders to kickstart your next collaboration.
Dive deep into next-generation analytics with bleeding-edge insights from Evan Floden, Seqera’s CEO.
Learn more about CEllular Thermal Shift Assay and Pelago Bioscience’s initiative to streamline drug discovery with its revolutionary invention.
Amidst the deluge of data, finding the right balance between working models and useful insights remains pivotal in forging the path towards leveraging computational success in the field of biotech and pharma.
Despite advances, key gaps in understanding insulin resistance persist, including CNS diagnostics, brain-periphery interactions, and apoE isoform roles, highlighting critical research priorities for new treatments.
GAS1’s discovery represents a beacon of hope in the fight against metastatic disease.
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settings