How can computers understand what is going on in cells and diagnose diseases? In this interview, Prof. Manfred Claassen explains how our understanding of biology is advancing thanks to technological progress. A key element is the application of machine learning to the wealth of data created by single-cell analysis technologies. The startup Scailyte is pioneering this approach with the goal of developing early diagnostics for complex diseases.
University of Tübingen
Manfred Claassen is a tenured Professor at the University of Tübingen. From 2013 to 2020, he was an Assistant Professor for computational biology at the Institute of Molecular Systems Biology at the ETH Zurich. He has a Diploma in Biochemistry and a Diploma in Computer Science from the University of Tübingen and a Ph.D. from ETH Zurich and completed his postdoctoral training at Stanford University. Manfred is co-founder and Scientific Advisor of the startup Scailyte.
Analyzing biological data with the help of machine learning algorithms promises to advance our understanding of diseases. But how intelligent is artificial intelligence today?
The models we are using stem from the field of machine learning that is related to the field of artificial intelligence. Artificial intelligence, in my opinion, creates an association with the broader notion of human intelligence and our ability to infer connections between facts from scratch, and this – in this generality – machine learning models today aren’t capable of. One way to conceive such models is as a model that is able to faithfully reconstruct an output, such as the weather tomorrow, based on input data, such as the weather today, without knowing the output beforehand. The way the model learns is by showing it existing pairs of input and output data. This very basic and seemingly unintelligent concept works very well in a plethora of application domains, as for instance impressively demonstrated for face or speech recognition.
If an algorithm tries to recognize a disease instead of a face, what will it be looking for?
Cancer and immune disorders such as multiple sclerosis or diabetes are complex diseases. What they have in common is that they are caused by dysregulation of a few cells, whose (dys-)function is determined by the interactions of a multitude of proteins and genes. The model will learn a large number of molecular parameters in parallel. For these diseases, not all of the cells are dysregulated. The health issues are caused by a small number of cells that transition to a diseased status and then, possibly multiply as, for instance, in cancer. Over the last decades, experimental methods have been developed to monitor thousands of proteins or genes at once. However, these methods analyze cells in bulk and are severely limited in diagnosing these diseases early.
That’s why single-cell technologies, which have become commercially available only about a decade ago, hold such great promise.
Exactly. Single-cell technologies allow measuring these parameters and thereby establishing a CV, so to speak, of every single cell in a sample. And this information makes it possible to actually look for patterns across individual cells that indicate onset or development of a disease, and extract these patterns.
But how does the model learn these patterns? In the case of cancer cells, there is no standard as these cells mutate over time.
We attempt to identify cell patterns that are the same across a large number of patients, and do so by analyzing dozens or hundreds of patients with the disease and compare it to healthy individuals. To address the limited number of patients in such studies, we help our model to learn by incorporating formalized prior knowledge from life science experts. This enables it to come up with meaningful patterns even with small study sizes.
How do you know that teaching your computer model biology lessons is actually useful, and what exactly to teach?
We’ve observed that a vanilla machine learning model is difficult to interpret. Inclusion of prior knowledge allows for relating model predictions to the basic functioning of cells as well as the different types of cells in the human body. While every cell shares the same blueprint, they differentiate over time to become skin cells, liver cells, or heart muscle cells, etc. But the fundamental question of which prior knowledge information is useful to train a diagnostic machine learning model is a topic of ongoing basic research. It is conceivable to go much deeper and integrate principles of chemistry and physics into a model, for example.
How different is the work the computer model does from the work of a scientist who also tries to interpret such data?
First of all, what the computer does – based on evaluating the model prediction – is not some kind of voodoo, it replicates the way in which a human would analyze the data. Take the basic question if the expression of a given protein correlates with the onset of a disease. The human scientist looks at protein number 1, then at number 2, etc. Sifting through a measurement of hundreds or thousands of proteins would be possible for a human researcher, but it would also be tedious, prone to error and last but not least boring work which takes a long time. We are considering a more complicated situation where, thanks to the latest single-cell technologies, we consider such measurements for possibly thousands or up to millions of individual cells. This situation renders data analysis possible only by means of more sophisticated computer models. Beyond automating work, these models, if designed properly, also provide a gain in knowledge, because they can identify complex patterns between a multitude of proteins.
So in a way, it’s not just the scientist that teaches the model basic understanding, it’s also the model that teaches the scientist new insights about interactions between cells. Gaining knowledge about complex diseases is the idea behind the startup Scailyte that you co-founded. How did the idea evolve into a company?
Peter Nestorov, who is now the CEO of Scailyte, came to me after he received his doctorate from the University of Basel. He was fascinated by the potential of single-cell data and was interested to do postdoctoral research on expanding on methods to analyze them at ETH where I taught back then. However, he opted for an offer in the biotech industry, which was very fortunate, because he could grow a powerful network of single-cell analysis machine users. Over and over he heard how those people struggled to analyze the data coming out of these machines. In the meantime, at ETH, we established a model that could correlate this kind of data with disease states. Having accomplished this, we realized that this approach has enormous potential to discover and establish diagnostic biomarkers for a wide spectrum of complex diseases. This is why we teamed up together with the software specialist Dennis Göhlsdorf and Daniel Sonnleithner to establish Scailyte.
How did the startup evolve since then?
At first, we focused on software development with the aim to improve our model to the point where other researchers could use it in their own projects. Over time it became apparent that we could capture the biggest value of our technology by defining and initiating clinical projects ourselves, with the goal to identify the cell identity biomarkers that allow developing a diagnostic assay.
How can you find these clues that should eventually allow diagnosing diseases much earlier than it is possible now?
Complex diseases leave traces, and we’re searching for these traces. For instance, in the case of cancer, our immune system detects it and is altered by it. While in those cases where cancer grows out the immune cells aren’t able to fight the disease, they still respond to it by adapting and changing, and we’re trying to identify and detect these changes. Scailyte is focused on diagnostics only and not on developing a cure. This goal allows us to focus on the single-cell analysis of blood samples only, not necessarily requiring patient material from the primary disease site. This situation facilitates the design of studies and further down the road of diagnostic assays because we don’t need typically difficult to attain tissue biopsies.
The level of insight that single-cell analysis methods have allowed is unprecedented. How will these methods evolve in the future?
At the moment, these methods can measure one type of molecular parameters at a time, RNA, proteins, or DNA, for example. In the future, they will be able to measure all of them in one go and provide even more complex data, which we expect to tap even more knowledge and enable an even wider spectrum of applications.
How has your startup experience with Scailyte been so far?
I’ve transitioned from an active founder to a scientific advisor role over time. It’s very exciting to see a startup grow from its founding. The development of a startup, with new people coming in, who bring new ideas with them, is very dynamic. But the most important thing has been constant since the day we started, namely our vision to leverage the potential that the combination of single-cell methods and machine learning has. It’s very powerful to have this long-term goal as guidance, even when you often change the ways to get there.
If everything goes according to plan, Scailyte’s most advanced project, a diagnostic test for a type of skin cancer called CTCL will be available in about two years. What would that signify for you?
I think we need to stay humble and deliver this result first. To go all the way from a single-cell study to identifying a biomarker and then developing a clinical assay, nobody has ever done this before. It would be a world premiere and great news for the whole community.
Written by
WITH US, YOU CANCO-INVEST IN DEEP TECH STARTUPS
Verve's investor network
With annual investments of EUR 60-70 mio, we belong to the top 10% most active startup investors in Europe. We therefore get you into competitive financing rounds alongside other world-class venture capital funds.
We empower you to build your individual portfolio.
More News
22.07.2020
Supercharging our startups
It has been more than half a year since we established our Portfolio Success Team to help our portfolio companies. We are happy to share a review of what has happened so far, what were some key results and what is coming next for the second half of 2020.
08.07.2020
Investment Activities H1/2020
We're happy to share a few highlights from our investment team from the first half of 2020.
08.06.2020
Workhero.co, a new way to improve job prospects
Our portfolio firm Firstbird launches WorkHero, a new platform for non-desk workers who lost their jobs. It is based on the idea that recommendations and trust are sometimes more important than just a CV.
Startups,Innovation andVenture Capital
Sign up to receive our weekly newsletter and learn about investing in technologies that are changing the world.