MLOps Innovator Series: Dispelling Hype vs Reality for AI in Healthcare
Artificial intelligence (AI) is revolutionizing the healthcare industry. Medical practitioners and healthcare stakeholders can address medical problems faster and make more informed decisions with AI and machine learning.
Continuous innovation delivers a promising future for AI in healthcare, but is it hype? With Benjamin Letson, Director of Healthcare & Life Sciences at SFL Scientific, we discover how AI is used in healthcare, its challenges, and its future in the industry.
The medical industry accumulates vast data sets in images, health records, clinical trials, and medical claims. Artificial intelligence can analyze this data and uncover insights and patterns that are indiscernible to humans. As a result, many organizations have utilized AI in healthcare.
How is AI used in healthcare? In 2021, the U.S. Food and Drug Administration (FDA) granted the first ever approval for an AI product in Digital Pathology, Paige Prostate, marking the new era of the use of AI in pathology. This AI software is designed to identify signs of cancerous growth in biopsy samples, enabling fast and accurate diagnosis.
AlphaFold, created by Google’s DeepMind, can predict protein structure, a pivotal solution to accelerate disease and drug development research. It’s referred to as the most important AI achievement and a tour de force of technological innovation.
At the height of the COVID-19 pandemic, Moderna leveraged workflow automation, data capture, and AI to accelerate processes and deliver insights to their scientists. This leverage sped up Moderna’s COVID vaccine development and is helping the company revolutionize disease treatment.
Continuous research and innovation will drive the use of AI in the healthcare sector for years to come. The industry is predicted to utilize synthetic data to speed up processes and create innovative AI solutions that offer personalized care for end users.
Stay Ahead of the Curve with Pachyderm
It’s not just hype—the role of AI in healthcare signals the growing importance of leveraging this technology and finding ways to innovate and stay competitive. Pachyderm provides what you need to accelerate your machine learning’s life cycle and stay ahead of the pack. Our AI MLOps tools and solutions can optimize, automate, and scale your machine learning operations. See the Pachyderm difference by scheduling a demo today!
Dan: [music] Welcome to another edition of the MLOps Innovator Series. I'm your host Daniel Jeffries, and I'm here with Benjamin Leston, PhD, Data Scientist, Director of AI Solutions for Healthcare and Life Sciences at the esteemed SFL Scientific. Thanks for coming to the show, Benjamin.
Benjamin: Sure. How're you doing? Thanks for having me.
Dan: Maybe tell everyone a little bit about your sort of background and how you got into AI and anything else that sort of fascinating that people would want to know.
Benjamin: Sure. Sure. So by training, I'm an applied mathematician. I have a doctorate from the University of Pittsburgh. I studied primarily, kind of, biophysical modeling of the brain in my thesis. So after that, I kind of looked around and saw that a lot of the interesting work that was happening in the healthcare and life sciences field was happening in AI. So I joined SFL Scientific as a data scientist where I worked with primarily healthcare clients doing everything from predictive analytics to more developing R&D solutions for transformational AI use cases. So after that, I worked as a solutions architect, and now I lead our healthcare division trying to connect interesting people with interesting problems.
Dan: Speaking of interesting problems, we've seen a bunch of-- it felt like for a long time artificial intelligence really wasn't delivering in terms of almost any of its promises. And then we started-- you get to AlexNet and you get this kind of big breakthrough and a lot of different use cases. But it was still slow progress. In particular, slow progress in the healthcare space. Right? You had the, kind of, disastrous failure of Watson Health, and many people thought, "Oh, maybe it just won't ever work." Right? Of course, I think none of us in the industry thought that that was the case, but still, it's been challenging to find healthcare, biosciences applications. Now, all of a sudden, we're seeing things like AlphaFold2 and DeepMind kind of spinning off a complete machine-learning-focused company based on that. Where do you think we are today? What's the most exciting use cases that you're sort of seeing? And where we're still sort of struggling and it's going to take more time?
Benjamin: Sure. Yeah. I think that we're definitely feeling some growing pains in AI in the healthcare space. I think for a long time, there wasn't a clear value proposition for the buy-in that's required to introduce AI into a clinical setting. But I think tools like Page.AI being approved by the FDA is kind of opening up a new realm of providing decision support to clinicians to make better, faster clinical outcomes for patients. I think that the place where the industry is still struggling, is finding the data access they need to empower strong decisions. I think a lot of companies are still struggling with-- they have very splintered, very siloed datasets that they can't join together to kind of produce a holistic view of their data resources and how that can be leveraged to buoy patient outcomes, or drug discovery outcomes, or what have you. So I think the next five years of AI in the healthcare space is going to be about connecting datasets together. I think a really cool technique for that is something similar to a knowledge graph that can take a lot of heterogenous data together and join insights together with cause and effect type relationships.
So I think bridging data is the next step. And then once that data is in place, building really specific ML tools on top of that to help clinicians and researchers alike. So I think the things that are happening right now with AlphaFold are really interesting developments trying to tackle really fundamental problems in the healthcare space, and I think we're starting to get to the point where people will trust AI enough to really let it into core decision-making processes.
Dan: Now, you raised a number of really interesting points there, right? Specifically, you kind of touched on the challenges with the datasets, and you mostly framed it in terms of whole-based access control or kind of old-school bureaucracy, challenges of having different silos. Some of the other ones seemed to be that maybe there isn't enough quality data, right? So do you still see synthetic data playing a part in the roles and how challenging can that be? In fact, that was one of the big failures of Watson Health, was that they generated synthetic data that just wasn't applicable to the real world, right? And so generating quality synthetic data is interesting. And the other challenges that we saw on papers recently from Google and a few others on trying to find new ways to augment or sort of reduce so many self-supervised learning because labeling, kind of, these huge datasets is incredibly challenging, too. Do you see those both as blockers over the next five years, or do you see, kind of, both of those being solvable problems within the next five years?
Benjamin: Yeah, I think synthetic data is going to be a really important part of almost any training pipeline. I think that to get enough information about either the way a tissue behaves, if you are doing drug discovery, or mocking up patient data to understand how intervention links to outcome is going to be really crucial. I think, like you mentioned, it's a very hard problem. I think that in order to generate synthetic data effectively, you need both subject matter expertise in the room as well as really competent data scientist actually building the synthetic data pipeline. I don't think this could be done without having a doctor in the room to talk about how interventions are going to change based on a patient's demographic information, things like that. But I think that getting that right enables a lot of interesting AI. And to your point about labeling, I think it lets us kind of extrapolate and get more bang for the buck with the label data that we have. So I think moving forward, I see synthetic data tools, even Nvidia, announcing the omniverse tools at GTC in the last week or so. I think people are going to start moving in that direction, and I think the industry as a whole is kind of maybe becoming less distrustful of talking about synthetic data being a really valuable tool in the development process.
Dan: And do you think-- you mentioned a lot of things like clinical trials and you're starting to see certain-- Moderna, being one of the more advanced companies in this space, kind of starting with the digital-first approach and kind of having AI/ML sort of embedded in the entire process from, kind of, risk analysis or whatever, and a lot of other pharmaceuticals sort of retrofitting their process that are very paper-based. So do you see-- but I've also seen sort of a lot of interesting edge cases when it comes to it. So there is a post the other day about a lady who had bought the latest Apple Watch with the EKG, or whatever, built into it and bought it for her mother on an upgrade, and it had woken the mother up in the middle of the night with a warning that she was AFib. And then there were a whole bunch of other folks who posted similar sort of stories which were really interesting, right? So do you think that the most interesting use cases are kind of in repurposing sort of the things that we're already doing, right, and kind of enhancing them, like acting in the sort of Centaur methodology where you're working with the AI on the drug discovery, the risk, the clinical analysis, these kinds of things? Or do you think it's also on the edge as well, or is it really kind of both of those things that are the most promising? And which is further along, I guess is another question?
Benjamin: Yeah. I think there's still room to grow on a lot of the edge deployment use cases that we're seeing. So I mean, obviously, internet of things, and reliable data steams, and wearables opens up a lot of interesting use cases. But I think there is a ways to go before we reach the potential of embedded AI that lives in a device and kind of operates autonomously without oversight. Because if you think about something like a continuous blood glucose monitor for a diabetic, right? So if that machine makes in incorrect prediction, someone's life is at risk, right? So putting guardrails around AI in making sure that it's operating the way it needs to is still a major barrier to crossing the regulatory lines needed to get those really interesting AI use cases out to patients. So I think thinking about not only how you take advantage of AI, but how you put guardrails around it, is a major step in edge computing and devices running on a patient level. I think that there are also a lot of technical challenges around housing these-- excuse me, housing these models on devices that often have limited compute and limited memory. You don't want to eat up your battery running a giant neural network, right?
So thinking about pruning, model deployment, how you leverage federated learning to improve your models and customize to individual patients, I think there's a lot of room to grow there, and I think we see that as a place where the healthcare industry is really heading in the next three to five years. I think on the other side, you're talking about kind of more institutional AI. So running on the cloud to enable people making decisions, perform drug discovery research, that sort of thing. I think that's an easier playing field to get into right now. It's easy to deploy those models and integrate them into people's workflows. I think there, the thing that's needed to grow is more buy-in from the experts doing the work, right? So getting a decision support tool in front of a research technician doing compound screenings in a drug discovery lab, and showing them that this is making reliable decisions and it can improve their work without any feeling of the AI being superior and dictating terms. Really building a kind of a collaborative relationship between an intelligence and a user that makes both of them stronger over time.
Dan: Interesting. Yeah. I am in total agreement with you that it's an easier playing field to go on the institutional side than versus the edge device. And I always feel like we're going back to the earlier days of internet devices where you had to worry about how much memory you had, and you had to-- it's almost like the early video games where you had to make sure, now, we've only got this much on the ROM, and we got to be super-economical programmers and we got to crunch it down. Or you look at in the infrastructure lines which I manage, there's-- I think [inaudible] is one of the companies that's just specializing in how do you take a model that exists and compress it, right? And still have it be viable, right? And there's a lot of this research now, how do we prune perimeters, for instance, and still make it valuable so it shrinks the size of these things? And the news captures a lot of these trillion-parameter networks. And it's like, well, that's great, but if I wanted to run on my watch or my glasses or in my smartphone, it might need to get a little smaller than that. Or make a call out to that one as a back up, but not necessarily be the one that's running inference on my phone.
Benjamin: Absolutely. Yeah. So to your point about video games, there's an interesting process that was used in Doom, actually, to compute inverse square roots that are needed when you're doing things like bounces around the playing field or whatnot. So I think we're at the point where we're starting to see that there are these kind of niche solutions to pruning models that are very equivalent to that, which basically come down to really clever programming techniques to mount these models on small devices with limited battery. But I think there's also a field of research that's growing around the information content stored in neural networks parameters. And I just saw some interesting work that suggested that you can prune out about 80% of a YOLO model, which is a computer vision model. You can prune out about 80% of the parameters and still have it function nearly identically. So there are a lot of interesting questions in the space about why are neural networks learning a lot of redundant parameters? Because it seems to be really important in the training process, but identifying how to take those models and make them as lightweight for deployment is a field that we see is really exciting. And a place that the field is going to need to go to deploy on the edge really effectively.
Dan: Well, we've actually seen similar-- I've seen a bunch of papers recently about using that for sort of multi-capability neural networks, right? Where you would have-- there's a number of different techniques where you might overcome catastrophic forgetting by essentially pruning it down to like 5 or 10 percent of the nets and then freezing those and then have it train on a new task. And then there's even papers where you essentially have a gigantic neural net in the back and kind of a learning neural net in the front with the idea of being able to pass information before it and they might then learn new things, layer it on top, then learn something new, layer on top, then average them all together. And again these kind of-- so not only is that important for kind of crunching stuff down, but it's also going to be important for trying to teach these things to be more than a one-trick pony, I think is quite interesting as well. And you also have been-- when we were talking outside, the conversation, you were excited about some other techniques. You were excited about graph neural nets, and some of the things around that. Why are you excited about that? What do you think that they kind of bring in particular to the health sciences field? And we've seen them used a lot in social media networks and graphing personal relationships. So why do you think that they're useful in kind of the biosciences and healthcare fields?
Benjamin: Sure. So I think that a lot of healthcare-related data really fits into a graph framework very naturally. So if you imagine you have a patient node in a graph network that's connected to maybe all of the past health issues that this patients had, demographic information, etc. So if you have kind of a little community that describes this person, which is very dynamic and flexible enough to capture the fact that one person might have a lot of different health issues that can be interconnected, right? So I think the promise of a graph neural network is being able to take that structure and learn directly from the interconnected nature of the data to try to predict, if I add a new symptom to this person's kind of community of information, should I be looking at disease, X, Y, or Z, right, and trying to figure out dynamically, can I use all of the past information I have in this natural format to see what's next for this patient, how I should intervene? I think on a totally different front, I think that if you look at people doing drug discovery models, you can represent any molecule you want to study as a graph network.
So the Lewis diagrams that we all drew in school that says, "This carbon atom is bonded to that carbon atom," or natural graph objects. So if you train our graph neural network on Lewis diagrams, what you get is a model that understands chemistry and the relationship between atoms in a way that hasn't been explored yet. This is an entirely new capability, in chemical modeling space. So I think to sum that up and make it succinct, the flexibility of a graph structure allows you to capture really interesting information, and then the new developments and these neural networks allow you to make predictions based on that information, which enable a lot of cool use cases.
Dan: So if you were-- so essentially, in one of those use cases, if I understand correctly, you're saying that, if we're able to pull in a lot of different information about an individual in their history, where they've been, doctor's notes, this kind of rich unstructured data that kind of exists out there about a patient in a gigantic sort of fog that's around them, that we might be able to represent it in a graph in terms of where they're sort of heading, right, is that sort of a trajectory, to make predictions in a way that says like if you keep going-- if you keep smoking, for instance, and eating a burger every other night and drinking soda, you're likely to end up in this direction? Is that sort of how you're thinking about it?
Benjamin: Yeah, I also think that, beyond that-- so let's imagine that that patient that has that full history is presenting a new set of symptoms that a clinician is trying to diagnose. So what I would imagine is, let's say that there are three candidate diseases that explain this grouping of symptoms. Now, what a clinician is going to do is order those in terms of, what's the likelihood that X, Y, or Z is the correct diagnosis? And then you're going to run tests to start with the most probable disease, eliminate it, and move on to the others. So what I was imagining is a tool that's taking into account all of the patient's histories to help the clinician prioritize which boarder to test these diseases. So most of the time, when you hear hoofbeats, a doctor is going to assume that it's a horse or not a zebra, is kind of the famous idiom, right? But sometimes it is a zebra, and catching the zebra quickly can have a big difference in the clinical outcome for patient, right? So kind of presenting a decision support tool to help clinicians make the best decisions as fast as possible, given the entire body of a patient history.
Dan: Interesting. And we also thought, if we look at something like alcohol-- and you mentioned that a lot of things can be represented as a graph. In fact, kind of the way that they described it is that they were using inattention network, right? And that attention network was looking at, sort of, spacial graphs. Thinking about proteins, essentially, as spacial graphs. Using - what is it called? - a multiple-sequence [inaudible], right? So their attention mechanism was basically trying to understand that spacial graph, right? And then you kind of generate predictions about how other structures would align into a different graph, and then sort of generating their own sort of graph for what that structure would look like. So there's kind of a-- it turtles all the down in a way, that they're sort of-- these graphs that we're representing, they were also able to potentially just represent a lot of these structures itself as a graph. So on the institutional side, where do you see that being sort of important? Because we talked about the patient side-- again, that's sort of going back to that edge institutional dichotomy and understanding the patient. But on the institutional side, where does the rubber meet the road? Where does this become really practical? Because it's the vital things that they're going to be able to do on the institutional side of things.
Benjamin: Sure. Yeah. So I think one of the really obvious use cases here is constructing knowledge graphs that kind of encompass all of the data an organization has. So let's imagine you're a very large pharmaceutical company doing drug discovery, right? So likely, when you're talked about, you're probably dealing with heavily splintered, heavily siloed datasets, and there can be a lot of institutional blindness about, group A doesn't know about the data that group B is generating. So one thing we see as a really interesting enabling technology is taking all of that data, extracting very high-level cause-effect relationships from all available data. So looking through presentation slides and extracting the fact that compound A will find what the protein B and cause some effect, right? So extracting all this high-level information and populating a large knowledge graph so that everyone in the organization has some high-level awareness of everything that's been tested before in these labs, and you can eliminate redundancy and time spent on experiments that have been tried 20 years in the past.
So I think the first step in a lot of these applications is connecting users to all of the data that's available to allow them to make intelligent decisions. And then after that, putting a layer of artificial intelligence on top of that to say, "Oh, I think there might be an interesting connection between this compound and this target. You should test this," right? So doing the same type of prediction you're talking about - excuse me - with patient analysis, but now we're looking at a high level and then saying, "No one's tried this protein compound pair before, and it seems like a strong candidate for a beneficial interaction."
Dan: So the knowledge graph, in essence, is also-- so it's a filed-driven-- in essence, it says, "Don't go down this path. It's already been done a bunch of times," or, "If you're going to go done this path, consider a different angle on it," right? But this sort of folds together in things that have been tried. And then on the flip side, it's also saying, "Here are angles that you might not have already thought through as potential strong candidates." And so there's an [inaudible] of processing this kind of discovery where it's suggesting potential pathways. Do you think there's also a feedback group that starts getting roped into those types of applications, too, where-- because I've often thought about this in, sort of, the yards, right? Sometimes AI is underrepresented there, but if you think about a guitar, a person playing a guitar, creating a riff, I can start to see the AI working with a person and saying, "Okay, great. Well, go give me 20 continuations of that riff." And okay, now I've listened to them and I go, "Well, you know what? Continuation six is really interesting. Go in that direction and forget the other ones." So do you sort of see a similar feedback loop and that like as they kind of test these different things that you go, "You know what this doesn't look too promising," or, "You know what? This was interesting. The preliminary results were this way. Give me some more ideas on this"? Do you see that kind of feedback loop happening? And, sort of, how fast do you see that sort of feedback loop happening?
Benjamin: Yeah, yeah. So we typically refer to that as something like guided hypothesis exploration. So you work with the AI to look for interesting directions, and as more data comes in, to refine those directions. So this is definitely a feedback loop where an AI can keep track of a lot more information at one time than a person can. And taking that and filtering it down to the bare essentials to help guide someone that's performing really involved scientific tasks. So I think there's definitely a feedback loop where both the human improves the quality of the AI and vice versa over time. Now, I think the interesting thing about positive feedback loops like this is that you're going to accelerate development. So AI helped person, person helps AI. So that both of these things are going to get much better, much faster than they would independently. So in the next three to five years, this is something that we're definitely going to see kind of breaking out of the space. And there's going to be a significant reduction in time to market for drug discovery, for instance. Or the speed at which clinicians can diagnose and make effective interventions. So I think you're exactly right, that the name of the game here is setting up these positive feedback loops where we can make better decisions faster.
Dan: And that kind of leads us into maybe the last-- a good place to sort of finish it off on is, where do you see all this thing in 10 years? A lot of this stuff is now very much in the early adopter phase. There's a lot of folks sort of working on this. It's a little bit messy, or a lot messy, depending on where you're at. And that's where a company like SFL comes in to be able to help piece it-- people make sense of it, and piece it all together, weave together a lot of bespoke solutions. How do you see it looking in a decade from an infrastructure standpoint? How do you see it looking from a types of systems that are in place and available to people both at the edge and kind of on the institutional side?
Benjamin: Yeah. I think in 10 years, the primary difference I see in the AI space in healthcare is more complete customization and personalization. So I think we've talked about a couple of different feedback loops today between a human and a device that has embedded medical capabilities - that blood glucose monitor making decisions. And we've also talked about on the institutional side the relationship between a person and a guided hypothesis exploration engine. So I think what's going to happen in the future is using more confederated learning frameworks to take a strong general purpose AI and slowly tailor it to exactly what the user needs. So when we talk about it internally, we kind of talk about it as if it's going to be-- you're going to have a really smart AI solution that over time is going to learn exactly what you want to do and help you get there. So I think this is the promise of a federated learning ecosystem, where you can have individual models kind of personalizing to the data inputs and outputs they're generating. So I think in the same way that medicine is moving towards kind of a personalized medicine paradigm, I think a lot of the AI in healthcare is going to move in a similar way, to really adopting to what the exact needs of the user are and tailoring its advice or its predictive capabilities to that end.
Dan: Awesome. I think that's going to be an exciting time. I can't even imagine what it's going to be 20 or 30 years from now in terms of our ability to do personalized medicine and be able to treat people for diseases they would have never probably seen a drug built for it, right? I mean, because like it or not, we have to build drugs that have a kind of blockbuster potential. But being able to tailor medicine to a very small genetic disease or disease that maybe doesn't have as much economic potential but still has a kind of human potential is an exciting transformative step that I'm personally looking forward to as a human being. [laughter] I think it'll be quite wonderful. Is there anything else that you kind of wanted to raise? And if not, then we can do kind of one sort of rapid-fire question, and then we can wrap up here.
Benjamin: Yeah. I mean, I think I'm also excited by the potential for AI in the healthcare industry. I think we're still, as you mentioned, in the early adoption stage. And to be honest with you, I can't even imagine what 20 years from now looks like, but I'm really hoping that the industry takes AI seriously, sees its opportunities and its limitations, and that we figure out how to get the most out of the data we have together and in a way that's safe for patients, but also increases the likelihoods of success.
Dan: All right. Now we'll do three rapid-fire questions here, just to finish off. Favorite math formula of all time?
Benjamin: Oh. I mean, I guess either the e^(i*pi) + 1 = 0, so Euler's equation, I think it's called, typically.
Dan: Awesome. Favorite movie of all time? And this could be a hard one, because there could be a top five, but give me the first one that comes to the top.
Benjamin: Well, I mean, the first one that popped to mind was Reservoir Dogs. So I'm going to have to go with that. It's an old-school Tarantino.
Dan: Classic. One of the best. They just don't make them like that anymore. And the last one is, if you could change one thing in healthcare, what would it be?
Benjamin: Yeah. I think it would be the belief that AI is there to replace clinicians. So this is the thing that we've come up against a couple times, and I don't know, I always try to make it clear that these are decisions support tools, and no one's trying to remove the human component from the healthcare application. So I think that if I could change anything, it would be to get buy-in from clinicians, to see AI as a tool that can help them make better decisions and not something that's trying to replace the extremely valuable things they do.
Dan: Yeah, that's one of the top three worst stories in AI, which is AI [crosstalk], that AI is going to take all the jobs, and AI is always biased and evil, right? All these things are addressable, and some of them are just outright Frankenstein mess. So I'd probably change that one too, so. [laughter]
Benjamin: Definitely. Just trying to meet some unmet needs and find better outcomes faster is my whole goal.
Dan: Awesome. Well, thanks so much for your time today. It's been a fantastic talk. And we really appreciate having you on the show.
Trusted by Forward-Thinking Companies
See Pachyderm In Action
Watch a short 5-minute demo which outlines the product in action