AI/Machine learning is a hot field topic in engineering and tech right now – it feels like every organization is trying to realize the benefits of how they can get their data to work for them.

The reality is that it’s hard to do, especially at a meaningful scale that will truly impact the business.

In this session, you’ll hear from Raj Venkataramani, Chief Architect of AI & Analytics at Cognizant, on the top challenges customers face in the effort to roll out machine learning practices and operationalize at scale.

Request a Demo

Dan: Hello, everyone, and welcome to the MLOps Innovator Series. I'm Daniel Jeffries, your host. I'm the Chief Technical Evangelist at Pachyderm and also the managing director of the AI Infrastructure Alliance. And I am here with my good friend, Raj. He's a chief architect for AI and analytics at Cognizant. Welcome to the show, my friend.

Raj: [music] Hello, everyone, and welcome to the MLOps Innovator Series. I'm Daniel Jeffries, your host. I'm the Chief Technical Evangelist at Pachyderm and also the managing director of the AI Infrastructure Alliance. And I am here with my good friend, Raj. He's a chief architect for AI and analytics at Cognizant. Welcome to the show, my friend.

Dan: I'm super happy to have you on. We've had a bunch of amazing conversations in private over the last year about AI and machine learning. And I'm really keen to kind of get it out there for everyone else to benefit from our shared knowledge. Maybe tell everyone a little bit about yourself and kind of where you've come from and how you kind of got to where you are and what your role is now.

Raj: Sure. Okay. So basically, I come from India where I did my engineering. And I've been pretty much doing a lot of development globally in different countries, US, Singapore, everywhere there was technology, and I went there. Currently, I work as chief architect at Cognizant in London, basically leading a team of experts in artificial intelligence and analytics, providing solutions to our clients in UK, continental Europe, and also, what we call as global growth markets across the world. So that's what I do right now.

Dan: Fantastic. So let's just jump right into it and talk about some of the fun things that you and I are seeing in the space. It's an incredible space. It's developing in new ways every single week, it feels like. There's some sort of new breakthrough, another company coming to the forefront. But it is still an early space, right? It's still very much kind of in the early-adopter phase despite all this kind of money pouring into it and everything. AI is filtering into almost every different aspect of our life. But it's not like you can go buy the O'Reilly book on the perfect setup for computer vision, or every company knows exactly which software to buy and nobody got fired for buying VMware or something like that back in the virtualization days. We're not quite at that level of maturity. But what do you see as some of the most mature solutions in the AI space today?

Raj: Good one. I think you're absolutely right. I think AI is in a very, very nascent stage. And I think that it'll remain that way for many years to come, right, because of the nature of the beast. But a few, I think, have fairly well proven, that's laid the bedrock, a good foundation for AI in industry. Take, for example the financial crime fraud detection space in the financial services world. Practically every financial services firm is operating on that. It's fairly immature. It's still immature, and it's continuing to grow, but you have a very proven track record there. You also have good solutions in the retail space. And today, you have very good understanding of customer behaviors, profiles, what products to upsell, cross-sell, what makes the best actions to take based on their behaviors buying patterns, etc. It's all driven by AI. And that's influencing each one of our lives directly, right?

You also see a lot of action in the call-center space. There are a number of call-center products [inaudible] natural language processing and also all the proactive and addressing of the customer needs [inaudible] happening. Sometimes again, another space which is quite well developed. If you see in the life sciences space, I would think in all the clinical process automation and in general process automation across industries. But even in complex areas like clinical process automation, etc., have been influenced very heavily by artificial intelligence. And last but not least is the manufacturing where you see asset management being all powered by AI. You have very good way of tracking assets, understanding asset conditions failures, prevention of those, predicting those behaviors, etc. So you see some-- I just picked a few that has come as on top of the list that we have discussed before, but there are a number of other areas where it's very mature. But to your point, Dan, none of these are in their cookie-cutter stage. They are all evolving. They have been there for years, but they're still continuing to evolve.

Dan: Yeah. I think people forget that-- I mean, AlexNet is really only 2008, right? And you look at a platform, for instance, like Spark, and its development started even before that. So the tools are just starting to come on that deal with this kind of whole new branch, right? And there are a number of parts of the stack that just don't quite fit like a traditional development environment. There's no analog to training or a lot of the data ingestion phases and cleaning phases and labeling or synthetic data. You just generally don't see those kind of things in traditional coding. And so the tools are still developing along the way. And you and I have talked about this a number of times. And we've seen the big FAANG companies, and those companies kind of roll their own software. But it's just starting to mature in the rest of the industry where you see lots of companies kind of coming out and solving a little piece of the stack, right? But there's not a clear leader in a lot of these spaces. And that's still very much developing space.

And you named a bunch of things like healthcare. You look at something like Moderna whose got machine learning in every step of their process from clinical engineering to drug discovery, right? And we've seen that in other parts of healthcare, we're starting to see breakthroughs like you mentioned. Retail space certainly has been very advanced in this space for a long time. Who do you see, though, that needs the most development and is kind of the furthest away from enterprise readiness, both from sort of a vertical standpoint but also from a tooling standpoint? What do you think really is going to need a number of years to kind of catch up?

Raj: And you named a bunch of things like healthcare. You look at something like Moderna whose got machine learning in every step of their process from clinical engineering to drug discovery, right? And we've seen that in other parts of healthcare, we're starting to see breakthroughs like you mentioned. Retail space certainly has been very advanced in this space for a long time. Who do you see, though, that needs the most development and is kind of the furthest away from enterprise readiness, both from sort of a vertical standpoint but also from a tooling standpoint? What do you think really is going to need a number of years to kind of catch up?

Also many of your real life interaction of AI is much simpler. So in any common industry, you have the [inaudible] in a big way. You also need to do a lot of edge analytics, edge AI because most of the intelligence has to happen on the edges, not on the [inaudible] so it's not in the central cloud server. [inaudible] are cloud-centric. And that shift has to happen. So again, you see that it's not really taking off in a real big way, right? So you see some of these as the areas where things are very, very far away. If they want to go to the farthest space-- and you can't talk about it without mentioning quantum machines, I think, which is probably two decades away from really happening there. But even in the areas where the technology is already available, like edge computing and other [inaudible], we don't have the enterprise-class adoption, the enterprise-class AI in place.

Dan: Yeah, it's interesting that you sort of mentioned the edge devices because I saw an article the other day that was saying for the first time the vast majority of smartphones, for instance, are going to have dedicated AI chips. So Apple's had one for a number of years and it looks like the new Google Pixel 6 will have one. And that's really a shift to have these kinds of dedicated processors to it and that's only going to accelerate it in the coming years. But you're right, the infrastructure at a number of levels isn't fully made yet both from the hardware level and the software level. And I think it's interesting that you pointed out that there's not necessarily a specific laggard, right? Everybody is kind of developing at the same time and trying to get their footing, right, in this space. But do you think there are companies that are the most cutting-edge? You could probably pick a number of different companies. But are there any companies that come to mind that you think are some of the most-- doing some of the most cutting-edge work in AI today and why?

Raj: Yeah, I should qualify what I say with the fact that it's my personal opinion rather than a Cognizant opinion, to be fair. So I would think that all the hyper scalers like the cloud vendors are today leading the [story?] on AI, right? They are trying to dominate the AI [inaudible], right? So they are becoming the Facebooks and Twitters of the social media space, right? Of course, FB and Twitter are also AI companies and they are dominating the landscape in one sense. The cloud vendors like Azure and AWS, GCP, etc. But I don't think it's just them, right? You see a lot of product vendors who are creating very proprietary solutions in spaces like financial crime, asset management, etc. And these product platforms are becoming highly intelligent and they have deep insights to offer and they are able to do product analytics, they are able to prescribe actions which would have been-- which would have taken weeks with professional assistance. And they are able to do it in a matter of seconds and they are able to do it in [inaudible]. So you see that shift.

But I think the real players today, who are driving the innovation, are in two kinds of companies. One is boutique firms which are very focused on providing very, very cutting edge, narrow, domain-focused solutions in the areas like computer vision, natural language processing, etc. So they are able to-- for example, in London you have the Cambridge Image Analytic Center [inaudible] Cognizant is a part of, right? And I lead that initiative. There are a number of boutique firms which are taking very narrow problems around computer vision and others. For example, identifying a defective nerve, right, along with lengths of nerves in the human brain, those kind of problems. And they're actually solving that and they are actually the cutting-edge ones. Then you have the infrastructure companies, right? I think again, infrastructure means that you are leading and glad to be part of that, right? If you see that they understand need for building true innovation class A platform, and they are addressing specific gaps be it in the area of synthetic data or labeling or feature engineering, or the version data controls, data monitoring, etc. So there are a number of these problems in the cloud platforms or the gaps in the industry that these AI infrastructure companies are really filling, right? So you have these different types of things. So to sum it up, you have the cloud vendors, you have the product vendors, and then the boutique firms, which are very focused on computer vision, NLP, and all those technologies, solving very specific domain problems. And then the AI Infrastructure Alliance companies which are taking some specific problems with the AI infrastructure and solving those problems. So these are the ones who is driving the innovation I see.

Dan: There's a lot to unpack there and that's really awesome. I actually had an observation recently that I think you'll probably share is, I think we've reached this sort of tipping point. And it dawned on me the other day, where I'm starting to see a number of companies whose entire business model comes from artificial intelligence, and in fact, would not be able to exist without artificial intelligence. To me, that was a huge sort of turning point when I was thinking about it. And the company that I remember seeing a post on-- I don't remember which the company it was, but I just remember seeing one of the founders post on LinkedIn. And it was just a light bulb went off was, it was generating photorealistic models for clothing catalogs. And what was interesting about it was that generally, you go hire a number of different models, right? You might hire different body shapes and sizes, different ethnicities, whatever you want to kind of showcase. But you're never going to photograph all of those models across 5,000 sets of clothes. But now with artificial intelligence, you can say, ''Great, I want this range of models.'' You're probably still going to hire a set of models for your hero shots, right? You're big kind of ads in the front page. But then if you want to show every piece of clothing on your catalog on any type of body, any ethnicity, anybody who's coming to your site can visualize what it looks like, that's a total AI business model. And so I agree that there's a lot of innovation coming up from there, right, where they're taking stuff and doing stuff you'd never be able to do without AI.

And then the second part you mentioned was sort of the AI infrastructure. And let's dive into that just a little bit more, right? And one of the things at the AI Infrastructure Alliance that we talk about is a lot of folks say, "Great, I just want this all-in-one solution." I see this with Cognizant, I see this with other SIs that I have this kind of privilege to talk to you where initially the customer comes in, and they go, "I want the all-in-one solution that does everything," because they're used to a fully developed ecosystem, right, that's been in the works for 30, 40, 50 years and has matured so much that you can't get a platform or two platforms or three platforms, it's going to do pretty much everything you can imagine. Whereas my sense in the Artificial Intelligence Infrastructure Alliance, so the AIIF for short, is that no matter what you buy today, even if you're buying Vertex, SageMaker, or Databricks, or whatever, they may do a huge chunk of what you want, but maybe they fall apart with unstructured data, or maybe they don't have a robust synthetic data platform, right? And you're going to need that specifically to augment your data set or their labeling system is kind of rudimentary. What do you see as some of the pieces that are kind of really missing in kind of these all-in-one solutions? I always say the all-in-one solutions only exist in the mind of marketers today and not in reality. What do you think are some of the big areas that are sort of missing in kind of these all-in-one solutions that are important for people?

Raj: I think if you really look at-- take something like management of models, right, and datasets and APIs, we have talked about this before, right? There is no single central solution to manage these assets. So you have to manage them as separate entities, right? Whereas in AI world, data and the models and the functions [inaudible] code which does the processing on the data are all interrelated, and the feature sets are all integrated, so you need to have a centralized way of managing that. You need to be able to trace back the model output all the way to the models which produce them and the data that lead to that and the features that [managed that?]. So you don't have that centralized way of doing that. If you see other areas, you're dealing with a lot of unstructured data. And today, the structured data volumes versus unstructured data volumes, if you look at it, the unstructured data volumes are growing. And there are not any standard solutions out there which can actually provide the right kind of metadata so you can search through these appropriately, and far less summarize what's there, create a knowledge graph out of it, or provide some insights, etc. So you have to custom-develop a lot of these solutions.

While DevOps, MLOps, DataOps, etc., have all matured, we still don't have a convergence of these tools. We still don't know what's the best bet. Every cloud vendor is trying to push their particular platform as a solution, right, as a way of doing things. But often clients are in a multi-cloud ecosystem, and they are also in a hybrid cloud ecosystem. So there are no universally acceptable solutions. So you have these problems. And also, apart from this, there's also a big challenge, right, who is the target audience for AI and ML solutions? Traditional software, it used to be developers and engineers who were doing this, but as with AI, we are having to do the [inaudible] to the business and the data scientists so that they can actually build the solutions. So the kind of infrastructure support they need is four different. They need a lot more automation, a lot more low-code/no-code environments, click and play type of solutions. And I think the industry is as a whole far away from that, creating that kind of mature ecosystem for doing [inaudible].

Dan: That's very interesting. And actually, one of the big pushes now at Pachyderm, where I'm at, is unstructured data because we've sort of realized that that's been one of our gigantic sweet spots. And the vast majority of our customers fall into that. And when I started to look across the industry, I realized 80% of their data is unstructured in the enterprise, but 80% of the solutions are mostly around structured data. Why do you think-- and then they're trying to sort of bolt it on, right? In a very unnatural way, for instance, we're sort of like shoving it into a database. And it's like if you're going to-- if you're going to be on a special effects-- if you're the special effects company doing machine learning, right, I mean, there was examples recently of Luke Skywalker, for instance, right, his face being done with a deepfake, right, in the latest Mandalorian and they end up hiring the fellow to do it, you're not going to shove a gigantic high-resolution video into a database table, right, and expect it to work, or high-resolution satellite imagery that's 500 gigabytes for each individual satellite to do object recognition. Or I saw a company recently, Flawless AI, that was doing deepfakes for lip-synching so that they would have a lip-synch of the actor and then they would also use style transfer so that it sounded like the original actor so the mouth moves perfectly. Again, you're not going to show that video into a database. Why do you think that so many companies have kind of struggled with unstructured data? Is it just that we've known how to deal with structured data for so long? In other words, we've had databases of very mature technologies at this point and just we've never been able to process it, or do you think there's something else at work that has caused companies kind of struggle with on structured data?

Raj: I think one is the culture eats everything for breakfast, right, so that's the fundamental thing. So basically, everybody is used to dealing with structured data. So the way we are still approaching-- at least most enterprises are approaching the problem is to treat it as a structured data problem. Even if there's unstructured data, they want to convert it to structured data and extract some insights, and process it as structured data at some point in the data check. But the reality is as you mentioned, so you get terabytes or petabytes of unstructured data whether it's videos or call center logs or records and all those. And these cannot be structured fundamentally because they don't have a particular structure in their traditional, relational, or document sense or any office, right? So they're not. And you need to start drawing the insights and you need to have ways to kind of attach some metadata to it to have some version controls to this. And be able to run your AI agents to kind of draw insights from this data and present it to the users. Because nobody is going to look at this data anyways, right, if you don't have missions to look at them. And that's what the-- because [inaudible], right? So we're getting around it by doing a few things.

One is convert all unstructured data into files, just store them as files and just search them and have some kind of a map around those files and manage them. Not very efficient because you can't deal with data of that volume like that. Or for a base of very expensive databases for managing specific types of video formats or speech and others and search through that. Then what you're doing is you're taking some real-life data, converting it into a format which a particular product can use and manage them. So there is no natural way. I don't think there's a simple easy solution, right, of managing this. So I think we have talked about Pachyderm, I think how it manages the version control and other things for this, right? I think, I'm sure Pachyderm will also evolve as it goes. But one of the things is that it tries to kind of treat unstructured data in its natural form and use underlying file systems and build some layers of intelligence around it, right, so you can actually process that. So you need those kinds of solutions. So why organizations struggle with that because there are no real ways to-- real solutions out there that has [inaudible] to manage all this kind of unstructured data. And that's, as you said, 80% of the data that's going to feed your ML model. So you need to have a mechanism to process. That [inaudible] where the big challenge is. And where a lot of thinking has to go in from an AI infrastructure point of view because their data is the-- especially unstructured data is the fuel for all your machine learning models, right, so.

Dan: Yeah. That's a good way to put it, the fuel and the food for all of it is data. So maybe, let's dig in a little bit to kind of Cognizant and sort of some of the things your team is working on. When you look at the machine learning lifecycle we've talked a lot about that, where do your consulting teams, your resident-level engineers are helping companies figure out their kind of machine learning infrastructure, and their pipelining, and their modeling, and all that. Where do you find yourself spending most of the time in the machine learning life cycle? Is it all over it or is there few places you tend to focus even more than anything else?

Raj: Okay. So I think it's a question of where do we spend the time, where do we want to spend the time, right?

Dan: That's a good distinction, actually. Yeah.

Raj: Yeah. There's a clear distinction. So I think, where do we spend most of the time today? It's on the data, right? Getting the right data foundation, getting the right data pipelines set up, the [engagement?] flow set up, before we get actual solutions at an enterprise scale launch for our customers. Where do we want to be spending the time is getting to their business problem and actually solve the business problem, right? Build the [inaudible] is more and straight away, okay, right? And know that, that is enterprize class. So the way I see this job, at least the way I would like to defend this job is, freeing the data scientist from the clutter of data engineering, and gain infrastructure, right? Give them a proper AI infrastructure so that the data scientist and the business can actually work with the real business problems and unlock the value. At the end of the day, AI has moved far away from the glamor radius, right? It's no longer about glamor, right? You don't do it for glamor. Now it's because you want to actually address some topline issues, some bottomline issues. You want to [inaudible] risks of the business. So you are actually using AI to solve some real problems and create the differentiation. You've mentioned about companies which are-- purely, it cannot survive without AI. And that's going to be strength of the [inaudible] for practically every AI organization, if they are not-- if there is not intelligence embedded across their processes for records and all that. They cannot survive it.

And how do you do that? You cannot do it with one [inaudible] use cases, right? Or cannot 20 use cases, you need to develop hundreds of use cases. Basically, you need to do it at scale across the entire enterprise. And for you to do that, you need to be able to develop it very fast because you can't be spending huge amounts of money, experimenting with the [inaudible] solutions. You need to be able to experiment very quickly, know that it works, and when it works you are able to deploy that solution fast for the market. So we are trying to get to that point where we can create at enterprise scale, right? And creating the right kind of infrastructure for them, so that they can actually have a functional approach towards generating-- I mean, identifying use cases, implementing them at enterprise scale. That's where we need to go. And that's where our journey history, right? Getting to that level of maturity.

Dan: So that's interesting. So where you want to be and where you are, right? And it feels like your teams tend to be-- actually a lot of teams, dealing with the plumbing, and the kind of unsexy part of it, right? The data engineering, if you will, flooding the RBAC maze, and pulling the data in, and transforming and changing it, and getting out of various siloes, and getting it into people's hands, and what you want to do is solving the problems. And I think you're right, we're going to be there for a number of years. And that, actually, is interesting because you would put out an AI-readiness pyramid that kind of shows the various stages of a team's maturity in AI, with the kind of bottom of the pyramid being just hiring a lone sort of data scientist with their laptop, and poking around a few problems, and maybe trying to get them all on production, to fully automated with kind of all kinds of testing, and alerting, and being able to put hundreds of thousands of models into production, and understand what they're doing. Where are you finding most companies in that pyramid, right? Where are they on that journey?

Raj: I think most of them-- and I think [inaudible] and if I remember [inaudible] it's a five-level pyramid, right? And most of them are in level two, three, right, where they have different processes. They have automated stuff. Because that's something every other enterprise [inaudible] have worked, and they have been doing this in the software engineering world, and they have just replicated that experience into the AI world. But what they're really struggling with is in monitoring the models and responding to these models, right? Most organizations eventually see that when the models regress, they have to fire to a data scientist. And then the model is taken off the production, and you have to deal with it, which means there's a downtime, or you need to have multiple models which can take the load [inaudible]. So they have to take your point where you can achieve Six Sigma level availability for AI models that models don't go down at any point in time. And if they do regress, you can actually respond back to it very, very quickly.

I think they're trying to shift to that level of confidence which business needs, right, to deploy AI models. And also the ability to go, very, very few are at a point where they can actually take a model, explain it, know that it's behaving in an ethical way, and know that they can trace it back, create that kind of audit trades, and very, very confidently do that. And that level of confidence is required for mission-critical applications, right? And whether it is a transport solution or a manufacturing solution where you know your mistakes could cost lives and huge impact on the business as well. Or it could be in life systems and other areas where errors could be very expensive to fix, right? So those are the things that we are trained as seeing the customers moving, right? So from basic automation to getting that level of intelligence and how the machine learning models work, how the whole pipeline operates, and that's where we're trying to move the needle. And that's not easy, right? Because each industry is very difficult, each has got its own complex days, and we try to solve those problems very, very unique way.

Dan: I think they're trying to shift to that level of confidence which business needs, right, to deploy AI models. And also the ability to go, very, very few are at a point where they can actually take a model, explain it, know that it's behaving in an ethical way, and know that they can trace it back, create that kind of audit trades, and very, very confidently do that. And that level of confidence is required for mission-critical applications, right? And whether it is a transport solution or a manufacturing solution where you know your mistakes could cost lives and huge impact on the business as well. Or it could be in life systems and other areas where errors could be very expensive to fix, right? So those are the things that we are trained as seeing the customers moving, right? So from basic automation to getting that level of intelligence and how the machine learning models work, how the whole pipeline operates, and that's where we're trying to move the needle. And that's not easy, right? Because each industry is very difficult, each has got its own complex days, and we try to solve those problems very, very unique way.

Raj: I think most of them. Most [inaudible] then, right? If you take, for example, something like accident management, right? Accidents are one of the biggest killers. And they practically kill more than practically any disease and maim more people than any disease does. And practically, everybody has to drive and which puts them all at risk, really, right? So it's not even a choice. So do we have robust solutions? They are evolving, but I don't think you have something completely reliable. But you still don't have a point where you can say, "There are no more accidents on the road and all that thing," right? So zero accidents, or. Even 50% of what's happening creates stress far, far away from what we can achieve today. But if you see-- and I think that's true for the transport. Not just for the transport sector, for the manufacturing segment, life sciences and all that. We don't have that level of robustness. But are we completely in the dark? I don't think so, because there are things we can learn from financial services. While the financial service impact could be high, people do take risks, right? If you take the financial services world - I think there are major investment banks and others - they have deployed financial crime fraud solutions and they're able to protect fraud in real-time and have models which can understand the customer behavior and have multiple models work in parallel to take different decisions and compare decisions in real-time and arrive at a consensus, right? So they're able to evolve consensus and kind of decide what actions to take, take certain prescriptions, and so on, right? So you see that happen. So I think that's a principle that can be applied, right?

As HR analytics grows and HA grows, we'll be able to take multiple models which can actually make decisions on the road, make decisions on the manufacturing shop floor, in the life sciences space, etc., and compartments, right? So you have that principle at least that we can take and apply in other industries, right? Because we are trying it out in the financial services space. And you can extrapolate that, right? But I think how do you get that? And I think it's not just about the process and technology, right? It's about understanding the domain, the implications of the problem, right? And creating very custom solutions, right? That will address the domain-specific problem, right? Which is why I believe you cannot have general off-the-shelf solutions and hope that you'll be able to solve your industry-specific problems. While you have the general off-the-shelf solutions to accelerate your journey, get there quickly, you need some very custom solution to address your specific problems. So that's what we are basically seeing with our clients and that's what we are also advising our clients. I think many of the clients are aware that this is a journey they have to take in the next 3 years, 5 years, 10 years, probably, right, to get to your point where they have that level of maturity, right? So it's about solving very, very locally, right, for a particular industry, particular organization based on that context.

Dan: That makes a lot of sense. And what do you think is the biggest challenge for a team trying to get to that sort of level-maturity? But also really just what do you see is the biggest challenge for building an effective AIML team today? Where are you seeing those challenges?

Raj: I think two things, right? So firstly, the nature of the team. And AIML is very, very cross-skilled, right? So you need to have business domain experts, data scientists, data engineers, civil engineers all working together. So they all come from different backgrounds. They have different perspectives so you need to kind of create a common language. We could all be speaking in German, French, English, and Italian in the room and it's almost like that, so it's kind of getting that level of commonality. I think the other problem is we're still not at a state of maturity where the common [inaudible] technology etc., has been mature enough, right, so that we can go-- we can get off to actually doing things very fast. We still have to spend a lot of time in planning because we have to spend a lot of time in creating a proper [inaudible] organization platform, right? So there are a lot of effort that needs to go into this. These are some of the blockers that we have to remove. I think because the challenge, if you ask me, is probably the talent space, right? To be able to build the solutions, you need to understand the domain, you need to understand AI modeling techniques, and be able to apply them. And that level of talent is very, very scarce in the market and building the talent pool is the most important problem for almost any organization today. And we [crosstalk] resources to be honest.

Dan: Yeah. The talent pool, right? I think the biggest challenge in this space is that for a number of years, it didn't work and when you looked at the kind of universities, they were basically saying, "Don't worry about machine learning. It doesn't work," right? "Don't bother studying it." And there was a very small group of people who kind of kept the research and the torch alive for a period of time and now, you've got this kind of rush as we've reached a level where the datasets were there and processing power was there, the algorithms matured and all of a sudden, now you have this kind of rush of hiring all the people who'd--

Raj: Where you get the talent's from. Yep.

Dan: Yeah. Right. And now, they've kind of filled the pool in the same way as if you lose a bunch of doctors and nurses it's very hard to get them back into the pipeline, right, because it's four to eight years of time. And so I think the same is sort of true here. We're going to be facing that for a number of years because I don't think there's any industry that's not going to be touched by AI, so I think we're going to need a lot of folks in that space. Let's maybe switch slight gears and then see if we can kind of race to the end here. I don't think we'll get through sort of everything you and I kind of planned to talk about because you and I could talk for a decade on this stuff. It's just, I think, infinitely fascinating to both of us and it's exciting to be in this, but let's talk about a few things. Do you see regulation and legislation having a big impact on the industry? There's a number of kind of talks about regulating things and been a lot of proposals, for instance, in the EU and other places. Do you see it having any impact or do you see it needing to kind happen at an individual company level? Or do you see kind of an interplay between that sort of regulation in the different companies and how do you see that playing out over the next 10 years?

Raj: I think regulation will have a huge impact and it always does, right? So basically, you need to decide who's responsible, really, at the end of the day when AI makes the decision. Often, you don't understand why AI takes a particular decision, why it recommends a particular action to be taken. And in cases where we implore the AI to take the action itself, who is really responsible? Is it the CFO, CRO of an organization or is it the software company which developed the AI models and solutions? So that's the question, right? So it's still at a stage of being [addition?] support system. Let's say you are managing network trail or some kind of infrastructure like that, right? Let's say AI prescribes an action. The field engineer sees that some of the action which is guided more by the standards that he or she is used to is something they can trust better, right? Now, what do they do? Do they rely on AI or do they rely on the standards, right? The regulation will dictate that they rely on the standards. So there's a lot of work to do to shift that, to bring that [inaudible]. For that to happen there has to be increased confidence in AI, increased robustness in the AI solutions, and lot of testing and proving of the solutions in variable scenarios for this to be fully adopted. And also the political side of things, right, so how do you create the right kind of technology solutions to make sure AI behaves ethically, right, and it is not violating any local laws of equality and other aspects? You have to put organizations in a [trust?], right? So creating that kind of infrastructure, the compliance infrastructure, is important for AI to be meaningfully applied in the enterprise scale, right, so that's a big area. So yeah, to answer that, in a short way, yes, regulations are going to be very important, and that's going to create lot of local variations on every single AI solution [inaudible].

Dan: Local variance is going to be challenging. Can you use the output of these algorithms in this jurisdiction, and it's going to be very geolocated. Some places are going to be very forward. I can even see companies skipping around legislations by taking it to another country to train it and taking the output there. So it's going to have to be a very global understanding, and it's going to be a push-pull over many, many years for society to come to terms with this in a proper way, and that's going to be a challenge for us. But thinking about kind of the future of that and then kind of helping us get to the final points of our conversation, I always like to talk about the future, where things are going, because it's very exciting to talk about where we are now, but it's important to kind of keep our eyes on the horizon, to know where things are going. And you and I have talked a lot about canonical stack for artificial intelligence, right? It's a lot of the work in the AI Infrastructure Alliance. But a canonical stack, the example I always use is something like a LAMP stack, right, for web development where you have a MEAN stack or a LAMP stack and then all of a sudden it's mature enough that it becomes kind of the default set of solutions, and that you have all these other applications that can be built on top of it. So you don't have something like WordPress until you have the LAMP stack, and then WordPress doesn't become 50% of the small websites out there until you've reached a certain level of maturity, and then you get higher order problems being solved. I think both you and I agree that we're sort of a bit away from a kind of canonical stack, saying these are the definitive solutions for solving even particular use cases, but certainly not all use cases in machine learning. How long do you think that it's going to take us to get there? How long is it going to take for stacks to develop? In your mind, what would it look like? How will we recognize it when it's here?

Raj: Like with what happens with LAMP, MEAN, and others, right, so it will be multiple variations of the stack, the [inaudible] in the stack as you rightly called out, and there'll be multiple stacks, and these will be dependent on the industry kind of problem we are trying to solve. And you'll be looking at some common elements, things like the ops, right, the [StoreOps?], the DevOps, [inaudible], MLOps, and the FinOps side of things, where you can actually run the process in a more automated way, more cost-efficient way, and in a more mature way. That's going to be part of practically every stack solution. And then you need to have the data engineering components which are responsible for all the [integration?], handling, transformation, storage, [inaudible] management, etc. So you need to create that data foundation, which again, I think is common, but then you need to start thinking about what are the types of data we need to deal with for different types of industries, and the volumes of data, and how do we create the right kind of data stack that we can say is fit for that particular industry's purpose with predefined models that go with that so you don't need to start from scratch for building any of these models. So that's the foundation.

So I think that'll take few years to achieve. I think people have tried this for some time and it's not easy to solve. And it could take some time for us to get to a point where it is mature enough, it's acceptable enough. But even if you're able to reach 50, 60 percentage of maturity in this, it's going to be a great accelerator for most instances because today they have to start from scratch in setting up a platform. They have to take that traditional data lakes and data warehouses and transform them into model data platforms fit for AI. Then you need to think in terms of the model pipelines themselves. So you have the model pipeline management, you need to think in terms of what are the tools for in a model management deployment of the models and tracking the model's performance while it's in production. And the key problem of AI, which is that the models will not behave as expected and you need to be able to regress in a-- you need to kind of expect the [tradition?] of the models or drifts, and respond to the drift in a very effective manner without impacting the service availability itself, right? I think that's the biggest challenge to deal with, so all the monitoring capabilities need to be added to that. And then you need to tie this monitoring all the way back to your data sets and feature sets and the feature engineering to support those extra. And create a graph of what really happened, and create the audit trail so that you can justify why your [gene?] auditor already got regulatory [inaudible] you put in earlier. How do you actually prove it?

So these things will take I would-- I think it's all evolving. I think there are bits and pieces of the solution in place. I think we can pull something together today with a lot of [inaudible] right? But for you to have something which you can fully trust, it's fully proven [inaudible], and it's [inaudible] do not few kinds of use cases, but hundreds of use cases because we're talking about scale, and [inaudible] scale of AI, not just small set of AI experiments or few kinds of use cases. We're talking about scale. And for them to get to that level of maturity I think we are at least two, three years away.

Dan: Yeah, I agree. And maybe even longer. I think one of the hardest parts of being a futurist is getting the exact timing right. It's like you can figure out what's going to happen, but we always tend-- I always find out, for instance, tend to be either a little too quick or, I think, it'll take too long and it always sort of surprises me a bit. So I've kind of started most--

Raj: It's always an estimation. You try to be quick and then you realize that life happens and it takes longer than what it should, right? So yeah.

Dan: Yeah. I had this picture a long time ago that robotics, for instance, would be one of the first areas that AI would really have a huge impact. But now that AI has really started to mature more, I realize robotics is still going to be one of the hardest problems to solve, right? And I remember there's an old paradox in AI that says, "The things that are easy to us are really hard in AI, and the things that are hard to us are pretty easy." Right? So these kind of higher order problems. And I remember seeing a fellow sweeping the street the other day, and thinking-- I just stopped and I looked and I thought, "Boy that job's not going away any time soon." I know a lot of people are worried about it, but that person is amazingly adept. And is doing a whole series of movements, from cleaning the broom, and picking it up, and dodging different people going past, and moving over cobblestones, and sweeping it away, and picking it up. There's all kinds of ballet, almost, of movements and I think it's very challenging to do. So thinking about that, let's blast forward 10 years, and this is always kind of a fun one to sort of end on. Where do you see-- fast forward, it's 10 years later. How are people working with AI? What kind of tools are they doing? Where's it had the biggest impact? What kinds of applications? What does it look like 10 years from now?

Raj: 10 years from now I think, what-- okay, probably people will be doing more of mathematics, or poetry, or art for fun. And AI will probably do most of what we do today, right? It should do practically almost everything that we do today. So many of the things that we do in our day-to-day jobs, in the [inaudible], right? The sweeping of the floor, I think that's a great problem to solve. I think those things will be all fully solved, right? So we don't have any more of the security guards in many of the buildings, right. It's all taken over by AI. Okay. So I think it will free up humans to do what humans should be doing, right, thinking, dreaming, and doing some creative work with mathematics, or art, or whatever, right? So I think AI should practically take on most of the enterprise.

So I think what would-- the way you look at it, already there's massive digitization automation of day-to-day business processes and all that. They are straight to processing today, right? Most of the processes are 99% straight, right. I don't think any of the enterprise process need to be having human intervention, right? The human intervention is only there for the humanizing of the-- probably we'll be looking at humanizing after 10 years. Not that we've got the basic problems solved, right? But I think that there'll be still some tech [shift?] problems to solve, right, which is around how do you make AI sensitive to real needs, right? Where it's probably analytical in its nature, but human decision making is not purely analytical in nature, right? So we need to probably have a deeper understanding of the human mind, and AI needs to start reflecting those-- for enterprises to really relate to people, right? It's not going to be just that, right? And it depends on how fast AI goes. If it's 10 years, maybe it's 20, I don't know, right? But that's what I would see. But the short answer is, I don't know. I don't have a crystal ball.

Dan: I've made a living out of pretending I have a crystal ball, but the truth is nobody does. And it's a bit challenging to predict, but it's always fun to think about it. And companies trying to plan to build their infrastructure and hire these right teams, do have to really think about what's going to be possible. And not just figure out what they need today but have an eye towards the future so that they could build for it. And I think it's important. We don't think about the future now so I always kind of draw people to it. And I'll say one last thing, and this is a good question to just finish on. And I always like to end on a positive note. If you could tell young people who are wanting to get into artificial intelligence, machine learning today one thing to study, and one area to focus, what would you tell them to focus on?

Raj: Is that a loaded question? Why is it understood [inaudible]? Are you asking what I tell myself or?

Dan: Yeah. I'm asking you as a young person, what are you studying in other words.

Raj: [inaudible] I think if you've been in the industry for some years, right? You should be really looking at the next generation and saying, "Wow, what a super time to enter into the industry," right? It's massive, right? Years of exploding it's growing like anything, right? And if anything, there are equally other great advances in quantum computing and nanotechnology, and all that, right? And also cybersecurity itself is launching into a massive new space. And it's so many things that are really happening. So I would say two things, right, so [inaudible] young. Algorithms are going to be the heart of what we do. I think there'll always be new algorithms to develop because that's the heart of all the things that we do, right? So that ability to develop algorithms, efficient algorithms that actually solve the problem is going to be the key. And I think we'd all be algorithm developers for years to come. So if people are good programmers but logical thinkers that's where I think their focus should be. And I think the second thing is to look at the kind of problems that you want to solve, right, because at the end of the day you have, "AI, it's the sort of solution to your business problems." So keeping that eye open for the problems that can be solved, what can be improved? Is going to be the most important thing that creative focus, right? How do you create new things? It's going to be the most important things and practically look into the hundreds of problems that need to be solved. And keeping the open mind and creativity in our lane is the most important thing I would think, right, for the people who are getting into the AI field and all that.

Dan: Creativity is super underrated and I think whenever I say to young folks out there they're thinking about something. Go towards the things that you're passionate about be excited about those things. Because even if you have a job as a lead data scientist, machine learning engineer, and maybe your job is optimizing engagement or doing churn prediction, or whatever it is that already has a business value. But your passion is music or the arts or whatever, dig into that stuff at night, have fun with it, play around with it even if it's still cutting edge. Because in 10 years it won't be and all the special effects companies and Hollywood and the music companies will be hiring you to do that. And I always think the most beautiful thing in life is the merger of something that you love and you like waking up to every day.

Raj: But I think also when you're passionate about an area, right, whatever it is - it could be art, it could be music, it could be mathematics, whatever, right? - you can apply AI to it, right? So I always wonder how-- I like to do some maths, right? I won't say I'm an expert at that but I do love algebraic geometry and some of these areas. So I always wonder can you use automated reasoning for proving theorems, for checking theorems, right? When it will be mature enough that we can actually work alongside an AI agent doing mathematics, right? So you can have a lot of things to really apply on the stick. So if you're passionate about some area you automatically see how you can actually apply AI in that space. And that's most important, right? And whatever you do I think if you're passionate about it you can always come up with a good AI solution for that also.

Dan: Totally. We're already starting to see mathematical checkers and that's only going to explode in the next couple of years. So wherever your passion lies go towards that. That's a fantastic place to end and I really appreciate you coming on and talking with us today, sharing your wisdom and insight. So thanks so much for being here appreciate it.

Raj: Thank you so much and it was a pleasure talking to you as always. Good luck with your event. Thank you so much.

Dan: Thank you. [music]