The Rock ‘n’ Roll of harnessing data

Add bookmark

Seth Adler

Elvis was right – not taking an intelligent first step to automation will leave you caught in a trap


Do you truly understand the data you’re utilizing for intelligent automation?

If the answer is yes, stop reading. You’re doing fine.

But if there was even a moment of hesitation, you’re with the majority of corporate enterprise intelligent automation practitioners who don’t necessarily feel fully acquainted with their datasets.

As an incredulous Roger Waters has been asking us for the past 40 years, ‘How can you have your pudding if you don’t eat your meat?!’

We’ve all rushed into intelligent automation without truly considering how the data (meat) is being employed and how it relates to our goals. And we’re now finding that the ‘cheap investment’ required and ‘quick turnaround’ promised by RPA isn’t always resulting in a ‘happy path’ towards AI (pudding). To continue the rock god theme, we’re realizing all over again what Elvis Presley has been telling us since 1961: ‘only fools rush in.’

Innovation is under-estimated in the long-term, but over-estimated in the short-term. AI falls in exactly that category.

Old music references apply here because, really, intelligent automation is a retreading of old ground. Each industrial revolutionary step change has been about accomplishing a better understanding of the information available to you – turning that information into insights and then turning those insights into action. But it all starts with the data. And if you didn’t start with the data when you rushed in to RPA, it’s time to reboot.

From Episode 30 of the AiiA podcast, DVB Bank’s Ingo Zimny asked, “What are the opportunities? Who are the clients? What is the status of opportunities? How often did we fail, and how many opportunities went well in the end?” And he answered, “We had to figure out how to support the most crucial processes in that whole process chain. We had to build up the data pipeline of the bank.”

And so it goes for any corporate enterprise intelligent automation practitioner. What data do you have? How structured is that data? What are the business opportunities that come from structuring your unstructured data? Who are the clients that can be affected? And how can you affect the customer experience of the end user through constructing a system with which to process your data?

Intelligent automation often begins with an RPA implementation but the industry is in the process of learning the limitations of RPA. In Episode 23 of the AiiA podcast, Danske Bank’s Suzanne Skarrup noted, “I think where we today see some challenges with RPA are, for example, in areas with only semi-structured data. There are limitations there. There we will need some machine learning to help us to automate for it.”

Circle K’s Jorgen Lislerud agreed (Episode 10): “We have data. We have a lot of data. We have a lot of transactions. AI will for sure be a part of our future somehow.”

And Mads Anderson who built the Robotics Center of Excellence, which supports the entire city of Copenhagen, is applying the same mentality, telling, “Pretty soon we will run out of processes having structured data, and then we're starting with machine learning and cognitive agents to feed the robots, making order in the data chaos. That's where we are starting to use the next level of technology, so that's going to be a very, very exciting journey.”

So as the industry learns the limitations of RPA, the next steps of machine learning and AI are becoming apparent. But no matter where you are on your intelligent automation journey, notes from the front lines continue to suggest that your intelligent automation plan is simply your data plan. And if your data plan didn’t get you through the initial RPA step, what chances do we have with AI?

Which is to say that ‘AI’, as a bright, shiny object for corporate practitioners, isn’t so bright and shiny. Instead, sleeves have been rolled up and hands have become dirty.

Prudential’s Global Head of AI, Michael Natusch, told us recently that “innovation is under-estimated in the long-term, but over-estimated in the short-term. I think AI falls in exactly that category. There have been major improvements in terms of algorithms and how we can work with data and how easily algorithms can be implemented and how easily they can be put to use in real world scenarios, but also there is an expectation that we can solve world hunger tomorrow. That's clearly not what's going to happen.”

Those that have engaged in an RPA POC, set up a COE and rolled out dozens and even hundreds of bots are suddenly struck by the fact that once you’re through your organization’s neat, pretty, structured data, there’s a cold, hard, structured brick wall.

“AI is nothing without data so we need lots of data as well as a specific kind of data,” said Natusch.

“It needs to be labeled data because we want to predict for instance, certain events. The performance of your prediction is better if you have some kind of idea as to what happened in the past, so you want to have some kind of data labels. For instance, if you're a credit card company and you want to predict fraud, then it's good to have in your database a flag of historical transactions that were fraudulent or likely to have been fraudulent. You need some kind of labeled data.”

How labeled and how structured is the data you’re using for that fifth, tenth, or one-hundredth process that you’re intelligently automating?

“You need to find a way to label that data. You need to find a way to go through, build some training datasets and then build models based on those training tests and build models based on those training datasets,“ adds Natusch.

So how many data scientists should you have on staff?

“Another way to go about this – again, they are not mutually exclusive – is to go and try and find a way to encourage your users to do the labeling themselves.”

That’s an extremely helpful piece of advice from an executive who has boldly placed AI in his job title as opposed to stopping short with an RPA or automation moniker.

Whether or not you have the means with which to ask your customers to label your data, do something with your data. And do it now. Case studies are proving that your actions are fraught with danger if you don’t.

Citi’s SVP, Reengineering Director, Kelly Switt put it bluntly: “You wouldn't take a five year old at a kindergarten and throw him into college. So if you don't have good data and if you don't understand what your business does, you're going to be wasting a lot of time and money trying to go straight into machine learning and cognitive AI or cognitive machine learning – whatever the new soup du jour is that people want to sell.”

Her approach is to take baby steps (or at least kindergarten steps) by creating her own solutions. In her words, what it comes down to is really understanding what the core commodities of your business are, how they operate, what data supports them, and making sure that you have at least some “cleansed” data around that.

“Even if you start with some basic rules processing, it's on the continuum,” she said.

“Starting with rules processing, starting to get smarter, having better data, running your analytics will only prepare you to be able to move into that journey. We've got places where we know are good candidates for natural language processing or machine learning that were starting with rules because we're ‘dumb’ right now. And I can't possibly teach a system to be smarter than me, if I'm dumb – right?”

In order to understand what you have in your data, you have to understand how it’s affiliated with your ‘other data.

This has led to some frustration, given that many within the industry have a  mindset of ‘set it and forget it.’ Simply plugging a device into another and hoping the decisions will be made for you is simply a fantasy. You have to teach it. And you can only teach someone something that you know. That, again, is why understanding the data is so crucial at this stage of the game.

The good news is that the problem is the solution. All companies can have a huge competitive advantage because they now have access to enormous amounts of data, which can be harnessed to create efficient machine learning algorithms. Your competitive advantage is your data.

In the very first AiiA podcast episode, IEEE’s Lee Coulter advised that people think about the nature of their data: “What kind of data do I have? Is it analog or digital? Is it structured or unstructured? Is it illuminated? Is it accessible? Is it affiliated?...There's this notion of something called a GUI, which is a Globally Unique Identifier. And in order to convert a data set into a ‘data fabric,’ you must seed it with the components that allow for affiliation.”

So this data journey starts with simply understanding what you have in your data.  But in order to understand what you have in your data, you have to understand how it’s affiliated with your ‘other data.’

“We haven’t automated a thing here – we’re just investigating data,” Coulter said.

“When you have data, you chunk it into pieces. There's the development data set, there's the training data set and there's a test data set. And, oh, holy moly! If you get that how you cut that data wrong, forget it!”

Understanding your data and how it’s affiliated before you bring in technology allows that technology to function as intended. As you look at each of those data sets, you may or may not have a good idea as to which elements in it are going to be relevant for machine learning to occur. And by that, we mean whether an algorithm can return a high-confidence prediction based on the training it has been fed.

If you don’t know what you have, stop. Don’t start until you do.

Failed RPA implementations have shown us that RPA can’t happen without good data. Machine learning can only occur if you understand the data you are using to teach it.

So  to paraphrase Sir Mick – if you don’t start understanding your data now, you ‘won’t get no satisfaction’ when it comes to achieving AI.

Good news! AI 2020: the global state of intelligent enterprise is available this month. This  in-depth, cross-sector study explores the journies, challenges and investments taking place across the world when it comes to AI adoption. Download it today.