TDWI Articles

IoT and the ML Connection

The intersection of machine learning and IoT is creating a need for new ways of thinking about -- and understanding -- data, sensors, citizen data scientists, and a host of other issues.

In an increasingly turbulent technology environment, new ideas are often to be found at the intersection of things. In such cases, the contradictions between trends and possibilities appear in high relief.

For Further Reading:

Data Requirements for Machine Learning

4 Reasons to Use Graphs to Optimize Machine Learning Data Engineering

The Importance of Being IoT

Within the data center, one such intersection is that between machine learning (ML) and the internet of things (IoT). ML is becoming an essential player in a growing array of process areas involving image recognition, natural language processing, forecasting, prediction, and process optimization.

At the same time, the IoT is creating an explosion in structured and unstructured data from a growing army of sensors capable of registering location, voices, faces, audio, temperature, sentiment, health and the like. ML is evolving to the point of being able to draw interesting patterns and inferences from these real time data streams, and make those results available to analysts as well as to embed them directly in business processes.

Sanford on Mixing It Up

The issues inherent in this intersection are being explored by Dr. Kenneth Sanford, a lead player with startup Dataiku. Dataiku is a software solution designed to provide access to sophisticated analytics processes for users ranging from data scientists to business analysts.

Unlike other self-service platforms, its focus is mainly on very large data sets residing in the cloud.

"One of the first places we are seeing IoT and ML coming together is in manufacturing," says Sanford. "Today, there is a lot of high-frequency, high-value production that benefits from incorporating more sensors to get early warning on positive or negative developments for near real-time decisions in countless areas such as quality control and re-jigging. ML makes it possible to put models directly in the process flow. One of our customers, for example, deploys as many as 12,000 models every week."

There are many industries gearing up to take advantage of new opportunities in ML, but this does not disguise the fact that this revolution is different from the recent evolution in big data and analytics. Both the production of data and the need for data are huge, and often in real time. Data must be stored, secured, and prepared for use. IoT provides vast rivers of structured, semi-structured, and unstructured data; and ML brings a range of new problems such as algorithm selection and modeling to the fore. At the same time, corporate needs for security and data governance cannot be ignored.

"Everyone needs to be at least data aware," says Sanford. "Frontline managers need to be constantly reminded that there is data all around them -- and understand that this presents an opportunity to predict the future. Data awareness, combined with a healthy dose of causality, can help you to determine what data you need and how to proactively deploy sensors to collect it."

Data Levees for the Storm

IoT generates huge volumes of largely repetitive data that needs to be controlled. An increasing problem is determining what to keep and what must go. From the beginning of the big data era, data hoarding has been common because there was value in diversity. With the IoT, however, this is less possible and less useful.

"It is becoming increasingly important to develop internal strategies about what data to keep and where it should be stored," says Sanford. "It is important to have a sampling strategy and for users to understand why you are storing data. Storing too much data of the wrong type can contribute to problems in governance and resource usage, as well as in security."

Building for the Future

In addition to creating a need for new ways of thinking about data, sensors, and business possibilities, the intersection of ML and the IoT also creates a need for greater understanding of data, more "citizen data scientists," and more software capable of helping with the demands of an expanding mandate for data-supported action. Software such as Dataiku can help with this, but hiring a broader range of experts is also important -- particularly those focusing on areas such as physics and use of time-series data. CoEs need to be attuned to the demands of IoT data, and company management needs to understand the value of planning for a sensor-driven smart data future.

The ML/IoT intersection is already well underway and highly visible in areas such as autonomous vehicles and robotics. Demands from IoT will help shape data strategy for years to come, as well as influencing how AI and ML technologies are integrated into the workplace. Gaining an early understanding of these influences will help ensure that your company can cope with the next wave of innovation in this sector.

About the Author

Brian J. Dooley is an author, analyst, and journalist with more than 30 years' experience in analyzing and writing about trends in IT. He has written six books, numerous user manuals, hundreds of reports, and more than 1,000 magazine features. You can contact the author at bjdooley.query@yahoo.com.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.