Skip to content

How to Pick the Right Use Case for Machine Learning

By: Fero Labs Logo light
Adobe Stock 433354824

Two-thirds of industrial firms report not realizing value from their ML and AI investments. 

Why? They picked the wrong use case—or lacked the data to take advantage of the solution.

Before applying any “smart” solution, it’s essential to find the right way to use it. If you don’t already have a use case in mind, think about problems you already have in your factory, or top-line KPIs that you’re focused on improving. These can be anything from production issues, such as inconsistent yields or cycles that involve significant amounts of unplanned downtime, to more big-picture goals, like minimizing costs or greenhouse gas emissions.

Once you’ve identified your use case, there are two questions you need to ask yourself:

Question 1: Is my use case a good fit for ML?

After seeing customers apply Fero software to hundreds of use cases in the industrial sector, we’ve identified a few things high-ROI use cases have in common.

Say you want to minimize the amount of time you have to stir ingredients to get the best conversion, keeping total batch time under five hours. Doing so would increase your capacity by 25% annually; with demand being significantly higher than your production capacity, you would increase revenue by 25%. You have sensors that measure everything around this reactor, as well as carefully maintained records of all batches for the past three years. You also have seven sites around the world that manufacture the same product (or similar ones), where you plan to scale the solution if this pilot is a success.

This is a great use case for machine learning. It’s quantifiable and scalable—the same issue exists across many plants, so optimizing this would be a huge win. Most crucially, you have all the data and process knowledge needed to make this a success. Using all the historical data obtained from your records, the model can pinpoint exactly how much time is needed to get the best conversion and make recommendations accordingly. In a matter of minutes, you can get information that it would take a human data scientist weeks to put together.

Say, on the other hand, you’re managing a food plant and want to cut down on raw ingredient waste throughout the process. When you walk around the factory, you see wasted food scraps, so you know there’s room for improvement, but you don’t have a good way of quantifying what you lose, because the shipping portion of the process only operates on weekdays. You have sensor data across the process, but have never extracted it for analysis. You’ve recently had your most senior engineer retire, losing a lot of process knowledge.

Instead of employing a machine learning solution right off the bat, it could be more beneficial to start with more basic data visualization tools such as Excel or an analytics dashboard to see what data you have available and what value you can get from it. This process will identify your gaps, so you can get to a point where ML is useful.

Question 2: Is the data good enough for this use case to be successful with ML?

In order for any machine learning optimization to be successful, you need to have good data. Since machine learning algorithms learn from the data they’re supplied, your ROI is only going to be as high as the quality of the data you put in. 

Consider these 3 questions to assess the quality of your data:

What systems do you use to generate and record the relevant data? Automated systems are best. However, if they aren’t monitored, sensors can fail and record bad data. Manually recorded data, on the other hand, is prone to typos and other issues.

Multiple data systems are often used between the first stage of process sensors and the final step of quality measurements. Is there a way that you can link production times to specific measurements?

Do you measure all the relevant phenomena in your process? You may know that ambient temperatures are important, but do you record them? You can’t build a machine learning analysis on data you haven’t recorded. Machine learning can often learn complex multivariate relationships to serve as a proxy for missing inputs, but it’s always best to have them.

How much data history do you have available? Sensors might take measurements multiple times a second, but production outcomes like quality might only be measured once a day. Machine learning models need to learn from outcomes. So the more outcomes you have access to, the better. 

Events such as large-scale equipment changes or capital projects can invalidate past data. This means you might have much less effective data than you think. A good rule of thumb is that you should have at least six months’ worth of data, although this depends on the frequency of the KPI. If the KPI is measured every minute, you can get away with much less. Conversely, if the KPI is measured only once daily, even six months of data may not be enough.

Once you've followed these steps, you can be confident that you have a good use case for machine learning. Now you're ready to move on and find the right solution.