Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's actually something ML is incredibly useful at, when it comes to machines with sensors - failure prediction / anomaly detection, etc.

In the industry, (preventive) maintenance takes up a pretty huge chunk of resources. It's something techs need to do often, and it's often a laborious task, but it's obviously done to reduce downtime.

So the business insight, as they like to call it, is to reduce costs tied up to repairs and maintenance.

All critical applications have multiple levels of redundancy, so that a complete breakdown is very unlikely, but it's still a very expensive process if you're dealing with contractors. If you can get techs to swap out parts before the whole unit goes to sh!t, then that's often going to be a much cheaper alternative.

But, in the end, it comes down to the quality of data, and the models being built. A lot of industrial businesses hire ML / AI engineers for this task alone, but expect some magic black-box that will warn x days / hours / minutes ahead that a machine/part is about to break down, and it's time to get it fixed. And they unfortunately expect a near-perfect accuracy, because someone in sales assured them that this is the future, and the future is now.



Yep, you and the user you're replying to are both right in different ways. One thing's for sure - machines don't generate "insights" on their own.

Let's define an "insight" as "new meaningful knowledge", just for fun. We could talk about what comprises "new" and "meaningful" but it would be beside the point I'm making.

In a supervised learning problem, the range of possible outputs is already known, meaning the model output will never be categorically different from what was in the training data. The knowledge obtained is meaningful as long as the training labels are meaningful, but it can never be new.

Unsupervised learning doesn't have a notion of "training data" but that means an unsupervised model's output requires additional interpretation in order to be meaningful. It is possible to uncover new structures and identify anomalies in new ways, but this knowledge isn't meaningful until someone comes in and interprets it.

Applied to the specific example where sensor data is used to try to generate insights about machine functionality: Either you can only predict the types of failures you've already seen, or you can identify states you've never seen but you wouldn't know whether they mean the system is likely to fail soon or not.

It's the Roth/401(k) tradeoff. For model output to be useful, someone must pay an interpretation tax. The only choice is whether it is paid upon insight deposit or withdrawal.


> Either you can only predict the types of failures you've already seen, or you can identify states you've never seen but you wouldn't know whether they mean the system is likely to fail soon or not.

Yup, this is something I've seen from both sides. First you mention is basically the standard, while the last is part of the deep learning voodoo black magic that executives and sales love.

I've had people approach me with proposals like "What if we just churn [ALL OF] our data through this or that model, and let's see if it comes up with some patterns we've never seen or thought about"

And that's not just for industrial applications. It's everywhere.

What is concerning to me is that this mentality will surely induce more unrealistic expectations. Before you know it, business execs are starting to ask why we need business analysts at all, because surely those fancy deep neural networks can extract all kinds of features - "only need data scientists to figure out those things".

So yeah, that's my fear. That businesses will blindly start to discard domain knowledge, and just feed black-box models their data, and let the data scientists wrestle with the results.


It’s not only outputs/labels that provide “insights”.

Knowing how the outputs relate to the inputs is where most new insights could come from.

For example, what feature (input) is driving the “failure” of the machine (output/prediction)?

This is where ML explainability comes in.


This is demonstrably false; AlphaGo made significant new discoveries, for example.


Yeah this is where it would have helped if I had discussed what I meant by "new".

AlphaGo is a supervised learner that outputs optimal Go moves given opposing play. It yields new discoveries in the same sense that a model designed to predict mechanical failures from labeled sensor data would: I didn't know what the model was going to predict until it predicted it, and now I know.

But what the factory owners want is a machine that can take raw, unlabeled sensor data and predict mechanical failures from that. They want insights. "Why not just feed all our data into the model and just see what comes out?" they ask. "I don't see why we need to hire at all if we have this neural net."

The reason you need a human somewhere in the system if you want insights is because someone needed to program AlphaGo specifically to try to win at Go. At the factory, someone needs to tell the machine what a mechanical failure is, in terms of the data, before it can successfully predict them.

Then, neither "winning at Go" nor "mechanical failure" are states that the system hasn't already been programmed to recognize. That's what I mean when I say a supervised learner cannot generate "new" output.


If the business wanted to track the rate of failures and create predictive models about when things fail, or detect anomalous behaviour, that's what they would have set out with as the goal, and, perhaps, some ML model might have helped, but probably, it would've been too unreliable and any number of standard predictive models with well known characteristics would have been used instead.

That's not what they wanted.

What people are being sold is AI/ML as a magic bullet that will do something useful regardless of the situation, and it lets business people avoid making decisions about what they actually want, because AI/ML can be anything, so they just signup for it and expect to get 20 things they didn't know they wanted handed to them on a plate.

Turn out, it's not enough to just collect a bunch of data and wave your magic wand at it. It wasn't with web analytics 10 years ago, it's still not.

What you actually need is someone who has a bunch of tricks up their sleeve, and has done this before, and can suggest a bunch of Business Insights the business might need before they start building anything, people that actually decide what to do, and actions taken to investigate, and solve those problems.

I mean, to some degree you're right; perhaps ML models could be useful for tracking hardware failures, but that's not what the parent post is talking about. The previous post was talking about just collecting the data and expecting the predictive failure models to just jump out magically.

That doesn't happen; it needs a person to have the insight that the data could be used for such a thing, and that needs to happen before you go and randomly collect all the wrong frigging metrics.

...but hiring experts is expensive, and making decisions is hard. So ML/AI is sold like snake-oil to managers who want to avoid both of those things. :)


Projects rarely end up doing what was planned when they started. As long as ML is solving real problems in practice, upper management will keep treating it as magic fairy dust to sprinkle around aimlessly.

It's all about how you package things. ML connected to an audio sensor could predict failure modes that are dificult to detect otherwise. Now that might not be was was asked for, but a win is a win.


I think both your post and the one you are responding to are correct.

I’ve experienced what the OP was alluding to...namely, it helps tremendously to start out with an understanding of the problem you’re trying to solve, more so in supervised learning. It’s incredibly frustrating to ask managers what business problem they are trying to solve only to be met with “We don’t know, that’s what we want the software to tell us.”

On the other hand, if they say “we want to predict machine failures” or “reduce maintenance downtime” now we have a lens in which to view the data.

If AI could do the magic as those managers hoped, they would be out of a job


I work in this space, and you've called out the thing that drives me absolutely bonkers. No matter how accurately I can diagnose the state of the equipment, what everybody wants is to know how long until it explodes, so that they can take it down for maintenance ten minutes before that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: