Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Prophet: Automatic Forecasting Procedure (github.com/facebook)
298 points by klaussilveira on Sept 26, 2023 | hide | past | favorite | 89 comments


Model development on Prophet stopped this year: https://medium.com/@cuongduong_35162/facebook-prophet-in-202...

They recommend checking out these for cutting-edge time series forecasting:

https://neuralprophet.com/

https://nixtla.github.io/statsforecast/


Fun fact: if you don't care about the auto-regressive aspect of NeuralProphet (it's turned off by default), you can implement the core of NeuralProphet/Prophet (piecewise linear trend + Fourier on weekly/daily seasonality) in about 60 LOCs with no other dependency than either torch or numpy+scipy.optimize, and without having to deal with Stan or the very poorly chosen heuristics of neuralprophet.

Another thing that both NeuralProphet and Prophet do extremely wrong by default is uncertainty estimation. The coverage probabilities are way off.


Do you have an example implementation of reimplementing the core of these?


It's literally what I did at work last week, which is why I found this submission timely. I'd have to check with my employer if it can be made public. I don't see any reason why not, there's not much to it.


What did you use to implement the regularization of the trend breakpoints? Prophet by default uses a regular grid and thins them out with STAN. I couldn't find a quick regularization replacement in numpy/scipy/statsmodels with equivalent performance. (I don't want to drag in another huge dependency with Torch or TF).


Why is STAN viewed negatively in this light? I am curious why bayesian libraries are the black sheep.


I am curious too. I have used Stan extensively. I found it extremely polished and pleasant to use.

It generated very efficient samplers for particularly weird (and enormous!) hierarchical models I had. Documentation is also great.

It is also worth reading Andrew Gelman's post about Prophet: https://statmodeling.stat.columbia.edu/2017/03/01/facebooks-...


I think he just means that it can be an incredible pain to install.


Not directly from machine learning perspective, rather touching stability of use in production setup. We at VictoriaMetrics allow Prophet to be one of the models to use for time series anomaly detection in vmanomaly product. When it comes to cloud environment, Prophet, which uses `cmdstanpy` under the hood, allows little to no control over the backend, thus, resulting in unexpected crashes for read-only filesystems like Red Hat OpenShift, after the backend attempts to create assets in /tmp directory during model fit stage, so such dependencies may limit usage of a product in real-world scenarios.


This is interesting to me. Do you use a library to estimate the fourier series of a data series or have you implemented it from scratch? I've searched for this in the past but always got results RE. Fourier transforms, not series.


As others have pointed out, Prophet is not a particularly good model for forecasting, and has been superseded by a multitude of other models. If you want to do time series forecasting, I'd recommend using Darts: https://github.com/unit8co/darts. Darts implements a wide range of models and is fairly easy to use.

The problem with time series forecasting in general is that they make a lot of assumptions on the shape of your data, and you'll find you're spending a lot of time figuring out mutating your data. For example, they expect that your data comes at a very regular interval. This is fine if it's, say, the data from a weather station. This doesn't work well in clinical settings (imagine a patient admitted into the ER -- there is a burst of data, followed by no data).

That said, there's some interesting stuff out there that I've been experimenting with that seems to be more tolerant of irregular time series and can be quite useful. If you're interested in exchanging ideas, drop me a line (email in my profile).


> they expect that your data comes at a very regular interval

Does prophet rely on this assumption? For health timeseries data the tool of choice is survival analysis - typically using Cox proportional hazards regression or similar regression tools that are able to handle irregular or censored data.

I've seen some moves towards using fancy bayesian or fancier machine learning stuff for clinical trials but a big issue is that they are very difficult to communicate to their intended audience.


There’s also the auton survival library. I’ve used it for very big survival models with time varying coefficients:

https://autonlab.org/auton-survival/


I find Bayesian regression models are actually simpler to explain as the assumptions you make are explicit and part of the model specification.

(Thought the actual sampling mechanics and tooling can be much more complex)


PyMC or Pyro/NumPyro make the implementation of Bayesian regression dead simple


I tried Prophet via Darts, and all the models in Darts assume a regular time series.

Re: "fancier machine learning" -- I've seen different flavors of RNNs & LSTMs have some success in analyzing time series data. I've struggled to get them to work on real-world (i.e., messy) data, but have had some encouraging results with a transformer encoder-only NN.


What does Dart do that a multibillion dollar entity with an excellent open sourcing track record misses doing? Perhaps it addresses a niché case well. Genuinely curious


Darts isn't a specific model, it's a wrapper API for a wide variety of forecasting models, and Prophet is one of them. Other models may or may not outperform Prophet depending on the nature of your specific application and your time series data. You really have to test them to know. And Darts facilitates testing many models on the same data by putting them all behind the same API.

Also, Prophet was developed by a very small number of individuals at Facebook, it's not something they invested massive resources into.


As others have mentioned in the thread, Prophet has been abandoned, and in my experience anyway, wasn't all that great.


Can we employ stochastic processes like the Poisson process to represent irregular data points? Are there any existing models for this?


A common strategy is interpolation. The challenge is that forecasting itself is a form of interpolation. So you're forecasting based on forecasted data.


I think a common “solution” is to use a Gaussian process model.


Prophet is such an appealing package because it promises to abstract away all the difficult parts of forecasting. However, in practice it does not fulfill its promises. I think this is a good discussion of the problems: https://www.microprediction.com/blog/prophet


As others have pointed out, it is a good idea to encode domain knowledge in your time series model through specification and priors. Prophet rarely beats a well specified GLM or SARIMA in real world applications, especially when uncertainty estimates are needed. Professionally, I have succesfully applied Gaussian Processes to many such cases.

A GP is an intuitive and expressive way to code time covariance in a model. A famous example is the relative birthdays model, discussed by Gelman et al in Bayesian Data Analysis and here [1].

[1] https://avehtari.github.io/casestudies/Birthdays/birthdays.h...


This library is old news? Is there anything new that they've added that's noteworthy to take it for another spin?

[disclaimer I'm a maintainer of Hamilton] Otherwise FYI Prophet gels well with https://github.com/DAGWorks-Inc/hamilton for setting up your features and dataset for fitting & prediction[/disclaimer].


I'm no time series expert, but from my experience and what I've heard, using Prophet for time series forecasting isn't recommended. It often leads to less-than-ideal results.

Curiously, in Medium-like (ie low effort) publications it's still the recommended way to tackle a forecasting problem. The promise of a model that can solve any time series problem sounds great, but not all that glitters is gold, and as you get more experience you discover that solutions like this usually don't work.


I used Prophet and personally I do not have any problems, but I agree with the criticism that the tool it’s extremely focused in ergonomics that abstracts important aspects of the tool that can be used to built better models [1].

[1] - https://ryxcommar.com/2021/11/06/zillow-prophet-time-series-...


I thought the biggest issue wasn't with the models themselves, but how Zillow decided to apply and act on them, which is why it didn't work in practice.

So on average their predictions may have been pretty good, but since each transaction also depends on the other party to accept their offer, and whether they get outbid, most of their predictions where the offer actually goes through would be on the tail end of where they slightly overestimated the price.

This tweet from the article summed it up nicely

> Zillow made the same mistake that every new quant trader makes early on: Mistaking an adversarial environment for a random one. https://twitter.com/0xdoug/status/1456032851477028870

I was lucky to make and learn from that mistake pretty quickly with some algorithmic trading on much smaller amounts. With housing transactions being much larger and slower, you wouldn't learn this lesson until it was too late. Models never perform as well in practice as they do in theory, and you need to remember to account for both known unknowns and unknown unknowns.


Great comments! I've learned a lot from them. I'm just getting started with algorithmic trading and time series modeling, so I appreciate your insights.


I've honestly had consistently better results with standard regression models. I really love the idea of it, and maybe I need to be tuning it better somehow, but overall I haven't had a great experience.


Isn't recommended by whom?


Every time I, or someone at work with more experience than me, have tried Prophet it has ended up in changing the approach and trying a different technique. In my experience with time series hand-crafted recipes tend to work much more better than out-of-the-box solutions.


I agree completely. We always end up moving away from Prophet every time. The results from Prophet are just not very good, although it can be useful for a proof-of-concept.


What do you instead? I think Prophet does get close to an answer that isn't "depends" like everyone else is suggesting as an alternative


Related. Others?

Zillow, Prophet, time series, and prices - https://news.ycombinator.com/item?id=29137200 - Nov 2021 (143 comments)

Is Facebook's “Prophet” the time-series Messiah or just a naughty boy? - https://news.ycombinator.com/item?id=27695574 - July 2021 (78 comments)


Im a Data Engineer in a large consulting company and I have been incredibly impressed with AutoGluon for forecasting. You can build and train a model in around 10 lines of code and it frequently gets into the top 3 or 4% of competitions on Kaggle without much data pre-processing


Any good tutorials on how to implement it?


Has anyone else struggled with Prophet? I've experimented with it on a few real world datasets and I've had very inconsistent results.


Yes. I've tried using it for pretty straightforward time series forecasts, and I struggled to make it into something useful in a business context.

I'll disclaim that I'm just a finance dude and not a data scientist or programmer. But the documentation leads me to believe that I am in the target audience. I felt like I could grasp the basic mechanics after reading the paper, but I wish the documentation could help someone like me be more intelligent with the 'tuning' of the model. I could never get accuracy below 15% average error, which is too large for my use case.

Probably user ignorance, but that's my experience.


You are the primary audience. Time series forecasting with deep learning is fraught with inconsistency. Someone on r/ML went pretty hard on detailing a survey and the stuff that was SOTA 10 years ago still is. Wish I saved that thread. The dude was well published.

edit: found it https://www.reddit.com/r/MachineLearning/comments/pe1lst/r_i...

Turns out it was about time series anomaly detection, but if you can detect, you can forecast if your model is generative



I updated my comment with the thread but it was actually about time series anomaly detection. Turns out it was the same dude in your second link, and your comment includes forecasting in the first link as well. Thank you!


When was this? I might go chasing this lead down, but even a fuzzy estimation of when would help. Will come link it here if I find it.


I updated my comment!


aaaaand i just spent 3 hours watching that, trying to remember some parts of calculus, and reading all of the wikipedia articles and "also see" that were grey on white in the video. Then i fell asleep, but i wanted to thank you, as i also thanked the Prof. that made that video (on reddit).


This looks to me like something they’d be using for internal capacity planning. If so, they’d be asking it questions like, “how much capacity do we build out for the upcoming holiday rush?” I wouldn't be surprised if financial datasets are very noisy compared to service capacity metrics. I didn’t read the paper though, maybe this is addressed and maybe I’m wrong about the use case! But stuff like the below from the docs reads like capacity planning tool to me:

> As an example, let’s look at a time series of the log daily page views for the Wikipedia page for Peyton Manning. We scraped this data using the Wikipediatrend package in R. Peyton Manning provides a nice example because it illustrates some of Prophet’s features, like multiple seasonality, changing growth rates, and the ability to model special days (such as Manning’s playoff and superbowl appearances).


Also perhaps anomaly detection in a metric.


I'm sad to see no one has responded with a solution to your problem. You are absolutely the target audience, and in my experience, Prophet is "as good as it gets" to generalized forecasting.


While using Prophet for purely "forecasting" setup might not guarantee consistent high-quality results out of the box, especially for noisy and complicated time series data, at VictoriaMetrics we found it practically useful for anomaly detection task:

In our vmanomaly product, Prophet is one of the go-to models for anomaly detection in metrics data and it usually requires little tuning to achieve considerable results. The main purpose for the use of Prophet or similar forecasting models is to reformulate the task of anomaly detection:

- given fitted model M, ground truth Y_i for particular data point X_i, we produce forecast Yhat_i and its uncertainty estimate [Yhat_lb, Yhat_ub] - if ground truth Y_i falls beyond the range of [Yhat_lb, Yhat_ub], we consider this point an anomaly - the further Y_i is from the range, the higher the anomaly score would be. In our particular implementation for easier alerting purposes, anomaly_score > 1 means "anomaly"

here's a small visual example: https://docs.victoriametrics.com/vmanomaly.html#examples


Based just on the documentation, it seems there are some assumptions they expect the data to adhere to, and if they don't apply then it would not produce good results.


I have not been able to get good results either, but I have not tried it in the past year. I also tried many of the architectures in Darts. I have found that fairly straightforward architectures work well. That is, I can iterate on my own design for my own specific data (with all its specific covariates) and get better results than I could with Darts or Prophet.


maybe because time series forecasting, for any time series of interest, is pretty much not possible.


Here's some other similar Python packages for forecasting:

- https://nixtla.github.io/neuralforecast/

- https://github.com/ourownstory/neural_prophet



This is the HN comment thread on a well-written skeptical article with this zinger:

“You can imagine my disappointment when, out-of-the-box, Prophet was beaten soundly by a ‘take the last value’ forecast.”


This example is super classic! XD


Wondering how many people are now downloading this and other libs like Dart and trying to do stock market prediction or crypto price forecasting. Most of the devs i know, myself included, have dabbled in coding up trading algorithms at some point in time.


It's the classic data nerd trap.

"I'm pretty good at statistics and can predict things using software... I bet I could make money in the stock market"

And then they realize just how hard it is.


the hard part isn't the stats. it is all the information that people buy and setting up those ingest pipelines! If i had a satellite telling me when a certain big company has a lot of cars in the lot parked after hours, I could make a zillion bucks too!


buying up and setting up those ingest pipelines seems easy? it's trivial to do #cars detection automatically.

in fact, this sort of alternate data is pretty commonplace in firms I've worked at.


you are right; it can be easy sometimes, especially if you have the expertise. i should have said that it is a bit expensive though.


If it were easy, quants wouldn't be getting paid $1M in TC


I just hope they come across the 90/90/90 rule first: 90% of new traders lose 90% of their money within 90 days.

VTSAX and chill? :^)


Can someone explain why the "no free lunch theorem" does not cause problems here?

https://en.wikipedia.org/wiki/No_free_lunch_theorem


Two explanations

First: Prophet is not actually "one model", it's closer to a non-parametric approach than just a single model type. This adds a lot of flexibility on the class of problems it can handle. With that said, Prophet is "flexible" not "universal". A time series of entirely random integers selected from range(0,10) will be handled quite poorly, but fortunately nobody cares about modeling this case.

Second: the same reason that only a small handful of possible stats/ML models get used on virtually all problems. Most problems which people solve with stats/ML share a number of common features which makes it appropriate to use the same model on them (the model's "assumptions"). Applications which don't have these features get treated as edge-cases and ignored, or you write a paper introducing a new type of model to handle it. Consider any ARIMA-type time series model. These are used all the time for many different problem spaces, and are going to do reasonably well on "most" "common" stochastic processes you encounter in "nature", because its constructed to resemble many types of natural processes. It's possible (trivial, even) to conceive of a stochastic process which ARIMA can't really handle (any non-stationary process will work), but in practice most things that ARIMA utterly fails for are not very interesting to model or we have models that work better for that case.


These insights are really awesome! It reminds me of the common aphorism in Statistics: 'All models are wrong, but some are useful.'These insights are really like a wake-up call, thank you!


Disclaimer: I haven't looked at the linked library at all, but this is a theoretical discussion which applies to any task of signal prediction.

Out of all possible inputs, there are some that the model works well on and others that it doesn't work well on. The trick is devising an algorithm which works well on the inputs that it will actually encounter in practice.

At the obvious extremes: this library can probably do a great job at predicting linear growth, but there's no way it will ever be better than chance at predicting the output of /dev/random. And in fact, it probably does worse than a constant-zero predictor when applied to a random unbiased input signal.

Except that it's also usually possible to detect such trivially unpredictable signals (obvious way: run the prediction model on all but the last N samples and see how it does at predicting the final N), and fall back to a simpler predictor (like "the next value is always zero" or "the next value is always the same as the previous one") in such cases.

But that algorithm also fails on some class of inputs, like "the signal is perfectly predictable before time T and then becomes random noise". The core insight of the "No Free Lunch" theorem is that when summed across all possible input sequences, no algorithm works any better than another, but the crucial point is that you don't apply signal predictors to all possible inputs.

Another place this pops up is in data compression. Many (arguably all) compressors work by having a prediction or probability distribution over possible next values, plus a compact way of encoding which of those values was picked. Proving that it's impossible to predict all possible input signals correctly is equivalent to proving that it's impossible to compress all possible inputs.

Another way of thinking about this: Imagine that you're the prediction algorithm. You receive the previous N datapoints as input and are asked for a probability distribution over possible next values. In a theoretical sense every possible value is equally likely, so you should output a uniform distribution, but that provides no compression or useful prediction. Your probabilities have to sum to 1, so the only way you can increase the probability assigned to symbol A is to decrease the weight of symbol B by an equal amount. If the next symbol is A then congratulations, you've successfully done your job! But if the next symbol was actually B then you have now done worse (by any reasonable error metric) than the dumb uniform distribution. If your performance is evaluated over all possible inputs, the win and the loss balance out and you've done exactly as well as the uniform prediction would have.


Time series forecasting is not at all solved. Prophet does not solve it for you.


the corollary of this is "generalize time series forecasting cannot be solved", which is likely to simultaneously be true


Tried it once. Its promise is to take the dataset's seasonal trend into account, which makes sense for Facebook's original use case.

We ran it on such a dataset and found out that directly using https://github.com/karpathy/minGPT consistently gives a better result. So we ended up using the output of Prophet as an input feature to a neural network, but the result was not improved in any significant way.


From my own experience, a properly cross-validated lasso regression over a wide range of autoregressive features beats FB Prophet by a good margin and offers nearly the same degree of automation.


I am intrigued on how this would perform on astronomical data.

If anyone is not aware there are many periodic phenomena in astronomy - e.g. variable stars which can have periods from minutes to hundreds of days.

The description of this library sounds like it's very tied to the human world - talking about yearly, weekly and daily seasonality.

[Weirdly though, we do sometimes see variability on 'human' timescales in astronomical data series. If maintenance is carried out weekly on a Monday that can add a signal into the data through missing datapoints.]


Prophet is a PITA to install with PyPy on Apple Silicon. Beware.


On this topic, does anyone know of a suitable time-series forecaster for multivariate analysis? Eg 8 independent/input variables, and one output variable? I've been using multiple linear regression (which works impressively!) but it doesn't take into account the time series, only the single prior day of inputs. Thanks :)


Not really sure what you are looking for, but the easiest might be to just add lags of your input variables in the same linear model that you are using.

If you are looking for an actual timeseries method I would checkout either darts [0] or statsforecast [1]. They are currently the most mature timeseries packages.

[0] https://unit8co.github.io/darts/ [1] https://github.com/Nixtla/statsforecast


In Machine Learning conference papers, a common approach is to model relationships between variables using Graph Neural Networks (GNNs). Using GNNs is a powerful and flexible way to go. Maybe you can give it a try!


i use GLM with my own hand-crafted features based on my knowledge of the business and the things that influence it. Works very very well


Thanks! I take it that this means a Generalised Linear Model? Could i ask for a link to a relevant article to get me started on the flavour of GLM that you recommend?



It's the mechanism used by Grafana's forecasting feature. It's still not greatly explained and in many cases it makes it hard for the user to understand it's results as might be below zero for data that can only be positive numbers (requests per second, for instance).


Prophet has gotten a lot of attention since being released in 2017, I think because the idea of a fully automatic solution is very appealing to people. One of the original developers, Sean Taylor, recently posted a nice retrospective on the project's successes and failures: https://medium.com/@seanjtaylor/a-personal-retrospective-on-... He quotes one of his earlier tweets:

  If I could build it again, I’d start with automating the evaluation of forecasts. It’s silly to build models if you’re not willing to commit to an evaluation procedure. I’d also probably remove most of the automation of the modeling. People should explicitly make these choices.
Having worked on similar Bayesian time-series forecasting tools at Google, this matches my experience (though I've never used Prophet seriously, so please don't take this as any direct judgement of it as a software package). There is a lot of value in a framework that lets you easily experiment with different model structures (our version of this was the structural time series tools in TensorFlow Probability, see, e.g., https://blog.tensorflow.org/2019/03/structural-time-series-m...). But if you're forecasting something you actually care about, it's usually worth the time to try to understand yourself what structure makes sense for your problem, and do a careful evaluation on held-out data with respect to whatever metric you're really trying to optimize. A fully automated search over model structures is cute, but even when it works, it mostly just ends up rediscovering properties of the data you could or should have already known (e.g., of course traffic to your work-related website will have a day-of-week effect), so the cases where it really adds practical value are harder to find than you might like.

Even in the age of deep learning, I do think these relatively classical Bayesian models have a lot of value for many applications. Time-series forecasting tends to be a case where:

- you don't have a ton of iid data points (often, only a single time series),

- you'd like forecasts with principled uncertainty estimates, e.g., credible intervals, giving you a range of scenarios to plan for,

- you often do have a pretty good idea of what features are relevant to the process you're predicting, and

- you want to understand in detail what features the forecast is accounting for (and what it might be missing),

all of which play to the strengths of more classical, structured statistical models, compared to more data-hungry black-box deep learning models. So the basic ideas in Prophet and similar tools do still have a lot of relevance going forward, IMHO.


You mention classical models but Bayesian deep learning is a thing too. One can even retrofit existing DL models to obtain uncertainty estimates, at the expense of increasing (possibly doubling) the number of model parameters.

The quality of the uncertainty estimates is a question though.


I'd be curious to see how it performs on economics data compared to mainstream models (say DSGE) whose results have never impressed me with their predictive power.


Nonsense vs nonsense. Close call


How can this possibly work? The classic example is a turkey trying to forecast his body weight not knowing that he’s for dinner.


Account for the fact that a lot of turkeys lose their bodily mass very quickly during the thanksgiving week?


Then what do you need forecasting for?


For predicting the next data point?


Facebook developers are doing some really great stuff. For some reason it doesn't translate into a really great facebook or instagram. The experience is worse compared to 10 years ago. If they hired 10,001 of the best developers not working at facebook I think their products would be the same or worse. Is there a single person responsible for the vision?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: