DataRobot PartnersUnify your AI stack with our open platform, extend your cloud investments, and connect with service providers to help you build, deploy, or migrate to DataRobot.
The DataRobot Automated Time Series product has traditionally been built on a supervised machine learning workflow, which allows users to forecast future events by specifying a target variable to train on. However, there are cases in which we would like to infer information from time series data without knowing the target. This may be in the form of detecting a faulty sensor in a machine or in the form of detecting unusually high network activity on a smart home device. In order to detect anomalous events, we need to look at the dataset holistically — knowing that anomalies can occur anywhere.
In Release 6.1 on DataRobot, we introduce Time Series Anomaly Detection, a fully unsupervised machine learning workflow that allows users to detect anomalies without specifying a target variable.
Types of Anomalies
As you might imagine, anomalies can occur in different forms. We may have a single spike on a flat region like this:
We also see clustered sine waves as follows:
Or several different data types layered on top of one another:
With DataRobot’s Anomaly Detection for Time Series, we have a new set of blueprints that leverage leading anomaly detection algorithms, developed to detect a wide array of anomaly types such as these right out-of-the-box.
Using Time Series Anomaly Detection
A core belief of DataRobot is that our products should help accelerate productivity for your data scientists and even help democratize data science for non-data scientists, such as business analysts. Time Series Anomaly Detection is no exception. We designed the UI to be as familiar and easy to use as any of our other products.
To get started, you follow a few basic steps:
Upload your dataset to AI Catalog or directly to your project as usual. On the Autopilot screen, you will select the “No target?” option.
Next, proceed to click through the “Set up time aware modeling” options as per normal. You will not have to choose a forecast window for anomaly detection, as we are detecting anomalies in real time. Now click “Start” to begin autopilot.
Once autopilot has completed you will see that models are ranked by “Synthetic AUC”. This metric is generated by binning the most common and the least common values to synthetically label points in time as anomalies. These labels are then used to compute the synthetic AUC for the model.
You can also upload a partial or full dataset with labeled anomalies to generate the actual AUC metric. In order to use this functionality, select a model, click on the predict tab, and then upload a dataset with the labels.
You then select “Forecast Range Predictions” and enter the label column name. Click on the compute prediction button as shown below:
Once the predictions are computed, go to the menu and click on “Show external test column.” You will see that the metric will change from “Synthetic AUC” to “AUC” as follows:
You can also further investigate anomalies under Evaluate > Anomaly over Time. This feature allows you to flip through different series and backtests to see when the anomaly occurred. Additionally, each anomaly is scored with a probability between 0-1 to show the certainty with which we can say that an anomaly occurred in that point in time.
Similar to standard Automated Time Series functionality, it is also possible to create a min or max blender model. For anomaly detection, a max blend model can detect all possible anomalies. These blenders will be especially useful for users with a higher tolerance for false positives.
So, in a few easy steps, assuming you are happy with your model, you are now ready to deploy your model to detect new anomalies in real time. You can do this in all of the usual ways that you are familiar with from the standard Automated Time Series product.
At DataRobot, we are proud to bring Anomaly Detection for Automated Time Series to the market. We encourage you to explore this new functionality. You can contact us directly if you are interested.
For existing customers, Anomaly Detection is included with your DataRobot Automated Time Series license. We’ve also included videos in the DataRobot Community to show you these capabilities in more detail and to help you build your first few models. Anomaly Detection is generally available today. So check it out!
Fareya works as a software engineer for the Automated Time Series product at DataRobot. She joined DataRobot in February after graduating with a degree in computer science from Worcester Polytechnic Institute. Fareya enjoys working with the product and marketing teams, and learning from customer feedback.