Skip to main content

· One min read

SynapseML (previously known as MMLSpark), is an open-source library that simplifies the creation of massively scalable machine learning (ML) pipelines. SynapseML provides simple, composable, and distributed APIs for a wide variety of different machine learning tasks such as text analytics, vision, anomaly detection, and many others. SynapseML is built on the Apache Spark distributed computing framework and shares the same API as the SparkML/MLLib library, allowing you to seamlessly embed SynapseML models into existing Apache Spark workflows.

With SynapseML, you can build scalable and intelligent systems to solve challenges in domains such as anomaly detection, computer vision, deep learning, text analytics, and others. SynapseML can train and evaluate models on single-node, multi-node, and elastically resizable clusters of computers. This lets you scale your work without wasting resources. SynapseML is usable across Python, R, Scala, Java, and .NET. Furthermore, its API abstracts over a wide variety of databases, file systems, and cloud data stores to simplify experiments no matter where data is located.

SynapseML requires Scala 2.12, Spark 3.2+, and Python 3.8+.

· One min read

The process of deploying Machine Learning (ML) algorithms within databases is challenging. The varied computational footprints of modern ML algorithms and the myriad of database technologies, each with their own restrictive syntax, make such tasks more than a little complex. We introduce an Apache Spark-based micro-service orchestration

· One min read

"Integrating the power of Azure Cognitive Services into your big data workflows on Apache Spark™

Today at Spark + AI Summit 2019, we're excited to introduce a new set of models in the SparkML ecosystem that make it easy to use Azure Cognitive Services at terabyte scales.

· One min read

In this work, we detail a novel open source library called MMLSpark that combines the flexible deep learning library Cognitive Toolkit with the distributed computing framework Apache Spark. To achieve this union, we have contributed Java Language bindings to the Cognitive Toolkit