Skip to main content
Version: 0.10.2

SynapseML Development Setup

  1. Install JDK 11
    • You may need an Oracle login to download.
  2. Install SBT
  3. Fork the repository on github
  4. Clone your fork
    • git clone https://github.com/<your GitHub handle>/SynapseML.git
    • This will automatically add your fork as the default remote, called origin
  5. Add another Git Remote to track the original SynapseML repo. It's recommended to call it upstream:
  6. Go to the directory where you cloed the repo (e.g., SynapseML) with cd SynapseML
  7. Run sbt to compile and grab datasets
    • sbt setup
  8. Install IntelliJ
  9. Configure IntelliJ
    • Install Scala plugin during initialization
    • OPEN the synapseml directory from IntelliJ
    • If the project does not automatically import,click on build.sbt and import project
  10. Prepare your Python Environment
    • Install Miniconda
    • Note: if you want to run conda commands from IntelliJ, you may need to select the option to add conda to PATH during installation.
    • Activate the synapseml conda environment by running conda env create -f environment.yml from the synapseml directory.
      note

      If you're using a Windows machine, please remove horovod requirement in the environment.yml file, because horovod installation only supports Linux or macOS. Horovod is used only for namespace synapse.ml.dl.

  11. Install pre-commit
    • This repository uses the pre-commit tool to manage git hooks and enforce linting/coding styles.
    • The hooks are configured in .pre-commit-config.yaml.
    • To use the hooks, please run the following commands:
    pip install pre-commit
    pre-commit install
    • Now pre-commit should automatically run on every git commit operation to find AND fix linting issues.

NOTE

If you will be regularly contributing to the SynapseML repo, you'll want to keep your fork synced with the upstream repository. Please read this GitHub doc to know more and learn techniques about how to do it.

Publishing and Using Build Secrets

To use secrets in the build you must be part of the synapsemlkeyvault and azure subscription. If you are MSFT internal would like to be added please reach out synapseml-support@microsoft.com

SBT Command Guide

Scala build commands

compile, test:compile and it:compile

Compiles the main, test, and integration test classes respectively

test

Runs all synapsemltests

scalastyle

Runs scalastyle check

unidoc

Generates documentation for scala sources

Python Commands

createCondaEnv

Creates a conda environment synapseml from environment.yml if it does not already exist. This env is used for python testing. Activate this env before using python build commands.

cleanCondaEnv

Removes synapseml conda env

packagePython

Compiles scala, runs python generation scripts, and creates a wheel

generatePythonDoc

Generates documentation for generated python code

installPipPackage

Installs generated python wheel into existing env

testPython

Generates and runs python tests

Environment + Publishing Commands

getDatasets

Downloads all datasets used in tests to target folder

setup

Combination of compile, test:compile, it:compile, getDatasets

package

Packages the library into a jar

publishBlob

Publishes Jar to synapseml's azure blob based maven repo. (Requires Keys)

publishLocal

Publishes library to local maven repo

publishDocs

Publishes scala and python doc to synapseml's build azure storage account. (Requires Keys)

publishSigned

Publishes the library to sonatype staging repo

sonatypeRelease

Promotes the published sonatype artifact