This article will detail how Axonize and Microsoft teamed up to optimize Axonize’s build and release pipeline. After performing a value stream mapping (VSM) session, key areas of focus were identifed and a hackfest took place to implement the following DevOps practices:

  • Infrastructure as Code (IaC)
  • Continuous integration (CI)
  • Continuous delivery (CD)
  • Automated testing

These were implemented using Azure Resource Manager templates, Visual Studio Team Services, and the NUnit test framework.

The hackfest team included members from Axonize, Microsoft, and Cloud Valley (a Microsoft partner).

Axonize:

  • Omri Cohen – CTO & VP R&D
  • Shachar Levi – Devices & Connectivity Leader
  • Adiel Oz – Frontend Leader
  • Amir Unger – Stream processing & Analytics Leader
  • Hagai Maya – DevOps Engineer

Microsoft:

  • Ori Zohar – Senior Technical Evangelist
  • Mor Shemesh – Software Development Engineer

Cloud Valley:

  • Asaf Nakash – CTO

The hack team

Customer profile

Axonize is a startup located in Tel-Aviv, Israel, offering an IoT backend-as-a-service platform enabling customers to create multiple applications in multiple industries with minimal time and effort. Axonize simplifies the process of creating IoT applications with depth and richness of business logic via simple configuration and without specialized back-end development resources.

Axonize is an alumni of the Microsoft Accelerator in Tel Aviv. The company has been working with Cloud Valley, a Microsoft gold partner, to address some of its technological challenges and so Cloud Valley was an important part of the hackfest team.

Problem statement

As the company is offering a platform in the IoT space, the solution architecture utilizes various Azure services such as Azure IoT Hub, Azure HDInsight and the Web Apps feature of Azure App Service, managed Azure Redis Cache, Azure SQL Database as well as Azure Virtual Machines. For some customers, the system is hosted by Axonize and for others it is deployed in their cloud environment.

This complex architecture, together with the growing number of customers and limited resources of a growing startup, has led Axonize to recognize some pain points in its development and deployment process. These pain points included:

  • Automation gaps in the build process.
  • Complex manual deployment into existing Staging and Production environments as well as creation of new environments on customers’ Azure subscriptions.
  • Challenges in deploying and integrating new Apache Storm topologies into the HDInsight cluster.
  • Lack of automated testing.

These pain points manifested as long lead times (for example, deploying a solution in a customer’s environment could take up to a day) and instability when new features were integrated into the Staging environment. Also, when issues arise, rollback is not possible and so a roll-forward approach is used. Because of the automation gap, roll-forward is not quick enough and is a challenging process.

“One of our biggest challenges was our deployment process. At Axonize we have a very complex architecture relying on Microsoft Azure IaaS (VMs) and PaaS (HDInsight Cluster, Redis, IoT Hub, App Service and much more…) services. Our deployment process is complex as well. Automating this process is very important to us. …

“Managing our infrastructure in a version control repository (Git) is something we knew we need but haven’t done. …

“Our automated tests were running only on development machines; integrating them to our deployment process is highly important.”

— Omri Cohen, Axonize CTO & VP R&D

Solution, steps, and delivery

Reviewing Axonize's cloud architecture


The hackfest began by reviewing Axonize’s architecture and discussing the pain points mentioned above. This discussion led to the decision to focus on two main system components:

  • The API module: an ASP.NET web app hosted on Azure App Service. This module interacts with many of the other modules in the system and so new versions of its code have vast implications on system stability.
  • The HDInsight Storm cluster: Axonize uses a Windows HDInsight cluster to run .NET Storm topologies that consume events from an Azure IoT Hub. Deploying new topology code and integrating it with the rest of the system is challenging.

Value stream mapping

With a better understanding of challenges and system architecture, the team used the value stream mapping (VSM) methodology to illustrate the delivery process. In a detailed and methodical process, the team reviewed and graphically described Axonize’s current process from start of development to production code deployments while marking manual steps, tools, owners, environments, and process times per step.

Value Stream Map


The VSM exercise helped the team identify key points about the current process:

  1. The process includes two environments—Staging and Production. The Staging environment is meant to be a pre-production environment but in practice is used for initial integration. This leads to a lot of instability and to a high S/R (Scrap Rate) of 50%. Components are deployed to Staging, routinely break other ones, have to be fixed and redeployed.
  2. Deployment from Staging into Production is done about once a month while deployments into Staging are much more frequent and may happen daily. This means that improving early steps of the pipeline will have a bigger impact on process efficiency.
  3. Testing can be better automated throughout the pipeline. Currently, some test steps are automated using NUnit, while others are done with Postman or a browser.
  4. Axonize is using Visual Studio Team Services (VSTS) to host its Git repositories and to manage sprint work items, so VSTS is a natural tool to be used for implementing the rest of the pipeline as well.
  5. Deploying new environments (in scenarios where customers need to run the system outside the multitenant deployment) is currently completely manual and takes a long time.

“With this process we identified our bottlenecks and some road-blocks we have. This step helped us a lot to identify the issues that held us back and we were able to then plan forward. I must say, I was a skeptic at first, though; this step proved itself as a crucial step toward success!”

— Omri Cohen, Axonize CTO & VP R&D


After additional discussions, the following tasks were defined for the implementation part of the hackfest:

  • A third environment named Development will be added to the pipeline. This environment will be used for integration tests, allowing issues to arise earlier in the process and for the Staging environment to become more stable and a true pre-production environment.
  • Resource Manager templates will be authored to be used for build and release steps and will also help automate the deployment of new environments for specific customer deployments.
  • Storm topology submission will be automated using scripts.
  • Automated builds will be defined using VSTS for continuous integration and nightly builds.
  • Releases for the three defined environments (Development, Staging, Production) will be defined using VSTS release management and build artifacts from the CI and nightly builds.

Defining a Development environment

As the VSM exercise illustrated, adding an environment for integration testing can help catch integration issues earlier and also help the Staging environment become more stable. The main problem with adding another environment was that Axonize preferred to avoid having three HDInsight clusters deployed at all times (one for each of the three environments) due to resource costs. Therefore, the Development environment was defined to include all resources except for the Storm cluster and IoT Hub. The partial Development environment will still be very valuable for integration tests that do not include the Storm cluster and so will be used in the continuous integration process of the API module.

Infrastructure as Code (IaC) using Resource Manager templates

Azure Resource Manager allows deploying and interacting with Azure resources in a consistent and well-defined manner. One of the useful features of Resource Manager is the ability to describe complex collections of resources and their dependencies via JSON files and then using these files to deploy the resources they describe in a single RESTful API call to Resource Manager. It’s important to note that Resource Manager template deployments are incremental, so deploying a template into a resource group a second time will only add new resources added to template and not conflict with existing ones from a past deployment.

In the context of the hackfest, using Resource Manager templates is a critical part of the solution that will mitigate the pain points Axonize was experiencing:

  • Describing the three environments via Resource Manager templates allows creating automatic pipelines for release management.
  • Resource Manager templates can also be used to simplify and accelerate the deployment of fresh new environments in customers’ subscriptions when these scenarios are needed.
  • Since Resource Manager templates are JSON files, they can be committed in source control together with the code; changes that include a code and environment change can be bundled in a single commit. This practice can remove a lot of the stability issues seen with deploying changes to the system.
  • Rollback and roll-forward can now happen at the infrastructure level and not just at the code level.

The above are common benefits of practicing Infrastructure as Code (IaC) in DevOps pipelines.

To serve the Development, Staging and Production environments, the team defined two templates to be used:

  1. A template that will include all resources except for HDInsight cluster and IoT Hub.
  2. A template that will include only the HDInsight cluster and IoT Hub.

The first template will be used to deploy the Development environment and the combination of the two will be used to deploy resources in Staging and Production environments. It’s important that the first template be used by all environments so changes in that template are reflected in all environments (creating a completely separate template for the Development environment could have caused inconsistencies between it and Staging and Production).

Enforcing a resource naming convention

The team has decided to use Resource Manager template parameters for prefix and postfix to implement a naming convention for resources in the different environments. For example, when deploying a template to the Development environment, the prefix “dev” is given as a parameter ensuring all resource names begin with this prefix, easily identifying them later.

An example of a prefix parameter definition in the parameter section of the template:

  ...
  "parameters": {
      "envPrefix": {
        "type": "string",
        "allowedValues": [
          "stg",
          "dev",
          "prod"
        ]
      }
      ...
  }
  ...


An example of how a resource name is defined in the template to be named using the prefix and postfix parameters as well as string functions to reference and concatenate the parameters :

  ...
  "databaseName": "[concat(parameters('envPrefix'),'-', parameters('sqlDatabase'), '-db', parameters('envPostfix'))]"
  ...

Automating Storm topology submission

As mentioned above, Axonize uses Apache Storm running on Azure HDInsight using a Windows cluster. These clusters run .NET Storm topologies and have been deployed into the cluster manually using the Storm Dashboard as described here. For the Storm topology submission to be part of the CI/CD pipeline the team is building in the hackfest, it needed to be run as a build or release step via some kind of script or command line syntax. Without this, the full deployment of the Axonize system could not be automated.

Using the examples detailed in the HDInsight GitHub repository and the PowerShell scripts it utilizes, the team was able to create a three-step process using PowerShell scripts. Steps included creating a spec file, zipping it together with the necessary DLLs, and submitting the .zip file to the cluster:

  //Create spec
  powershell -executionpolicy bypass -File .\CreateScpSpec.ps1 -TopologyAssembly {PathToAssemblyDLL} -SpecFile {SpecFileName}
    
  //Create zipfile
  powershell -executionpolicy bypass -File .\CreateScpPackage.ps1 -TopologyAssemblyDir {Where the bin\debug is} -PackageFile {packageName} -JavaDependencies {pathtoravisjar}
    
  //Submit topology
  powershell -executionpolicy bypass -File .\SubmitSCPNetTopology.ps1 -ClusterUrl {ClusterUrl} -ClusterUsername {ClusterUsername} -ClusterPassword {ClusterPassword} -SpecFile {PathToSpecFile} -PackageFile {packageName}


The above steps can now be integrated and used in VSTS build and release definitions (see below).

Automated builds

The team created two build definitions in VSTS:

  1. A continuous integration (CI) build definition.
  2. A nightly build definition.

CI builds

This build is triggered every time a commit is pushed into the Git integration branch. This trigger is expected to happen frequently during the development process. Build steps include Visual Studio build, running unit tests using the NUnit framework, and publishing the build artifacts.

CI build steps

Nightly builds

Full integration tests cannot run on the Development environment since it is missing the Storm cluster and IoT Hub, but full integration tests are important to identify integration issues early, before deploying to the Staging environment. Therefore, the team decided to define an additional build process that will include steps for deploying a temporary HDInsight cluster, running the tests, and then deleting the cluster to save costs. Because these steps take time to run, scheduling them to run every night rather than as part of the continuous integration process avoids the long lead time from being added to the pipeline. Hence these nightly builds are used to run full integration tests on the Development environment with minimal added cost (since the Storm cluster is only deployed for a short time).

The build artifacts from the nightly build will be used for releases to the Staging and Production environments.

Below is a screenshot of the build steps for the nightly build:

Nightly build steps

A few things to note in the above build definition:

  • A Resource Manager template is used to deploy a temporary HDInsight cluster and the last step deletes the resource group holding the cluster.
  • Three PowerShell scripts are used to submit the Storm topology into the cluster as described in the previous section.
  • There are two steps that run tests:
    • The first runs unit tests that do not require an environment, just as in the CI builds.
    • The second runs integration tests that require a full environment.

Integrating NUnit tests

Axonize has chosen the popular NUnit test framework to run all testing scenarios for the API module. To include NUnit tests in a project, the NUnit package needs to be installed in Visual Studio:

NUnit install


For NUnit to run properly as part of a build definition, the “Path to Custom Test Adapters” field needs to be configured as shown below:

NUnit test step config in VSTS


Once NUnit has been configured correctly with the test adapter, test results can be viewed in VSTS. In this case, on the first run half of the tests passed and half failed. The test result screen provides additional details to understand which tests failed and why (see Error Message and Stack Trace below):

Test results view in VSTS


Release management

For a continuous deployment implementation, the team used VSTS release management capabilities to define release steps.

Delivery to Development environment

The Development environment deployment was defined in a release definition that uses CI build artifacts and was configured to be triggered automatically whenever a CI build was successful. Release definitions included Resource Manager template deployment, web app build artifacts deployments, and running integration tests with NUnit (not including integration tests with the Storm cluster that is not deployed in the Development environment).

This means that for the part of the pipeline on which Axonize spends most of the time (as identified during the VSM session), the process is now completely automated:

  • A merge from a feature branch is committed and pushed into the integration branch.
  • This commit triggers a CI build that builds the code and runs unit tests.
  • On build success, a release is triggered that deploys the new build into the Development environment and integration tests are executed.

Delivery to Staging and Production environments

For Staging and Production environments, delivery is done using artifacts from the nightly build, which include more extensive integration tests and so have a higher confidence level. Although the deployment process to these environments is completely automated using release step definitions in VSTS, it is not automatically triggered. Axonize’s process includes less frequent deployments to Staging and Production and so delivery to these environments will be manually launched by design. On top of that, the release process was configured to include an approval process to better safeguard against unplanned deployments. Approval was given to only two people in the company.

Below is the outline of the Staging release steps, though the Production tests are identical (cloned from the Staging definitions):

Release steps

Conclusion

At the end of the hackfest, Axonize has managed to greatly optimize its delivery pipeline and many of the pain points identified before the hackfest were addressed. In summary:

  • A third Development environment was added to the pipeline, helping the team identify issues before they manifest in the Staging environment and by doing so contributing to its stability.
  • Storm topology deployment was automated to be used in the CI/CD flow.
  • Continuous integration was implemented with automated builds and tests using VSTS.
  • Nightly builds were introduced to the pipeline to extend the scope of integration tests without having to keep expensive deployments running at all times.
  • Continuous delivery was implemented for the Development environment and automated release steps defined for that manual and controlled delivery into Staging and Production.
  • Infrastructure as Code practices were introduced using Resource Manager templates to serve all of the above as well as to simplify deployments for customer scenarios that required deploying new environments.

“Axonize’s DevOps architecture and CI/CD process gained a lot from this hackfest. It was an incredibly productive three-day hackfest.”

— Omri Cohen, Axonize CTO & VP R&D

Future directions

During the hackfest, the following directions were recognized as future areas of focus for the Axonize team:

  • Extend the number and scope of unit tests and integration tests.
  • Add front-end tests using the Enzyme framework.
  • Configuration management—use Resource Manager template parameters to inject web apps with configurations at deployment time.
  • Leverage Azure Key Vault for managing secrets.
  • Improve the monitoring capabilities of the system.

Appreciation

Thanks to the Axonize team for sharing their challenges and being open-minded about the opportunity to reexamine their processes. The team was truly committed to learning how to “do DevOps right” and this attitude has been a great factor in the success of the hackfest and why so much got done.

Another thanks is to Asaf Nakash, CTO of Cloud Valley who has been a most valuable partner in this engagement.

Resources