Digamore Entertainment GmbH built their new game backend on top of Azure Service Fabric, which helped them to reduce operational overhead. In this hackfest, we worked on the DevOps pipeline to increase the productivity of their small development team. To achieve our goals, we needed to introduce DevOps practices such as Infrastructure as Code (IaC) and continuous delivery, and integrate open source software such as Apache JMeter into Visual Studio Team Services.

“Though we are a small team and have no one dedicated to DevOps, we were able to develop a system that can handle the requirements of a massive multiplayer online game.” —Marko Flod, Software Engineer, Digamore Entertainment GmbH

Key technologies used

Core team

  • Marko Flod – Software Engineer, Digamore Entertainment GmbH
  • Frédéric Nagler – Senior Unity Programmer, Digamore Entertainment GmbH
  • Peter Kirchner – Technical Evangelist, Microsoft
  • Andreas Pohl – Technical Evangelist, Microsoft

Customer profile

Digamore Entertainment GmbH is a new company based in Cologne, Germany that works on mobile games. Even though the company is new, they are composed of a competent and well-known team of seasoned game developers from companies such as Ubisoft Blue Byte and Flare Games. They decided to use Azure Service Fabric for their new mobile football manager game Football Empire. Using Azure Service Fabric helped them develop a highly scalable multiplayer backend in a very short time frame.

Problem statement

Digamore Entertainment faced two major challenges at the beginning of the hackfest that we wanted to solve. They needed a highly scalable backend architecture that could scale cost efficiently from thousands to millions of monthly active users, so one of the challenges was to prove that their architecture was actually able to scale cost efficiently. To do this, they faced their second challenge: they needed a way to automatically load test their cluster and detect performance bottlenecks as well as detect performance degradation over release. This kind of load test should not be run against a production environment, but against an isolated test environment.

Solution, steps, and delivery

Identifying opportunities for improvement

By creating a value stream mapping that included the configuration for every environment, we discovered that the creation of new environments and the deployment to test, staging, and production were done manually.

Value stream mapping


With the creation of the value stream mapping, it was clear that we couldn’t implement continuous testing with Visual Studio Team Services because the setup and deployment time for an environment was way too long, taking days if you wanted to test and deploy more than ten changes during a sprint. To be able to achieve our goal of continuous load testing against the backend, we needed to reduce the operations overhead.

To achieve this, we decided to implement the following before we started working on the load test:

  • Automatically create new isolated test clusters by utilizing Infrastructure as Code (IaC).
  • Manage the deployment to the environments by using continuous delivery.

After we improved these DevOps areas, we would be able to start automated load testing by using Visual Studio Team Services and Apache JMeter. In our scenario, we wanted to automate the process from building the code to creating the Azure Service Fabric cluster to deploying into the cluster. This means we needed a way to automatically deploy a secure Azure Service Fabric cluster without the need of interaction.

Therefore, we needed to create an ARM template to automate the deployment of resources. This way we would create the Infrastructure as Code in a repeatable, versionable, and testable way where multiple people could collaborate. This is the basis for continuous integration and continuous delivery, which we would need later on to deploy automatically in a different environment.

Continuous delivery or continuous deployment?

At the start of the hackfest, we talked about the possibility of reducing the need of human interaction in the deployment process because we wanted to reduce the deployment from hours to seconds. We discussed continuous deployment and thought about the automation process, and that a real continuous deployment pipeline needs to be trusted because it would publish changes automatically.

But how do you establish trust to your pipeline? This is not easy to answer because it depends on the team and the infrastructure. However, we agreed that one of the most important parts needed to establish trust to your pipeline was a very high test coverage of your code. This is something that is not easy to achieve and in our case, it would be even harder because it would require us to mock up some of the Service Fabric actor models, which would need time and resources.

Even if you have this test coverage, you still need to establish trust to your pipeline, and we think that continuous delivery is one step in this direction. The difference between continuous delivery and continuous deployment is small but quite important. In continuous delivery, you build a pipeline that still needs a human to confirm a new deployment or start a new pipeline step, whereas in continuous deployment, every step is automated. Because of this, we decided to start with continuous delivery until the test coverage grows and the team is familiar with the concepts.

For more information about the differences between continuous delivery and continuous deployment, see the diagram in this blog post: Continuous Delivery vs. Continuous Deployment: What’s the Diff?.

Creating the ARM template

We started by using the wizard in the Azure portal. There are several ways to create an ARM template: writing it from scratch, using the quick start templates, starting with the Visual Studio project templates for Service Fabric, creating an export from a live environment or—the way we have chosen—by using the export function of a wizard.

Actually, the way you choose depends on the result you expect and what the different ways can provide. For example, I would only recommend starting from scratch for teaching purposes. In real-world projects, you don’t have the time to start from scratch.

In our case, the wizard creates the best starting point for our template for later adaptation. If something is missing in the export of the wizard, we can additionally export the live environment to get specific live settings or properties.

Create a key vault

Because we wanted to create a secure Service Fabric cluster and needed a place to store certificates and secrets, we created a key vault by using Azure Key Vault. From a DevOps and source control perspective, private keys, connection strings, passwords, etc. should not be stored in configuration files that are stored accidentally or on purpose in a source control system.

For our secure Service Fabric cluster, we created a self-signed certificate and uploaded it to the newly created key vault. We could have automated the creation of the key vault, but this was not necessary in our case because we only needed to automate the deployment of the cluster, not the creation of the key vault. If you’re interested in automating the creation of the key vault and the certificate, see the Service Fabric Helper scripts on GitHub.

To create a new key vault, first create a new resource and then search for Key Vault.

Select Azure Key Vault


Next, we named the new key vault, and then selected a subscription, resource group, location, and pricing tier. It was then important to set two advanced access policies:

  • Enable access to Azure Virtual Machines for deployment
  • Enable access to Azure Resource Manager for template deployment

Create Azure Key Vault


Create and store a cluster certificate in the key vault

Creating a self-signed certificate with PowerShell is quite easy. The following PowerShell lines create a self-signed certificate and export it to the current working directory.

$cert = New-SelfSignedCertificate -DnsName mycluster.westeurope.cloudapp.azure.com -CertStoreLocation "cert:\LocalMachine\My"
$password = ConvertTo-SecureString -String "your-own-password" -Force -AsPlainText
Export-PfxCertificate -Cert $cert -FilePath ".\my-cert-file.pfx" -Password $password


For the Service Fabric cluster that we create later, we needed three pieces of information:

  • Thumbprint of the certificate
  • Resource ID of the Azure Key Vault (for the source key vault)
  • URL of the certificate in the Azure Key Vault


You can get the thumbprint of the certificate easily via PowerShell:

PS C:\Users\pkirch\Desktop> cd cert:
PS Cert:\> cd .\\LocalMachine\my
PS Cert:\LocalMachine\my> ls


You then get the following output.

Get thumbprint of certificate


You can get the resource ID of the Azure Key Vault from the Properties page of the key vault in the Azure portal.

Key Vault resource ID


You can get the URL of the certificate from the Secrets section of the Azure Key Vault Properties page; under Settings, select Secrets, and in the Secrets list, select the certificate.

certificate resource id


Select the latest (and possibly only) version of the certificate and you get details, including the URL to the certificate.

certificate resource id and URL


With that you have collected the three pieces of information about the key vault and the certificate that you will need to create a secure Service Fabric cluster.

In our case the information looks like the following.

Name Value
Source key vault /subscriptions/26a6XXXX-1458-XXXX-8c8b-XXXX8425e92/resourceGroups/digamore-assets/providers/Microsoft.KeyVault/vaults/digamore
Certificate URL https://digamore.vault.azure.net/secrets/sfccert/827e5ba9ea2f4d2ab8c71866732bee2a
Certificate thumbprint 11FB42A4103D0E33E80F6B2000870929F86778FE


Create a Service Fabric cluster and template

After creating the key vault and a certificate, we created a new Service Fabric cluster via the Azure portal. Search for Service Fabric Cluster, select it, and then select Create on the Service Fabric cluster blade.

Select Service Fabric cluster


Give the cluster a name, select the operating system (in our case Windows), choose a user name and password (the password will not be exported in the template, but we have to enter it anyway), select the subscription of your choice, enter a new resource group name, and finally select a location.

Create Service Fabric cluster step 1


In the second step of the wizard, we configured the cluster. We have only one node type. Give the node type a name (in our case we just named it nodetype1). For the durability tier, we chose Bronze. For virtual machine size, we selected Standard_D1_v2. For reliability tier, we also chose Bronze. For initial VM scale set capacity, we selected 3. We needed some custom endpoints for ports 80 and 443. At this time, we do not need to configure advanced settings.

node type configuration


We activated Diagnostics and have no custom fabric settings. Fabric upgrade mode is Automatic by default.

Create Service Fabric cluster step 2


In the third step, Security, we configured the Service Fabric cluster as Secure and filled in the information we collected in the previous section (source key vault, certificate URL, and certificate thumbprint).

Create Service Fabric cluster step 3


In the fourth and last step, we just had to verify our settings. Because we were primarily interested in the template that the wizard creates for us, we didn’t click Create. Instead, to the right of the Create button, we selected Download template and parameters.

Create Service Fabric cluster step 4


With that we get a view of the template and the parameters and the chance to download both. In the zip archive we get, we have a lot of other files to use for deploying the template.

download template


We downloaded the template and parameters. After downloading, we closed the current page and created the cluster by clicking the Create button. We used the live cluster as comparison for our template customization.

Creating the Service Fabric cluster takes a little while.

deployment of Service Fabric cluster


Walk through the exported template

Let us briefly walk through the template. After a short analysis, we will customize it in the next section.

With the download zip archive, we get 7 files; 2 files are important and 5 of them are only to get you started:

file content
template.json ARM template of our Service Fabric cluster
parameters.json parameters used for the template file
deploy.ps1 deployment script for PowerShell
deploy.sh deployment script for shell
deployer.rb deployment script for Ruby
DeploymentHelper.cs deployment code for C#
deploy-preview.sh deployment script for shell using CLI 2.0


We have four main sections in the template.json:

  • The parameters section contains changeable variables from outside the template.
  • The variables section contains variables that are defined inside the template.
  • The resources section contains a declaration of the resources Azure has to deploy.
  • The output section contains variables whose information is collected after the deployment.

There are a lot of parameters. That’s the way the wizard creates the template. Only a few of them are important for us to parameterize. If you have time, you could demote most of them from parameters to variables.

The variables are mostly about concatenated resource IDs, nothing that’s very important in our scenario to discuss.

Customize the template

For security reasons, per default the admin password for the virtual machines is not saved in the template or parameters file. But we want to automate the deployment without the need to provide a password at every deployment. Saving the password is not an option. Therefore, we save the admin password in the key vault and reference it from the parameters file to the key vault.

To save the admin password:

  1. Create a new secret for the virtual machine password in the Azure Key Vault.
  2. Select Manual as an upload option.
  3. Name the secret vmpassword.
  4. Type as value the admin password, and create the secret.

Instead of providing the password in the parameters.json file directly…

"adminPassword": {
    "value": "some-password-everyone-could-read"
}


… we reference the Azure Key Vault where we have saved the password. You have to provide the resource ID to the key vault and the name of the secret you want to use.

"adminPassword": {
    "reference": {
        "keyVault": {
            "id": "/subscriptions/26a6XXXX-1458-XXXX-8c8b-XXXX8425e92/resourceGroups/digamore-assets/providers/Microsoft.KeyVault/vaults/digamore"
        },
        "secretName": "vmpassword"
    }
}


In our scenario, we need to use the Azure Service Profiler. For the Service Profiler, we need two settings: the corresponding storage account name and the access key. Storing the access key of a storage account is also not appropriate in the template or parameters file, so we store it in the key vault.

"ServiceProfilerStorageName": {
    "value": "dmserviceprofiler"
},
"ServiceProfilerStorageSecret": {
    "reference": {
        "keyVault": {
            "id": "/subscriptions/26a6XXXX-1458-XXXX-8c8b-XXXX8425e92/resourceGroups/digamore-assets/providers/Microsoft.KeyVault/vaults/digamore"
        },
        "secretName": "dmserviceprofilerstorage"
    }
}


In the templates.json file we need to add an extension to the cluster’s virtual machine extension list. You can find this list (= array) in the following path:

resource with "name": "[parameters('vmNodeType0Name')]", properties, virtualMachineProfile, extensionProfile, extensions


{
    "properties": {
        "publisher": "Microsoft.VisualStudio.ServiceProfiler",
        "type": "ServiceProfilerAgent",
        "typeHandlerVersion": "0.1",
        "autoUpgradeMinorVersion": true,
        "settings": {
            "config": {
                "_comment": "The config is managed by cloud settings. You can update it in Service Profiler portal -> Configure Agents."
            }
        },
        "protectedSettings": {
            "storageAccountName": "[parameters('ServiceProfilerStorageName')]",
            "storageAccountKey": "[parameters('ServiceProfilerStorageSecret')]",
            "storageAccountEndPoint": "https://core.windows.net"
        }
    },
    "name": "ServiceProfilerAgent"
}


Test the template

In the downloaded template.zip you find a lot of sample files that demonstrate how to deploy the template. In our project, we used PowerShell. It’s useful to use the parameter -Verbose to get detailed information about the deployment process. It’s not really necessary, but helped us a lot to find bugs in the customizing process of the template.

New-AzureRmResourceGroupDeployment -ResourceGroupName myresourcegroup -TemplateFile template.json -TemplateParameterFile parameters.json -Verbose


If you get the meaningless error message Multiple error occurred: Conflict,Conflict,Conflict., the possibility is high that you have just forgotten to set the advanced access policy Enable access to Azure Resource Manager for template deployment in the Azure Key Vault (see the section Create a key vault).

VERBOSE: Performing the operation "Creating Deployment" on target "myresourcegroup".
New-AzureRmResourceGroupDeployment : Multiple error occurred: Conflict,Conflict,Conflict. Please see details.


Tips for testing the template

We tested the template deployment via PowerShell several times. For that reason, we reused the resource group name and bundled the storage account name to the resource group name. We won’t do it this way again because it wasted some time. If you delete a storage account with a certain name, you cannot immediately reuse it, so deploying the entire template, including storage accounts whose names were deleted just shortly before, takes more time than usual. As a result, we do not reuse storage account names anymore, but just increase a counter; for example, mystorageaccount42.

Introducing multiple environments to Service Fabric

After we managed to have the Service Fabric cluster as Infrastructure as Code, we were able to deploy a new environment automatically. But to finally get to the load testing we would need to introduce multiple environments to the backend application. Service Fabric already supports the configuration of multiple environments through an application parameter that can be changed dynamically.

The application parameter is basically custom key value pairs that are defined in the settings of a service and can be used at runtime by the service by using the environment of the context object Environment.GetEnvironmentVariable("MyEnvVariable");. This value can be overwritten by ApplicationParameters in Service Fabric where you would define multiple settings for different environments. The most interesting part about this is that you are able to inject custom parameters through the Visual Studio Team Services template without the need to predefine your environment. This is important because we are creating test clusters on demand where the credentials and settings aren’t relevant for the developer.

For more information about multiple environments, see Manage application parameters for multiple environments.

Missing Service Fabric Context object

Even though it’s easy to integrate multiple environments with Service Fabric, we had the problem that the application wasn’t always aware of the Service Fabric Context object, which made it hard to implement it into the application. Digamore used a utility project where they had a configuration class that provided the configuration that was hardcoded and created at compile time.

During the hack, we used the Service Fabric parameters to create this configuration object at runtime and inject it into the application. This way we didn’t needed to rewrite the application.

Working


Setting up the load test

We used Apache JMeter to describe our load test because Digamore already used JMeter to do simple load testing from their local computers. We defined a load test against the REST API that the application exposed and exported it as an XML file once.

Following is the JMeter config file from the hackfest.

        <HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="HTTP Request Login" enabled="true">
          <boolProp name="HTTPSampler.postBodyRaw">true</boolProp>
          <elementProp name="HTTPsampler.Arguments" elementType="Arguments">
            <collectionProp name="Arguments.arguments">
              <elementProp name="" elementType="HTTPArgument">
                <boolProp name="HTTPArgument.always_encode">false</boolProp>
                <stringProp name="Argument.value">&quot;${__UUID}&quot;</stringProp>
                <stringProp name="Argument.metadata">=</stringProp>
              </elementProp>
            </collectionProp>
          </elementProp>
          <stringProp name="HTTPSampler.domain">lasttest01.westeurope.cloudapp.azure.com</stringProp>
          <stringProp name="HTTPSampler.port"></stringProp>
          <stringProp name="HTTPSampler.connect_timeout"></stringProp>
          <stringProp name="HTTPSampler.response_timeout"></stringProp>
          <stringProp name="HTTPSampler.protocol">https</stringProp>
          <stringProp name="HTTPSampler.contentEncoding"></stringProp>
          <stringProp name="HTTPSampler.path">/Login/LoginDeviceId</stringProp>
          <stringProp name="HTTPSampler.method">POST</stringProp>
          <boolProp name="HTTPSampler.follow_redirects">true</boolProp>
          <boolProp name="HTTPSampler.auto_redirects">false</boolProp>
          <boolProp name="HTTPSampler.use_keepalive">false</boolProp>
          <boolProp name="HTTPSampler.DO_MULTIPART_POST">false</boolProp>
          <boolProp name="HTTPSampler.monitor">false</boolProp>
          <stringProp name="HTTPSampler.embedded_url_re"></stringProp>
        </HTTPSamplerProxy>
        <ResultCollector guiclass="SummaryReport" testclass="ResultCollector" testname="Summary Report" enabled="true">
          <boolProp name="ResultCollector.error_logging">false</boolProp>
          <objProp>
            <name>saveConfig</name>
            <value class="SampleSaveConfiguration">
              <time>true</time>
              <latency>true</latency>
              <timestamp>true</timestamp>
              <success>true</success>
              <label>true</label>
              <code>true</code>
              <message>true</message>
              <threadName>true</threadName>
              <dataType>true</dataType>
              <encoding>false</encoding>
              <assertions>true</assertions>
              <subresults>true</subresults>
              <responseData>false</responseData>
              <samplerData>false</samplerData>
              <xml>false</xml>
              <fieldNames>false</fieldNames>
              <responseHeaders>false</responseHeaders>
              <requestHeaders>false</requestHeaders>
              <responseDataOnError>false</responseDataOnError>
              <saveAssertionResultsFailureMessage>false</saveAssertionResultsFailureMessage>
              <assertionsResultsToSave>0</assertionsResultsToSave>
              <bytes>true</bytes>
              <threadCounts>true</threadCounts>
            </value>
          </objProp>
          <stringProp name="filename"></stringProp>
        </ResultCollector>
        <hashTree/>
        <ResultCollector guiclass="StatVisualizer" testclass="ResultCollector" testname="Aggregate Report" enabled="true">
          <boolProp name="ResultCollector.error_logging">false</boolProp>
          <objProp>
            <name>saveConfig</name>
            <value class="SampleSaveConfiguration">
              <time>true</time>
              <latency>true</latency>
              <timestamp>true</timestamp>
              <success>true</success>
              <label>true</label>
              <code>true</code>
              <message>true</message>
              <threadName>true</threadName>
              <dataType>true</dataType>
              <encoding>false</encoding>
              <assertions>true</assertions>
              <subresults>true</subresults>
              <responseData>false</responseData>
              <samplerData>false</samplerData>
              <xml>false</xml>
              <fieldNames>false</fieldNames>
              <responseHeaders>false</responseHeaders>
              <requestHeaders>false</requestHeaders>
              <responseDataOnError>false</responseDataOnError>
              <saveAssertionResultsFailureMessage>false</saveAssertionResultsFailureMessage>
              <assertionsResultsToSave>0</assertionsResultsToSave>
              <bytes>true</bytes>
              <threadCounts>true</threadCounts>
            </value>
          </objProp>
          <stringProp name="filename"></stringProp>
        </ResultCollector>


The complete JMeter file can be found in this sample repository.

As you can see in the template, we started testing the logon endpoint of the application by trying to log on with a random user ID to force the application into the most expensive part, the creation of a new user. It’s important to note that you can’t use all result collectors with Visual Studio Team Services; because of this we focused on the simple StatVisualizer.

Now you can upload this test file to Visual Studio Team Services and define the type of load that you want to generate.

Running the Load Test


To see the impact of the load, we used the Azure Service Profiler, which can be integrated into Azure Application Insights, which allows you to monitor the impact of the load test to your backend code. This enabled us to continuously run load tests against new code and identify performance degradation.

Conclusion

In the end, we had three very productive days resulting in a working solution. We had good support by the facilities of the Microsoft Technology Center in Cologne. We split ourselves into two teams, each time one from Digamore, one from Microsoft. In one of the MTC rooms we had two big screens where each team could present their current progress.

By introducing DevOps practices such as Infrastructure as Code and continuous delivery, we were able to reduce the overall deployment times from days to seconds, and enable Digamore to run automated load tests against their REST API with the help of Apache JMeter and Visual Studio Team Services.

The process can be optimized further by increasing test coverage and introducing continuous deployment to reduce the need for manual interaction.

“With Service Fabric, we can focus on implementing game design features instead of investing a lot of time setting up the infrastructure.” —Marko Flod, Software Engineer, Digamore Entertainment GmbH

TheTeam


Additional resources