A joint team from ICEBERG and Microsoft migrated part of ICEBERG’s hockey analytics system to the Azure cloud using the Web Apps feature of Azure App Service, Azure Storage, Azure Content Delivery Network, and Virtual Machines for Open Source workloads.

Core team:

  • Evgeny Khokhlov (LinkedIn) – Lead Developer, ICEBERG
  • Anton Trubakov – Project Manager, ICEBERG
  • Alex Belotserkovskiy (@ahriman_ru) – Technical Evangelist, Microsoft
  • Viktor Kiselev – Partner Business Evangelist, Microsoft

In this report, we describe the process and what we did before, during, and after the hackfest, including:

  • How we migrated the ICEBERG analytics system to the cloud.
  • What the architecture looked like, how it developed over time, what it looks like now, and future plans.
  • Learnings and architecture questions from the process.

Customer profile

ICEBERG is an IT startup based in Russia that provides enhanced sporting analytics and data collection through modern computer vision algorithms. This data is used to make better management decisions, uncover players’ strengths and weaknesses, improve the training process, and adjust strategy for game preparation.

ICEBERG’s main expertise is in ice hockey. In addition to services for professional teams, ICEBERG offers products that provide high-quality data for media, gaming, betting, and fantasy sports solutions. It uses optical tracking technology along with activity recognition through advanced machine learning algorithms to transform video into raw numbers. The data is then aggregated and visualized for clients in a user-friendly, state-of-the-art web portal that uncovers insights about the game, players, team, and opponents.

Problem statement

The hockey analytics system is a difficult candidate for the cloud. It consists of resources such as web cams and processing units/servers located completely at some venue (like a stadium). The problem was to understand how we could leverage the cloud for that system, where the cloud could bring benefits, and where we won’t be able to touch any part of the system.

We divided the process of migration into steps:

  • Analyze existing architecture and technical requirements
  • Propose the first version of the architecture
  • Start to migrate
  • Gather the experience
  • Propose the second version of the architecture
  • Iterate

We wanted to try doing all of that with the minimum amount of new codelines development efforts. We wanted to leverage the platform as a service (PaaS) architecture and check the limits.

Architecture - the starting point

As shown in Figure 1, the “architecture before” was the traditional client-server monolithic architecture. Stadium cameras capture the video, process it, and upload for verification and further processing to the Dropbox storage. Java server application serves the role of back end that hosts the REST API for customers who want to access data, visualization gateway for Tableau, and data access layer for PostgreSQL.

While that approach is classic, it does not provide the scalability, agility, and ability to offload some heavy workloads to the services instead of writing its own boilerplate code and deploying its own infrastructure or setting up the hosting resources.

Figure 1. Architecture before the hackfest

Architecture

Architecture - v2.0

Thinking about what can be migrated to the cloud is a long process that involves assessing both the technical and economical pros and cons of each subsystem.

As we saw, the original architecture consisted of:

  • Storage subsystem with the programmatic CRUD capabilities

    That was the subsystem where we could identify only a few of the technical issues:

    • Technical. Make changes to the user application (so it can put data into Azure Storage); latency.
    • Business. We could not identify the issues that would be impactful in terms of business.

    At the same time, Azure Storage is virtually unlimited. It also is agile in terms of programming abstractions (blobs, tables, and so on, and supports a variety of clients that are REST-capable), and it has a sophisticated security model (such as for generating the security strings for user access on the fly).

    ICEBERG developers were able to rewrite the storage subsystem in such way that, when launching, it is able to find the Azure Storage account that is nearest to the user location.

  • Front end

    ICEBERG front end uses Java technologies (Spring). The code is being deployed after the Bitbucket commit. The only concern that ICEBERG expressed was with the Java support on Microsoft Azure. That concern was eliminated when we looked at the Java Developer Center. The next step was simple—the website was deployed to Web Apps from the Bitbucket repository into which the out-of-the-box support is embedded. Performance issues with the Azure Storage access were fixed after code analysis and learning the reason—use of the wrong API method. (The information regarding that fix is at the end of this report.) The smart routing was moved out of scope (Azure Traffic Manager was planned for the second iteration).

  • Database layer

    We decided to move PostgreSQL database out of the scope of the first migration iteration and move step by step. We had already moved two important subsystems and decided to monitor the results and identify potential issues.

  • Processing system

    We could not identify the approach to migrate any part of the processing system that would not seriously impact the overall performance of the system. We moved that part out of scope, but during the next iteration came up with a great idea—how to add the value to the processing system with the cloud instead of moving or changing any part of that.

At the end of that iteration, we ended up with the architecture shown in Figure 2.

Figure 2. Architecture v2.0

Architecture v2.0

Architecture - v2.5

After testing and analyzing the hybrid solution we developed and deployed in Iteration 2, what we had left was the database subsystem (PostgreSQL) and most difficult part to deal with—the processing subsystem. We did not give up on the idea of doing something with the component that was initially designed to work only on-premises.

  • Database layer

    The database that should be deployed is PostgreSQL. Because there is no PostgreSQL as a service in Azure, the only option was to install and set this up on the virtual machine, then add it to the high-availability cluster on Azure Virtual Machines and leverage the Azure Redis Cache service.

  • Processing system

    During the discussions of what to do about video processing and analytics, we noticed that if we extract events from the video, form the JSON payload, and send it to the Event or IoT Hub from which these events will be forwarded through Stream Analytics in real time to Power BI, we could provide users (coaches, players, etc.) with the opportunity to see the real-time analytics literally on the fly. We also could give them detailed analytics and conclusions at the end of the play. That is something that was born during the brainstorm discussions. It was a completely new use case for the system, and could be easily implemented using only cloud services. That idea was not implemented during the first iterations, but we will definitely try to do so in a future one.

  • Traffic routing and Content Delivery Network

    When ICEBERG deployed its website and data, a reasonable question arose: What should be done for international users? ICEBERG is planning to expand to Western Europe and the United States, so it is important to deliver content in a fast and agile way. They chose to use Traffic Manager and Content Delivery Network for that. Traffic Manager combined with Web Apps provides not only the opportunity to distribute the traffic to different instances of the web app, but to do A/B testing among other tasks. When planning, we had the idea to use a Linux-based web app when (and if) Java will be available. Because it is in preview, it is important to test it not only in a test environment but also with real users. Traffic Manager allows distribution of the load in such a way that, for example, 5% of the load will go to the Linux-based web app to see if the benefits are worth migrating, and 95% to the main Windows-based one.

After that, the architecture looked completely different.

Figure 3. Architecture v2.5

Architecture v2.5

Results of the technical engagement

At the start, the ICEBERG solution was implemented as a classic client-server solution. Its architecture was tightly coupled, and some components (like the Dropbox storage) were not implemented in a scalable and agile way.

During the technical engagement between Microsoft and ICEBERG, the architecture was heavily revamped in such way that critical components would leverage the PaaS benefits (such as health checking and fast deployment) as much as possible to eliminate the chance of any availability issues in the future and delegate their migration to the cloud platform.

The ICEBERG team spoke positively not only about how the cloud platform works, but about how the Microsoft platform works with the Open Source. The ICEBERG system is written in Java and uses PostgreSQL, so for the successful migration it was necessary to be sure that the technologies used are supported and tested in the cloud. Both Java and PostgreSQL can be described as “first class citizens” on Microsoft Azure, and it was one of the important learnings for the team.

“Azure technologies allow ICEBERG to get large data volumes from our clients, who always move around the globe. Secure data transfer, implemented by Microsoft, keeps us from worrying about our customers’ data confidentiality. Scalable cloud architecture based on Azure VMs, Azure Web Apps, and CDN is perfect for the always-growing number of requests from both our current and new customers.”
– ICEBERG team

ICEBERG’s future plans for using the Azure platform include the IoT Suite (for implementing the real-time processing and visualization of Power BI), Stream Analytics, and Azure Machine Learning. The ICEBERG solution is already working in production. The Open Source project, which is working perfectly in the Microsoft cloud, is proof that Microsoft supports Open Source technologies.

Appendix: performance issues

During the last iteration, we experienced this performance issue: uploading metadata files from Azure Storage took more than 16 seconds. To improve performance, ICEBERG rolled out this performance fix:

  • Parallelize algorithm and upload every file in a separate thread.
  • Instead of working with Azure Input Stream, download data in memory.

    Before:

      CloudBlockBlob blob = ...
      InputStream in = blob.openInputStream()
    

    After:

      CloudBlockBlob blob = ...
      try (ByteArrayOutputStream os = new ByteArrayOutputStream()) {
          blob.download(os);
          InputStream in = new ByteArrayInputStream(os.toByteArray());
      }
    

After performance optimization, computation takes less than 1 second. This is a great example of how to apply traditional best practices to the cloud deployment.