This is the report of a technical engagement between developers from Microsoft and CADEX and its results: the migration of a complex, C++-based 3D CAD visualization and conversion product to the cloud.

CAD Exchanger is the desktop/mobile solution for 3D visualization/conversion aimed at addressing CAD data interoperability challenges. The software enables users to visualize 3D data, convert it across a wide range of CAD formats, build computational meshes for engineering analysis, display 3D model properties, and more. The list of supported formats is constantly growing and includes both neutral formats (IGES, STEP, STL, JT, VRML, X3D, OBJ) and modeling kernel-specific ones (ACIS, Parasolid, Open CASCADE, Rhino/Open NURBS).

The CADEX development team had built the software suite that worked perfectly in a desktop mode, and they provided their services to the customers in that model. This technical architecture proved its efficiency for small and large customers.

Roman Lygin, CEO of CADEX, started to consider creating a software as a service (SaaS) version of that suite. He was interested in understanding how Microsoft Azure capabilities could be used for 3D visualization SaaS needs such as autoscalability, storage, processing, and global distribution. Making desktop software with complex logic work in the cloud and provide its services by using platform as a service (PaaS) was the main challenge in the technical engagement.

Making such solutions cloud-friendly can be a challenge. As seen in this report, there are usually a lot of challenges and questions, and the main technical question is “how feasible is it to efficiently move the app to the cloud?”

Approach

In a nutshell, there are two ways of migrating desktop software to the cloud:

  • “As is,” which means that the company is going to use a virtual machine (VM)
  • Rewrite the software, which needs the company to invest a lot (usually) development hours in refactoring, changing the architecture, and actually rewriting the software

Often, the answer is somewhere between—moving to the cloud requires setting up the network boundaries and autoscalability, among other tasks, so the architecture is going to be changed anyway.

Architecture

The architecture jointly reviewed by the teams assumed that there should be web apps and VMs for processing the front-end and back-end jobs using the existing CAD Exchanger codebase.

Architecture diagram

The client uses a web application framework (Ember) for submitting the job. WebGL is used for visualization; sockets provide real-time communication. Plans include mobile and desktop app cloud support and native visualization.

The web app service deployed into the Web App feature of Azure App Service is the Node.js back end (using the Sails MVC framework) that does the following:

  1. Save uploaded files to Azure Storage
  2. Push messages to Azure Service Bus queue
  3. Interact with Azure DocumentDB or SQL Server to work with user data (such as settings or list of files)
  4. Use cache (such as Redis) to organize real-time working and file sharing

On the other end, the VMs are encapsulated in the VM cluster that is being automatically scaled up and down depending on how many messages (tasks) are in the queue. VMs perform heavyweight operations with 3D models:

  • Importing – converting a source user file to internal binary format for faster restoring in future, saving the most important part of data (like scene graph) to the database to improve the user web-app experience
  • Exporting – converting the internal representation of the 3D model to the target format and saving it
  • Meshing and other compute-intensive operations – running operations that require access to “real” 3D data and spend a lot of CPU time.

Key migration challenges

The CAD Exchanger migration was a challenge for various reasons, including the following:

  • Large codebase, including C++ engine and Qt QML UI layer.
  • Single-user desktop solution (with scalability limited to single machine capabilities).
  • Active interactions of the user with the 3D model via 3D view. (The cloud is dynamic by its nature, so we needed to plan how to “stick” the user to the context.)

The core C++ engine in CAD Exchanger is large and already efficiently uses multi-threading parallelism. CADEX owns two patents related to parallel methods that help to efficiently balance workload on a single machine. So rewriting the core base was not an option, and the CADEX team planned to limit migration efforts to rewriting the UI layer only.

3D models are very distinct in their sizes, contents and complexity, so processing can take extremely different times, from fractions of seconds to minutes. Therefore capability of dynamic balancing to offer good user experience was important.

There are many more challenges in comparison with desktop solutions, including working with multi-file models, support of incremental loading, and identification of model elements.

Still, the team was committed to invest in this migration.

Scalability

VMs need to be powerful enough to be able to process tasks as fast as possible. Here, the question of how to properly implement autoscalability is critical. As widely known and stated in the official documentation (see Horizontal vs vertical scaling), there are two ways to scale a solution: horizontal (increase or decrease the number of VM instances) and vertical (scale up or down the type of the VM, such as adding more computing cores). It is important to assess both options before going forward.

Before doing any analysis about how to implement autoscalability, architects should ensure that the system is scalable. There is no silver bullet for scalability, and the scope of scalability includes many questions from various disciplines. Implementing autoscalability for a solution that is not scalable at all, or only looks like a scalable one, should be avoided. Instead, developers should thoroughly test the autoscalability and see whether the solution is working as usual, users have the same experience, and the performance is not degrading. A lot of guidance is available on the Internet that covers the topic of how to properly design cloud-friendly and scalability-friendly architecture, such as the following:

Vertical scaling

Vertical scaling keeps the same number of VMs but makes the VMs more (“up”) or less (“down”) powerful. Power is measured in memory, CPU speed, disk space, and so on. Vertical scaling has more limitations. It’s dependent on the availability of larger hardware, which quickly hits an upper limit. Vertical scaling also usually requires a VM to stop and restart.

That information was enough to reach the conclusion that it is not exactly what we looked for—and Azure has no built-in autoscaling solution. Developers can use Azure Automation or custom scripts, but that still doesn’t solve the problem of fast scaling when many tasks are coming from the front end.

Horizontal scaling

Horizontal scaling is more flexible in a cloud situation because it allows you to run potentially thousands of VMs to handle the workload. There is no downtime, and if you have VMs already created but switched off, the scale process is performed very fast (in terms of a few minutes usually). That approach is more demanding in terms of maintenance and control but it fits the scenario very well. In the CAD Exchanger case, the horizontal type was chosen as a primary autoscaling pattern.

That gave us another problem to solve.

Sessions

Earlier, I mentioned the life-long question for almost every project in which the user should somehow communicate with the back end by using the front end. If the processing is very (very) fast (so that we do not need sessions), it is not critical—the user clicks and receives the result in a few seconds. The situation becomes more complicated with long processing times. The first idea that comes to mind is that we need to somehow “stick” the user to the same processing resources to ensure that if he leaves the front end and returns later, he will see the same picture.

That idea makes sense, but when it comes to the implementation, it can lead to tight coupling. Instead, the architecture could be more loosely coupled—in this case, we proposed to do as shown in the following diagram.

Diagram: Implementing sessions

The front-end instance puts the file in storage and records its URL, forms a message to the queue, gets the metadata, and sends it all together. On the other end, VMs poll the queue and process the task, communicating with the storage. The front end tracks the corresponding entity in storage and shows the progress to the user.

The sticky session should be implemented only for the user; the front end knows nothing about the back end, the back end knows nothing about the front end, and the only entity they both know about is the task file in the storage. Of course, we can avoid the sticky-session mechanism entirely by using a login/logout mechanism, but that won’t work with anonymous users.

For developer convenience, the Web Apps service that we use for the front end provides a sticky-session “lever” that one can pull to turn on or off the functionality. (See Disable Session affinity cookie in the Azure App Service team blog.)

A few lines earlier I mentioned polling the queue. Let’s dig a little deeper into that process and how we made our decisions.

Polling the queue

When it comes to distributed systems that usually have queues, the question of how to poll this queue effectively is among the most popular (but not simple) questions. If the developer knows that a message arrives every n seconds, it is of x size, and no force will change those numbers, it might be completely OK to poll every n seconds. Reality is different, so worker processes might poll an empty queue. Or, if the delay between polling operations is too long, the queue might be overwhelmed and the messages go stale before processing begins—or the queue might even reject incoming messages. It is important to implement the adaptive algorithm of the retry policy on both front end and back end; it can be custom-made or platform-native if the platform provides such capability. This is important not only from the resources perspective but from the financial side as well—queue services such as Service Bus use a transaction-based billing model. Before going with custom-made rules, the developer should research whether the service she is going to use provides any built-in implementation; for Azure, Retry guidance for specific services is very helpful.

One of popular retry policies is the exponential backoff. Born in the networks world, it is a good fit for many scenarios, including one in CAD Exchanger. It uses the feedback from the system that is being polled (such as “Is it empty?”) and adapts the numbers to the value between the limits set. It is the method we chose for prototyping the solution; because many solutions already exist, our implementation has almost no unique code.

For communication with the queue, we used the Azure Storage Client Library for C++, as referenced in the following code.

// get message from queue and make it invisible for +20 minutes.
    queue_request_options anOptions;
    myData->myInQueue.get_message_async (std::chrono::minutes (20), anOptions, operation_context {})
    .then ([this] (const pplx::task<cloud_queue_message>& theMessageTaks) {
        try {
            cloud_queue_message aMessage = theMessageTaks.get();
            auto aStringMessage = toUTF8Array (aMessage.content_as_string());
            if (!aStringMessage.isEmpty()) {
                // Decoding message. For more information see azure-storage/lib/services/queue/queuemessageencoder.js:171
                aStringMessage.replace ("&lt;", "<")
                .replace ("&gt;", ">")
                .replace ("&quot;", "\"")
                .replace ("&apos;", "'");

                // Propcessing message from JSON format
                processMessage (aStringMessage);
            }
        } catch (const std::exception& theEx) {
            std::cerr << "Exception: " << theEx.what() << std::endl;
            std::cerr << "Cannot peak message." << std::endl;
        }
        scheduleRequired();
    });

Storage

The storage part of the architecture consisted of the following:

  • Azure Blob storage – Using the approach described earlier (the ticket pattern, in which the front end puts the files in Blob storage, gets the URL, and passes it to the queue along with its metadata), Blob storage became the important part of the solution because of its built-in scalability, replication (every entity has three replicas), and simple usage as well as the simple development model. One question that arose during the architecture-planning stage was about the security model of the storage. It was clear that, on the level of the platform, Azure isolates the entities and accounts, but in the multitenant application, it is important to consider isolation on the application level. How to provide the updates to the user? How to secure it? Will it need a specific storage structure that helps to separate one customer from another?

    Once again, the official Azure documentation is the one stop. To cover the security-related questions, two main hyperlinks make everything clear:

    • Azure Trust Center, the one stop for every security-related question, including certifications, standards, procedures, and general recommendations
    • Azure Storage security guide, the comprehensive guidance on how to plan, architect, and create the security model for the Storage services and implement it in code.

    For the CAD Exchanger solution, the appropriate model involved shared access signatures plus role-based access control through the use of the Azure Active Directory service.

  • SQL Server and Azure DocumentDB – Storing the user information (files list, usernames, and so on) in the related storage is the traditional approach. In the current stage, both SQL Server and DocumentDB are operational and being tested for the CAD Exchanger solution in the cloud.

Deployment

For the CAD Exchanger team, it was critical to test their solution in the cloud, because there was no confidence that, given the dynamic nature of the cloud, technologies that backed the VM infrastructure as well as various limits won’t be the issue for the normal workflows.

What we did:

  • Deployed the VM with the C++-based service tasked with the solution load-testing (listening to the queue and running the heavy native-code workloads). The official documentation How to use Queue Storage from C++ was very helpful.
  • Deployed a few VMs for the autoscaling cluster into the availability set.
  • Connected the VM service to the queue and started to emulate the load.
  • Tested the autoscaling. We found that the approach we chose is working, and the Azure built-in autoscaling based on the messages count in the queue is able to perform its work in the appropriate time.

Team

  • Roman Lygin – Founder and CEO, CADEX
  • Sergey Solomin – Lead Developer, CADEX
  • Alex Belotserkovskiy (@ahriman_ru) – Technical Evangelist, Microsoft Russia

Results of the technical engagement

At the starting point, the CAD Exchanger solution was implemented as a complex 3D application for mobile and desktop that primarily written in C++. It was obvious that it migration would take significant effort. We analyzed various approaches, chose to not touch the software, and leveraged the Azure services as much as possible, using VMs for the software and Web Apps, Service Bus queue, Azure Storage, DocumentDB, and SQL Server to provide access to the native functionality and make it more SaaS-ready.

With regard to the issues we identified, mostly of them were architecture-related: which service to use (the Web App feature was the obvious candidate, but it was quite the analysis of the queue service), the limits of these services and the platform itself (covered by documentation, so the challenge was mostly to better understand the future workloads of CAD Exchanger), and the solution security model.

We managed to prototype a working solution, so the results of the technical engagement between Microsoft and CADEX regarding the migration of the CAD Exchanger to Azure cloud services suite may be considered successful. It was important learning that even a complex C++ native application can be wrapped into the PaaS umbrella and work for its cloud users in the multitenant manner. It is important to note that this engagement allowed CADEX to make the migration faster and to avoid lot of technical hiccups and nuances.

Some modifications triggered by the cloud migration were applied back to the common codebase and thus benefit existing desktop and mobile versions.

The CADEX team continues to work on productization of the solution and plans to deploy a public beta version later this year.

“Technical contents available at the Microsoft Azure portal have been instrumental for our migration project. Technical articles, blogs and code snippets were directly used in migration of CAD Exchanger to Microsoft Azure.”

“[The] Microsoft team has been very supportive during this project, from community relations and engineering to business development and marketing. I really appreciate your time and involvement into this engagement and look forward to its further steps.”

“Thanks to cooperation with the Microsoft engineering team and particular recommendations by Alexander Belotserkovskiy, we were able to navigate through the migration path faster and to deliver CAD Exchanger to our users sooner and with greater confidence.”