Remote Cluster User Guide

This document provides an overview of the remote setup process for FarmVibes.AI, detailing the main components, configuration options, and customization possibilities.

The primary focus of this guide is to help users understand the structure and organization of the Terraform scripts used to create and configure the remote cluster and its associated Azure cloud components.

Requirements

The FarmVibes.AI remote management script is a Python-based utility that works with the assumption that whoever is executing it has at least a Contributor role in the Azure subscription. Please make sure you have Python installed before trying to setup a cluster.

The az command-line interface is a hard requirement for running the script. Please follow the Azure Command-Line Interface (CLI) documentation for instructions on how to install it. Make sure to install the version appropriate for your architecture, otherwise the install process may fail.

If you have issues installing these utilities, please see the next section, with instructions on how to build a FarmVibes.AI remote cluster from an Azure Cloud Shell.

Required Azure CPU quotas

In the default configuration, your subscription’s CPU quota must support at least 20 vCPUs in the region you choose. You will also need 12 Standard DSv3 Family vCPUs available. Please follow the Azure quota increase guide. The installer will fail if the quota is not available.

To summarize, the minimum numbers of CPUs required are:

  • Total Regional vCPUs: increase to at least 20

  • Standard DSv3 Family vCPUs: increase to at least 12

  • Standard BS Family vCPUs: increase to at least 8

You might have to enable the Microsoft.Compute provider in your subscription. For instructions on how to do so, please proceed to the Azure providers section.

Azure Cloud Shell installation

For cases when users are unable to install the FarmVibes.AI requirements in a local machine, they can use an Azure Cloud Shell to complete installation.

To do so, please visit https://shell.azure.com/ to start a new Azure Cloud Shell.

When asked whether to start a “Bash” or “PowerShell” shell, select “Bash”.

Then, do the following manual steps:

pip install --upgrade pip  # to use the latest version of the package manager
# Do the following to install the `vibe_core` library and the `farmvibes-ai` command
pip install "git+https://github.com/microsoft/farmvibes-ai#egg =vibe_core&subdirectory=src/vibe_core"
az provider register --namespace Microsoft.Compute  # to enable compute provider

After running the above commands, please manually set the processor quota in the Azure region you want to use, if you haven’t done so. (See the section Required Azure CPU quotas for details.)

Then, run an az account show and make sure you are connected to the correct Azure subscription. If not, az account list will list all your subscriptions, and running an az account set $SUBSCRIPTION_ID will set the current subscription, where $SUBSCRIPTION_ID is the id of the subscription you’d like to use.

Depending on your permissions, you may or may not need to create a resource group manually to install the FarmVibes.AI AKS cluster to. If needed, you can create a new resource group with az group create --name resource_group_name --location location_name, where resource_group_name is the name of the resource group, and location_name is the Azure Region where you will install the cluster.

Once these requirements are met, you can follow the instructions on how to use the farmvibes-ai script.

Azure Providers

Since the FarmVibes.AI remote management script needs to provision new resources on Azure, it needs access to various Azure Providers. The script itself is able to register each required provider. As of the time of writing of this document, the required providers are:

  • Microsoft.DocumentDB, for provisioning Cosmos DB

  • Microsoft.KeyVault, for managing an Azure KeyVault for managing secrets

  • Microsoft.ContainerService, for managing AKS itself

  • Microsoft.Network, for configuring public IPs and vnets

  • Microsoft.Storage, for provisioning storage accounts

  • Microsoft.Compute, for provisioning Virtual Machines and Virtual Machines Scale Sets

Setting the subscription

Some Azure users will have access to multiple subscriptions. To list all subscriptions you have access to, use the command

az account list

Then, you can set your default subscription with the command

az account set ${YOUR_DESIRED_SUBSCRIPTION_ID}

Azure environments

The AKS environment supports all Azure (National) Cloud environments. Users can choose which cloud they want to interact with by using the --environment argument. Valid values for this argument are:

  • public (the default if not specified)

  • usgovernment (for a US Government cloud)

  • german (for the German sovereign cloud)

  • china (for the China sovereign cloud)

Overriding the cluster admin

The farmvibes-ai install script uses the az CLI to infer the credentials of the administrator of the cluster (= the user currently-logged in). In some situations, the user performing the install might not be the user that will actually manage the cluster with kubectl.

In such cases, you can use the --cluster-admin-name to define who will have access to the cluster.

Azure Cloud Components

Architecture diagram of the remote FarmVibes.AI configuration

The cluster configuration process works by provisioning different resource types on the Azure Cloud, always orchestrated by the farmvibes-ai remote script, detailed in the next section.

The first component that needs to be configured is the Resource Group. The script supports either using an existing Resource Group, or creating a new one. We recommend letting the script create a new resource group, as the script will always assume it is able to manage the resource group (even deleting resources when cluster destruction is requested).

A (1) Resource Group provides a logical container for resources that are deployed within an Azure subscription. In this project, a resource group is created to organize and manage the resources related to the remote cluster, making it easier to monitor, secure, and manage resources collectively.

After the farmvibes-ai remote script ensures the existence of the Resource Group, it creates a (2) Metadata Storage account for the cluster and related Azure resources. The script determines the name of this new storage account by using a deterministic function that hashes the cluster name and the resource group name.

Once the resource group and the metadata storage account exist, the script then uses TerraForm to provision the infrastructure, which takes place in three levels, described below.

(3) Infrastructure

  • Cosmos DB: the setup creates two Azure Cosmos DB accounts, each with their respective databases and containers. Azure Cosmos DB is a fully managed, globally distributed, multi-model database service designed for low-latency access to data and seamless horizontal scaling.

  • Key Vault: a component to store and manage sensitive information such as secrets, connection strings, and credentials used by the services in the remote cluster.

  • Azure Kubernetes Service (AKS): a managed Kubernetes service that simplifies the deployment, management, and scaling of containerized applications using Kubernetes. In this project, an AKS cluster is created with the specified configurations, enabling you to deploy and manage your containerized services with ease.

  • Storage Account: a cloud storage service that provides scalable, durable, and highly available storage for various types of data, such as blobs, files, queues, and tables. In this project, the Storage Account is set up to allow access from the virtual network created for the AKS setup, providing secure and isolated access to stored data.

  • Virtual Network and Subnet: Virtual Network (VNet) is the fundamental building block for creating a private and isolated network in Azure. It enables resources within the network to communicate with each other securely and efficiently.

  • Public IP Address: Azure Public IP addresses enable you to communicate with resources over the internet, providing public access to services running in the remote cluster. The domain name label is created using a combination of the prefix and a hash of the resource group name, creating a unique and easily identifiable domain name for your services.

(4) Kubernetes Components

After the infrastructure layer is complete, the script proceeds to provision the kubernetes components, which are:

  • Redis database for caching data about workflows executed and data generated

  • RabbitMQ for messaging between FarmVibes.AI services

  • Dapr (Distributed Application Runtime) for abstracting service invocation and messaging

  • A persistent volume that uses the storage account created in the previous step as backing store

  • Open telemetry service for collecting telemetry data from services

  • A certificate management service for configuring TLS endpoints for FarmVibes.AI services such as the REST API

(5) Services

Finally, after the cluster basic components are configured, the script proceeds to define and deploy the FarmVibes.AI services:

  • Rest-api (Server). A webserver that exposes a REST API so users can call workflows, track workflow execution and retrieve results.

  • Orchestrator. This component manages workflow execution, transmitting requests to workers and updating workflow status.

  • Worker. A scalable component responsible for the actual workflow operation computation. Instead of running the whole workflow at once, it computes the atomic chunks processed by the user.

  • Cache. This component sits between the orchestrator and workers, it checks if an operation was previously executed and returns cached results to the orchestrator.

  • Data Ops. This component is responsible for managing data operations such as keeping track of assets related to workflow execution and deleting run data when requested.

Installation and Setup

The installation and setup of the AKS cluster for FarmVibes.AI is managed by a script named farmvibes-ai, under the remote action group. This script automates the process of setting up the necessary infrastructure in Azure (discussed above) and deploying the services.

Before running the script, you need to install the vibe_core package which provides the farmvibes-ai command. Here’s how to do it:

  1. Make sure you have Python 3.8 or higher installed, with the pip command available. See the Python installation guide and the pip installation guide for instructions on how to install Python and pip.

  2. Make sure you have git installed and configured. See the git installation guide for instructions on how to install git.

  3. Now you have two options for installing the vibe_core package:

    3.1. Install the package directly from the GitHub repository, with pip pip install "git+https://github.com/microsoft/farmvibes-ai#egg =vibe_core&subdirectory=src/vibe_core"

    3.2. Clone the repository with git clone https://github.com/microsoft/farmvibes-ai.git and install the package from the local copy with pip install ./farmvibes-ai/src/vibe_core.

  4. After installing the vibe_core package, you can run the farmvibes-ai script.

The farmvibes-ai script

The farmvibes-ai script manages the full lifecycle of the remote cluster, including creation, deletion, and updates. The script is organized in subcommands, each one responsible for a specific action. The subcommands, with their descriptions and usage instructions, can be shown by running farmvibes-ai remote -h.

To create a new remote cluster, you have to provide some required arguments to farmvibes-ai remote setup: the name of the cluster, the resource group name, and the location of the cluster, as well as an email address to generate the TLS certificates for the REST API.

The script will create a new resource group with the specified name, and use it to create the cluster, also creating it if it doesn’t exist.

NOTE: We recommend you use an empty resource group to create your cluster. If you ever need to destroy your cluster, all changes in your subscription will be restricted to that resource group.

WARNING: If you decide to reuse an already existing resource group, destroying the cluster will delete ALL resources in that resource group.

An example setup command is shown below:

farmvibes-ai remote setup \
  --region eastus \
  --cert-email testemail@example.com \
  --resource-group some-resource-group-name

Once the command runs, you will receive a hostname for the cluster, and the script will update FarmVibes.AI configuration files to point the default client to the newly created cluster’s url.

INFO - URL for your AKS Cluster is: https://test-build-1234-5678e9-dns.eastus2.cloudapp.azure.com

You can use to access the REST API and the FarmVibes.AI client, following the instructions in our Client user guide:

from vibe_core.client import get_default_vibe_client
client = get_default_vibe_client("remote")

Pre-registering providers

As mentioned before, the installer will auto-register each provider automatically, but if you’d prefer to pre-register all providers beforehand, you can execute the PowerShell script below (which, depending on your security settings, might be easier to copy and paste the code in the prompt.)

PowerShell script example (click to expand)
# Define the list of providers to register
$providers = @(
    "Microsoft.DocumentDB",
    "Microsoft.KeyVault",
    "Microsoft.ContainerService",
    "Microsoft.Network",
    "Microsoft.Storage",
    "Microsoft.Compute"
)

# Initialize arrays for jobs and a hashtable for job-provider mapping
$jobs = @()
$jobProviderMap = @{}

# Start a job for each provider to register it
foreach ($provider in $providers) {
    $job = Start-Job -ScriptBlock {
        param($provider)
        $currentState = az provider show -n $provider --query "registrationState" -o tsv

        if ($currentState -ne "Registered" -and $currentState -ne "Registering") {
            az provider register --namespace $provider
        }

        do {
            $registrationState = az provider show -n $provider --query "registrationState" -o tsv
            if ($registrationState -ne "Registered") {
                Start-Sleep -Seconds 10
            }
        } while ($registrationState -ne "Registered")

        $provider
    } -ArgumentList $provider

    $jobs += $job
    $jobProviderMap[$job.Id] = $provider
}

# Periodically check the status of each job
while ($jobs | Where-Object { $_.State -eq 'Running' }) {
    Clear-Host
    foreach ($job in $jobs) {
        $state = $job.ChildJobs[0].JobStateInfo.State
        $providerName = $jobProviderMap[$job.Id]
        Write-Host "Provider $providerName is in $state state."
    }
    Start-Sleep -Seconds 10
}

# Retrieve and display the final status of each job, then clean up
foreach ($job in $jobs) {
    $providerName = Receive-Job -Job $job -Wait
    Write-Host "Provider $providerName is now registered."
    Remove-Job -Job $job
}