Remote Cluster User Guide
This document provides an overview of the remote setup process for FarmVibes.AI, detailing the main components, configuration options, and customization possibilities.
The primary focus of this guide is to help users understand the structure and organization of the Terraform scripts used to create and configure the remote cluster and its associated Azure cloud components.
Requirements
The FarmVibes.AI remote management script is a Python-based utility that works with the assumption that whoever is executing it has at least a Contributor role in the Azure subscription. Please make sure you have Python installed before trying to setup a cluster.
The az
command-line interface is a hard requirement for running the script.
Please follow the Azure Command-Line Interface (CLI)
documentation for instructions on how
to install it. Make sure to install the version appropriate for your
architecture, otherwise the install process may fail.
If you have issues installing these utilities, please see the next section, with instructions on how to build a FarmVibes.AI remote cluster from an Azure Cloud Shell.
Required Azure CPU quotas
In the default configuration, your subscription’s CPU quota must support at
least 20 vCPUs in the region you choose. You will also need 12 Standard DSv3 Family vCPUs
available. Please follow the Azure quota increase
guide.
The installer will fail if the quota is not available.
To summarize, the minimum numbers of CPUs required are:
Total Regional vCPUs: increase to at least 20
Standard DSv3 Family vCPUs: increase to at least 12
Standard BS Family vCPUs: increase to at least 8
You might have to enable the Microsoft.Compute
provider in your subscription.
For instructions on how to do so, please proceed to the Azure
providers section.
Azure Cloud Shell installation
For cases when users are unable to install the FarmVibes.AI requirements in a local machine, they can use an Azure Cloud Shell to complete installation.
To do so, please visit https://shell.azure.com/ to start a new Azure Cloud Shell.
When asked whether to start a “Bash” or “PowerShell” shell, select “Bash”.
Then, do the following manual steps:
pip install --upgrade pip # to use the latest version of the package manager
# Do the following to install the `vibe_core` library and the `farmvibes-ai` command
pip install "git+https://github.com/microsoft/farmvibes-ai#egg =vibe_core&subdirectory=src/vibe_core"
az provider register --namespace Microsoft.Compute # to enable compute provider
After running the above commands, please manually set the processor quota in the Azure region you want to use, if you haven’t done so. (See the section Required Azure CPU quotas for details.)
Then, run an az account show
and make sure you are connected to the correct
Azure subscription. If not, az account list
will list all your subscriptions,
and running an az account set $SUBSCRIPTION_ID
will set the current
subscription, where $SUBSCRIPTION_ID
is the id of the subscription you’d like
to use.
Depending on your permissions, you may or may not need to create a resource
group manually to install the FarmVibes.AI AKS cluster to. If needed, you can
create a new resource group with az group create --name resource_group_name --location location_name
, where resource_group_name
is the name of the
resource group, and location_name
is the Azure Region where you will install
the cluster.
Once these requirements are met, you can follow the instructions on how to use the farmvibes-ai script.
Azure Providers
Since the FarmVibes.AI remote management script needs to provision new resources on Azure, it needs access to various Azure Providers. The script itself is able to register each required provider. As of the time of writing of this document, the required providers are:
Microsoft.DocumentDB
, for provisioning Cosmos DBMicrosoft.KeyVault
, for managing an Azure KeyVault for managing secretsMicrosoft.ContainerService
, for managing AKS itselfMicrosoft.Network
, for configuring public IPs and vnetsMicrosoft.Storage
, for provisioning storage accountsMicrosoft.Compute
, for provisioning Virtual Machines and Virtual Machines Scale Sets
Setting the subscription
Some Azure users will have access to multiple subscriptions. To list all subscriptions you have access to, use the command
az account list
Then, you can set your default subscription with the command
az account set ${YOUR_DESIRED_SUBSCRIPTION_ID}
Azure environments
The AKS environment supports all Azure (National) Cloud
environments. Users can
choose which cloud they want to interact with by using the --environment
argument. Valid values for this argument are:
public
(the default if not specified)usgovernment
(for a US Government cloud)german
(for the German sovereign cloud)china
(for the China sovereign cloud)
Overriding the cluster admin
The farmvibes-ai
install script uses the az
CLI to infer the credentials of
the administrator of the cluster (= the user currently-logged in). In some
situations, the user performing the install might not be the user that will
actually manage the cluster with kubectl
.
In such cases, you can use the --cluster-admin-name
to define who will have
access to the cluster.
Azure Cloud Components
The cluster configuration process works by provisioning different resource types on the Azure Cloud, always orchestrated
by the farmvibes-ai remote
script, detailed in the next section.
The first component that needs to be configured is the Resource Group. The script supports either using an existing Resource Group, or creating a new one. We recommend letting the script create a new resource group, as the script will always assume it is able to manage the resource group (even deleting resources when cluster destruction is requested).
A (1) Resource Group provides a logical container for resources that are deployed within an Azure subscription. In this project, a resource group is created to organize and manage the resources related to the remote cluster, making it easier to monitor, secure, and manage resources collectively.
After the farmvibes-ai remote
script ensures the existence of the Resource
Group, it creates a (2) Metadata Storage account for the cluster
and related Azure resources. The script determines the name of this new storage
account by using a deterministic function that hashes the cluster name and the
resource group name.
Once the resource group and the metadata storage account exist, the script then uses TerraForm to provision the infrastructure, which takes place in three levels, described below.
(3) Infrastructure
Cosmos DB: the setup creates two Azure Cosmos DB accounts, each with their respective databases and containers. Azure Cosmos DB is a fully managed, globally distributed, multi-model database service designed for low-latency access to data and seamless horizontal scaling.
Key Vault: a component to store and manage sensitive information such as secrets, connection strings, and credentials used by the services in the remote cluster.
Azure Kubernetes Service (AKS): a managed Kubernetes service that simplifies the deployment, management, and scaling of containerized applications using Kubernetes. In this project, an AKS cluster is created with the specified configurations, enabling you to deploy and manage your containerized services with ease.
Storage Account: a cloud storage service that provides scalable, durable, and highly available storage for various types of data, such as blobs, files, queues, and tables. In this project, the Storage Account is set up to allow access from the virtual network created for the AKS setup, providing secure and isolated access to stored data.
Virtual Network and Subnet: Virtual Network (VNet) is the fundamental building block for creating a private and isolated network in Azure. It enables resources within the network to communicate with each other securely and efficiently.
Public IP Address: Azure Public IP addresses enable you to communicate with resources over the internet, providing public access to services running in the remote cluster. The domain name label is created using a combination of the prefix and a hash of the resource group name, creating a unique and easily identifiable domain name for your services.
(4) Kubernetes Components
After the infrastructure layer is complete, the script proceeds to provision the kubernetes components, which are:
Redis database for caching data about workflows executed and data generated
RabbitMQ for messaging between FarmVibes.AI services
Dapr (Distributed Application Runtime) for abstracting service invocation and messaging
A persistent volume that uses the storage account created in the previous step as backing store
Open telemetry service for collecting telemetry data from services
A certificate management service for configuring TLS endpoints for FarmVibes.AI services such as the REST API
(5) Services
Finally, after the cluster basic components are configured, the script proceeds to define and deploy the FarmVibes.AI services:
Rest-api (Server). A webserver that exposes a REST API so users can call workflows, track workflow execution and retrieve results.
Orchestrator. This component manages workflow execution, transmitting requests to workers and updating workflow status.
Worker. A scalable component responsible for the actual workflow operation computation. Instead of running the whole workflow at once, it computes the atomic chunks processed by the user.
Cache. This component sits between the orchestrator and workers, it checks if an operation was previously executed and returns cached results to the orchestrator.
Data Ops. This component is responsible for managing data operations such as keeping track of assets related to workflow execution and deleting run data when requested.
Installation and Setup
The installation and setup of the AKS cluster for FarmVibes.AI is managed by a
script named farmvibes-ai
, under the remote
action group. This script
automates the process of setting up the necessary infrastructure in Azure
(discussed above) and deploying the services.
Before running the script, you need to install the vibe_core
package which
provides the farmvibes-ai
command. Here’s how to do it:
Make sure you have Python 3.8 or higher installed, with the
pip
command available. See the Python installation guide and the pip installation guide for instructions on how to install Python and pip.Make sure you have git installed and configured. See the git installation guide for instructions on how to install git.
Now you have two options for installing the
vibe_core
package:3.1. Install the package directly from the GitHub repository, with pip
pip install "git+https://github.com/microsoft/farmvibes-ai#egg =vibe_core&subdirectory=src/vibe_core"
3.2. Clone the repository with
git clone https://github.com/microsoft/farmvibes-ai.git
and install the package from the local copy withpip install ./farmvibes-ai/src/vibe_core
.After installing the
vibe_core
package, you can run thefarmvibes-ai
script.
The farmvibes-ai script
The farmvibes-ai
script manages the full lifecycle of the remote cluster,
including creation, deletion, and updates. The script is organized in
subcommands, each one responsible for a specific action. The subcommands, with
their descriptions and usage instructions, can be shown by running farmvibes-ai remote -h
.
To create a new remote cluster, you have to provide some required arguments to farmvibes-ai remote setup
: the name of the cluster, the resource group name, and the location of the cluster, as well as an email address to generate the TLS certificates for the REST API.
The script will create a new resource group with the specified name, and use it to create the cluster, also creating it if it doesn’t exist.
NOTE: We recommend you use an empty resource group to create your cluster. If you ever need to destroy your cluster, all changes in your subscription will be restricted to that resource group.
WARNING: If you decide to reuse an already existing resource group, destroying the cluster will delete ALL resources in that resource group.
An example setup command is shown below:
farmvibes-ai remote setup \
--region eastus \
--cert-email testemail@example.com \
--resource-group some-resource-group-name
Once the command runs, you will receive a hostname for the cluster, and the script will update FarmVibes.AI configuration files to point the default client to the newly created cluster’s url.
INFO - URL for your AKS Cluster is: https://test-build-1234-5678e9-dns.eastus2.cloudapp.azure.com
You can use to access the REST API and the FarmVibes.AI client, following the instructions in our Client user guide:
from vibe_core.client import get_default_vibe_client
client = get_default_vibe_client("remote")
Pre-registering providers
As mentioned before, the installer will auto-register each provider automatically, but if you’d prefer to pre-register all providers beforehand, you can execute the PowerShell script below (which, depending on your security settings, might be easier to copy and paste the code in the prompt.)
PowerShell script example (click to expand)
# Define the list of providers to register
$providers = @(
"Microsoft.DocumentDB",
"Microsoft.KeyVault",
"Microsoft.ContainerService",
"Microsoft.Network",
"Microsoft.Storage",
"Microsoft.Compute"
)
# Initialize arrays for jobs and a hashtable for job-provider mapping
$jobs = @()
$jobProviderMap = @{}
# Start a job for each provider to register it
foreach ($provider in $providers) {
$job = Start-Job -ScriptBlock {
param($provider)
$currentState = az provider show -n $provider --query "registrationState" -o tsv
if ($currentState -ne "Registered" -and $currentState -ne "Registering") {
az provider register --namespace $provider
}
do {
$registrationState = az provider show -n $provider --query "registrationState" -o tsv
if ($registrationState -ne "Registered") {
Start-Sleep -Seconds 10
}
} while ($registrationState -ne "Registered")
$provider
} -ArgumentList $provider
$jobs += $job
$jobProviderMap[$job.Id] = $provider
}
# Periodically check the status of each job
while ($jobs | Where-Object { $_.State -eq 'Running' }) {
Clear-Host
foreach ($job in $jobs) {
$state = $job.ChildJobs[0].JobStateInfo.State
$providerName = $jobProviderMap[$job.Id]
Write-Host "Provider $providerName is in $state state."
}
Start-Sleep -Seconds 10
}
# Retrieve and display the final status of each job, then clean up
foreach ($job in $jobs) {
$providerName = Receive-Job -Job $job -Wait
Write-Host "Provider $providerName is now registered."
Remove-Job -Job $job
}