# Troubleshooting This document compiles the most common issues encountered when installing and running FarmVibes.AI platform, grouped into broad categories. Besides the issues listed here, we also collect a list of [known issues on our GitHub repository](https://github.com/microsoft/farmvibes-ai/labels/known%20issues) that are currently being addressed by the development team. - **Package installation:**
Permission denied when installing `vibe_core` Old versions of `pip` might fail to install the `vibe_core` library because it erroneously tries to write the library to the system's `site-packages` directory. An excerpt of the error follows: ``` ร— python setup.py develop did not run successfully. โ”‚ exit code: 1 โ•ฐโ”€> [32 lines of output] running develop /usr/lib/python3/dist-packages/setuptools/command/easy_install.py:158: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. warnings.warn( WARNING: The user site-packages directory is disabled. /usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. warnings.warn( error: can't create or remove files in install directory ``` If that happens, you might have to upgrade `pip` itself. Please run `pip install --upgrade pip` if you have write access to the directory where `pip` is installed, or `sudo pip install --upgrade pip` if you need root privileges.

- **Cluster setup:**
How to change the storage location during cluster creation You may change the storage location by defining the environment variable `FARMVIBES_AI_STORAGE_PATH` prior to installation with the *farmvibes-ai* command. Additionally, you may use the flag `--storage-path` when running the `farmvibes-ai local setup` command. For more information, please refer to the help message of the *farmvibes-ai* command.
Missing secrets Running a workflow while missing a required secret will yield the following error message: ```bash Could not retrieve secret {secret_name} from Dapr. ``` Add the missing secrets to the Kubernetes cluster. [Learn more about secrets here](SECRETS.md).
No route to the Rest-API Building a cluster with the *farmvibes-ai* command will set up a Rest-API service with an address visible only within the cluster. In case the client cannot reach the Rest-API, make sure to restart the cluster with: ```bash farmvibes-ai local restart ```
Running out of space even after changing storage location If, even after setting the `FARMVIBES_AI_STORAGE_PATH` env var to point to another location you are still running out of space with FarmVibes.AI, you might have to change the storage location of the docker daemon. That happens because even though asset storage goes into `FARMVIBES_AI_STORAGE_PATH`, we still use temporary space in our worker pods. If your operating system's disk is limited in space (especially when running multiple workers), you might run out of space. If that's the case, you can change the [docker daemon data directory location](https://docs.docker.com/config/daemon/#daemon-data-directory) to another disk with more space. For example, to instruct the docker daemon to save data in `/mnt/docker-data`, you would have to define the contents of `/etc/docker/daemon.json` as ```json { "data-root": "/mnt/docker-data" } ``` As an alternative you might also want to delete data from previous workflow runs to free some space. For more information on how to do that and other data management operations, please refer to the [Data Management user guide](CACHE.md).
Unable to run workflows after machine rebooted After a reboot, make sure to start the cluster with: ```bash farmvibes-ai local start ```

- **Composing and running workflows:**
Calling an unknown workflow Calling `client.run()` with a wrong workflow name will yield the following error message: ```HTTPError: 400 Client Error: Bad Request for url: http://192.168.49.2:30000/v0/runs. Unable to run workflow with provided parameters. Workflow "WORKFLOW_NAME" unknown``` Solutions: - Double check the workflow name and parameters; - Verify that your cluster and repo are up-to-date;
Tasks fail with "Abnormal Termination" Some workflows, such as the SpaceEye workflow (in the `preprocess.s1.preprocess`) or the Segment Anything Model (SAM) workflow might use a large amount of memory depending on the input area and/or time range used for processing. When that's the case, the Operating System might terminate the offending task, failing it and the workflow. When inspecting the error reason, users might find a text that says `... ProcessExpired: Abnormal termination`. One solution is to request processing of a smaller region. Another solution is to scale down the number of workers with the command `~/.config/farmvibes-ai/kubectl scale deployment terravibes-worker --replicas=1`. If, even when doing the above, the task still fails, the Kubernetes cluster might need to be migrated to a machine with more RAM.
Unable to find ONNX model when running workflows Make sure the ONNX model was added to the FarmVibes.AI cluster: ```bash farmvibes-ai local add-onnx ``` If no output is generated, then your model was successfully added.
Verifying why a workflow run failed In case a workflow run fails, you might see a similar status table when monitoring a run with `run.monitor()` (please refer to the [client documentation](CLIENT.md) for more information on `monitor`): ```bash >>> run.monitor() ๐ŸŒŽ FarmVibes.AI ๐ŸŒ dataset_generation/datagren_crop_segmentation ๐ŸŒ Run name: Generating dataset for crop segmentation Run id: dd541f5b-4f03-46e2-b017-8e88a518dfe6 Run status: failed Run duration: 00:00:16 โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“ โ”ƒ Task Name โ”ƒ Status โ”ƒ Start Time โ”ƒ End Time โ”ƒ Duration โ”ƒ Progress โ”ƒ โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ โ”‚ spaceeye.preprocess.s2.s2.download โ”‚ failed โ”‚ 2022/10/03 22:22:16 โ”‚ 2022/10/03 22:22:20 โ”‚ 00:00:00 โ”‚ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 0/1 โ”‚ โ”‚ cdl.download_cdl โ”‚ done โ”‚ 2022/10/03 22:22:12 โ”‚ 2022/10/03 22:22:15 โ”‚ 00:00:05 โ”‚ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1/1 โ”‚ โ”‚ spaceeye.preprocess.s2.s2.filter โ”‚ done โ”‚ 2022/10/03 22:22:10 โ”‚ 2022/10/03 22:22:12 โ”‚ 00:00:02 โ”‚ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1/1 โ”‚ โ”‚ spaceeye.preprocess.s2.s2.list โ”‚ done โ”‚ 2022/10/03 22:22:09 โ”‚ 2022/10/03 22:22:10 โ”‚ 00:00:01 โ”‚ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1/1 โ”‚ โ”‚ cdl.list_cdl โ”‚ done โ”‚ 2022/10/03 22:22:04 โ”‚ 2022/10/03 22:22:09 โ”‚ 00:00:04 โ”‚ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1/1 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ Last update: 2022/10/03 22:23:59 ``` The platform logs the possible reason why a task failed, which might be recovered with `run.reason` and `run.task_details`.
Workflow run with 'pending' status indefinitally If the status of a workflow run remains in 'pending', make sure to restart the cluster with: ```bash farmvibes-ai local restart ```

- **Example notebooks:**
Unable to import modules when running a notebook Make sure you have installed and activated the [micromamba environment](https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html) provided with the notebook.
Workflow run monitor table not rendering inside notebook Make sure to have the `ipywidgets` [package](https://pypi.org/project/ipywidgets/) installed in your environment.

- **Segment Anything Model (SAM):**
Adding SAM's ONNX models to the cluster Running workflows based on SAM requires the image encoder and prompt encoder/mask decoder to be exported as ONNX models, and added to the cluster. To do so, run the following command in the repository root: ```bash python scripts/export_sam_models.py --models ``` where `` is a list of model types to be exported (`vit_b`, `vit_l`, `vit_h`). For example, to export all three ViT backbones, run: ```bash python scripts/export_sam_models.py --models vit_b vit_l vit_h ``` The script will download the models from the [SAM repository](https://github.com/facebookresearch/segment-anything), export each component as a separate ONNX file, and add them to the cluster with the `farmvibes-ai local add-onnx` command. If you are using a different storage location, make sure to pass the `--storage-path` flag to the `add-onnx` command. Before running the script, make sure you have a micromamba environment set up with the required packages. You can use the environments defined by `env_cpu.yaml` or `env_gpu.yaml` files in the `notebooks/segment_anything` directory.
Unreliable segmentation mask when using bounding box as prompt As the input Sentinel-2 rasters may be considerably larger than the images expected by SAM, we split the rasters into 1024 x 1024 chips (with an overlap defined by the `spatial_overlap` parameter of the workflow). This may lead to corner cases that yield unreliable segmentation masks, especially when using a bounding box as prompt. To avoid such cases, consider the following: - Only a single bounding box is supported per prompt group (i.e., all points with the same `prompt_id`). - We recommend providing at least one foreground point within the bounding box. Even though the model supports segmentating rasters solely with a bounding box, the results may be unreliable. - If the prompt contains a foreground point outside the provided bounding box, the workflow will adjust the bounding box to include all foreground points in that prompt group. - Background points outside the bounding box are ignored. - Regions outside the bounding box will be masked out in the final segmentation mask.
```{eval-rst} .. autosummary:: :toctree: _autosummary ```