Local deployment

This page gives the directions to locally run the Substra stack. This deployment is made of:

  • 1 orchestrator (running in standalone mode, i.e. storing data in its own local database)

  • 2 backends (running in two organisations, org-1 and org-2)

  • 1 frontend

It allows you to run the examples and start using Substra SDK (also known as substra).

Prerequisites

Hardware

The following table indicates the resources needed to run the Substra stack locally. The minimal profile represents resources needed to be able to run the stack, whereas the recommended profile describes how much is needed to have a comfortable development experience.

CPU

Hard drive space

RAM

Minimal

2 cores

70 GB

10 GB

Recommended

4-8 cores

100 GB

16 GB

Caution

Choose wisely the parameters passed to Kubernetes as it might try to use all the allocated resources without regards for your system.

Caution

Check that enough available disk space is allocated to Docker, else you might run into errors.

Software

Instructions for Mac

First, install Homebrew, then run the following commands:

brew install k3d
brew install kubectl
brew install skaffold
brew install helm
brew install gsed # Needed for k3s-create.sh

First time configuration

  1. Create a Kubernetes cluster, create and patch the Nginx ingress to enable SSL passthrough:

    1. Download k3-create.sh.

    2. Make the script executable.

      chmod +x ./k3-create.sh
      
    3. Run the script

      ./k3-create.sh
      

    Tip

    This script can be used to reset your development environment.

  2. Add the following line to the /etc/hosts file to allow the communication between your local cluster and the host (your machine):

    127.0.0.1 orchestrator.org-1.com orchestrator.org-2.com substra-frontend.org-1.com substra-frontend.org-2.com substra-backend.org-1.com substra-backend.org-2.com
    
  3. Add the helm repositories

    helm repo add bitnami https://charts.bitnami.com/bitnami
    helm repo add twuni https://helm.twun.io
    helm repo add jetstack https://charts.jetstack.io
    
  4. Clone the Substra components repositories

    • orchestrator

      git clone https://github.com/Substra/orchestrator.git
      
    • substra-backend

      git clone https://github.com/Substra/substra-backend.git
      
    • substra-frontend

      git clone https://github.com/Substra/substra-frontend.git
      
  5. Update Helm charts

    cd orchestrator/charts/orchestrator/
    helm dependency update
    cd ../../../
    cd substra-backend/charts/substra-backend/
    helm dependency update
    cd ../../../
    

Launching

  • Deploy the orchestrator

    cd orchestrator
    skaffold run
    
  • Deploy the backend

    cd substra-backend
    skaffold run
    

Caution

On arm64 architecture (e.g. Apple silicon chips M1 & M2), you need to add the profiles dev and arm64.

skaffold run -p dev,arm64

Tip

If you need to re-run skaffold run for whatever reason, don’t forget to use skaffold delete to reset the state beforehand (or reset your environment by running the k3-create.sh script again).

Tip

When re-launching the orchestrator and the backend, you can speed up the processing by avoiding the update of the chart dependencies using the profile nodeps.

skaffold run -p nodeps
  • Deploy the frontend

    cd substra-frontend
    docker build -f docker/substra-frontend/Dockerfile --target dev -t substra-frontend .
    docker run -it --rm -p 3000:3000 -e API_URL=http://substra-backend.org-1.com -v ${PWD}/src:/workspace/src substra-frontend
    

    You can access the frontend at http://substra-frontend.org-1.com:3000/. The dev credentials are:

    • login: org-1

    • password: p@sswr0d44

Launching computations

One way to test that everything is working fine is to launch computations on your local deployment. To do that you can use the MNIST federated learning example and setup the clients with the following values:

client_org_1 = substra.Client(url="http://substra-backend.org-1.com")
client_org_1.login(username="org-1", password="p@sswr0d44")

client_org_2 = substra.Client(url="http://substra-backend.org-2.com")
client_org_2.login(username="org-2", password="p@sswr0d45")

Monitoring

You can use kubectl command to monitor the pods. Tools like k9s and k8lens provide graphical interfaces to monitor the pods and get their logs.

Stopping

To stop the Substra stack, you need to stop the 3 components (backend, orchestrator and frontend) individually.

  • Stop the frontend: Stop the process running the local server in Docker (using Control+C)

  • Stop the orchestrator:

    cd orchestrator
    skaffold delete
    
  • Stop the backend:

    cd substra-backend
    skaffold delete
    

If this command fails and you still have pods up, you can use the following command to remove the org-1 and org-2 namespaces entirely.

kubectl delete ns org-1 org-2

Next steps

Now you are ready to go, you can either run the Substra examples or the SubstraFL examples.

This local deployment is for developing or testing Substra. If you want to have a more production-ready deployment and a more customized set-up, have a look at the deployment section.

Documentation on running tests on any of the Substra components is available on the component repositories, see substra, substrafl, substra-tools, substra-backend, orchestrator, substra-frontend and substra-tests repositories.

Troubleshooting

Note

Before going further in this section, you should check the following points:
  • Check the version of Skaffold, Helm and Docker. For example, Skaffold is released very often and sometime it introduces bugs, creating unexpected errors.

  • Check the version of the different Substra components:

    • if you are using a release you can use the compatibility table.

    • if you are using the latest commit from the main git branch, check that you are up-to-date and see if there were any open issue in the repositories or any bugfixes in the latest commits.

You can also go through the instructions one more time, maybe they changed since you last saw them.

Troubleshooting prerequisites

This section summarize errors happening when you are not meeting the hardware requirements. Please check if you match these first.

Note

The instructions are targeted to some specific platforms (Docker for Windows in certain cases and Docker for Mac), where you can set the resources allowed to Docker in the configuration panel (information available here for Mac and here for Windows).

The following list describes errors that have already occurred, and their resolutions.

  • <ERROR:substra.sdk.backends.remote.rest_client:Requests error status 502: <html>
    <head><title>502 Bad Gateway</title></head>
    <body>
    <center><h1>502 Bad Gateway</h1></center>
    <hr><center>nginx</center>
    </body>
    </html>
    
    WARNING:root:Function _request failed: retrying in 1s>
    

    You may have to increase the number of CPU available in the settings panel.

  • Unable to connect to the server: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
    
    Unable to connect to the server: net/http: TLS handshake timeout
    

    You may have to increase the RAM available in the settings panel.

  • If you’ve got a task with FAILED status and the logs in the worker are of this form:

    substrapp.exceptions.PodReadinessTimeoutError: Pod substra.ai/pod-name=substra-***-compute-*** failed to reach the \"Running\" phase after 300 seconds."
    

    Your Docker disk image might be full, increase it or clean it with docker system prune -a

Troubleshooting deployment

Skaffold version

Skaffold schemas have some incompatibilities between version 1.x and version 2.0. Check your version number and upgrade to Skaffold v2 (2.1.0 recommended) if necessary.

skaffold version
brew upgrade skaffold

Other errors during backend deployment

If you encounter one of the following errors while deploying the backend:

Error: UPGRADE FAILED: cannot patch "orchestrator-org-1-server" with kind Certificate: Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s": dial tcp <ip>:443: connect: connection refused
deploying "orchestrator-org-1": install: exit status 1
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post "https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s": x509: certificate signed by unknown authority

Check that the orchestrator is deployed and relaunch the command skaffold run.