Substra is an open source federated learning (FL) software. It provides a flexible Python library and a web application to run federated learning training at scale.
The key Substra differentiators are:
Framework agnostic — Any Python library can be used: PyTorch, TensorFlow, sklearn, etc.
Flexible — Any kind of computation can be run: machine learning, analytics, etc.
Scalable — Support for vertical scaling (several trainings on one machine) and horizontal scaling (training on several machines).
Traceable — All machine learning operations are logged in an auditable read-only database.
Web application — A web application to monitor long-running computations and explore model’s performances.
Production ready — Packaged in Kubernetes and regularly audited.
Debugging made easy — Remote error logs are accessible to data scientists. The same code can be run in a deployed production environment or on a single machine to debug.
How does it work?¶
- Substra has three user interfaces:
Substra: a low-level Python library (also called SDK). Substra is used to create datasets, functions and machine learning tasks on the platform.
SubstraFL: a high-level federated learning Python library based on Substra. SubstraFL is used to run complex federated learning experiments at scale.
A web application used to monitor experiments training and explore their results.
Client side: Install Substra and SubstraFL python libraries with the following command:
pip install substrafl. Substra python library is a dependency of SubstraFL, so it will be automatically installed. More information on the installation can be found here.
Server side: There are 2 options to deploy the server side of Substra (backend, frontend and orchestrator):
Local deployment: to deploy locally on a single one machine. Useful for quick tests and for development.
Production deployment: for real deployments.
You can start doing local FL experiments with Substra by installing only the client side.