Remote

This is an internal Substrafl module, the user should not use any functions here directly, apart from the remote_data and remote decorators. This modules defines how the user code is wrapped, transformed and registered as Substra algorithms.

Decorator

Decorators to wrap functions so that they are executed on the remote organizations.

substrafl.remote.decorators.remote(method: Callable)

Decorator for a remote function. With this decorator, when the function is called, it is not executed but it returns a AggregateOperation object containing all the informations needed to execute it later (see substrafl.remote.operations.AggregateOperation).

  • The decorated function definition should have at least a shared_state argument

  • If the decorated function is called without a _skip=True argument, the arguments required are the ones in remote_method_inner

  • If the decorated function is called with a _skip=True argument, it should have the arguments of its original definition

  • The decorated function should be within a class

  • The __init__ of the class must be

    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
    
  • self.args and self.kwargs will be given to the init, any other init argument is ignored (not saved in the RemoteStruct)

Parameters

method (Callable) – Method to wrap so that it is executed on the remote server

substrafl.remote.decorators.remote_data(method: Callable)

Decorator for a remote function containing a data_samples argument (e.g the Algo.train function) With this decorator, when the function is called, it is not executed but it returns a DataOperation object containing all the informations needed to execute it later (see substrafl.remote.operations.DataOperation).

  • The decorated function definition should have at least a shared_state argument

  • If the decorated function is called without a _skip=True argument, the arguments required are the ones in remote_method_inner, and it should have at least a data_samples argument

  • If the decorated function is called with a _skip=True argument, it should have the arguments of its original definition

  • The decorated function should be within a class

  • The __init__ of the class must be

    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
    
  • self.args and self.kwargs will be given to the init, any other init argument is ignored (not saved in the RemoteStruct)

Parameters

method (Callable) – Method to wrap so that it is executed on the remote server

Remote Struct

class substrafl.remote.remote_struct.RemoteStruct(cls: Type, cls_args: list, cls_kwargs: dict, remote_cls: Union[Type[substrafl.remote.substratools_methods.RemoteDataMethod], Type[substrafl.remote.substratools_methods.RemoteMethod]], method_name: str, method_parameters: dict, algo_name: Optional[str])

Bases: object

Contains the wrapped user code and the necessary functions to transform it into a Substra asset to execute on the platform.

Parameters
  • cls (Type) – The remote struct type (e.g. Algorithm, dataset)

  • cls_parameters (str) – The class parameters serialized into json string. E.g.: use json.dumps({"args": [], "kwargs": kwargs})

  • remote_cls (Union[Type[RemoteDataMethod], Type[RemoteMethod]]) – The name of the class used remotely

  • remote_cls_parameters (str) – The remote class parameters serialized into json string. E.g.: use json.dumps({"args": [], "kwargs": kwargs})

  • algo_name (str, Optional) – opportunity to set a custom algo name. If None, set to “{method_name}_{class_name}”

  • cls – Locally defined class

  • cls_args (list) – Arguments (args) to instantiate the class

  • cls_kwargs (dict) – Arguments (kwargs) to instantiate the class

  • remote_cls – Remote class to create from the user code

  • method_name (str) – Name of the method from the local class to execute

  • method_parameters (dict) – Parameters to pass to the method

  • algo_name – opportunity to set a custom algo name. If None, set to “{method_name}_{class_name}”

get_cls_file_path() pathlib.Path

Get the path to the file where the cls attribute is defined.

Returns

path to the file where the cls is defined.

Return type

pathlib.Path

get_instance() Any

Get the class instance.

Returns

Instance of the saved class

Return type

Any

get_remote_instance() Union[substrafl.remote.substratools_methods.RemoteMethod, substrafl.remote.substratools_methods.RemoteDataMethod]

Get the remote class (ie Substra algo) instance.

Returns

instance of the remote Substra class

Return type

Union[RemoteMethod, RemoteDataMethod]

classmethod load(src: pathlib.Path) substrafl.remote.remote_struct.RemoteStruct

Load the remote struct from the src directory.

Parameters

src (pathlib.Path) – Path to the directory where the remote struct has been saved.

Return type

substrafl.remote.remote_struct.RemoteStruct

save(dest: pathlib.Path)

Save the instance to the dest directory using cloudpickle.

Parameters

dest (pathlib.Path) – directory where to save the remote struct

summary() Dict[str, str]

Get a summary of what the remote struct represents.

Returns

description

Return type

Dict[str, str]

Operations

Dataclasses describing the operations to execute on the remote.

class substrafl.remote.operations.AggregateOperation(remote_struct: substrafl.remote.remote_struct.RemoteStruct, shared_states: Optional[List])

Bases: object

Aggregation operation

Parameters
Return type

None

class substrafl.remote.operations.DataOperation(remote_struct: substrafl.remote.remote_struct.RemoteStruct, data_samples: List[str], shared_state: Any)

Bases: object

Data operation

Parameters
Return type

None

Substra tools methods

class substrafl.remote.substratools_methods.RemoteDataMethod(instance, method_name: str, method_parameters: typing.Dict, shared_state_serializer: typing.Type[substrafl.remote.serializers.serializer.Serializer] = <class 'substrafl.remote.serializers.pickle_serializer.PickleSerializer'>)

Bases: object

Composite algo to register to Substra

Parameters
load_head_model(path: str) Any

Load the head model from disk

Parameters

path (str) – path to the saved head model

Returns

loaded head model

Return type

Any

load_trunk_model(path: str) Any

Load the trunk model from disk

Parameters

path (str) – path to the saved trunk model

Returns

loaded trunk model

Return type

Any

predict(inputs: TypedDict, outputs: TypedDict, task_properties: TypedDict) None

predict function

Parameters
  • inputs (TypedDict) – dictionary containing: the testing data samples loaded with Opener.get_data(); the head model loaded with CompositeAlgo.load_head_model(); the trunk model loaded with CompositeAlgo.load_trunk_model();

  • outputs (TypedDict) – dictionary containing: the output predictions path to save the predictions.

  • task_properties (TypedDict) – Unused.

Return type

None

register_substratools_functions()

Register the functions that can be accessed and executed by substratools.

save_head_model(model, path: str) None

Save the head model

Parameters
  • model (Any) – Head model to save

  • path (str) – Path where to save the model

Return type

None

save_trunk_model(model, path: str) None

Save the trunk model

Parameters
  • model (Any) – Trunk model to save

  • path (str) – Path where to save the model

Return type

None

train(inputs: TypedDict, outputs: TypedDict, task_properties: TypedDict) None

train method

Parameters
  • inputs (TypedDict) – dictionary containing: the training data samples loaded with Opener.get_data(); the head model loaded with CompositeAlgo.load_head_model() (may be None); the trunk model loaded with CompositeAlgo.load_trunk_model() (may be None); the rank of the training task.

  • outputs (TypedDict) – dictionary containing: the output head model path to save the head model; the output trunk model path to save the trunk model.

  • task_properties (TypedDict) – Unused.

Return type

None

class substrafl.remote.substratools_methods.RemoteMethod(instance, method_name: str, method_parameters: typing.Dict, shared_state_serializer: typing.Type[substrafl.remote.serializers.serializer.Serializer] = <class 'substrafl.remote.serializers.pickle_serializer.PickleSerializer'>)

Bases: object

Aggregate algo to register to Substra.

Parameters
aggregate(inputs: TypedDict, outputs: TypedDict, task_properties: TypedDict) None

Aggregation operation

Parameters
  • inputs (TypedDict) – dictionary containing: the list of models path loaded with AggregateAlgo.load_model();

  • outputs (TypedDict) – dictionary containing: the output model path to save the aggregated model.

  • task_properties (TypedDict) – dictionary containing: the rank of the aggregate task.

Return type

None

load_model(path: str) Any

Load the model from disk, may be a in model of the aggregate or the out aggregated model.

Parameters

path (str) – Path where the model is saved

Returns

Loaded model

Return type

Any

register_substratools_functions()

Register the functions that can be accessed and executed by substratools.

save_model(model, path: str)

Save the model

Parameters
  • model (Any) – Model to save

  • path (str) – Path where to save the model

Register

Create the Substra algo assets and register them to the platform.

substrafl.remote.register.register.add_metric(client: substra.sdk.client.Client, permissions: substra.sdk.schemas.Permissions, dependencies: substrafl.dependency.Dependency, metric_function: Callable, metric_name: Optional[str] = None) str

Adds a metric to the Substra platform using the given metric function as the algorithm to execute. The metric function must be of type function, and its signature must ONLY contains datasamples and predictions_path as parameters. An error is raised otherwise.

Parameters
  • client (substra.Client) – The substra client.

  • permissions (substra.sdk.schemas.Permissions) – Permissions for the metric function.

  • dependencies (Dependency) – Metric function dependencies.

  • metric_function (Callable) – function to compute the score from the datasamples and the predictions. This function is registered in substra as a metric.

  • metric_name (str, Optional) – Optional name chosen by the user to identify the metric. If None, the metric name is set to the ‘metric_{metric_function.__name__}’.

Returns

The metric key of the metric created from the metric function.

Return type

str

substrafl.remote.register.register.register_algo(client: substra.sdk.client.Client, remote_struct: substrafl.remote.remote_struct.RemoteStruct, permissions: substra.sdk.schemas.Permissions, inputs: List[substra.sdk.schemas.AlgoInputSpec], outputs: List[substra.sdk.schemas.AlgoOutputSpec], dependencies: substrafl.dependency.Dependency) str

Automatically creates the needed files to register the composite algorithm associated to the remote_struct.

Parameters
  • client (substra.Client) – The substra client.

  • remote_struct (RemoteStruct) – The substra submittable algorithm representation.

  • permissions (substra.sdk.schemas.Permissions) – Permissions for the algorithm.

  • inputs (List[substra.sdk.schemas.AlgoInputSpec]) – List of algo inputs to be used.

  • outputs (List[substra.sdk.schemas.AlgoOutputSpec]) – List of algo outputs to be used.

  • dependencies (Dependency) – Algorithm dependencies.

Returns

Substra algorithm key.

Return type

str

Generate wheels for the Substra algo.

substrafl.remote.register.generate_wheel.local_lib_wheels(lib_modules: List, operation_dir: pathlib.Path, python_major_minor: str, dest_dir: str) str

Prepares the private modules from lib_modules list to be installed in a Docker image and generates the appropriated install command for a dockerfile. It first creates the wheel for each library. Each of the libraries must be already installed in the correct version locally. Use command: pip install -e library-name in the directory of each library.

This allows one user to use custom version of the passed modules.

Parameters
  • lib_modules (list) – list of modules to be installed.

  • operation_dir (pathlib.Path) – PosixPath to the operation directory

  • python_major_minor (str) – version which is to be used in the dockerfile. Eg: ‘3.8’

  • dest_dir (str) – relative directory where the wheels are saved

Returns

dockerfile command for installing the given modules

Return type

str

substrafl.remote.register.generate_wheel.pypi_lib_wheels(lib_modules: List, operation_dir: pathlib.Path, python_major_minor: str, dest_dir: str) str

Retrieves lib_modules’ wheels from Owkin private repo (if needed) to be installed in a Docker image and generates the appropriated install command for a dockerfile.

Parameters
  • lib_modules (list) – list of modules to be installed.

  • operation_dir (pathlib.Path) – PosixPath to the operation directory

  • python_major_minor (str) – version which is to be used in the dockerfile. Eg: ‘3.8’

  • dest_dir (str) – relative directory where the wheels are saved

Returns

dockerfile command for installing the given modules

Return type

str

Serializers

Serializers to save the user code and wrap it in the Substra algo code.

class substrafl.remote.serializers.PickleSerializer

Bases: substrafl.remote.serializers.serializer.Serializer

static load(path: pathlib.Path) Any

Load an object from a path using pickle.load

Parameters

path (pathlib.Path) – path to the saved file

Returns

loaded state

Return type

Any

static save(state: Any, path: pathlib.Path)

Pickle the state to path

Parameters
  • state (Any) – state to save

  • path (pathlib.Path) – path where to save it

class substrafl.remote.serializers.Serializer

Bases: abc.ABC

abstract static save(state: Any, path: pathlib.Path)

Save the state to the path

Parameters
  • state (Any) – state to save

  • path (pathlib.Path) – path where to save it