3. Build and configure your own client/worker

You can develop a custom client to submit jobs to ArmoniK and invoke services embedded in a worker shared library.

3.1. Background: execution paths, blobs, and the convention

3.1.1. Two execution paths

The C++ SDK supports two ways for a client to communicate with a worker:

Convention path

Legacy path

Task input type

TaskDefinition (named blobs)

TaskPayload (binary, deprecated)

Payload wire format

JSON envelope with named inputs

C++-specific binary encoding

Library identification

DynamicLibrary struct in task options

filename: appName.appVersion

Cross-SDK compatible

Yes (C++, Java, C#)

No (C++ only)

Status

Recommended

Deprecated

The DynamicWorker selects the path at runtime: the presence of the ConventionVersion key in TaskOptions.options triggers the convention path; its absence falls through to the unchanged legacy path.

3.1.2. Blobs

A blob is a named, immutable byte sequence stored in ArmoniK’s result storage and identified by a UUID. Blobs are the standard unit for passing data between clients and workers in the convention path — the concept and terminology are shared with the Java and C# SDKs.

In the convention path:

  • Task inputs are uploaded as blobs before submission. The SDK does this automatically when you use BlobDefinition::FromData.

  • The worker library itself can also be uploaded as a blob via SessionService::UploadLibrary, so the worker pod can fetch and load it at execution time without needing it on the local filesystem.

  • The output of a completed task is stored as a blob. Its ID (result_id) is passed to HandleResponse and can be fed directly into a downstream task via BlobDefinition::FromBlobId — no re-upload needed.

In the legacy path the payload is embedded directly in the task submission request. Blobs are still used internally — the SDK serialises the payload into a blob and the worker retrieves it from blob storage — but this is entirely transparent: the TaskPayload interface hides the blob lifecycle from the caller, and HandleResponse does not expose a result_id for the result blob.

3.1.3. The convention

The convention is an agreed encoding that allows a client written on top of any supported SDK (C++, Java, C#) to submit tasks to a worker written on top of any supported SDK. It consists of two parts:

Task options keys — carried in the flat TaskOptions.options map:

Key

Set by

Purpose

ConventionVersion

SetDynamicLibrary

Marks the task as a convention task; value is "v1"

LibraryPath

SetDynamicLibrary

Path to the .so. When LibraryBlobId is absent the worker dlopens this path directly (must be a valid path on the worker filesystem). When LibraryBlobId is present the worker ignores this field and resolves the library from blob storage instead.

LibraryBlobId

UploadLibrary + SetDynamicLibrary

Blob ID of the uploaded .so; worker downloads it to a temp path and dlopens it at runtime. When set, LibraryPath is not used by the worker.

Symbol

SetDynamicLibrary

Method name forwarded to call() as the name argument

Payload format — a JSON envelope {"inputs":{...},"outputs":{...}} replacing the C++-specific binary encoding. This is an internal wire format managed by the SDK; workers and clients do not need to parse it directly.

3.1.4. Backward compatibility

All changes are backward compatible. Existing code that uses TaskPayload and HandleResponse(result_payload, taskId) continues to compile and run without modification. The only visible change is a compiler deprecation warning at TaskPayload call sites. The convention is strictly opt-in: if ConventionVersion is absent from task options the DynamicWorker follows the legacy path unchanged.

For new code, use TaskDefinition and override HandleResponse(result_payload, taskId, result_id). If you have existing code that overrides HandleResponse(result_payload, taskId), it will still compile and run — the SDK calls the result_id overload internally and the default implementation delegates to the legacy one — but HandleResponse(result_payload, taskId) is deprecated and will be removed in a future release.


3.2. Client

The SessionService class in ArmoniK::Sdk::Client is the entry point for task submission. Its constructor takes a Properties object (configuration + default task options) and a logger.

3.2.1. Configure the session

The Properties object is loaded from a JSON file or environment variables:

  • Endpoint — control plane address

  • mTLS — enable/disable mTLS

  • caCert / clientCert / clientKey — certificate files for mTLS

  • sslValidation — strict SSL validation (default: true)

Environment variables override the JSON file. The default task options carried in Properties include:

  • MaxRetries (default: 3)

  • Priority (default: 2)

  • AppName, AppVersion, AppNamespace, AppService — library and service identification

  • PartitionId — infrastructure partition to route tasks to

  • For the convention path: DynamicLibrary configuration encoded via SetDynamicLibrary

3.2.2. Invocation handler

Implement IServiceInvocationHandler before submitting tasks. Override HandleResponse (with result_id) and HandleError:

class MyHandler : public ArmoniK::Sdk::Client::IServiceInvocationHandler {
public:
  void HandleResponse(const std::string &result_payload,
                      const std::string &taskId,
                      const std::string &result_id) override {
    // result_payload: the worker's return value (raw bytes)
    // result_id:      blob ID of the result — pass to BlobDefinition::FromBlobId
    //                 to chain tasks without re-uploading
  }

  void HandleError(const std::exception &e, const std::string &taskId) override {
    // handle task failure
  }
};

Warning: HandleResponse and HandleError can be called concurrently for different tasks. Your implementation must be thread-safe: protect any shared state (counters, result collections, etc.) with a mutex or equivalent.

HandleError is called once per task, after the task is permanently aborted. This happens in two cases:

  • The worker threw ArmoniKSdkException — permanent failure, ArmoniK does not retry regardless of max_retries.

  • The worker threw any other exception — transient failure, ArmoniK retried up to max_retries times and all attempts failed.

3.2.4. Submitting tasks — legacy path

TaskPayload is deprecated but still functional. No convention keys are needed in task options.

service.Submit(
    {ArmoniK::Sdk::Common::TaskPayload("methodName", serialized_args)},
    handler);

Migrate to TaskDefinition when convenient. There is no flag-day deadline but TaskPayload will be removed in a future major release.

3.2.5. Task chaining with result_id

result_id in HandleResponse is the blob ID of the completed task’s output blob. Pass it directly to BlobDefinition::FromBlobId for a downstream task — no re-download and re-upload needed:

class ChainHandler : public ArmoniK::Sdk::Client::IServiceInvocationHandler {
public:
  std::string last_result_id;

  void HandleResponse(const std::string &, const std::string &,
                      const std::string &result_id) override {
    last_result_id = result_id;
  }
  void HandleError(const std::exception &e, const std::string &) override { /*...*/ }
};
// Compute 2² and 3² in parallel, then feed their results into an "add" task.
// handler_a and handler_b collect the result blob IDs.
service.Submit({TaskDefinition("square", {{"x", BlobDefinition::FromData("2")}})}, handler_a, opts);
service.Submit({TaskDefinition("square", {{"x", BlobDefinition::FromData("3")}})}, handler_b, opts);
service.WaitResults();

// handler_a->last_result_id and handler_b->last_result_id are now set.
service.Submit(
    {TaskDefinition("add",
        {{"a", BlobDefinition::FromBlobId(handler_a->last_result_id)},
         {"b", BlobDefinition::FromBlobId(handler_b->last_result_id)}})},
    handler_c, opts);
service.WaitResults();
// result: 4 + 9 = 13

3.2.6. Waiting for results

WaitResults blocks until the requested tasks complete and calls the handler for each result:

service.WaitResults();                        // wait for all submitted tasks
service.WaitResults({"id1", "id2"});          // wait for a specific subset
service.WaitResults({}, Any | BreakOnError);  // return on first completion or abort

Parameters:

  • task_ids — IDs to wait on; empty means all tasks submitted with this service.

  • waitBehaviorAll (default) or Any; combine with BreakOnError to exit as soon as a result is aborted.

  • waitOptions — polling interval and other timing controls.

3.2.7. Cross-SDK interoperability Samples

A C++ worker built with the convention path can receive tasks from clients written in other SDKs. See the reference in the ArmoniK Samples repository:

The worker library used by both of them is also in the Samples repository, see the worker code in the ChainedArithmetic sample.


3.3. Worker

The ArmoniK DynamicWorker receives tasks and dispatches them to a shared library. It selects the execution path at runtime:

  • If ConventionVersion is present in TaskOptions.optionsconvention path: the worker resolves the library (from filesystem or blob storage), parses the JSON payload, and calls ServiceBase::call(ctx, name, inputs_map).

  • Otherwise → legacy path: the worker locates the library by application_name/application_version filename and calls ServiceBase::call(ctx, name, raw_payload).

In both cases the implementation lives in a shared library (ServiceBase subclass + armonik_create_service).

3.3.1. Deploying the shared library

The library must be accessible to the worker at execution time. Two options:

  • Copy to shared storage (legacy path, and convention path without UploadLibrary):

    # HostPath deployment (local):
    kubectl -n armonik get secret -o template={{.data.host_path}} shared-storage | base64 -d
    
    # S3 deployment:
    kubectl -n armonik get secret -o template={{.data.service_url}} shared-storage | base64 -d
    

    For the legacy path the filename must match application_name.application_version or application_name.

  • Upload as a blob (convention path with UploadLibrary): call service.UploadLibrary(path, lib) from the client. The worker downloads the .so from blob storage at task execution time; no manual file copy is required. library_path in DynamicLibrary is then optional.

    The call flow and service/session lifetimes within a worker pod are described in the flowchart in the appendix.

3.3.2. Building the shared library

Link your library to ArmoniK.SDK.Worker (C++14 or later). It provides default implementations of armonik_enter_session, armonik_call, armonik_leave_session, and armonik_destroy_service. You only need to provide armonik_create_service.

Derive from ArmoniK::Sdk::Worker::ServiceBase and implement armonik_create_service. Reference examples: EchoService.h, AdditionService.h, ServiceDispatch.cpp.

3.3.2.2. Legacy-path worker

Override the string-based call() to receive the raw binary payload. The default implementation of the string overload parses the payload as convention JSON and delegates to the map overload — so existing legacy workers that already override the string version are unaffected by the PR #76 changes.

std::string call(void *session_ctx, const std::string &name,
                 const std::string &input) override {
  // deserialize `input` using your own binary format
  return serialized_result;
}

The library is loaded by filename: application_name.application_version or application_name.

3.3.2.3. Exception handling

The SDK maps exceptions thrown by ServiceBase::call() to two outcomes:

Exception type

Status

ArmoniK behaviour

ArmoniK::Sdk::Common::ArmoniKSdkException

Permanent error

Task is marked Error; no retry is performed regardless of max_retries. HandleError is called on the client.

Any other std::exception

Transient error

ArmoniK retries the task according to TaskOptions::max_retries. Once all retries are exhausted HandleError is called on the client.

Use ArmoniKSdkException for unrecoverable failures (bad input, unsupported method, logic errors) where retrying would never succeed. Let any other exception propagate as-is for transient conditions (network timeouts, temporary resource unavailability) that a retry may resolve.

#include <armonik/sdk/common/ArmoniKSdkException.h>

std::string call(void *ctx, const std::string &name,
                 const std::map<std::string, std::string> &inputs) override {
  if (name != "square")
    // Permanent: retrying won't help — unknown method name won't change.
    throw ArmoniK::Sdk::Common::ArmoniKSdkException("unknown method: " + name);

  auto it = inputs.find("x");
  if (it == inputs.end())
    throw ArmoniK::Sdk::Common::ArmoniKSdkException("missing input 'x'");

  // Any std::exception propagated here (e.g. from a database call) will
  // trigger a retry automatically.
  int x = std::stoi(it->second);
  return std::to_string(x * x);
}

3.3.2.4. Raw C symbols (advanced)

If you prefer not to use ServiceBase, implement the five mandatory C symbols directly:

  • armonik_create_service — creates and returns a service context (may be null).

  • armonik_enter_session — called on session start; returns a session context.

  • armonik_call — executes a method and sends the result or failure via the provided callback.

  • armonik_leave_session — frees session context resources.

  • armonik_destroy_service — frees service context resources.

See the ArmoniKSDKInterface.h documentation for the full signatures.