Cano

Type-safe async workflow engine with built-in scheduling, retry logic, and state machine semantics.

Cano is still far from a 1.0 release; the API is subject to change and may include breaking changes.

Cano is a high-performance orchestration engine designed for building resilient, self-healing systems in Rust. Unlike simple task queues, Cano uses Finite State Machines (FSM) to define strict, type-safe transitions between processing steps.

It excels at managing complex lifecycles where state transitions matter:

Data Pipelines: ETL jobs with parallel processing (Split/Join) and aggregation.
AI Agents: Multi-step inference chains with shared context and memory — see cargo run --example ai_workflow_yes_and (needs a local Ollama).
Background Systems: Scheduled maintenance, periodic reporting, and distributed cron jobs.

Features

Processing Models

A whole Task family: plain Task, side-effect-free RouterTask, wait-until PollTask, fan-out BatchTask, resumable SteppedTask — mixed freely in one workflow.

State Machines

Type-safe enum-driven state transitions with compile-time checking.

Retry Strategies

Fixed delays, exponential backoff with jitter, and custom strategies.

Scheduling

Built-in scheduler with intervals, cron schedules, and manual triggers.

Concurrency

Execute multiple workflow instances in parallel with timeout strategies.

Crash Recovery

Pluggable CheckpointStore records every state entry with an optional workflow_version stamp; resume_from rehydrates a crashed run and refuses checkpoints whose version disagrees with the workflow. Embedded, ACID RedbCheckpointStore behind the recovery feature.

Sagas / Compensation

Pair a forward step with a compensate action; a later failure rolls back the work already done, in reverse — and replays the rollback across a crash.

Observability

Built-in tracing spans and metrics counters, plus WorkflowObserver hooks and resource health probes.

Resilient, Self-Healing

What the tagline means, concretely. Every one of these is opt-in and zero-cost when unused — the FSM dispatch hot path stays allocation-light whether or not you wire any of it up.

Resilient — recover from transient faults

Retries — fixed, or exponential backoff with jitter
Per-attempt timeouts and workflow total timeout with bounded compensation drain
Circuit breaker — short-circuit a failing dependency
Split bulkhead — cap concurrent parallel tasks
Panic safety — a panicking task becomes an error, never unwinds the engine
Scheduler backoff & trip for flaky scheduled flows

Self-healing — repair & report on its own state

Checkpoint + resume — replay a crashed run from its last state
Sagas / compensation — roll back completed work in reverse on failure
Observer hooks — synchronous lifecycle / failure / retry / checkpoint events
Resource health probes — on-demand health for a workflow's dependencies

Full coverage: the Resilience, Recovery, Saga and Observers guides.

Getting Started

Cano requires Rust 1.89.0+ (edition 2024). Add it to your Cargo.toml:

[dependencies]
cano = { version = "0.13", features = ["all"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Cano ships with no features enabled by default. features = ["all"] turns on all four optional features at once:

scheduler — the Scheduler (cron + interval + manual triggers)
tracing — tracing-crate spans and the TracingObserver
recovery — RedbCheckpointStore, the embedded ACID checkpoint store (the CheckpointStore trait itself is always available)
metrics — metrics-crate counters / histograms / gauges and the MetricsObserver

Pick only what you need — e.g. features = ["recovery"], or omit features entirely for the lean core. Cano runs on the Tokio runtime, so tokio is a required direct dependency — you launch the runtime via #[tokio::main] or tokio::runtime::Builder. The two tokio features above are the minimum to do that; add "time", "sync", etc. only if your own code calls into them, or use "full" if you prefer convenience over compile time.

Basic Example

use cano::prelude::*;

// Define workflow states
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
enum WorkflowState {
    Start,
    Process,
    Complete,
}

// #[derive(Resource)] generates a no-op Resource impl for stateless config structs
#[derive(Resource)]
struct AppConfig { batch_size: usize }

// #[task] handles the async-trait rewrite — no external async-trait crate needed
#[derive(Clone)]
struct SimpleTask;

#[task(state = WorkflowState)]
impl SimpleTask {
    async fn run(&self, res: &Resources) -> Result<TaskResult<WorkflowState>, CanoError> {
        let config = res.get::<AppConfig, _>("config")?;
        println!("Processing task (batch_size={})...", config.batch_size);
        Ok(TaskResult::Single(WorkflowState::Process))
    }
}

#[derive(Clone)]
struct DoneTask;

#[task(state = WorkflowState)]
impl DoneTask {
    async fn run_bare(&self) -> Result<TaskResult<WorkflowState>, CanoError> {
        println!("Done!");
        Ok(TaskResult::Single(WorkflowState::Complete))
    }
}

#[tokio::main]
async fn main() -> Result<(), CanoError> {
    let resources = Resources::new()
        .insert("config", AppConfig { batch_size: 64 });

    let workflow = Workflow::new(resources)
        .register(WorkflowState::Start, SimpleTask)
        .register(WorkflowState::Process, DoneTask)
        .add_exit_state(WorkflowState::Complete);

    workflow.orchestrate(WorkflowState::Start).await?;
    Ok(())
}

Run a working version with cargo run --example workflow_simple (or task_simple for the bare-task variant).

Where to go next

New to Cano? Read the docs roughly in this order:

Workflows — defining states, the builder, validation, and how a run executes.
Resources — typed, lifecycle-managed dependency injection (every task receives a &Resources).
Task — the default processing unit, then the rest of the Task family (RouterTask, PollTask, BatchTask, SteppedTask) as you hit a shape that fits.
Split & Join and Scheduler — parallelism within a workflow, and time-driven execution of workflows.
Resilience & recovery: Resilience, Recovery, Saga.
Observability: Tracing, Metrics, Observers.

Every concept has a runnable example under cano/examples/ — each page links the relevant ones.