Cano
Type-safe async workflow engine with built-in scheduling, retry logic, and state machine semantics.
Cano is still far from a 1.0 release; the API is subject to change and may include breaking changes.
Cano is a high-performance orchestration engine designed for building resilient, self-healing systems in Rust. Unlike simple task queues, Cano uses Finite State Machines (FSM) to define strict, type-safe transitions between processing steps.
It excels at managing complex lifecycles where state transitions matter:
- Data Pipelines: ETL jobs with parallel processing (Split/Join) and aggregation.
- AI Agents: Multi-step inference chains with shared context and memory — see
cargo run --example ai_workflow_yes_and(needs a local Ollama). - Background Systems: Scheduled maintenance, periodic reporting, and distributed cron jobs.
Features
Processing Models
A whole Task family: plain Task, side-effect-free RouterTask, wait-until PollTask, fan-out BatchTask, resumable SteppedTask — mixed freely in one workflow.
State Machines
Type-safe enum-driven state transitions with compile-time checking.
Retry Strategies
Fixed delays, exponential backoff with jitter, and custom strategies.
Scheduling
Built-in scheduler with intervals, cron schedules, and manual triggers.
Concurrency
Execute multiple workflow instances in parallel with timeout strategies.
Crash Recovery
Pluggable CheckpointStore records every state entry with an optional workflow_version stamp; resume_from rehydrates a crashed run and refuses checkpoints whose version disagrees with the workflow. Embedded, ACID RedbCheckpointStore behind the recovery feature.
Sagas / Compensation
Pair a forward step with a compensate action; a later failure rolls back the work already done, in reverse — and replays the rollback across a crash.
Observability
Built-in tracing spans and metrics counters, plus WorkflowObserver hooks and resource health probes.
Resilient, Self-Healing
What the tagline means, concretely. Every one of these is opt-in and zero-cost when unused — the FSM dispatch hot path stays allocation-light whether or not you wire any of it up.
Resilient — recover from transient faults
- Retries — fixed, or exponential backoff with jitter
- Per-attempt timeouts and workflow total timeout with bounded compensation drain
- Circuit breaker — short-circuit a failing dependency
- Split bulkhead — cap concurrent parallel tasks
- Panic safety — a panicking task becomes an error, never unwinds the engine
- Scheduler backoff & trip for flaky scheduled flows
Self-healing — repair & report on its own state
- Checkpoint + resume — replay a crashed run from its last state
- Sagas / compensation — roll back completed work in reverse on failure
- Observer hooks — synchronous lifecycle / failure / retry / checkpoint events
- Resource health probes — on-demand health for a workflow's dependencies
Full coverage: the Resilience, Recovery, Saga and Observers guides.
Getting Started
Cano requires Rust 1.89.0+ (edition 2024). Add it to your Cargo.toml:
[dependencies]
cano = { version = "0.13", features = ["all"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
Cano ships with no features enabled by default. features = ["all"] turns
on all four optional features at once:
scheduler— theScheduler(cron + interval + manual triggers)tracing—tracing-crate spans and theTracingObserverrecovery—RedbCheckpointStore, the embedded ACID checkpoint store (theCheckpointStoretrait itself is always available)metrics—metrics-crate counters / histograms / gauges and theMetricsObserver
Pick only what you need — e.g. features = ["recovery"], or omit features
entirely for the lean core. Cano runs on the Tokio runtime, so tokio is a required
direct dependency — you launch the runtime via #[tokio::main] or
tokio::runtime::Builder. The two tokio features above are the minimum to
do that; add "time", "sync", etc. only if your own code calls into them, or
use "full" if you prefer convenience over compile time.
Basic Example
use cano::prelude::*;
// Define workflow states
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
enum WorkflowState {
Start,
Process,
Complete,
}
// #[derive(Resource)] generates a no-op Resource impl for stateless config structs
#[derive(Resource)]
struct AppConfig { batch_size: usize }
// #[task] handles the async-trait rewrite — no external async-trait crate needed
#[derive(Clone)]
struct SimpleTask;
#[task(state = WorkflowState)]
impl SimpleTask {
async fn run(&self, res: &Resources) -> Result<TaskResult<WorkflowState>, CanoError> {
let config = res.get::<AppConfig, _>("config")?;
println!("Processing task (batch_size={})...", config.batch_size);
Ok(TaskResult::Single(WorkflowState::Process))
}
}
#[derive(Clone)]
struct DoneTask;
#[task(state = WorkflowState)]
impl DoneTask {
async fn run_bare(&self) -> Result<TaskResult<WorkflowState>, CanoError> {
println!("Done!");
Ok(TaskResult::Single(WorkflowState::Complete))
}
}
#[tokio::main]
async fn main() -> Result<(), CanoError> {
let resources = Resources::new()
.insert("config", AppConfig { batch_size: 64 });
let workflow = Workflow::new(resources)
.register(WorkflowState::Start, SimpleTask)
.register(WorkflowState::Process, DoneTask)
.add_exit_state(WorkflowState::Complete);
workflow.orchestrate(WorkflowState::Start).await?;
Ok(())
}
Run a working version with cargo run --example workflow_simple (or
task_simple for the bare-task variant).
Where to go next
New to Cano? Read the docs roughly in this order:
- Workflows — defining states, the builder, validation, and how a run executes.
- Resources — typed, lifecycle-managed dependency injection (every task receives a
&Resources). - Task — the default processing unit, then the rest of the Task family (RouterTask, PollTask, BatchTask, SteppedTask) as you hit a shape that fits.
- Split & Join and Scheduler — parallelism within a workflow, and time-driven execution of workflows.
- Resilience & recovery: Resilience, Recovery, Saga.
- Observability: Tracing, Metrics, Observers.
Every concept has a runnable example under cano/examples/ — each page links the relevant ones.