Core System Overview
This document explains the core workflow model and execution flow in BubuStack, and links to the detailed reference docs that define the contract.
Who this is for
- Workflow authors designing Stories and Steps.
- Component authors building Engrams or Impulses.
- Operators who need the resource map and execution flow.
What you'll get
- A mental model of the core resources and how they relate.
- The lifecycle of a StoryRun from trigger to completion.
- Pointers to the deeper contract docs.
Unix philosophy
BubuStack's resource model follows the Unix principle: do one thing and do it well, and write programs to work together.
- An Engram is a program. It does one job — fetch HTTP, summarize text, send a notification — and produces structured output. It has no knowledge of the pipeline it participates in.
- A Story is a pipeline. It composes Engrams into a directed acyclic graph, wiring step outputs to step inputs through templates. Replace one Engram and the rest of the pipeline is unaffected.
- An Impulse is a trigger. It listens for external events — webhooks, cron schedules, Kubernetes resource changes — and creates StoryRuns with mapped inputs. It has no knowledge of the workflow it starts.
- Step outputs are the universal interface. JSON flows between steps. There
is no proprietary IPC — just structured data evaluated through the template
engine (
{{ steps["fetch"].output.body }}). - The operator (bobrapet) is the kernel. It schedules, reconciles, and enforces policy. It never owns business logic or external side effects.
- Engrams and Impulses are userspace. They own API calls, storage writes, and data transformation. The SDK provides the contract between operator and component.
This separation means every component is independently testable, replaceable, and reusable. Want Slack instead of Discord? Change the Engram ref, not the Story. Want Anthropic instead of OpenAI? Point the step at a different Engram. The pipeline doesn't care.
At a glance
- Story defines the workflow spec: DAG, policies, schemas, and templates.
- StoryRun is a concrete execution of a Story with resolved inputs and outputs.
- StepRun is the execution record for a single step within a StoryRun.
- Engram and EngramTemplate define executable components and their schemas.
- Impulse and ImpulseTemplate define triggers that launch StoryRuns from external events.
See CRD Design for the full resource model and relationships.
Core resources and relationships
| Resource | Purpose | Key fields |
|---|---|---|
| Story | Declarative workflow definition. | spec.pattern, spec.steps, spec.inputsSchema, spec.outputsSchema, spec.output, spec.compensations, spec.finally, spec.policy |
| StoryRun | An execution of a Story with concrete inputs and outputs. | spec.storyRef, spec.inputs, status.phase, status.stepStates, status.output |
| StepRun | An execution of a single step within a StoryRun. | spec.stepId, spec.input, spec.timeout, spec.retry, status.phase, status.output |
| Engram | A configured component instance used by Story steps. | spec.templateRef, spec.mode, spec.with, spec.executionPolicy |
| EngramTemplate | Cluster-scoped component definition with schema and defaults. | spec.inputSchema, spec.outputSchema, spec.execution |
| Impulse | A trigger that launches StoryRuns from external events. | spec.templateRef, spec.storyRef, spec.mapping, spec.deliveryPolicy, spec.throttle |
| ImpulseTemplate | Cluster-scoped trigger definition. | spec.contextSchema, spec.deliveryPolicy |
Execution flow (simplified)
- An Impulse receives an event and maps it to StoryRun inputs.
- The StoryRun controller resolves the Story and builds the DAG.
- StepRuns are created as steps become ready.
- If a template references offloaded data, the controller resolves it based on policy: in-process hydration from S3 (
controller), pod-based materialization (inject), or rejection (error). - StepRuns execute Engrams or primitives and publish outputs.
- StoryRun aggregates step states and final output.
Durable semantics
Delivery guarantees, recovery rules, idempotency expectations, and signal semantics are defined in Durable Semantics.
Component ecosystem
SDK usage, contracts, registry patterns, and reliability semantics are summarized in Component Ecosystem.
Workflow graph and step model
The Story spec defines a DAG. Step order in YAML does not imply execution order.
Dependencies are expressed with steps[].needs. Independent steps run in parallel.
Supported step types:
- Engram steps (type omitted,
refset) condition,parallel,sleep,stop,executeStorywaitandgate(batch only)
Failure handling and cleanup:
steps[].allowFailuremarks a step as non-fatal to the StoryRun.spec.compensationsruns after failure, beforespec.finally.spec.finallyruns after completion regardless of outcome.- Cleanup blocks work for both batch and streaming stories. In streaming mode, cleanup steps execute as batch jobs after the streaming topology terminates.
Details of primitive behavior are in Primitives.
Execution patterns: batch vs streaming
Stories declare spec.pattern:
batch: short-lived StoryRuns, standard DAG scheduling.streaming: long-lived topologies where steps process packets.
Evaluation behavior differs by field:
| Field | Batch evaluation | Streaming evaluation |
|---|---|---|
steps[].if | Runtime (DAG controller). | Runtime per packet (hub). |
steps[].with | Runtime at StepRun creation. | Runtime per packet, inputs only (deterministic). |
steps[].runtime | Not used. | Runtime per packet (hub). |
spec.output | Runtime at StoryRun finalize. | Runtime at StoryRun finalize (if applicable). |
Streaming steps can select a transport declared in spec.transports.
Expression rules and determinism are defined in Expressions.
Streaming notes:
- The bobravoz-grpc hub evaluates
ifper packet and routes to downstream Engram steps that depend on the upstream step. - If a runtime
ifevaluates to false, that branch is skipped while other branches continue. executeStoryhonorswaitForCompletion; when true the hub waits for the child StoryRun to finish.- For steps with multiple
needs, the hub performs a lightweight join when a correlation key is available (bubu.join.keyormessage_id).
See Streaming Contract and Transport Settings for transport configuration.
Inputs, outputs, and schemas
Stories define contracts:
spec.inputsSchemavalidates StoryRun inputs.spec.outputsSchemavalidates StoryRun output.spec.outputtemplates the final output frominputsandsteps.
StoryRuns provide concrete inputs in spec.inputs and store final output in status.output.
Defaults and validation behavior are specified in Inputs.
Final output is capped (1 MiB). When exceeded, the StoryRun succeeds but the output is not stored in status (no automatic storage ref is written). See Payloads.
Templating and expressions
Template strings like {{ inputs.foo }} and raw template forms ($bubuTemplate) are
used throughout Story fields. Use {{ ... }} for evaluation; plain strings are
treated as literals. The engine is Go templates plus Sprig with custom helpers
(see Expressions). Deterministic inputs-only evaluation forbids now
to preserve replayability.
When a template references offloaded payloads, the platform either rejects the
evaluation (default) or injects a materialize StepRun to hydrate and resolve the
template, depending on templating.offloaded-data-policy. The materialize engram is
selected via templating.materialize-engram.
Policies and overrides
Policies are applied hierarchically. For step execution:
StepRun > Story step execution > Engram execution > Template execution.
Story-wide defaults live under spec.policy, including:
timeouts.storyandtimeouts.stepretries.stepRetryPolicyandretries.continueOnStepFailurestorageandexecutiondefaultsexecution.placementfor pod scheduling (nodeSelector,tolerations,affinity)execution.cachefor output caching (key,mode,ttlSeconds)queueandpriorityfor scheduling order (strict priority ordering, no preemption)storyrun.queue.<name>.priority-aging-secondscan raise effective priority for queued runs
retries.stepRetryPolicy supports maxRetries, delay, backoff, maxDelay,
and jitter (0–100 percent jitter applied to the computed delay).
Policy layers
BubuStack has three distinct retry/backoff layers. They operate independently and at different scopes:
| Layer | What it retries | Who drives it | Configuration |
|---|---|---|---|
| Job backoff | Pod failures within a single Kubernetes Job. Kubelet restarts the pod container up to backoffLimit times. | Kubernetes Job controller | JobPolicy.backoffLimit, JobWorkloadConfig.backoffLimit, or operator default job.backoff-limit. RestartPolicy is set to Never (new pod per retry) or OnFailure (in-place restart). |
| Step retry | Entire step execution (creates a new Job/pod). Triggered when a StepRun finishes with a retryable exit class and retries remain. | StepRun controller | RetryPolicy.maxRetries, delay, maxDelay, backoff, jitter. Template defaults via TemplateRetryPolicy.recommendedMaxRetries, recommendedBaseDelay, recommendedMaxDelay, recommendedBackoff. Operator default retry.max-retries. |
| Trigger delivery retry | Submission of a trigger event to create a StoryRun. Retried by the SDK inside the Impulse workload. | SDK (runs inside Impulse pod) | TriggerDeliveryPolicy.retry (maxAttempts, baseDelay, maxDelay, backoff). Defaults from ImpulseTemplate.spec.deliveryPolicy, overridable on Impulse.spec.deliveryPolicy. |
Key distinction: Job backoff and step retry both affect the same step but at
different levels. A step with backoffLimit: 3 and maxRetries: 2 can produce
up to (3+1) * (2+1) = 12 pod attempts in the worst case (3 pod restarts per
Job x 3 Jobs). Trigger delivery retry is entirely separate and concerns
StoryRun creation, not step execution.
Template-level fields (TemplateJobPolicy, TemplateRetryPolicy) provide
recommended defaults prefixed with recommended*. These are merged into the
runtime policy when the corresponding field is unset, not when it is explicitly
set to zero. See CRD Design for the full policy resolution chain.
Impulse trigger delivery behavior is configured separately via
ImpulseTemplate.spec.deliveryPolicy and Impulse.spec.deliveryPolicy. This
policy controls trigger deduplication and delivery retries; it does not replace
StepRun retry policies. Per-trigger throttling is configured on
Impulse.spec.throttle and enforced by the SDK
running inside the Impulse workload (templates do not provide throttle defaults).
Impulse.spec.throttle fields:
ratePerSecond: steady-state token bucket rate (0 disables rate limiting)burst: token bucket burst size (defaults toratePerSecondwhen omitted)maxInFlight: cap on concurrent trigger submissions (0 disables)
When throttling delays a trigger, the SDK patches Impulse.status.throttledTriggers
and Impulse.status.lastThrottled so operators can observe backpressure.
Impulse throttling example
apiVersion: bubustack.io/v1alpha1
kind: Impulse
metadata:
name: github-webhook
spec:
templateRef:
name: github-webhook
storyRef:
name: pr-review
throttle:
ratePerSecond: 5
burst: 10
maxInFlight: 20
Operator-level defaults and limits are documented in Operator Configuration.
Runtime objects and lifecycle
StoryRun and StepRun phases, reasons, and terminal semantics are defined in
Lifecycle. status.conditions is the canonical lifecycle source of
truth for both resources.
Payloads and storage
Payloads can be stored inline or offloaded to storage. Storage references use
$bubuStorageRef and related metadata fields. Size limits and reference
behavior are defined in Payloads. The controller does not auto-offload
oversized Step inputs or StoryRun outputs; workflows must use storage refs and
aggregation engrams when payloads grow.
Design principle (DOTADIW: Do One Thing And Do It Well): The controller is responsible for workflow orchestration and reconciliation only. Engrams and Impulses own external side effects, storage writes, and storage lifecycle (retention/GC). The controller does not manage the lifecycle of storage objects produced by workloads.
Scoping and versioning
Namespace scoping rules are defined in Scoping. By default,
namespaced references must be within the same namespace. The operator config
references.cross-namespace-policy can enable ReferenceGrant-based sharing or
allow all cross-namespace references. Templates are cluster-scoped.
Versioning and pinning are described in Versioning. The CRD version
lifecycle (v1alpha1 -> v1beta1 -> v1), upgrade procedures, and conversion
webhook plan are in Migration.
Operator configuration
Controller configuration, scheduling controls, and defaults are documented in
Operator Configuration. The sample ConfigMap lives at
config/manager/operator-config.yaml.
Where to dive deeper
| Area | Document | Focus |
|---|---|---|
| Architecture | Architecture | Module map, dependency graph, runtime topology |
| Configuration | Operator Configuration | Operator defaults, scheduling keys, and knobs |
| Durable semantics | Durable Semantics | Delivery guarantees, recovery rules, idempotency expectations |
| Component ecosystem | Component Ecosystem | SDK usage, contracts, and component catalog |
| Step semantics | Primitives | Primitive behavior and cleanup blocks |
| Expressions | Expressions | Contexts, determinism, and materialization |
| Schemas | Inputs | Defaults and validation rules |
| Payloads | Payloads | Inline vs storage refs and size limits |
| Caching | Caching | Output cache keys, modes, and TTLs |
| Lifecycle | Lifecycle | Phases, reasons, and terminal rules |
| Streaming | Streaming Contract | Streaming message rules and data flow |
| Transport | Transport Settings | Backpressure, routing, and replay |
| CRD design | CRD Design | Resource model, relationships, and policy chains |
| Scoping | Scoping | Namespace boundaries and reference policy |
| Versioning | Versioning | Pinning behavior and compatibility |
| CRD migration | Migration | API version lifecycle and upgrade procedures |
| SDK | Go SDK | SDK entry points and usage patterns |
| Building components | Building Engrams | Step-by-step guide for Engrams and Impulses |
| Roadmap | Roadmap | What's planned and where to contribute |