> ## Documentation Index > Fetch the complete documentation index at: https://docs.tensor9.com/llms.txt > Use this file to discover all available pages before exploring further. # Running Commands A submitted ops command moves through a state machine on its way from "you want to run this" to "you have the released output." This page walks your side of that flow: how to submit, how to watch progress, what your customer sees in parallel on the `/support/` link they receive, and how to verify the audit chain after the fact. How a command moves: you submit, your customer approves, the output is released to you.

How a command moves: you submit, your customer approves, the output is released to you.

## Submitting a command ```bash theme={null} tensor9 ops command create \ --appName my-app \ --customerName acme-corp \ --template linux-disk-usage \ --vars MOUNT_PREFIX=/var/lib/myapp \ --commandName check-myapp-disk \ --reason "investigating disk pressure on tenant alerts" ``` Required flags: | Flag | Purpose | | ---------------- | --------------------------------------------------------------------------------------------------- | | `--appName` | The Tensor9 app the command targets. | | `--customerName` | Which of your customers' appliances will execute the command. | | `--commandName` | A memorable identifier (3-64 chars, lowercase + hyphens). You'll use this to retrieve, cancel, etc. | One of the following picks the body of the command: | Flag | What it does | | ------------ | ------------------------------------------------------------------------------------------ | | `--template` | Reference an already-imported template by id. Most common. | | `--command` | Inline ad-hoc command body. Useful for one-off shell snippets that don't merit a template. | Common modifier flags: | Flag | Purpose | | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `--vars` | Comma-separated `KEY=VALUE` pairs for template variables (`--vars NAMESPACE=prod,DEPLOYMENT=api`). See "A note on `--vars` escaping" below. | | `--permissions` | `ReadOnly` (default), `ReadWrite`, or `Admin`. Drives the role minted on the appliance for Kubectl-tier commands. | | `--reason` | Free-text justification. Shown to your customer at approval time and recorded in the audit trail. | | `--timeout` | How long to wait for customer approval before the command times out. Default `7d`. | | `--originRsxId` | Required only for Kubectl ad-hoc and Kubectl templates: the Terraform resource address of the target cluster (e.g. `aws_eks_cluster.production`). Ignored for Tf and Script paths. The id is resolved against your **latest published release** of this app, not against the version currently running on your customer's appliance; if those differ and the resource was renamed across versions, the cluster lookup may fail. There is no flag today to pin the resolution to a specific release. | The `--commandType` flag selects between command shapes. It's almost always inferred from `--template` or auto-defaults; only set explicitly if you're authoring tooling that needs to be specific: | `--commandType` value | When the system uses it | | --------------------- | ----------------------------------------------------------------- | | `Kubectl` | Ad-hoc kubectl invocation (no template). Default for ad-hoc. | | `KubectlFromTmpl` | Auto-selected when `--template` resolves to a Kubectl template. | | `ScriptFromTmpl` | Auto-selected when `--template` resolves to a Script template. | | `TfFromTmpl` | Auto-selected when `--template` resolves to a Terraform template. | #### A note on `--vars` escaping Today `--vars` is a single comma-separated string, which means values cannot contain commas or `=` characters. This is a known limitation; support for repeated `--var KEY=VALUE` flags is planned. Until then, work around with templates whose variable values are constrained to simple alphanumerics + path characters. The action prints the assigned `commandName` and the initial state, then returns. The command is now `Submitted`; your customer's review experience begins next. ## Lifecycle at a glance Ops command lifecycle: three lanes (Command Approval, Execution, Output Release) with happy-path states across the top, terminal unhappy states across the bottom, and an intermediate Cancelling state reachable from Submitted, CmdApproving, or Executing.

Ops command lifecycle: three lanes (Command Approval, Execution, Output Release) with happy-path states across the top, terminal unhappy states across the bottom, and an intermediate Cancelling state reachable from Submitted, CmdApproving, or Executing.

The happy path has six in-flight states, two terminal happy states, and five terminal unhappy states. One intermediate state (`Cancelling`) covers the brief window where a cancel request has landed but the appliance is still tearing down. ### Happy path in words The command exists; your customer's appliance has not picked it up yet. You see this immediately after `tensor9 ops command create`. Your customer's appliance has the command in its inbox and is waiting on a human review decision. Your customer sees the `/support/` link and walks the four-step approval UI. Your customer approved execution and the appliance is preparing to run the command. The command body is running inside the appliance's sandboxed working directory. The appliance captures stdout / stderr / exit code, uploads each output stream to your blob store (S3 in your customer's account) and stores a small `[blob: ...]\\n` payload on the command record. That payload is then encrypted with a per-command key. Execution finished. The appliance is now waiting for your customer to review the output and decide whether to release it to you. Your customer signed an Ed25519 release manifest. The control plane surfaces the decrypted blob-payload to you; calling `tensor9 ops command retrieve` returns the payload, and you fetch the actual bytes by curling the presigned URL it carries. State advances to `Completed` next. ### Terminal unhappy states | State | What happened | | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `CmdRejected` | Your customer rejected the command at review (Step 1 of their approval UI). | | `OutputRejected` | Your customer approved execution, but rejected releasing the output. You never see stdout / stderr. | | `ExecutionFailed` | The command ran but the appliance reported a non-zero exit code or an internal error (or the staleness recovery fired). Terminal: no transition out, no output release path. The failure stderr is uploaded to your blob store like any other output. | | `Cancelled` | You explicitly cancelled (`tensor9 ops command cancel`) before the command was approved. | | `Timeout` | Your customer never decided within the `--timeout` window. | ### Encryption mechanism Output is encrypted with a per-command AES-256-GCM key that the appliance generates on-the-fly. The ciphertext carries a SHA-256 fingerprint of the key (in the AAD), so the decrypt path can find the right key in the appliance's secret store even after a re-execution overwrites the per-scope slot. Keys live in the appliance's secret store under `/t9-private/projection/.../ops-cmd/...` and are deleted after successful release or output rejection. When you see a "decryption failed" alert from the appliance, the likely causes are: (a) a re-execution clobbered the scope-keyed slot before your customer released the previous run's output, (b) the secret store is unreachable, (c) your customer rotated keys mid-flight. The fingerprint addressing is the defense for (a); see the appliance audit log for the specific failure mode. ## Watching progress ```bash theme={null} # Snapshot of every command across this app tensor9 ops command list --appName my-app # Same, including completed and rejected history tensor9 ops command list --appName my-app --history # Drill into one command tensor9 ops command retrieve --appName my-app --commandName check-myapp-disk ``` `retrieve` shows the current state, full audit chain (who approved what, when), and (once `Completed`) the released stdout / stderr / exit code. Both commands accept `--output json` for scripting; pipe into `jq .lifecycle` to poll a single state value. ## What your customer sees When you submit, your customer is sent (via your existing notification channel) a unique `/support/` web link. Clicking it opens a four-step approval UI: If your customer hasn't subscribed a notification channel for ops-command events, submission silently produces no notification. The command still appears in your `tensor9 ops command list` (state: `Submitted`), and your customer would only see it if they happened to visit the support portal directly. Confirm channel subscription with each customer at onboarding; otherwise an "unanswered" command is more likely to mean "your customer doesn't know about it" than "your customer is ignoring it." | Step | What your customer does | | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Review** | Reads the template description, declared `data_access` tags, your reason, and the exact template body and variable values you submitted. Picks Approve, Reject, or close. | | **Approve** | Confirms the approval. State advances to `CmdApproved`; the appliance executes immediately. | | **Execute** | Watches a status pane while the appliance runs the command. State stays `Executing` until the output lands. | | **Release** | Reviews the decrypted stdout / stderr / exit code, then signs an Ed25519 release manifest. Picks Release or Reject. | ### What "review" actually shows For Terraform templates, your customer sees the literal HCL source plus the values they're submitting for each `variable`. They do **not** see a `tofu plan` output: a plan would require evaluating data sources, which can only run after the appliance is authorized to do so. Your customer is reviewing "the shape of what will execute" plus the declared `data_access`, `side_effects`, and permission tier, not a fully-resolved diff. Things the HCL surface does NOT pre-evaluate for the reviewer: * **`${var.X}` interpolations stay as literal strings in the displayed HCL.** Your customer sees `${var.MOUNT_PREFIX}` in the command body and the submitted value of `MOUNT_PREFIX` separately; they have to substitute mentally at review time. The approval UI shows the submitted variable values next to the HCL. * **`for_each` cardinality is invisible at review time.** A `for_each = toset(data.aws_s3_buckets.all.buckets[*].name)` does not show whether it will iterate over 3 buckets or 30,000. Bound the potential blast radius via `data_access` + `side_effects` declarations and use `description` to explain the cardinality semantics in plain English. * **`local-exec` heredocs are reviewed as shell.** A multi-line `command = <<-EOT ... EOT` is shown verbatim. Customers reviewing a `kubectl drain ... && kubectl ...` heredoc are reviewing a shell program, not a Terraform plan. Keep heredocs short and named in the `description`. * **`null_resource.triggers` are not re-evaluated against prior state** (there is no prior state; see [Authoring templates](/fundamentals/operations/templates)). A `triggers = { mount_prefix = var.MOUNT_PREFIX }` block makes the resource look like it fires only on change, which is misleading. Document the behavior in the template's `description` or omit the triggers block. Because the review surface is the HCL, the `description` field on `tensor9_command` carries a lot of weight. Treat it as the plain-English equivalent of the HCL: name the exact APIs called, the expected output shape, the cardinality of any fan-out, and the intended side effects. For Script and Kubectl templates the review surface is the literal script body or kubectl invocation. Same caveats: `${VAR}` references are unsubstituted; the customer reads the script and the variable values side-by-side. ### Trust properties for the release step Release has two properties to explain to your customer: * **The plaintext output passes through the appliance your customer already controls before you see it.** Decryption happens on the appliance (which lives in your customer's cloud account under their IAM); the control plane only ever sees the ciphertext before release. Once your customer signs release, the control plane forwards the plaintext to you. * **The release decision is non-repudiable.** Your customer's signed release manifest is preserved in the audit chain and can be verified independently with [`tensor9 ops command audit verify`](#audit-and-forensics). Pre-approved templates skip the per-command Review and Approve steps; see [Pre-approvals](/fundamentals/operations/preapproval). ## Resource limits and queueing Operations is for diagnostics and short-lived interventions, not for bulk data extraction. The appliance enforces a few limits that you should size your templates against: | Limit | Value | Implication | | --------------------------------- | ---------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Concurrent commands per appliance | 10 | Approved commands dispatch onto the appliance's bounded execution pool (max 10 worker threads). Commands beyond the cap stay in `CmdApproved` and are picked up on the next polling cycle (\~10s) as workers free up. | | Per-command runtime cap | 10 minutes | A command running past 10 minutes will be killed and surfaced as `ExecutionFailed`. | | Stuck-execution recovery | 20 minutes (2 × runtime cap) | A command that's been in `Executing` for 20 minutes (e.g. because the appliance restarted mid-run) is auto-transitioned to `ExecutionFailed` with stderr noting the likely cause. | | Per-stream output cap | 5 GiB | stdout and stderr each route through your blob store (uploaded by the appliance, fetched by the customer's release script and your `retrieve` call via a presigned S3 URL). 5 GiB is the AWS S3 single-PUT ceiling. | | Children per batch | 50 | See "Batches" below. | The 5 GiB cap is high enough that you generally don't think about it. The appliance uploads each stream to your S3 bucket and stores only a small `[blob: bucket=..., key=..., size=..., sha256=...]\\n` payload (encrypted) on the command record. The customer's release script fetches the actual bytes, sha256-verifies them, and shows them in the local preview. Your `tensor9 ops command retrieve` returns the same payload; curling the URL gives you the bytes back. Bulk log dumps (`journalctl --since 24h`, `kubectl logs deployment/...`) flow through without the per-template `| tail -c` self-capping templates used to need. On Kubernetes-form-factor appliances whose blob store does not yet support presigned URLs (MinIO is in this category as of this writing), the appliance falls back to an inline 4 MiB cap with a marker like `[stdout truncated - 4 MB cap]`; the release script's preview still works, but the customer and you see only the first 4 MiB of any stream over the cap. If you're on-call and you see a command stuck in `Executing` for more than five minutes, you can either wait for the 20-minute staleness recovery (automatic) or interrupt it manually with `tensor9 ops command cancel --commandName `. Cancelling an `Executing` command transitions it to `Cancelling` while the appliance tears down; the eventual terminal state depends on what the appliance was doing at the time. See "Cancelling" below for the full state-machine view. ### Not yet enforced Two limits ship in the codebase as constants but no enforcement call site exists today. Treat these as documentation-of-intent, not as guarantees: * **Submissions per hour: 100 per appliance.** Plan around it; do not rely on it. The 101st submission this hour will succeed. * **Cooldown between submissions: 60 seconds per appliance.** Same caveat: not enforced today. Enforcement will land in a future release; until then, rate-limiting in your own scripts is the only real bound. ## Cancelling ```bash theme={null} tensor9 ops command cancel --commandName check-myapp-disk ``` What happens depends on the command's current state: | State at cancel time | Outcome | | ---------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `Submitted`, `CmdApproving` | Transitions directly to `Cancelled`. | | `CmdApproved` | Transitions to `Cancelled`; the appliance never picks the command up to execute. | | `Executing` | Transitions to `Cancelling`; the appliance tears down its sandbox while the cancel intent propagates. The eventual terminal state depends on what the appliance was doing (typically `Cancelled`, but may surface as `ExecutionFailed` if the process had already produced partial state that needed teardown). | | `Executed`, `OutputApproving`, or any terminal state | No-op. The cancel arrives too late; output release proceeds normally. | The Cancelling intermediate state exists specifically because a running `tofu apply` or `kubectl` invocation needs a moment to wind down cleanly. In-flight side effects (a partially-created cloud resource, a partially-applied K8s manifest) may or may not be backed out depending on the template; if you cancel a mutating template mid-execution, expect to inspect your customer's environment afterwards. If you submitted many commands by accident and need to cancel them all in one shot: ```bash theme={null} # Cancel everything against this customer regardless of when submitted tensor9 ops command batch cancel-bulk \ --appName my-app \ --customerName acme-corp \ --yes # Or scope to a time window tensor9 ops command batch cancel-bulk \ --appName my-app \ --submittedAfter "2026-05-09T13:55:00Z" ``` `cancel-bulk` lists the commands it's about to cancel and prompts for confirmation; pass `--yes` to skip the prompt for scripts. Only commands still in cancellable states are touched; anything past `CmdApproving` is reported as "skipped" so you know what's still running. ## Batches When you need to fan a command out across multiple customers or appliances, use the batch surface: ```bash theme={null} tensor9 ops command batch submit --appName my-app --file ./batch-spec.json tensor9 ops command batch list --appName my-app tensor9 ops command batch retrieve --appName my-app --batchId tensor9 ops command batch cancel --appName my-app --batchId ``` A batch creates one underlying ops command per appliance. The lifecycle tracks each child command independently, so different customers approving at different times is normal. `batch retrieve` rolls the children up into a single status summary. A single batch is capped at **50 child commands** (one per appliance). For larger fleets, submit multiple batches with a small delay between them to avoid a thundering-herd against the notification path. We are working on a higher cap; let us know what your steady-state fan-out looks like. ## Audit and forensics Three Ed25519 signatures protect every ops command. Together they form a non-repudiation chain your customer can verify independently: | Signature | Signed by | What it proves | | ----------------- | --------------------- | ----------------------------------------------------------------------------------------------------------------------- | | `commandApproval` | Appliance signing key | "The command body you submitted, with these specific variable values, was approved by this person at this time." | | `outputIntegrity` | Appliance signing key | "Exactly these stdout / stderr / exitCode bytes came out of the command's execution, before any encryption or transit." | | `outputApproval` | Appliance signing key | "These specific output bytes were approved for release to you by this person at this time." | The signatures are stored on the command's audit record and survive the encrypt/decrypt cycle (the integrity signature is computed over plaintext before encryption, then preserved in the ciphertext metadata). Either you or your customer can verify the full chain on a specific command: ```bash theme={null} tensor9 ops command audit verify \ --appName my-app \ --commandName check-myapp-disk ``` The action retrieves the command record + the appliance's pinned signing public key, reconstructs the canonical signed-data for each of the three signatures, and verifies. The three checks use two different trust anchors: * `commandApproval` and `outputApproval` verify the buyer's signature on the approval / release manifest, using the public key embedded in the manifest itself. The output reports the `signerPublicKeyFingerprint`, which you (or the customer) should cross-reference against the buyer-signing pubkey currently pinned on the appliance vault, which is the actual trust anchor. * `outputIntegrity` verifies the appliance's signature over the plaintext output, against the customer's pinned `opsCmdPubKey`. Output (in the healthy case): ``` Audit chain for check-myapp-disk (Completed) Appliance signer fingerprint: 3a8c4b1f... ✓ commandApproval [OK] ✓ outputIntegrity [OK] ✓ outputApproval [OK] ✓ Audit chain verified. ``` Any failure is surfaced as `[FAIL]` with a one-line reason, and the process exits non-zero. Customers running compliance audits should script this against their full ops-command history; you should run it whenever a customer reports "you ran something I didn't approve" so the disagreement turns into a verifiable record very quickly. `UNSIGNED_LEGACY` records (commands authored before the signature chain was required, with no buyer-signed manifest attached) are reported but do not cause a failure exit by default; pass `--strict` to fail on those too. For programmatic use, pass `--output json`. The JSON `checks` array carries per-check `trustAnchor`, `signerPublicKeyFingerprint`, `signedBy`, and `signedAt`, which let you distinguish auto-approved-by-pre-approval from manually-approved commands and archive the chain independently. ## Related * [Authoring templates](/fundamentals/operations/templates): the templates this command body comes from. * [Pre-approvals](/fundamentals/operations/preapproval): skip per-command approval for repeated runs. * [Security model](/fundamentals/operations/security): the keys, signatures, and storage-side audit guarantees that make the audit chain non-repudiable. * [Permissions model](/fundamentals/permissions-model): how permission tiers map onto appliance-side roles.