What Ships With Every Deploy

Every capability on this page is included on every plan. We don't tier-gate diagnostics, backups, observability, or compliance — the product is the product. The plan tier picks resource size, not feature set.

CRITICAL finding-7a3c2 · 14:32:18 UTC

Cart query reaching 14s p99 in checkout

Evidence

trace 4e9a82c: browser → nginx → php-fpm → mysql
slowest span: SELECT * FROM cart_items WHERE cart_id = ?
              19,840 rows · 14.2s · no index used

Code Location

app/Http/Controllers/CheckoutController.php:127
  $items = $cart->items()->with('product')->get();

Recommended Fix

1. Add index: cart_items (cart_id, created_at)
2. Eager-load product.images to drop N+1
Expected p99: 14s → ~120ms

Diagnostics

Detect, diagnose, report, recommend, contain. Five steps, in that order. Other hosts kill first and explain never. We diagnose first, capture evidence, and contain only when there's no other option.

  • Diagnostic reports with finding ID, evidence, code location, and recommended fix — delivered via dashboard, email, API, and MCP.
  • Per-tenant code index traces SQL queries back to the function that issued them.
  • Catch the bug before it ships. Pre-deploy static analysis runs in the deploy pipeline — the same code index flags N+1 query patterns before the code reaches production, not after it slows down checkout.
  • Our agent reads three surfaces at once: telemetry, your code, and the live runtime state.
  • Want us to ship the fix? Optional code-patch service — every patch goes through human review before merge.
WARNING finding-9b1f4 · 02:15:44 UTC

PHP Fatal in Stripe webhook handler

Evidence

trace 8c7e3a1: stripe webhook → handler::process
exception: Undefined index "payment_intent"
occurred: 18 times in last 24h · 0.4% of webhooks
side effect: 7 orders left in "awaiting_capture"

Code Location

app/Webhooks/StripeHandler.php:84
  $intent = $payload['payment_intent'];

Recommended Fix

1. Guard with array_key_exists() before access
2. Verify Stripe signature before parsing payload
3. Replay the 18 affected events from the audit trail

Included on every plan.

Observability

100% trace capture. No sampling. Per-tenant cost attribution from day one.

ClickHouse-backed unified store: metrics, logs, traces, business events, and cost data all SQL-queryable in one place.

  • Every Temporal workflow logs to the store for process mining.
  • Per-service cost attribution — every byte tagged to a tenant and a workload.
  • Agents query the telemetry store with SQL — same surface our ops team uses.
  • 12-month audit log retention per PCI Req 10.

Included on every plan.

Self-Healing

A restart destroys the evidence. So we diagnose first — capture the forensics, find the root cause, then take the least destructive action — and we hold both your application and the platform underneath it to that rule. The difference is who pulls the trigger: on your app, you do; on our own infrastructure, we do.

Most hosts kill first and explain never. You get a ticket that says "high CPU detected, we killed your pod" — no diagnosis, no root cause, no record of what went wrong. We do the opposite: nothing is restarted until the diagnostic state is captured and written down.

Your application: you hold the switch.

When something in your app degrades, we diagnose it and tell you exactly what we'd do to fix it. Whether we actually do it is your call — with one exception.

  • Security detection is always on. Threat detection cannot be disabled. Reverse shells, container escapes, crypto mining: we see them on every tenant, every container, no opt-in. The response starts human-in-the-loop and graduates to automatic only for patterns proven safe over time.
  • Everything else is opt-in, off by default. Restarting PHP-FPM, killing a stuck MySQL query, pausing a runaway cron, clearing a cache — business-logic fixes that bridge the gap until the real fix ships. You enable them one action type at a time, or leave them all off and we just tell you what we'd do.

The platform underneath: it heals itself to the same standard.

The infrastructure your app runs on — the proxies in the request path, the databases, the deploy engine — follows the same diagnose-first discipline. Because it's our infrastructure, we act on it without waiting for you. Least destructive first:

  • Cancel the offending work — kill the runaway query, cancel the stuck task, drop the bad connection. The cheapest fix, and it resolves most incidents on its own.
  • Drain, don't drop — stop sending new work to a degraded instance instead of killing it mid-request.
  • Replace, then retire — scale up a healthy replacement first, shift traffic, then drain the degraded one. No gap in capacity.
  • Restart as last resort — only after the diagnosis is captured, never before. The evidence survives the fix.

Our own databases never get a generic drain-and-restart: we kill specific queries, fail over to a verified-ready replica, and make one change at a time. The destructive actions — a failover, a restart — stay behind a human approval gate. Permanently.

Included on every plan.

Agent Surface

Same APIs, same data, same access as human operators. Point your coding agent at our MCP server and it drives the platform from your IDE.

Two MCP servers. External for your developer agent (Claude Code, Cursor, and the rest), internal for our ops team. ~35 tools at launch.

  • Deploy, rollback, restart, restore via MCP tool call.
  • Query traces, logs, metrics by tenant + endpoint + window.
  • Self-service actions: restart PHP, kill query, clear cache, manage DNS/SSL/SSH/backups.
  • Tenant API keys with 3 scopes: admin, deploy, read-only.

Included on every plan.

Deploys

Push to GitHub. Or SSH in and rsync. We accept both.

Code is built into an immutable image, deployed blue/green, and gated by health + canary metrics before traffic shifts. Failed deploys auto-revert with full forensics preserved.

  • Webhook-triggered builds for git-backed tenants.
  • CLI synsmarts deploy for workspace-backed tenants.
  • Per-tenant blue/green cache + shared session continuity.

True zero-downtime deploys — not a best-effort rolling restart. The new build comes up beside the live one and takes zero production traffic until it clears its health and canary checks. The cutover is atomic: requests in flight finish on the old slot while new requests land on the new one. Because each slot runs its own object cache, the new slot is already warm — no cold start, no thundering-herd cache stampede the moment you go live. Sessions ride a shared session store across the switch, so a customer mid-checkout never gets logged out or loses their cart. And if the new build fails its gates, traffic never moved in the first place — the deploy auto-reverts and your customers never saw it. No maintenance window, no "be right back" page, no dropped requests.

A real staging environment, not a toy. Spin up a second instance at the smallest tier and you get a true staging environment: the full production stack, real AWS, real diagnostics — not a stripped sandbox that behaves differently from prod. Deploy the same git SHA to both instances to promote a verified build, and run the diagnostic report against staging before it ever reaches your customers.

Multi-model code review on your deploys. Prism is the review gate we run on our own platform changes: every change goes in front of a panel of independent frontier AI models that have to reach consensus before it ships. One reviewer has blind spots — a panel that must agree catches what any single model misses. Opt your deploys in and the same gate reviews your code before it reaches the build.

Migration safety: we run it before you do. Most hosts read your migration files and guess whether they're safe. But schema migrations and data patches execute arbitrary code — file content alone can't tell you what a Magento Data Patch will do to your data. So we don't guess. When a deploy touches the schema, we clone your live database, run the migration against the throwaway clone, and classify what actually happened before a single byte of production data is touched.

  • Code-only — the majority of deploys. No schema change, no classification overhead, straight to blue/green.
  • Additive — new columns, tables, or indexes. Safe to apply while your current code keeps serving traffic. Zero downtime.
  • Breaking — drops, renames, new NOT-NULL columns. Gated behind your explicit approval, a maintenance window, and a fresh backup taken immediately before the migration runs.
  • Can't classify — if the dry-run can't prove a migration is safe, we treat it as breaking. Fail closed, never fail open.

Included on every plan.

Speed

Fast is the default, not an add-on. Every tenant serves from the edge, with the PHP and cache layers tuned before you deploy a line.

The same diagnostic engine that names a slow query also keeps the fast path fast: static and cacheable responses are served from Cloudflare's global edge, dynamic responses come off a pre-tuned PHP stack, and the object cache survives every deploy so a release never cold-starts your site.

  • Cloudflare edge on every tenant. Every request enters through Cloudflare's global network — cached assets serve from the edge nearest your customer, not a round trip to origin. WAF, CDN, and DDoS protection ride the same path. Included on every plan, not a premium CDN upsell.
  • PHP tuned before you arrive. OPcache and FastCGI Cache are wired into the base image by default, per PHP version. No plugin to install, no config to hand-tune — the fast path is the path you get.
  • Object cache that survives deploys. Per-slot external Redis backs your application object cache. Blue and green slots hold independent cache instances, so a deploy never flushes your live cache and traffic never lands on a cold site.
  • Media off the edge. Uploads land in a per-tenant S3 bucket and serve through the CDN, so images and downloads don't compete with your application for PHP workers.
  • Scales with the spike. Flash sale or launch day, resources burst up to meet demand and settle back when it passes — burst to 3× your base reservation, the first 10% of the month free.

Included on every plan.

Backups + Restore

Backups are table stakes, not a premium tier.

Hourly XtraBackup. Continuous binlog shipping at 1-minute intervals. Hourly EBS snapshots. Tamper-proof cross-region copies. Same retention for every tenant.

  • 30-day XtraBackup retention.
  • 14-day binlog retention — PITR to any timestamp in the window.
  • 7-day EBS snapshot retention as tertiary safety net.
  • Self-service restore: pick a timestamp, watch the workflow provision a fresh primary, swap endpoints, keep the old for a rollback window.
  • Quarterly cross-region restore drills, full fleet coverage.

Included on every plan.

Security + Compliance

PCI-DSS SAQ D as a Service Provider. GDPR co-equal.

CDE boundary published, audited annually, validated quarterly via ASV scans and continuous file-integrity monitoring at the container layer.

  • Per-tenant KMS keys, multi-region.
  • File-integrity monitoring and runtime security on every node, every container.
  • Cardholder-data discovery scanner across MySQL, S3, and the telemetry store — pattern matching for any payment data outside the expected flow.
  • MFA mandatory on all CDE access.
  • Cloudflare for SaaS (WAF + CDN + DDoS, Level 1 PCI sub-processor).

Included on every plan.

Get Started See Pricing