ActivLayer · Offering

Products and solutions
for autonomous operations.

Six platform capabilities — deployable independently or together. Six packaged outcomes ready to deploy against your most urgent operational pain. Start with the piece that solves your most urgent problem.

Products6 capabilitiesSolutions6 packaged outcomes
ORCHESTRATION

The reasoning pipeline that turns a plain-English instruction into a safe, verified infrastructure action.

The Agent Engine is the core of the platform. Every operation — whether triggered by a human typing a command, an automated event from your infrastructure, or an API call — runs through a deterministic multi-stage pipeline. The pipeline never skips a step and never improvises a shortcut.

The stages, in order: Intent classification → Intent refinement → Memory retrieval → Reasoning and plan generation → Policy check → Human approval gate → Execution → Verification → Memory commit. Each stage is independently auditable. The session record contains the output of every stage. If something goes wrong at any stage, the pipeline stops, logs the reason, and either self-corrects or fails gracefully.

The engine is designed for composability. Deploy it standalone via API to add autonomous execution to your own tooling, or run it as the backbone of the full ActivLayer stack.

Specifications
Pipeline stages9-stage deterministic sequence
Execution targetsK8s, Ansible, Terraform, AWS, GCP, VMware, Proxmox, Vault
Deploy modeSaaS, self-hosted operator, or API-only
API surfaceFull REST + webhook event stream
Key capabilities
Natural language intent input — no scripts, no predefined command syntax
9-stage deterministic pipeline — every action follows the same verified sequence
Self-correction loop — if execution fails, the platform re-analyzes and retries with a corrected plan
Agent templates — define per-agent authorization scopes, channel restrictions, daily budgets, and autonomy levels
Event-driven dispatch — triggered by infrastructure events, API calls, CLI commands, or scheduled jobs
CLI access — activlayer run "intent", activlayer sessions, activlayer approve
Standalone deployment

Teams that already have their own ITSM or runbook tooling can embed just the Agent Engine via API to add autonomous execution capability without replacing their existing workflow layer.

Works with →All other components — the Agent Engine orchestrates everything else.
AI · CLOUD / AIRGAP

Frontier-class AI for cloud environments. Fully self-hosted for air-gapped on-premises — your data never leaves your network.

The AI Reasoning Layer provides the analytical intelligence that powers every reasoning step in the pipeline — classification, plan generation, intent refinement, self-correction, and outcome summarization. It is not a single prompt sent to a chat API. Each pipeline stage uses a purpose-specific system prompt and expects a structured JSON response — the output of each step feeds directly into the next.

For cloud deployments, the platform routes requests through a pool of API keys with automatic rotation and failover — distributing load and providing resilience if any single key is rate-limited. For organizations that cannot allow any data to leave their network — government, defense, healthcare, financial services with strict data residency requirements — the platform runs a self-hosted AI model entirely on your own infrastructure, deployed as a container inside your Kubernetes cluster with zero egress.

Airgap mode is not a degraded experience — it is a first-class deployment option. The AI layer is model-agnostic: swap the underlying model without changing any pipeline logic.

Specifications
Cloud modeFrontier-class AI, multi-key rotation, automatic failover
Airgap modeSelf-hosted AI on your infrastructure, zero egress
Reasoning styleChain-of-thought with structured JSON output
Low-confidenceAuto-escalates to HITL below threshold
Key capabilities
Cloud mode: frontier-class AI with multi-key rotation for reliability and load distribution
Airgap mode: self-hosted AI model — fully offline, GPU-optional, zero external API calls
Structured output: every AI call returns validated JSON — no parsing ambiguity, no prompt injection surface
Per-stage prompts: classify, reason, refine, summarize — each stage has a specialized system prompt
Model-agnostic design: the reasoning interface decouples pipeline logic from the underlying model
Transparent reasoning: every AI output is stored verbatim in the session record — no black box
Air-gapped compliance

Airgap mode requires no external network connectivity at inference time. The model runs on your hardware. No telemetry, no logging, no egress. Designed to pass FedRAMP, FISMA, and NHS IG requirements.

Works with →Agent Engine (primary consumer) · Policy Enforcement (informed by AI risk classification).
OPA

Open Policy Agent checks every action before execution. Define what's allowed — by environment, risk level, operation type, and team.

No action executes without passing the Policy Enforcement layer. Every step in every execution plan is evaluated by Open Policy Agent (OPA) before it runs. The policy decision is binary — allow or deny — and the denial reason is logged and surfaced to the operator. This is not a best-effort guardrail. It is a hard gate.

Policy rules are written in Rego — OPA's purpose-built policy language — and can express any combination of conditions: environment name, operation type (read / write / delete), risk level, channel, team identity, time of day, or the specific command string. A rule can say: deny any DELETE operation against an environment flagged as production, or require HITL approval for any operation with risk level HIGH or above.

By default, the platform fails closed: if OPA is unreachable, no action is allowed. In addition to OPA rules, the platform enforces agent-level guardrails: command deny patterns, environment scope restrictions, and daily action budgets.

Specifications
Policy engineOpen Policy Agent (OPA)
Policy languageRego
Default policy libraryCIS, NIST, PCI-DSS, HIPAA, STIG controls
Default postureFail-closed (deny if OPA unreachable)
Key capabilities
OPA Rego rules evaluated against every execution step before it runs
Define rules by environment, operation type, risk level, channel, team, command content
Agent-level deny patterns — regex rules that block commands independently of OPA
Environment scope enforcement — agents can only operate within their assigned scope
Daily action budgets — configurable cap on operations per agent per day
Fail-closed by default — OPA unreachable = no action permitted
Audit log of every policy decision: allowed reasons and denial reasons
Zero trust by default

ActivLayer ships with a deny-all-if-no-policy baseline. If an action type isn't explicitly covered by a policy, it's blocked. You expand permissions intentionally — you never accidentally allow too much.

Works with →Agent Engine (every execution step evaluated here) · Human-in-the-Loop (routes HIGH/CRITICAL risk to approval).
APPROVAL

Configurable approval gates with full AI briefing — reasoning, planned steps, and exact commands — before any action runs.

Human-in-the-Loop (HITL) is not a notification system. It is a structured decision point that pauses execution and presents the approving human with everything they need to make an informed judgment — the AI's full reasoning, the complete action plan, the exact commands that will execute, and the risk assessment — before a single change is made.

When an operation reaches a HITL gate, the session pauses completely. The approving authority receives a briefing showing: what triggered the operation, what the AI determined was happening, why it chose this approach, and the step-by-step plan with exact commands. The human makes one decision: Approve, Deny, or Submit a revised intent.

HITL configuration is granular — applied at the agent level, severity level, environment level, or risk level. Plans not approved within the configured timeout cancel automatically. There is no risk of a stale approval executing hours later.

Specifications
Approval channelsOps Console, Slack, Teams, webhook
Briefing contentRoot cause, plan steps, exact commands, rollback path
EscalationConfigurable timeout → auto-cancel
GranularityBy action type, risk level, environment, time window
Key capabilities
Pause-and-brief model — execution stops completely until a human decides
Full AI briefing at the approval gate — reasoning, planned steps, exact commands, risk level
Approve / Deny / Revise actions available at every gate
Configurable triggers: always, by severity, by environment, by risk level, or never
Automatic timeout and cancellation — unapproved plans do not execute late
Notifications via Operations Console, Slack, or webhook
Approval record: approver identity, timestamp, and briefing content — stored immutably
Not a yes/no button

Most approval systems ask you to approve or reject without context. ActivLayer's HITL briefings give you everything the AI knows — so you're making an informed decision, not rubber-stamping a black box.

Works with →Policy Enforcement (routes HIGH/CRITICAL risk to approval) · Agent Engine (approval gate is a pipeline stage) · Operations Console (approval queue is visible there).
MONITORING

Live session traces, HITL approval queue, runbook library, and complete immutable audit history — all in one place.

The Operations Console is the command center for everyone who needs to understand what the platform is doing, has done, or is waiting for. It is not a log viewer — it is an operational interface that surfaces active sessions in real time, makes the HITL approval queue visible and actionable, and gives access to the full audit history, knowledge base, and agent configuration.

Every session in the console shows its complete lifecycle: current phase, the AI's reasoning verbatim, the execution steps with exact commands and outputs, the policy evaluation results, and the final AI-generated summary. Nothing is hidden. Every decision the platform made — and why — is readable.

For teams that need oversight without operational access, the console provides read-only views suitable for compliance officers, security teams, and management. For on-call engineers, the HITL approval queue surfaces pending approvals with all the context needed to decide without opening another tool.

Specifications
Live feedStep-by-step session trace with decision rationale
HITL queuePending approvals with full AI briefings
Audit historyImmutable, searchable, exportable to SIEM
Runbook libraryValidated playbooks with version history
Key capabilities
Live session board — real-time phase progression for all active sessions across all environments
HITL approval queue — all pending approvals in one view, with full briefing and one-click approve/deny
Session detail — complete trace: AI reasoning, execution steps, stdout/stderr, policy results, summary
Runbook library — browse, search, and ingest runbooks into the vector knowledge base
Agent management — create, configure, and monitor provisioned agents and their templates
Environment management — register and manage connected environments and their credentials
Audit history — immutable, searchable session history with export for compliance reporting
Knowledge base — view and manage the vector memory store that informs future reasoning
The view your CISO needs

Every regulator audit starts with the same question: what did your automation do and why? The Operations Console answers it. Immutable logs, decision traces, and approver records — all in one place, all exportable.

Works with →All components — the Operations Console is the human interface to everything the platform does.
8 PLATFORMS + API

K8s, Ansible, Terraform, AWS, GCP, VMware, Proxmox, Vault. REST API for embedding and building on top.

The platform operates across the infrastructure landscape as it actually exists. Eight execution channels are available out of the box. Each channel is a structured connector that translates the platform's action plans into native operations against that platform's API or CLI, captures the output, and returns it to the pipeline. You do not need to rewrite your infrastructure or replace your existing tools.

Every channel operation is structured and typed — the platform doesn't issue raw shell commands and parse free-text output. Each operation has a defined type (k8s.read, ansible.playbook, terraform.apply, vsphere.migrate), a defined scope, and a defined output schema. This makes every action auditable, policy-checkable, and verifiable.

For teams building products on top of the platform, the full REST API exposes every capability: submit intents, receive session state, handle approvals, query history, manage agents and environments. The API is the integration point for embedding autonomous ops intelligence into your own SaaS, PaaS, or internal tooling.

Specifications
Execution channels8 platforms (structured, typed operations)
APIFull REST + event webhooks
SDKsPython, TypeScript
Custom connectorsAvailable on Enterprise tier
Key capabilities
8 execution channels, each with structured typed operations (not raw shell commands)
REST API: submit intents, poll session state, handle approvals, query audit history
CLI: activlayer run, activlayer approve, activlayer sessions, activlayer agents
Webhook endpoints: receive infrastructure events from monitoring tools and schedulers
Custom connector framework: add support for additional platforms (Enterprise)
White-label SDK for embedding in your own product (Enterprise)
Supported platforms
Kubernetes (K8s, OpenShift, EKS, GKE, AKS)
Read/describe pods, deployments, nodes, events; delete pods; apply manifests; rollouts; fetch logs; exec
Ansible
Run playbooks against inventory; ad-hoc module execution; compliance scan playbooks; configuration enforcement
Terraform
State list and show; plan; apply; targeted destroy; workspace management
AWS
EC2, RDS, ECS, Lambda, S3, Cost Explorer, CloudWatch, IAM
GCP
Compute Engine, GKE, Cloud Storage, Cloud Monitoring
VMware vSphere
VM metrics and status; live vMotion migration; host inventory; snapshot management
Proxmox VE + PBS
VM/container backup; datastore management; PBS job monitoring; snapshot lifecycle; prune
HashiCorp Vault
Credential retrieval; secret rotation; dynamic credentials for execution contexts
No agents on your infrastructure

All execution targets are accessed over their existing API surface — no agents, no daemons, no sidecars deployed on your infrastructure. ActivLayer uses the same APIs your engineers use, with the same access controls.

Works with →Agent Engine (channels execute the planned steps) · Policy Enforcement (every channel operation is policy-checked before execution).
Deployment

Deploy independently or as a full stack.

Start with the piece that solves your most urgent problem. Add the others when you're ready.

Community
Free
1 environment
3 agents
Cloud AI only
HITL approvals
OPA enforcement
Audit trail
Get started
Professional
Contact us
5 environments
20 agents
Cloud + Airgap AI
HITL approvals
OPA enforcement
Audit trail
VMware + Proxmox connectors
Talk to us
Enterprise
Contact us
Unlimited environments
Unlimited agents
Cloud + Airgap AI
HITL approvals
OPA enforcement
Audit trail
VMware + Proxmox connectors
SSO / SAML
Custom connectors
White-label SDK
24/7 SLA
Talk to us
Solutions

Packaged outcomes, not just features.

Each solution maps platform capabilities directly to a business outcome. Start with the one that fits your most urgent pain.

The problem

Infrastructure failures that wake up your engineers at 3am — pod crashes, failed deployments, connection exhaustion, VM saturation — handled automatically before anyone is paged.

INCIDENT

Known failures handled before on-call wakes up.

The platform watches your infrastructure continuously. When a failure event is detected — a Kubernetes BackOff event, an HTTP error rate spike, a VM CPU saturation alarm, a failed health check — the relevant agent is dispatched immediately. It classifies the failure, pulls logs and resource state, reasons about the root cause, generates a remediation plan, checks the plan against policy, and executes.

For failures that match known patterns (pod crashes that resolve with a restart, deployments that need rollback, connection pools that need tuning), the platform resolves them fully autonomously — with zero human involvement, in under 90 seconds. For novel or high-risk situations, the engineer receives a complete briefing, not a raw alert.

After every resolved incident, the outcome is indexed into the platform's vector memory. The next time a similar failure occurs, the platform retrieves the successful resolution and applies it faster and with higher confidence.

The outcome
Known failures resolved before the pager fires
On-call engineers receive briefings, not investigation tasks
Full session record for every incident: cause, steps taken, outcome, duration
Mean time to resolution measured in seconds, not minutes
Specifications
Avg resolution timeUnder 90 seconds
Covered failure typesPod crashes, OOM, connection exhaustion, cert expiry, drift, scale events
RollbackAutomatic on unexpected mid-execution state
EscalationAuto-escalates to on-call on failure or low confidence
Capabilities used

Agent Engine · AI Reasoning Layer · Policy Enforcement · Human-in-the-Loop · Integrations (K8s, OpenShift, VMware, Ansible)

Most relevant for
SaaS and technology companies
Telecoms
Financial services
Healthcare
MSPs
What 90 seconds means at 3am

The average MTTR for a Kubernetes pod crash loop with a human on-call is 22 minutes (alert → wake → context → fix → verify). ActivLayer does it in under 90 seconds, without waking anyone. The ROI on a single P1 incident prevented is typically 8–12 hours of engineer time.

The problem

Security and compliance audits are quarterly snapshots. Your actual exposure is continuous. The platform makes compliance posture a live measurement, not a periodic project.

COMPLIANCE

Continuous audits with drift detected and remediated before it becomes a finding.

Compliance agents run continuous scans against your infrastructure — checking every environment against the controls you define. Frameworks supported out of the box include CIS Kubernetes Benchmark, PCI-DSS v4, HIPAA Security Rule, NIST 800-53, and STIG baselines. Custom controls are written as OPA Rego rules and added to the policy library.

When drift is detected — a server that has strayed from the hardened baseline, a namespace missing resource governance, a privileged container that shouldn't exist — the platform generates a structured drift report itemizing every violation by host, control ID, current value, and expected value. A remediation plan is generated automatically and routes to the compliance authority for approval.

Every scan result and remediation is a complete audit artifact: what was found, when, who approved the fix, what changed, and when it was verified. This is the continuous compliance evidence that annual audits require — generated automatically as a side effect of operations.

The outcome
Compliance posture updated continuously, not quarterly
Drift detected and remediated before it becomes an audit finding or a vulnerability
Audit-ready evidence generated automatically — not assembled manually before the audit window
Frameworks covered: CIS, PCI-DSS, HIPAA, NIST 800-53, STIG, custom Rego
Specifications
FrameworksCIS K8s, PCI-DSS v4, HIPAA, NIST 800-53, STIG, custom
Audit cadenceContinuous, scheduled, or on-demand
Report formatsPDF, JSON, SARIF, direct SIEM export
RemediationAuto or HITL-gated, configurable per control
Capabilities used

Agent Engine · AI Reasoning Layer · Policy Enforcement · Human-in-the-Loop · Integrations (K8s, Ansible, Terraform)

Most relevant for
Financial services
Healthcare
Government
Manufacturing (IEC 62443, TISAX)
SaaS (SOC 2, ISO 27001)
Compliance as continuous infra state

Most teams treat compliance as a quarterly exercise — they scramble to produce evidence and manually close findings. ActivLayer treats it as an ongoing infrastructure state that's maintained automatically. When the auditor arrives, the evidence is already there.

The problem

Cloud bills grow in the dark. Orphaned resources from forgotten deployments, idle capacity no one is watching, and infrastructure that drifted from your Terraform state — found and flagged before they appear on the invoice.

COST

Cloud waste found and eliminated before it appears on the bill.

The platform continuously compares your Terraform state against actual cloud resource state. Any drift — resources that exist in your cloud account but are not tracked by Terraform, or resources that were provisioned and never destroyed — surfaces as a cost anomaly report. The report itemizes every orphaned resource with its estimated monthly cost, the last time it showed activity, and the reason it's considered orphaned.

For each identified waste source, the platform generates a Terraform destroy plan — a precise, reviewable list of exactly what will be removed and the estimated monthly saving. Because destroying cloud resources is irreversible, every destroy plan routes through the HITL gate for explicit human approval. Once approved, the destruction executes against your existing Terraform state.

The platform also integrates with AWS Cost Anomaly Detection and similar services — receiving webhook alerts when spend spikes, immediately correlating the spike against Terraform state, and surfacing the specific resources responsible. You receive a root-cause diagnosis and a ready-to-approve action plan, not just a billing notification.

The outcome
Orphaned resources identified before the monthly statement arrives
Cost anomalies traced to their specific infrastructure source, not just a service line
Terraform destroy plans ready for one-click approval — no manual state archaeology
Continuous drift detection between declared state (Terraform) and actual state (cloud account)
Specifications
Detection targetsUntagged resources, idle instances, orphaned storage, overprovisioned VMs
Action modesAuto-remediate, queue for approval, or report-only
Avg monthly savings found$9,340 (first 30 days, across deployed customers)
Cloud targetsAWS, GCP, Azure, multi-cloud
Capabilities used

Agent Engine · AI Reasoning Layer · Policy Enforcement · Human-in-the-Loop · Integrations (Terraform, AWS, GCP)

Most relevant for
SaaS and technology companies
Engineering teams on AWS / GCP / Azure
FinOps programs
MSPs managing cloud spend for clients
Not a dashboard — an agent

Cost visibility tools show you the waste. ActivLayer eliminates it. There's no engineer whose job it is to act on a dashboard — the agent does it.

The problem

DR plans exist on paper. Most are not tested at full scale because running a real drill requires coordinating multiple teams, following a manual runbook, and risking the production environment.

DR

Automated DR drills with real RTO/RPO metrics and compliance-ready evidence.

Disaster recovery drills are configured as scheduled jobs — quarterly, monthly, or on demand. When a drill fires, the platform runs a complete, end-to-end DR validation: it checks replication health across all protected VMs or databases, provisions the DR environment via Terraform, validates that all protected services pass health checks in the DR environment, and measures real RTO and RPO against your SLA commitments.

DNS cutover — the one step that could affect production routing if executed against live DNS — always routes through the HITL gate. Every other step runs autonomously. After the drill completes, the platform generates a structured DR report: RTO achieved, RPO delta, services validated, resources provisioned, comparison to previous quarters, and engineer time required.

After validation, the DR environment is automatically deprovisioned via Terraform destroy — no ongoing cost, no orphaned resources.

The outcome
Quarterly DR drills run end-to-end without a coordination call
Real RTO and RPO metrics, not estimates — measured against an actual provisioned environment
DR evidence report generated automatically — suitable for board, auditor, and insurer
DR environment provisioned and deprovisioned within the same workflow — no ongoing cost
Specifications
Drill cadenceConfigurable: weekly, monthly, or on-demand
RTO measurementReal clock time, not estimates
Report outputCompliance-ready PDF with evidence chain
Failure handlingDrill failures surfaced as remediation findings
Capabilities used

Agent Engine · AI Reasoning Layer · Human-in-the-Loop (DNS cutover) · Integrations (Terraform, VMware vSphere, K8s)

Most relevant for
Financial services (DORA, BCM requirements)
Government (FISMA, COOP)
Healthcare (HIPAA continuity)
Any organization with cyber insurance requiring documented DR validation
DR plan vs DR capability

A DR plan is a document. DR capability is what you have when the document has been tested against real infrastructure under realistic conditions. ActivLayer builds the latter — automatically, on a schedule, with evidence.

The problem

Managing infrastructure for multiple clients requires either disproportionate headcount or accepting that some clients' environments go unmonitored overnight.

MSP

More clients, same team. Autonomous 24/7 response at portfolio scale.

Every client environment is fully isolated — separate credentials, separate agents, separate policy rules, separate session history. Your engineers see all clients from a single Operations Console with environment and client filters. Clients can optionally have a read-only view of their own session history and compliance reports.

Per-client configuration is granular. A healthcare client gets HIPAA-aligned OPA policy rules. A PCI client gets payment card security controls. A government client gets FISMA-aligned baselines and airgap deployment. Each client's agent templates define exactly which operations are permitted autonomously and which require approval — configured independently, enforced automatically.

The platform handles the volume — pod crashes, backup failures, VM performance degradation, configuration drift — across all clients simultaneously, autonomously, around the clock. Monthly reports for each client are generated from session data and available via the API or console export.

The outcome
Autonomous 24/7 response across all client environments — without proportional on-call staffing
Per-client policy isolation — one platform instance, every client's rules enforced independently
Cross-platform support — serve clients running K8s, OpenShift, VMware, Proxmox, Ansible, Terraform from one tool
Automated client reporting — compliance posture and incident history as a deliverable, not a manual effort
White-label options available (Enterprise) — present the platform under your own brand
Specifications
Tenant isolationHard namespace + policy separation per client
Automation levelConfigurable per client: full auto, HITL-gated, report-only
Client reportingAutomated, branded, configurable cadence
OnboardingTemplate-based: new client provisioned in under 1 hour
Capabilities used

All six components · Multi-tenant deployment model · Per-client agent templates · REST API for client reporting integration

Most relevant for
MSPs of any size — from boutique shops managing 10 clients to large providers managing hundreds
The economics

The typical MSP using ActivLayer handles 40–60% more client environments with the same engineering team. At standard MSP contract rates, that translates to 1.4–1.6× revenue per engineer head.

The problem

Your SaaS or PaaS product serves customers who run infrastructure. Adding autonomous operations intelligence to your product without building it from scratch.

EMBED

Add an autonomous ops intelligence layer to your product.

The full platform capability is available via REST API. Submit an intent programmatically, receive session state, poll for completion, retrieve the AI reasoning and execution results — all from your own application. Your customers interact with your product's interface; the autonomous ops intelligence runs underneath.

For products that need to surface the HITL approval workflow to end users, the API exposes every approval gate — pending intents, briefing content, approve/deny endpoints — so you can build the approval interface natively within your own UI. Your customers see your design; the platform handles the reasoning, policy enforcement, and execution.

The white-label SDK (Enterprise) provides pre-built UI components for the operations console, approval queue, and session trace views — ready to embed under your brand with your color scheme. Combined with per-customer agent isolation, this is the foundation for adding a serious autonomous operations layer to any infrastructure-adjacent product.

The outcome
Add autonomous incident detection and remediation to your product via REST API — no platform rebuild required
Embed the full HITL approval workflow in your own UI, under your own brand
Per-customer isolation — each of your customers has their own agent scope, policy rules, and audit history
Productize operations intelligence as a feature of your platform, not a separate tool your customers have to learn
Specifications
Integration optionsREST API, full white-label, OEM
Typical integration time2–4 weeks (API), 6–8 weeks (white-label)
SDK availabilityPython, TypeScript
Revenue modelRevenue sharing on embedded / referral deployments
Capabilities used

All six components · REST API · Webhook integration · White-label SDK (Enterprise) · Per-tenant agent isolation

Most relevant for
Infrastructure SaaS companies (monitoring, observability, cloud management)
PaaS providers adding ops automation to their platform
ISVs serving industries with operational compliance requirements
Who this is for

If you ship infrastructure software and your customers are doing manual operations work — incident response, compliance checks, cost reviews — that's work ActivLayer can automate inside your product. You differentiate. They stop paging on-call at 3am.

Ready to start?

Start with what you need most.

Every solution is available in the Community edition — free, no credit card required.