Roadmap¶
This roadmap reflects the current shipped state of AION and the remaining work required to move from a local control-plane prototype toward production-grade self-healing.
Current status¶
The repository now ships:
- context-aware Python scanning with repository profiling and Semgrep triage
- deterministic repair artifacts plus standalone verification
- repair evaluation with persisted repair records and metrics
- sandbox orchestration for single events, event queues, inbox items, and webhooks
- release candidates with approval, staged advancement, rejection, and rollback
- runtime-first defense planning for containment before rollout
- drift detection, knowledge base, and a continuous watch loop for self-evolution
- knowledge-base confidence boosts wired into the policy engine
Completed phases¶
Phase 0: Trusted foundation — shipped in v1.0.0¶
- Findings are upgraded into structured incidents.
- Repair, verification, orchestration, and release flows share JSON models.
- Fixtures cover the incident lifecycle end to end.
Phase 1: Automatic repair closed loop — shipped in v1.0.0¶
- Deterministic repair is implemented for the first supported issue classes.
- Verification runs syntax checks, Semgrep re-scan, assertions, and optional sandbox commands.
repair,verify, andrun-incidentprovide a local closed loop.
Phase 2: Self-verification and learning — shipped in v1.0.0¶
- Repair attempts can be persisted as audit records.
repair-evalcomputes repair success, verification pass, false-fix, and rollback rates.- Failure outcomes are captured for later template and policy refinement.
Phase 3: Pre-production autonomy prototype — shipped in v1.0.0¶
- The orchestrator accepts JSON events, queue payloads, inbox items, and webhooks.
- Policy gating decides whether an incident can enter automatic sandbox remediation.
- Repository-level sandbox execution supports project-specific verification commands.
- Release candidate management now covers approval and staged rollout decisions.
Phase 4: Self-evolving engine — shipped in v1.1.0¶
- Drift detection:
snapshotsaves a point-in-time security state;driftcompares the current codebase against any saved snapshot and reports new incidents, resolved incidents, regressed files, and a numeric health delta. - Knowledge base: every successful repair is persisted as a
RepairPatternin.aion/knowledge/patterns.json; confidence boosts derived from historical success rates are applied by thePolicyEnginebefore the auto-repair threshold check, closing the self-evolving feedback loop. - Continuous watch loop:
watchpolls a target directory on a configurable interval, auto-repairs newly detected incidents, and refreshes the baseline after each successful fix. - Engine status dashboard:
statusshows all saved snapshots and the full knowledge-base summary in a single view. - Expanded LLM provider support: Gemini, Azure OpenAI, DeepSeek, and Qwen are now supported alongside Anthropic and OpenAI, with automatic provider auto-detection from available environment variables.
Next steps¶
Phase 5: Production adapters¶
The next major work is no longer core modeling. It is integration:
- authenticated webhook and queue adapters for real event sources
- deployment adapters for promotion, rollback, and rollout telemetry
- provider adapters for WAF, gateway, and feature flag execution
- approval and audit integration with external systems
- richer repository test selection instead of only configured command lists
Phase 6: Product proof before scale¶
Before adding more adapters, AION should prove it can reliably reduce security and operations toil for at least one clear ideal customer profile (ICP). The priority is measurable product impact, not feature count.
ICP for the next cycle¶
- Python platform teams maintaining 10-50 services
- Existing Semgrep and CI usage, but no standardized automated remediation
- Need for auditable, approval-based rollout due to compliance requirements
90-day execution plan¶
- Weeks 1-4: Reliability hardening
- make deterministic repair + verification pass rate transparent per issue type
- reduce false-fix rate with stricter policy thresholds and rollback defaults
- ship golden-path examples that run end-to-end in CI
- Weeks 5-8: Integration validation
- deliver one production-ready webhook adapter and one queue adapter
- add deployment-system handshake for staged promotion and rollback evidence
- validate operational latency from event intake to release recommendation
- Weeks 9-12: Adoption proof
- run 2-3 design partner pilots
- capture baseline vs. after metrics for MTTR, manual remediation time, and rollback frequency
- publish a public case study and reference architecture
Go / no-go scorecard¶
Continue investing only if these targets are met by the end of the next cycle:
- at least 2 active pilot teams running AION weekly
-
=30% reduction in median remediation time for supported issue classes
- <=5% verified false-fix rate on pilot repositories
-
=90% of auto-approved candidates either advance cleanly or rollback safely
If fewer than 2 of the 4 targets are met, pause expansion and either narrow the scope to a single high-performing workflow (for example, deterministic repair + verification) or sunset the project.
Guiding principle¶
AION is intentionally conservative. Every step favors deterministic artifacts, auditable state, and reversible rollout decisions over opaque automation.