Why did this AI coding session fail?

Reflect puts prompts, responses, tool and MCP calls, errors, retries, subagents, files, verification, and outcomes on a synchronized local timeline so the first visible failure or recovery gap can be found.

How can I improve the way I work with AI agents?

Reflect finds recurring friction and successful routines in past sessions, then suggests bounded changes to context, prompts, tools, verification, workflows, or skills. The evidence remains available for review before any practice becomes reusable.

Where do tokens and cost go across AI agent work?

Reflect shows observed token usage, cache behavior, estimated cost, model mix, tool activity, failures, retries, context churn, and large-session concentration across one session or the full local history. Cost is shown only where model-pricing evidence is recognized.

How can proven AI practices scale across teams, repositories, and agents?

Reflect turns selected session evidence into reviewable, versioned workflows and skills. Proven practices can then move across agents, repositories, and teams, while later sessions measure whether they actually helped.

What evidence supports increasing AI capacity and budget?

Reflect connects local usage and cost to delivered work, identifies avoidable waste, documents improvement commitments, and prepares an evidence-backed internal request for review.

See How Agents Work. Improve What Happens Next.

Reflect turns AI coding sessions into shared evidence for better procedures, smarter capacity decisions, and reusable practices that scale across agents, repositories, and teams.

Local first
Open source
Vendor neutral
OpenTelemetry native

Explore the Live Dashboard Install in 60 Seconds

Showcase Dataset

184 sessions · 5 agents

64.8Avg Quality

49,101Spans

16,009Tool Calls

691Tool Failures

1,554MCP Calls

$0.12Priced Cost

227Subagents

Cost covers sessions with recognized model-pricing evidence.

Agent Activity

Cursor

23.1K

Claude

16.1K

Copilot

9.1K

Gemini

783

Observed Evidence

See exactly where retries, failures, and context churn slow work down.
Find successful routines worth turning into reusable skills and workflows.
Measure whether an applied improvement helped future sessions.

One View Across

Claude Code
Codex
Cursor
GitHub Copilot
Gemini CLI
OpenCode
Antigravity

From Agent Work to Measured Improvement

Understand the Run. Decide What Improves.

Reflect connects sessions, workspaces, repositories, tools, skills, outcomes, and local memory so evidence can explain what happened, guide better procedures, and show what is ready to scale.

Diagnose Sessions Without Reconstructing Them by Hand.

Follow prompts, responses, tool and MCP calls, failures, retries, subagents, token use, cost, and outcomes on one synchronized timeline—even when evidence arrives through native sessions and hooks.

Repeated tool failure3 retries

Context handoff gap2 sessions

Visible verificationMissing

See the Evidence Behind the Work.

Move beyond isolated transcripts. Stable hook, native-trace, workspace, repository, MCP, and delegation identities connect sessions to the work they actually performed.

Find Loops That Create Rework.

Detect repeated inputs, failed recovery, broad exploration, late verification, and patterns that consume time without changing state or improving the work.

Decide What Becomes Reusable.

Review evidence-backed workflow and skill proposals, inspect the exact change, install approved versions into the right repository or agent, and retain their history.

Measure Whether Improvement Is Real.

Compare later sessions against the original cohort before claiming that a workflow, skill, or rule improved the outcome.

Ask Through Your Agent

Use Reflect Where You Already Work.

Start with a normal engineering or business question. The agent uses Reflect's local evidence to explain the work, recommend follow-up actions, and prepare evidence for personal learning, team decisions, or internal requests.

Agent syntax For other agents, choose Reflect from the skill picker.

Budget Evidence

Why does our team need more AI budget?

Let the agent connect exact usage to past work, separate productive growth from avoidable waste, recommend improvements, and prepare a reviewed internal request.

$reflect-usageExplain our AI budget need and what should improve.

Session Diagnosis

Why did this session stall or fail?

Open the synchronized conversation, timeline, tool activity, subagents, workspace context, failures, and cost evidence.

$reflectExplain why this session stalled or failed.

Recurring Friction

What should we improve first?

Rank repeated problems by impact, then inspect the exact observations and source sessions behind each recommendation.

$reflectRank what we should improve first.

Loops to Skills

Can this repeated behavior become reusable?

Review bounded loop evidence and let an agent author one pending workflow packaged as a skill for deliberate approval.

$reflect-skillsFind repeated behavior worth turning into a reusable skill.

Repository Guidance

How should an agent work in this repository?

Retrieve reviewed workflows, linked evidence, and folder-scoped memory instead of relying on generic advice.

$reflectExplain how an agent should work in this repository.

Memory Recall

What have we already learned here?

Search local, provenance-aware memory for a repository decision, release practice, debugging path, or prior constraint.

$reflectFind what we already learned about the release gate.

Instrumentation Health

Are the agents actually observable?

Check hook installation, native telemetry, local files, gateway state, detected agents, pricing, and integration support.

$reflectCheck whether agent telemetry and integrations are healthy.

Personal Insight to Organizational Learning

Reflect on One Session. Improve How AI Work Scales.

Start with a personal question about one session, then follow the same local evidence into shared procedures, capacity decisions, and practices that scale across the organization.

Reflect Local evidence connected

Evidence, Not Vibes.

Personal reflection

Organizational reflection

You

Why did my AI coding agent hit its token budget limit?

ReflectLocal evidence

Reflect breaks down observed input, output, cache creation, and cache-read tokens by session and model, then connects that usage to tool and MCP calls, retries, context churn, continuations, and outcomes. When the evidence cannot prove an exact cause, Reflect keeps the attribution unknown instead of inventing one.

Token breakdownRetriesContext

Why did my AI coding agent hit its token budget limit?

How It Works

Private by Default. People Stay in Control.

Connect once, ask through your agent, and use the visual UI whenever you want the complete evidence and review path.

01 / Capture

Connect Your Agents.

Reflect configures local OTLP collection and reads supported native session stores without forcing every agent through one vendor gateway.

reflect setup

02 / Ask

Work Through Your Agent.

Ask natural questions about usage, failures, repeated work, repository guidance, and improvement. Reflect gives the agent bounded, provenance-aware evidence.

Use Reflect to explain this work.

03 / Review

Explore the Evidence Visually.

Open the UI to inspect sessions and costs, review exact workflow or skill changes, approve or reject them, and see their later impact.

reflect

Local-First Architecture

Your Engineering Telemetry Stays Yours.

Reflect keeps its evidence graph and workflow ledger in local SQLite. Its read-only MCP gives agents scoped context with provenance, while optional memory providers such as OMEGA remain independently installed and clearly separated from Reflect-verified evidence.

Local by defaultTraces, session stores, the SQLite ledger, and dashboard data remain on your machine.
Text capture is optionalUse metadata-only telemetry or opt into prompt and response capture deliberately.
OpenTelemetry nativeKeep your own collector, backend, retention, and governance model.
Agent-ready contextExpose a bounded task lifecycle, six read-only MCP inspection tools, and approval-gated changes.

Memory, With Boundaries

Bring Your Memory Provider.

Reflect keeps evidence and provenance in its own local ledger, then connects to optional memory systems without confusing external recall with verified session facts.

Connected Providers

Available for configured memory operations

Local SQLiteDefault source of truthBuilt in

OMEGALocal semantic memoryIntegrated

Agent MemoryGeneric HTTP memoryConnected

LiteLLMProxy memory endpointConnected

Memory PalaceCompatible HTTP memoryConnected

Discovery Adapters

Installation and health visibility in this release

Mem0Health and discoveryDiscovery only

GraphitiHealth and discoveryDiscovery only

TencentDB Agent MemoryHealth and discoveryDiscovery only

Start Locally

From Install to Agent-Ready Evidence in 60 Seconds.

Connect Reflect once, ask through your coding agent, and open the visual dashboard whenever you want the complete evidence and review path.

Read the Documentation View on PyPI

Quick Start

$ pipx install o11y-reflect
$ reflect setup

# Then ask in your coding agent
› Use Reflect to explain why our team needs more AI budget.

# Open the visual overview
$ reflect

Ask. Review. Improve.

Ask Through Your Agent. See the Evidence in Reflect.

Get the immediate explanation in the conversation, then use the visual UI to inspect the source evidence, control what becomes reusable, and measure what helped.

Explore the Live Dashboard View reflect on GitHub