Engineering Status¶
Last updated: 2026-06-13
Status: v1 released. Two auth modes (Agent User / Delegated) running locally on macOS, Linux, and ARM64 Windows 11. 1,400 passing tests across the suite (1 skipped), ruff clean. Body-first prompt architecture loads at boot; persona-sati MCP wires personality and memory when configured. ADR-005 cloud-memory Phases 1, 2, 5, 6a shipped — blob storage is opt-in via setup.sh --use-cloud-memory. Work IQ Word migration landed (PR #75) and now emits fail-closed audit events for every Work IQ MCP tool call. The send_teams_message auto-wait pattern is host-gated and deterministic. Confused-deputy authorization fix in add_teams_member / share_file shipped via active-sponsor-channel binding (Gate 3) on 2026-06-04. The Teams Bot Gateway mode was removed on 2026-06-08 (ADR-006) — it bypassed Agent Identity and was superseded by Microsoft Agent 365's managed AI teammate. README, docs site, and GitHub Pages auto-deploy refreshed 2026-05-21.
In Progress¶
Source of truth for detail: TODOS.md in the repository root. One line each below.
- Follow-up: two-phase sponsor confirmation flow — Gate 3 closes Chain A. Chain B (prompt injection from
read_filecontent where sponsor is genuinely active in target chat) needs explicit per-action sponsor approval. SeeTODOS.mdP1. - Follow-up:
read_filecontent spotlighting — broader prompt-injection mitigation than the Gate 3 fix. SeeTODOS.mdP1. - Script-toolkit docs closeout —
./status.shis the canonical entry; finish the remaining script-reference polish and smoke verification. SeeTODOS.mdP1. - Test isolation: blob env leakage —
tmp_data_dirfixture intests/tools/test_interaction_log.pydoesn't clearENTRABOT_BLOB_ENDPOINT; 10 tests fail on any machine with blob env configured. Partially addressed:test_interaction_log.py,test_daily_summary.py, andtest_email_poll.pyfixtures now unset blob env; session-scoped autouse fixture still open. - MCP server orphans on Claude Code exit — background poll tasks sit outside FastMCP's lifespan cancel scope; new sessions spawn a second server, both poll Graph independently.
- Daily summary scheduler — wrong day + double-fire — UTC-based
target_daysummarizes the brand-new UTC day at 5pm PDT; scheduler fired twice at the same second on 2026-04-17.
Recently Shipped¶
Last ~30 days. Full diff: git log --since="2026-05-04".
- A365 Work IQ audit attribution (2026-06-13, branch
security/a365-audit-attribution) —WorkIqProvider.call_toolnow logs pending/success/failure audit events around every Work IQ MCP call before touching customer SharePoint/OneDrive/Word resources. Audit metadata records only{server, tool}— never argument keys or values — and audit failure prevents the MCP call. Resource handle is a stablea365.{server}.{tool}string; operators correlate by action+timestamp+agent_id and walk over to Graph server-side logs for document-level detail. +6 tests intests/a365/test_provider.py. - Teams chat poll cursor persistence (issue #17) (2026-06-09) — per-chat poll cursor (
last_ts,seen_ids_tail,bootstrapped) now persists throughMemoryBackendatchat_cursors/<chat_id>.json. Fixes the "11-day-old replay flood" symptom — every MCP restart used to re-bootstrap from "newest message at boot" and silently drop messages that arrived during a server-down window. 24-hour staleness cap onlast_tsre-baselines genuinely-old chats instead of surfacing stale messages as live. Debounced 1s async save coalesces bursts; graceful shutdown flushes dirty cursors. New modulesrc/entrabot/tools/chat_cursors.py. +35 tests acrosstests/tools/test_chat_cursors.pyandtests/test_mcp_server_chat_cursors.py. - Confused-deputy fix: active-sponsor-channel binding (Gate 3) (2026-06-04, branch
fix/msrc-active-sponsor-channel-binding) — closes Chain A inadd_teams_memberandshare_file. NewActiveChannelBindingsstore keyed by Graphuser_id, TTL ongraph_sent_at(not server-observed time) to defend bootstrap-replay, updated only afterwrite_stream.send()succeeds.share_filerefactored to audit-first so gate failures emit audit events. Audit metadata records bothsupplied_chat_idandbound_chat_id. +50 tests acrosstests/identity/test_active_channel.py,tests/test_mcp_push_channel_binding.py,tests/tools/test_add_member_channel_binding.py,tests/tools/test_share_file_channel_binding.py. Hard-won learning #67. Follow-up: two-phase confirmation for Chain B (tracked in TODOS P1). read_emailMCP tool (2026-05-27) — fetches the full body + all recipient lists + headers of an inbound mail bymessage_id. Fixes the gap where the 60s email-poll channel push truncates the preview of long forwarded mails. Same three-hop Agent User token +Mail.Readscope as the poll. +7 tests.- Email cursor sub-second precision (2026-05-27) —
advance_cursor()bumps the poll watermark by 1 ms so Graph'sgtfilter does not re-fetch messages at the cursor's exact second after a server restart. - README + docs-site refresh (2026-05-21, ff9a8dd, 9b73dee, b495073) — developer-first README rewrite, GitHub Pages auto-deploy, nav restructure.
- OSS sanitization passes (2026-05-21, f2a3c18; 2026-05-18, 6cff243) — PII scrub, personal data and private identifiers removed from repo.
- Script toolkit refactor + E2E smoke harness (2026-05-19, PR #77) —
./status.shconsolidated;setup.sh --statusdelegates to the same implementation. - Sponsor DM wait — host-gated fix (2026-05-19, 905b7d0, 26aa647) —
wait_for_sponsor_dmno longer blocks Claude Code sessions; channel push is the path on hosts that support it. Learning #66. - Targeted agent identity teardown + setup hardening (2026-05-15, f21cf82, c47552b) — granular teardown without nuking the Blueprint; identity consolidation no longer races on partial state.
- Work IQ Word migration (2026-05-15, PR #75) — Word create/read/comment/reply now routes through Microsoft Agent 365's Work IQ Word MCP server; Graph beta
/commentsis no longer the comment-reply path. - Persona-sati host bootstrap + Entra sponsor authority (2026-05-02, PR #72) —
bootstrap_session()returns the assembled mind contract in one call; sponsor allowlist resolved via Entra. - Server-side placeholder + commitment-language discipline hooks (2026-05-04, PR #74) — outbound Teams text gets server-side commitment detection; placeholders post + resolve cleanly.
- Files MCP — share gate inverted + PR2 author/upload/share (2026-04-30, PR #69, PR #64) — sponsor requester required, recipient unrestricted; write + upload + share to sponsor tools land.
Open Issues¶
add_file_comment Word/Excel — Graph beta /comments returns 404¶
Work IQ Word migration shipped on main (PR #75, 2026-05-15) and live smoke passed; Graph beta /comments is no longer the Word-comment reply path. The legacy add_file_comment against .docx files still 404s — the endpoint family is SharePoint list-item metadata comments, not document-content comments. For .xlsx the correct surface is /workbook/comments, not addressed.
Fix tracked in: docs/runbooks/hard-won-learnings.md Learning #60; Work IQ pivot in PR #75 covers the Word path.
CLI commitment-language detection — unenforced¶
Server-side commitment detection ships as part of the send_teams_message outbound hooks, but only fires on outbound Teams text. Commitments uttered to the operator in the host terminal ("I'll batch this up later") never reach the MCP server and silently drift. Server-side enforcement is host-portable; CLI enforcement would require host-specific hooks. We chose host-portable coverage of the Teams path over Claude-Code-only coverage of both paths.
Fix tracked in: scripts/hooks/README.md "Known coverage gaps"; body-prompt strengthening in prompts/anatomy/channel-discipline.md is the current mitigation.
Persona-sati 12h MCP refresh bug — PR #47 paused at Blueprint constraint¶
Every ~12 hours, Claude Code's cached MCP bearer expires and persona-sati tools start returning Zod schema errors until restart. Draft PR persona-sati#47 (550/550 tests pass) implemented OIDC discovery + PRM shim, but the live OAuth flow at the 12h boundary is blocked: Microsoft's Agent Blueprint app type — which the Persona-Sati Blueprint uses — cannot have public-client redirect URIs and cannot be flipped to fallback-public-client mode. Tenant state reverted; no behavioral change for cert-based three-hop or OBO. Possible resolutions: separate Entra app reg for the MCP client (Phase 2A), persona-sati implements OAuth 2.1 itself (Phase 2B), or land #47 as Phase 1 only.
Fix tracked in: persona-sati#47; docs/platform-learnings/agent-id-blueprints-and-users.md for the platform constraint.
Agent Identity missing Application.Read.All after provisioning¶
wait_for_sponsor_dm and sponsor-gated flows fail with 403 Authorization_RequestDenied when calling /servicePrincipals/{id}/microsoft.graph.agentIdentity/sponsors. Root cause: scripts/create_entra_agent_ids.py doesn't grant Application.Read.All to the Agent Identity service principal. Workaround applied manually on the Windows VM via New-MgServicePrincipalAppRoleAssignment.
Fix tracked in: Add Application.Read.All grant to create_entra_agent_ids.py. Partially addressed in 45bec0f (Windows port acceptance); full provisioner fix still pending across platforms.
Architecture Snapshot¶
The agent talks to the MCP server over stdio. The server reads the Blueprint's private key from the OS keystore, walks the three-hop chain to produce a delegated user token, and uses that token for every Graph and Work IQ call. Inbound Teams messages and emails arrive via background polls and push into the client as channel notifications. Operational state lives locally by default or in Azure Blob Storage scoped to the Agent User's object ID when cloud memory is enabled.
Blueprint (client_credentials)
→ Agent Identity (FIC exchange)
→ Agent User (user_fic grant, idtyp=user)
→ Graph API: Teams, Mail, OneDrive
→ Azure Blob Storage (parallel third hop, ADR-005 Phase 5)
┌─────────────────────────────────────────────────────────┐
│ Local Device (Mac / Windows / Linux) │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Claude Code / Copilot CLI (MCP Client) │ │
│ │ └── stdio + channels ────┐ │ │
│ └────────────────────────────┼─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Entrabot MCP Server (Python) │ │
│ │ │ │
│ │ Body prompt: agent_system.md + anatomy/*.md │ │
│ │ + Persona (optional): persona-sati /sse │ │
│ │ │ │
│ │ send_teams_message ───▶ Graph API (Agent User) │ │
│ │ read_teams_messages ──▶ Graph API (Agent User) │ │
│ │ whoami ───────────────▶ cached state │ │
│ │ audit_log ────────────▶ interaction log │ │
│ │ │ │
│ │ Background: Teams 5s, email 60s, discovery 120s,│ │
│ │ daily summary 5pm PDT │ │
│ │ │ │
│ │ Tokens: Agent User (three-hop, idtyp=user) │ │
│ └──────────────────────────────────────────────────┘ │
└───────────┬──────────────────────────┬──────────────────┘
│ │
▼ ▼
┌───────────────┐ ┌──────────────┐
│ Entra ID │ │ Graph API │
│ Agent IDs │ │ Teams Chat │
│ Agent Users │ │ Mail / Drive │
└───────────────┘ └──────────────┘
What Works (Shipped Capabilities)¶
- End-to-end:
setup.sh→ MCP server → Teams message delivery - Three-hop Agent User token flow (Blueprint → Agent Identity → Agent User,
idtyp=user) - Agent User creation, license assignment, and consent grant via Graph
- Dedicated provisioner app (avoids Azure CLI token rejection)
- State persisted in
.entrabot-state.json(idempotent setup, no secret reset) - Certificate auth for Blueprint — private key in OS keystore, no secrets on disk (ADR-003)
- Token auto-refresh: eager (55-min) + lazy (401 retry) for all tools
- Bidirectional Teams channel — background polling +
notifications/claude/channelpush, 2s overlap dedup window - 429 rate-limit handling with
Retry-Afterpropagation - Multi-user group chat support and cross-tenant federated B2B chats (auto-detects guest UPN, resolves home tenant via OpenID discovery)
add_teams_member— add users to a chat at runtime- Two auth modes:
agent_user,delegated(MSAL interactive + device-code fallback) - Progressive identity state machine:
UNAUTHENTICATED → DELEGATED → PROVISIONING → AGENT_USERwithasyncio.Lock-protected transitions - Identity-aware user ID —
_effective_user_id()returns the right object ID per mode - Body-first prompt architecture —
@includeexpansion ofprompts/anatomy/*.md, non-overridable security and channel discipline - Persona-sati MCP integration —
bootstrap_session()returns the assembled mind contract in one call; clean fallback when not configured - Adaptive Cards —
send_cardwithtool_activity,task_status,build_resulttemplates - Microsoft Agent 365 Work IQ Word — create, read, comment, reply-to-comment
- Files MCP — SharePoint / OneDrive read, write, upload, share (two-gate sponsor authorization on share)
- Email — background poll with Purview-encrypted detection, daily summary at 5pm PDT
- Promises —
add_promise/list_promises/resolve_promisebacked by entrabot blob, ETag concurrency, identity-scoped - Storage backends —
LocalBackend(default) andBlobBackend(opt-in viasetup.sh --use-cloud-memory) - ARM64 Windows 11 acceptance — full CNG signing via TPM-backed cert, three-hop flow live against Entra, Copilot CLI MCP registration, Teams DM round-trip
- 66 hard-won learnings documented in
docs/runbooks/hard-won-learnings.md
Test + Lint Discipline¶
1,237 tests collected. pytest -v && ruff check . must pass before every commit. Coverage threshold is 80% via --cov-fail-under=80. Background poll loops, identity transitions, three-hop token mints, and outbound discipline hooks are covered by integration tests with respx-mocked Graph endpoints.