Daniel Lepel | M365 & Azure Architect

An Agent Architecture Case Study

A production agent system, built on Microsoft Graph and Claude Agent SDK, running continuously for ten weeks against a real operating problem.

This is a long-form technical case study of an agent architecture I built and run. It is a production system, not a demo. The numbers are real. The code is shipping. The problem domain is my own job search, which gave me a ninety-day forcing function and the fastest possible feedback loop. The patterns behind it are the ones any organization moving infrastructure teams into agent territory is going to keep running into: grounding, skill isolation, deterministic state, writable memory, and operator integration.

Production agent architecture Continuous operation since Feb 6, 2026 Claude Agent SDK · Microsoft Graph · Python

Section 1 | Positioning

I built the agent architecture enterprises are about to need.

My twenty years are in Microsoft infrastructure. Tenants, identity, endpoints, Exchange, the plumbing. In February my position at National Business Technologies was eliminated in a reduction in force, and I had ninety days to replace a principal-level role. I treated it as a forcing function. Rather than running a job search on spreadsheets and manual follow-ups, I rebuilt the whole thing as an agent system and became my own Customer Zero.

Today it runs my pipeline end to end. 145 opportunities tracked, six active interview threads, continuous operation since February 6. Scheduled jobs run unattended on cron, regenerating the dashboard, classifying new email, and scraping job boards without a session open. Inside the session, it scans my inbox via Graph, drafts recruiter replies in my voice, tailors resumes, generates interview prep documents grounded in confirmed experience, and refuses to fabricate anything the source material does not support. It is the same pattern enterprises now need for tenant assessment, license optimization, oversharing review, and agent governance.

The reason this is worth a long read is specific. Most Microsoft infrastructure engineers have not yet shipped a production agent. I have. This document is how it is built, why the choices are the ones that transfer to a customer engagement, and where the pattern is directly portable to the work a Microsoft cloud practice is already being asked to do.

Section 2 | By the numbers

The system is live, not a demo.

These numbers come from the running system. Every row is an opportunity I acted on. Every folder holds drafted correspondence, screening notes, and prep artifacts the agent produced or helped produce.

145

Opportunities managed

Feb 6 to Apr 16, one operator

Active interview threads

Running concurrent

129

Opportunity folders

JD, resume, prep, correspondence

Days of continuous run

Autonomous morning digest and triage

Specialist skills

Routed by task signature

Python services

Graph API, tracker, digest, health

Scheduled and startup jobs

6 Cowork cron, 4 Windows Task Scheduler

Context files

Confirmed experience, rules, lessons

Every metric above is queryable from the running system. The tracker is an .xlsx with a regenerated HTML view. The skills live in a plain directory the agent reads at session start. The run counts come from the action log.

Section 3 | Capabilities

What it actually does, in operator terms.

Inbox triage with accountability

Paginates the Outlook inbox via Graph. Finds recruiter replies, application confirmations, interview invites, and DOL mail. Never relies on keyword search because staffing firms use unpredictable senders. Writes every new recruiter contact to a tracker row the same pass.

Draft generation in operator voice

Produces Outlook drafts directly via Graph instead of asking the operator to copy and paste. Embeds the full signature block inline because the Outlook auto-signature is disabled. Handles decline, interest, redirect, and thank-you flows with separate templates.

Grounded interview prep

Builds company and role prep docs keyed to the job description and the operator's confirmed experience. Refuses to claim tools the operator has not used. Spells out every acronym on first use. Includes a meeting-join button pulled from the calendar invite.

Posting screener

Reads full job descriptions before queueing anything. Checks that the core platform is Microsoft. Verifies the posting link is live. Rejects with tracker evidence so the record stays auditable for the DOL.

Resume tailoring with provenance

Variants per role, built from a branded docx template. Strips AI and tool metadata from docProps XML and runs pikepdf on the PDF before delivery. Every change is traceable to the source bullet in the master resume.

Scheduled execution

Six cron tasks run unattended via the Cowork scheduled-tasks MCP: morning digest (06:00), dashboard and briefing (06:00 weekdays), email classification (09:00 weekdays), LinkedIn sweep (07:00 weekdays), Dice and Indeed sweep (07:15 Mon/Thu), session-usage monitor (22:00). Four more run via Windows Task Scheduler: the action and chatbot API servers start at login, rclone backup fires nightly at 23:00, and a toast notifier hangs on a manual trigger. Sessions open onto work already done.

Morning digest and dashboard

Python service regenerates a digest page, a command-center page, and an actions page. Runs on a Cowork cron schedule at 06:00 ET. The operator opens one URL in the morning and sees the whole pipeline.

Section 4 | Architecture

How the pieces fit.

Natural-language in, deterministic state out.

The architecture has four tiers. A natural-language entry point, a skill router, a tool layer that mixes Model Context Protocol connectors with Python services, and a state layer on disk. Probabilistic work stays in the model. Deterministic work stays in Python. The two halves never pretend to be each other.

Tier 1 | Interface

Natural language, voice or text

Operator gives a request in plain English. No DSL, no form. The session-start skill fires on every new session and gates all other work until the inbox is scanned and the tracker is current.

▼

Tier 2 | Skill Router

16 specialist skills, markdown-defined

Each skill is a markdown file with a description field the model matches against the request. The router picks one or more skills based on task signature. Skills are isolated: the resume-work skill cannot rewrite the email-and-drafts skill. This is the same pattern Claude Agent SDK uses for production agents.

▼

Tier 3a | MCP connectors

ms365 | dice | indeed | microsoft-learn

Graph API access for mail, calendar, SharePoint, OneDrive. Job board search. Microsoft documentation lookup. Added April 2 after the outlook-assistant MCP was deprecated.

Tier 3b | Python services

Deterministic work

action_server.py, chatbot_server.py, generate_morning_digest.py, generate_tracker_html.py, graph_api_helper.py. 15 services total. Anything that needs to be the same every time runs here, not in the model.

→

Tier 4 | State

Disk, not a database

Job_Tracker.xlsx is the spine. action_log.json, sent_log.json, action_queue.json track the agent's own work. MEMORY.md and a memory/ directory hold persistent facts across sessions.

Ground truth

CLAUDE.md and context/

The agent reads CLAUDE.md at session start. Identity rules, compensation floors, writing rules, thoroughness rules. 17 context files hold confirmed experience. Anything not in here is not a claim the agent will make.

Deterministic | Python Probabilistic | Model + skill

Section 4.5 | The agent map

All sixteen skills, drawn three ways.

Structure, sequence, and system stack.

The four-tier block above is the bones. The three visuals below are the muscle. Constellation shows what each skill does and which ones share a tool. Swim lane shows how they execute in order during a normal morning run. Layered stack shows every moving part named in one place.

Map 1. Skill Constellation

Structure

Main Agent in the center. Sixteen specialist skills grouped into four functional domains. The tool line under each cluster name is the primary MCP or Python library the skills in that domain invoke.

Map 2. Daily Workflow

Sequence

A normal weekday morning, left to right. Each step is one skill acting on state produced by the step before it. No step is skipped. The session gate on the far left forbids any downstream work until the inbox scan and tracker refresh are complete.

Session Gate

session-start

Reads CLAUDE.md. Loads rules, identity facts, comp floors. Refuses to do anything else until the inbox has been scanned.

StateCLAUDE.md | context/

Inbox Triage

email-and-drafts

Paginates Outlook via Graph. Finds recruiter replies, ATS confirmations, interview invites, DOL mail. Stages drafts for each new recruiter contact.

Toolms365 MCP

Draft Generation

email-and-drafts | post-screen-thankyou

Writes Outlook drafts directly. Embeds signature inline. Selects template by flow: decline, redirect, interest, thank-you.

Toolms365 MCP | create-draft-email

Prep Sync

interview-prep

For any confirmed interview on the calendar, rebuilds the prep doc. Pulls Teams join link. Adds it as a button on the doc and on the dashboard.

Toolms365 calendar | python-docx

Tracker Regen

tracker-and-dashboard

Any change to Job_Tracker.xlsx triggers generate_tracker_html.py. The HTML view is never edited by hand.

Toolopenpyxl | Python

Morning Digest

action-queue | scheduled task

Renders a digest page with open actions, interview schedule, proactive leads, and pending DOL items. One URL for the whole pipeline.

ToolPython service | Windows Scheduler

Map 3. Layered System Stack

System

The whole thing as a stack. Operator on top, disk state on the bottom, request flow downward, state flow upward. Every named component in the running system appears somewhere on this map. Orange outlines are MCP connectors. Gold fills are skill boundaries. Italic caption text is the underlying tool.

Layer 1 | Operator

Daniel

Natural-language requests, voice or text. One operator, no queue.

Layer 2 | Orchestrator

Main Agent | Skill Router

Matches request to skill descriptions. Enforces the session-start gate. Loads CLAUDE.md and context/ on every new session.

Layer 3 | Specialist skills (16)

Correspondence

email-and-drafts post-screen-thankyou linkedin-posts

Resume & assets

resume-work workday-resume docx pdf pptx xlsx

Pipeline & state

session-start action-queue tracker-and-dashboard

Interview & meta

interview-prep folder-layout schedule skill-creator

Layer 4 | Tool layer

MCP connectors

ms365 microsoft-learn dice indeed

Python services (15)

action_server.py chatbot_server.py generate_morning_digest.py generate_tracker_html.py graph_api_helper.py

PowerShell jobs (13)

scheduled tasks sync scripts startup hooks

Layer 5 | State on disk

Pipeline state

Job_Tracker.xlsx action_log.json action_queue.json sent_log.json

Ground truth

CLAUDE.md context/ (17) MEMORY.md memory/

Opportunity files

opportunities/ (129) resume variants prep docs

Section 4.6 | The Command Center

What the operator actually sees.

The morning view, rendered with sample data.

The maps above describe the shape of the system. The Command Center is where the operator lives. One URL opened at the start of the day, three stacked views, every piece of state the pipeline produced since the last run. The views below are rendered against sample data because the live pipeline holds confidential recruiter correspondence and active interview threads. The layout, the field structure, and the rendering pattern are exactly what ships. Company names are placeholders from the Microsoft documentation tradition: Fabrikam, Contoso, Northwind, Adventure Works, and friends.

Sample data only. Rows, recruiters, dates, and compensation figures are fabricated. The real dashboard runs locally on the operator's own machine.

Pipeline Snapshot

regenerated from Job_Tracker.xlsx on every change

Company	Role	Stage	Next Action	Contact	Updated
Fabrikam Industries	Principal M365 Architect	Round 2 Scheduled	Prep doc due Fri	Alex Morgan \| talent recruiter	Apr 15
Contoso Ltd	Director of IT Infrastructure	Awaiting Feedback	Follow up Apr 24	Jordan Chen \| direct	Apr 12
Northwind Traders	Senior Cloud Engineer	Applied	ATS confirmation received	direct apply	Apr 14
Adventure Works	Cloud Governance Lead	Screen Scheduled	Screen Apr 21 10:00 AM ET	Sam Rivera \| agency	Apr 16
Wingtip Toys	Azure Platform Architect	Prospect	Research company, confirm comp floor	proactive lead	Apr 11
Tailwind Traders	Principal Cloud Architect	Declined	Rate below floor, logged	Casey Patel \| staffing	Apr 10
Proseware	Senior M365 Engineer	Applied	ATS silent past SLA	direct apply	Apr 08
Lucerne Publishing	IT Director	Phone Screen Done	Waiting decision	Morgan Bell \| search firm	Apr 09

Action Queue

drafts awaiting operator review | stale threshold 48h

Draft reply staged 2h ago

Recruiter follow-up at Fabrikam Industries. Full signature block embedded inline. Needs operator review and send.

amorgan@fabrikam.example Staged in Outlook

Interview confirm 6h ago

Adventure Works screen confirmed for Apr 21 10:00 AM ET. Calendar invite accepted. Prep doc scheduled for build Fri.

srivera@recruitco.example Prep doc queued

Stale draft audit 18h ago

Two drafts to Proseware recruiter unsent past 48h threshold. Operator decision required: send, rewrite, or close out.

draft_queue.json Operator action

Proactive lead Today 06:15

New posting detected on Dice. Title: Principal Cloud Architect at Fabrikam Industries. Core platform Microsoft: yes. Posting link live: yes. Compensation inside floor. Queued for operator review.

dice MCP | screened by posting-screener skill Queued

Morning Digest | Apr 17 2026

regenerated nightly by a Cowork cron task at 06:00 ET

New since last scan	3 recruiter replies \| 2 ATS confirmations \| 1 interview invite
Interview today	Adventure Works \| Screen \| 10:00 AM ET \| Teams join button active on prep doc
Interviews this week	2 confirmed \| 1 pending recruiter confirmation
Stale drafts	2 recruiter drafts past 48h threshold \| flagged for operator action
Proactive leads	5 new postings matched against criteria \| 3 queued \| 2 auto-rejected below comp floor
DOL / unemployment	RESEA complete \| UI active \| next certification Sunday \| job-search log current
Run health	session-start last run 06:00 ET \| action_server uptime 71 days \| no errors in last scan

This view is where the architecture pays off. The operator does not open Job_Tracker.xlsx, action_log.json, or the skill directory during a normal morning run. One URL holds the whole pipeline. Replace the operator with a customer tenant administrator and the rendering pattern still holds. Different data, same state spine, same deterministic rebuild every time the pipeline changes.

Section 5 | Decisions that made it real

The choices I would bring to a customer conversation.

Split deterministic and probabilistic work cleanly.

Anything that must be the same every time runs in Python. Anything that has to sound like a person runs in the model. The tracker calculation, the HTML regeneration, the log write, the file copy: Python. The email draft, the screening judgment, the prep-doc narrative: model. The two halves never pretend to be each other.

Why it mattersCustomers will try to let an agent move money, delete tenants, or compute payroll. The right answer is to bracket the probabilistic part with deterministic guardrails, not to trust the model to behave.

Ground every claim in source files.

The agent is not allowed to claim experience I cannot back up. CLAUDE.md has explicit identity rules. Context files hold confirmed work history. If a prep doc would need a capability I do not have, the agent either omits it or flags the gap. No fabrication, no inflation, no plausible hallucination.

Why it mattersCustomer-facing agents that fabricate will destroy trust on the first contact. Grounding is the only pattern that scales.

Make the agent refuse shortcuts.

If there is a more rigorous method available to verify something, the agent uses it, even if it costs more tool calls. Verify by querying Sent Items, not by asking the operator. Paginate the full inbox, not a keyword search, because staffing firms use unpredictable senders. Fix data quality issues in the same pass you find them, not a list for later.

Why it mattersShortcut behavior is how agents drift. Building the opposite habit into the system prompt and into the skill descriptions is what keeps behavior stable over weeks of runs.

Close the loop with writable memory.

The agent writes lessons back to skill files, context files, and auto-memory after each task. CLAUDE.md stays lean; archived lessons live in context/lessons-archive.md. The next session gets a sharper tool than the last one. This is the feedback loop most agent pilots are missing.

Why it mattersWithout a writable memory tier, an agent starts every engagement cold. With one, it compounds.

Treat the operator as the integration point.

The agent pushes state into tools the operator already uses: Outlook drafts, Excel tracker, HTML dashboard. The operator never has to learn a new interface. The agent adapts to the environment rather than asking the environment to adapt.

Why it mattersThis is the move that unlocks adoption. Customers will not migrate to a new console. They will accept an agent that makes their existing console better.

Section 6 | Translation

Why this pattern matches the work enterprises are being asked to do.

Same architecture, bigger surface area.

What I built for my own pipeline

What a Microsoft cloud customer now needs

Inbox triage that writes back to a tracker. Agent paginates Graph, classifies recruiter correspondence, stages Outlook drafts, updates the .xlsx spine.

Tenant assessment, oversharing review, license optimization, and migration. All four share the same architectural shape: paginate a Graph source, classify each record, stage an action, write back with a full audit log. My pipeline runs that pattern on mail and a tracker. A customer engagement runs the same pattern on SharePoint permissions, Exchange rules, license entitlements, or a full tenant transfer. The pattern is the portable piece. Scale and the zero-loss bar on migration are what a customer tenant adds.

Grounded interview prep. Refuses to claim tools the operator has not used. Pulls facts from context files. Every acronym spelled out on first use.

Agent governance for customer-facing agents. Grounding is the foundational discipline, and it is the one I have built into my own system. Enterprise agent governance adds what I have not yet shipped at customer scale: multi-tenant data isolation, Purview DLP and sensitivity labels, data residency controls, and audit export for regulated industries. The grounding muscle transfers. The rest is M365 plumbing that already maps to components I know how to wire.

16 skills, one router. Task signatures pick the right specialist. Skills stay isolated. The router cannot rewrite a skill's behavior at runtime.

Copilot Studio and Agent 365 patterns. Topics, actions, and tools in Copilot Studio are the direct analogue of my skills, Python services, and MCP connectors. The primitives are the same; the packaging and licensing differ. Translation between Claude Agent SDK and Copilot Studio is pattern recognition, not net new learning. The architectural vocabulary carries over cleanly.

Session gate, memory tier, deterministic state. Agent cannot do downstream work until upstream state is verified. Memory writes after every task. State lives on disk, not in the model.

Customer Zero discipline for agent rollouts. I ran the pilot on myself, instrumented everything, wrote the lessons back to the guardrails, and widened scope only after the loop closed. A one-operator pipeline is smaller than a customer fleet by several orders of magnitude. The discipline is what transfers. The blast radius is what gets sized up inside a customer engagement.

The architecture above is pattern transfer from a pipeline I built and run. I have not yet applied the same pattern at customer scale on a Copilot Studio or Agent 365 engagement. That is the honest distinction, and it is why Customer Zero is the right framing. My twenty years are in the Microsoft infrastructure these agents plug into: tenants, identity, Exchange, SharePoint, endpoints, the M365 control plane. I know how those components wire together. What they need is the right tenant to live in. Given the platform depth and the architectural fluency, the ramp into a customer conversation is short.

The short version.

Most Microsoft infrastructure engineers have not yet shipped a production agent. I have. The system has been running for ten weeks. The architecture is in Section 4. The operator view is in Section 4.6. The decisions that made it real are in Section 5. The translation from my pipeline to customer work is in Section 6.

If you are building a Microsoft cloud practice, running a Copilot or Agent 365 customer-zero program, or staffing a Principal Architect role where this pattern is part of the job, I would be glad to walk through any of it live.

Daniel Lepel

Principal Microsoft Cloud Architect

daniel@lepel.us
212-252-9200
Albany, NY Capital District

Agent Architecture Case Study | daniellepel.com | Published April 2026