Comprehensive architecture, implementation, and test plan for the A.G.I.L.E. recruitment intelligence platform
We evaluated four deployment patterns and selected Cloudflare-first with AI Gateway as the production architecture.
| Option | Runtime | AI Models | Cost | Complexity |
|---|---|---|---|---|
| B. CF Workers + AI Gateway | TS at edge | All via Gateway | Free tier + tokens | Low |
| A. GCP Cloud Run + ADK | Python on GCP | Gemini + LiteLLM | Cloud Run + Vertex | High |
| C. Hybrid (CF + GCP) | Both | Split by task | Both platforms | Highest |
| D. Workers AI only | TS at edge | Llama/Mistral only | Neurons only | Low but limited |
fetch() from Workersagenticx-agents repo stays as architecture docs + Python reference| Agent | Actions | Tools Used | Status |
|---|---|---|---|
| Acquire | parse-cv, search | Workers AI, Drive, Sheets, ATS | Working |
| Gauge | score, bulk-score | Gemini Pro, ATS, Sheets | Working |
| Integrate | schedule | Calendar, Gmail, ATS | Stub tools |
| Leverage | report, dashboard | Sheets, ATS, Drive | Working |
| Engage | notify (stage-update, rejection, offer, teams) | Claude, Gmail | Stub tools |
Gauge auto-advances to Integrate if score ≥ 80. Engage fires after each stage transition.
All inference through one OpenAI-compatible endpoint: gateway.ai.cloudflare.com/v1/{account}/{gateway}/compat/chat/completions
| Task | Model | Provider | Why | Fallback |
|---|---|---|---|---|
| CV text → JSON | workers-ai/llama-3.1-8b | Edge | Fast, free, structured | Gemini Flash |
| CV structured parse | google/gemini-2.5-flash | Good JSON output, cheap | Workers AI | |
| Candidate scoring | google/gemini-2.5-pro | Strong reasoning | Claude Sonnet | |
| Offer letters | anthropic/claude-sonnet-4-6 | Anthropic | Quality writing, legal | Gemini Pro |
| Rejection feedback | anthropic/claude-sonnet-4-6 | Anthropic | Empathetic tone | Gemini Flash |
| Pipeline reports | google/gemini-2.5-flash | Data synthesis, fast | Workers AI 70B | |
| Routine emails | workers-ai/llama-3.1-8b | Edge | Volume, templated, free | Gemini Flash |
| Embeddings | workers-ai/bge-base | Edge | Always edge, instant | — |
Identical queries return cached results instantly. TTL: 1h parsing, 24h reports. Expect 30-50% hit rate after 3 months.
cf-aig-metadata header tags every request with agent name, task type, and vacancy ID. Dashboard shows cost per agent.
If Gemini is down, Claude handles scoring. If both fail, Workers AI provides degraded but functional response.
// Edge AI (free, instant)
const result = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { messages, max_tokens });
// AI Gateway → Gemini or Claude (same endpoint, different model string)
const resp = await fetch(`https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/compat/chat/completions`, {
method: 'POST',
headers: {
'cf-aig-authorization': `Bearer ${aigToken}`,
'cf-aig-metadata': JSON.stringify({ agent: 'gauge', task: 'score' }),
},
body: JSON.stringify({ model: 'google/gemini-2.5-pro', messages, max_tokens: 500 }),
});
All Google APIs are standard REST, called via fetch() from Cloudflare Workers. No Python SDK needed. Currently stubbed — returns simulated responses when service account key is not configured.
| Service | Functions | Agent | Status |
|---|---|---|---|
| Drive | uploadFile, createFolder, listFiles, shareFile | Acquire | Stub |
| Gmail | sendEmail, createDraft, searchEmails | Engage | Stub |
| Calendar | createEvent (+ Meet), findFreeSlots, listEvents | Integrate | Stub |
| Sheets | appendRow, readRange, updateCell | Leverage | Stub |
| Auth | Service account JWT → access token (cached in KV) | All | Stub |
wrangler secret put GOOGLE_SA_KEY (paste base64 value)GOOGLE_DELEGATED_USER, folder IDs, sheet ID, calendar ID in wrangler.tomlThe portal serves an A2A agent card at /.well-known/agent.json advertising 5 recruitment skills. Any A2A-compatible system can discover and interact with agenticX agents.
{
"name": "agenticX",
"url": "https://demo-dev.agilex.co.za",
"skills": [
{ "id": "recruit/parse-cv", "agent": "acquire" },
{ "id": "recruit/score", "agent": "gauge" },
{ "id": "recruit/schedule", "agent": "integrate" },
{ "id": "recruit/report", "agent": "leverage" },
{ "id": "recruit/notify", "agent": "engage" }
],
"authentication": { "schemes": ["bearer"] }
}
| Repo | Language | Files | LOC | Purpose | Deployed |
|---|---|---|---|---|---|
| agilex-portal | TypeScript | 14 | 1,485 | Cloudflare Worker — agents, AI Gateway, D1, portal UI | agilex-app.pages.dev |
| agenticx-agents | Python | 38 | 2,092 | Google ADK reference + architecture docs (GCP stubs) | Reference |
| agilex-site | Astro/TS | 5 | 513 | Marketing site — agilex.co.za | agilex-site.pages.dev |
| File | LOC | Purpose |
|---|---|---|
src/agents/router.ts | 378 | All 5 A.G.I.L.E. agents — dispatch, tools, Edge AI calls |
src/lib/ai-gateway.ts | 208 | Multi-model inference — Workers AI + Gemini + Claude via AI Gateway |
src/index.ts | 126 | Worker entry — router, CORS, A2A card, Env types |
src/sim/m365.ts | 212 | Legacy M365 simulation (to be replaced by Google stubs) |
src/lib/google/*.ts | 239 | Google Workspace stubs — auth, Drive, Gmail, Calendar, Sheets |
src/api/ai.ts | 88 | Raw AI inference + embedding endpoints |
src/api/candidates.ts | 52 | Candidate CRUD (D1) |
src/api/seed.ts | 94 | Demo data — 10 clients, 8 vacancies, 15 candidates |
schema.sql | 86 | D1 schema — 6 tables |
/.well-known/agent.jsonCF_AIG_TOKEN secret)GOOGLE_AI_KEY secret)ANTHROPIC_API_KEY secret)demo-dev.agilex.co.za DNS routingEach stubbed capability activates by setting one secret or env var — no code changes needed:
| # | Capability | Set This | Activates |
|---|---|---|---|
| 1 | AI Gateway → Gemini | wrangler secret put GOOGLE_AI_KEY | CV parsing via Gemini Flash, scoring via Gemini Pro |
| 2 | AI Gateway → Claude | wrangler secret put ANTHROPIC_API_KEY | Offer letters, rejections via Claude Sonnet |
| 3 | AI Gateway auth | wrangler secret put CF_AIG_TOKEN | All Gateway routing, caching, cost tracking |
| 4 | Google Workspace | wrangler secret put GOOGLE_SA_KEY | Drive, Gmail, Calendar, Sheets — all tools go live |
| 5 | Custom domain | DNS AAAA demo-dev on agilex.co.za | Portal accessible at demo-dev.agilex.co.za |
POST /api/agents/acquire/parse-cv with CV text → structured JSON with name, skills, etc.
POST /api/ai/embed with text → 768-dim vector preview
POST /api/agents/gauge/score with candidateId → score 0-100
GET /api/ai/models → edge + gateway models
POST /api/ai/infer with model=google/gemini-2.5-flash → structured candidate JSON
POST /api/agents/gauge/score with Gemini Pro model → score + reasoning + tier
POST /api/agents/leverage/report → pipeline narrative + metrics
Same request twice → second returns faster with cache hit header
POST /api/agents/engage/notify type=offer → professional offer email in SA English
POST /api/agents/engage/notify type=rejection → constructive feedback
Generate 5-8 structured questions for a vacancy + candidate
Block Gemini in gateway → same output from Claude
Block all providers → Llama 8B handles task at edge
PDF bytes → file in Candidates/Active/{name}/ with Drive URL
To, subject, body → email delivered from craig@agilex.co.za
Summary, datetime, attendees → event with Google Meet link
Row values → new row in Pipeline Tracker
Upload PDF → parse → score → Drive upload → Sheets row → activity log
Select candidate → find slots → book Calendar → send Gmail → update stage
Request → gather Sheets data → Gemini narrative → upload Drive PDF
GET /.well-known/agent.json → valid A2A card with 5 skills
Create candidate → consent_given timestamp in D1
Parse CV → only structured fields + Drive URL stored in D1
Every agent action → activity logged with timestamp, agent, action
| Component | Monthly (ZAR) | Tier |
|---|---|---|
| Cloudflare (Workers, D1, KV, Pages, DNS, CDN) | R0 | Free |
| AI Gateway (metering, caching, analytics) | R0 | Free |
| Workers AI (Llama 8B + BGE embeddings) | ~R90 – R180 | Usage |
| Gemini API (Flash + Pro via AI Gateway) | ~R180 – R540 | Usage |
| Anthropic API (Claude Sonnet via AI Gateway) | ~R360 – R720 | Usage |
| Google Workspace | Existing | — |
| Total | R630 – R1,440 | ~$35–80/mo |
| Phase | Scope | Effort | Status |
|---|---|---|---|
| Phase 0 | Agent scaffold, AI Gateway client, Google stubs, A2A card, architecture doc | Done | Complete |
| Phase 1 | Set API keys (Gemini, Claude, AI Gateway) → activate multi-model routing | 1 hour | Next |
| Phase 2 | Wire agents to use ai-gateway.ts infer() instead of direct Workers AI | 2-3 days | Planned |
| Phase 3 | Google Workspace — implement real auth + Drive + Gmail + Calendar + Sheets | 1 week | Planned |
| Phase 4 | E2E workflows — CV intake → score → schedule → notify | 3-4 days | Planned |
| Phase 5 | Deploy to demo-dev.agilex.co.za, POPIA audit, demo readiness | 2 days | Planned |
aistudio.google.comconsole.anthropic.comwrangler secret put CF_AIG_TOKEN
wrangler secret put GOOGLE_AI_KEY
wrangler secret put ANTHROPIC_API_KEY
wrangler deploycurl -X POST agilex-app.pages.dev/api/ai/infer \
-d '{"model":"google/gemini-2.5-flash","prompt":"Hello from AgileX"}'
curl -X POST agilex-app.pages.dev/api/ai/infer \
-d '{"model":"anthropic/claude-sonnet-4-6","prompt":"Hello from AgileX"}'