Projects & Experiments

My personal lab — side projects, proofs of concept, and experiments at the intersection of AI Agents and Data Governance.

auto_awesome

Featured Builds

AI & Agents

Sidekick — AI Co-Worker

code

A proof-of-concept AI co-worker for exploring LLM agents, RAG, memory, and project-scoped document workflows using LangChain and LangGraph.

The Challenge

Whether a local agent loop (worker + evaluator) with ChromaDB-backed RAG and per-project memory could serve as a genuine productivity tool, not just a demo.

The Outcome

Built a working multi-mode agent with full-file RAG retrieval, SQLite checkpoints, and LangSmith tracing. Validated the worker+evaluator pattern at scale.

LangGraphLangChainChromaDBGradioPythonClaude / OpenAI

Personal experiment / POC

AI & Agents

SQL Agent Loop

code

Demonstrates that with a data dictionary and an agentic AI, you can answer any business question in natural language — even when column names are cryptic.

The Challenge

That metadata and governance are the true foundation for effective AI. Without good data definitions, the agent guesses wrong. With them, it reliably translates natural language to accurate SQL.

The Outcome

Validated the pattern end-to-end. The agent uses tools to inspect schema, read the dictionary, and run SQL in a loop — no hardcoded queries.

PythonOpenAISQLiteGradio

Personal experiment / POC

AI & Agents

Agentic AI POC

code

Experimental proof of concept for agentic AI patterns — tool-augmented reasoning, agent orchestration loops, memory integration, and plan-and-execute workflows.

The Challenge

How an LLM chooses and invokes tools, receives results, and continues reasoning — without relying on framework abstractions.

The Outcome

Built a custom agent loop with tool registry, long-term memory, planner, and worker from first principles. Useful reference for understanding what frameworks abstract away.

PythonOpenAI

Personal experiment / POC

AI & Agents

Career Chatbot

code

AI agent that acts as a personal career assistant. Uses function calling, a retry loop, and an LLM-as-judge evaluator to stay grounded and in character.

The Challenge

The LLM-as-judge pattern for output quality control — a second model evaluates every reply and triggers a retry with feedback if it fails.

The Outcome

Deployed to Hugging Face Spaces. The evaluator pattern significantly improves response consistency and prevents hallucinations.

PythonOpenAIGradioHuggingFace Spaces

Personal experiment / POC

Data Governance

Data CoPilot

code

Local-first, chat-centric data copilot that routes messages to internal agents for profiling, harmonization, and governance document generation.

The Challenge

A scalable full-stack foundation: React SPA + FastAPI + Postgres (pgvector) + Redis + MinIO — learning each layer step by step.

The Outcome

Core infrastructure operational via Docker Compose. API, workers, and web UI scaffolded. Validates the architecture and data model before full implementation.

FastAPIReactpgvectorRedisMinIOLangGraphPython

Personal experiment / POC

AI & Agents

Meeting Mind

A local-first agentic platform that converts meeting transcripts into a time-aware knowledge base. Ingest any transcript format, extract structured ideas, decisions, and actions, then ask natural-language questions across your entire meeting history.

The Challenge

Whether meeting knowledge could be made persistent, searchable, and temporally aware — tracking how decisions and ideas evolve across meetings — without cloud dependency or an LLM API.

The Outcome

Built a full ingestion-to-retrieval pipeline: topic segmentation, structured extraction, temporal supersession linking, and a three-layer store (SQLite + JSON knowledge graph + lexical vector index). Works entirely offline with an optional LLM upgrade path.

PythonSQLiteKnowledge GraphTF-IDF Vector SearchMulti-Agent

Personal experiment / POC

AI & Agents

IdeaForge

code

A multi-agent AI system that takes a raw business idea through a full pipeline — research, debate, business case, and product plan — and delivers production-ready documents. 17 specialized agents across 3 stages with durable human-in-the-loop checkpoints.

The Challenge

Whether 17 coordinated LLM agents working in parallel — each owning a distinct role — could reliably turn a single raw idea into a complete, production-ready business case and product plan.

The Outcome

Built a full multi-tenant platform: LangGraph orchestration, FastAPI backend, Stripe billing, BYOK key vault, Jinja2 web UI, and an MCP server for IDE integration. Pipeline runs end-to-end with durable interrupt/resume at every human checkpoint.

LangGraphFastAPIPythonSQLiteStripeMCP

Personal experiment / POC

Tooling

Asparagus Operations POC

code

Full data lake architecture built on free Google services: raw storage (Drive), AI-powered ETL (GPT-4o-mini), structured serving (Sheets), and browser-side analytics via DuckDB WASM — plus a full operations workflow for invoice entry and master data management.

The Challenge

A complete data lake — raw zone, AI ETL, structured serving, data catalog, and interactive analytics — can be built entirely on free Google services with no dedicated database, no data warehouse, and no cost per query.

The Outcome

Proved the concept: DuckDB WASM runs SQL in the browser querying data from Google Sheets, GPT-4o-mini extracts structured JSON from PDF/XML invoices, and the entire stack costs under $5/month vs $50–$1,000+ for AWS or Snowflake equivalents.

ReactExpressDuckDB WASMGoogle SheetsGoogle DriveOpenAI GPT-4o-miniRecharts

Personal experiment / POC

Tooling

Agricultural Settlement Platform

Fully operational agricultural intake, quality, contract, and settlement platform built on Google AppSheet + Apps Script. Replaced manual spreadsheet-driven settlement with a structured, rule-based, auditable engine.

The Challenge

Enterprise architecture principles — governed data, controlled calculation engines, workflow automation — can be applied in low-code environments.

The Outcome

Production-deployed system handling multiple contract types, quality-based payout thresholds, advance payments, PDF generation, and automated email distribution.

Google AppSheetGoogle Apps ScriptGoogle Sheets

Personal project

Experiments

Local Image Generation POC

code

Ran Stable Diffusion XL locally to explore text-to-image, image-to-image, and inpainting pipelines — including device support (CUDA, MPS, CPU).

The Challenge

How local diffusion models work in practice: download, memory requirements, precision issues on Apple Silicon, and pipeline differences.

The Outcome

Working t2i, i2i, and inpainting pipelines. Key lesson: MPS requires float32; attention_slicing helps on limited VRAM.

PythonDiffusersStable Diffusion XLHuggingFace

Personal experiment / POC

Tooling

Project Setup Kit

code

A Claude Code plugin that takes an empty repo from "I have an idea" to a design contract, a locked stack, a linted backlog, and its own bespoke build agents — then gets out of the way.

The Challenge

That agent harnesses should be generated per project, not parameterized — a scoper that knows your stack writes better briefs. And that a design doc works as a contract: fixed sections, locked ADRs, and an honest register of what is deliberately unspecified.

The Outcome

Built an 8-step skill chain (brief → stack → design → scaffold → harness-forge → backlog → plan-lint → harness-doctor) with harness tiers sized to the build. Key lesson: a multi-agent harness is prose that nothing type-checks, so auditing it is a mandatory gate, not an option.

Claude CodeSkillsSubagentsSlash CommandsMarkdown

Personal experiment / POC

Interested in collaborating on a technical project?

I'm always open to discussing research prototypes, data infrastructure, or open-source tooling.

Get in Touchmail