// the explorer

AI Explorer

By day, I am a Strategic AI Advisor and Enterprise Architect. I solve the challenge of integrating modern AI into rigid, legacy core systems in highly regulated industries. I design governed API layers, MCP-based agentic systems, and the cloud-native infrastructure required to make these models scale securely in production — across AWS, Azure, and GCP.

On the weekends, I run empirical experiments on LLMs and SLMs — adversarial tests, benchmarks, and compliance experiments on open-source models to see where they actually break. I replicate research paper findings on local hardware, focusing on structured output failures, adversarial guardrails, context position bias, and compliance enforcement. Real benchmarks. Real limitations. No hype.

I also build with agent frameworks — LangChain, CrewAI — and test how they hold up when wired to real cloud backends: Bedrock, Vertex AI, Azure AI Foundry. The goal is always the same: find what breaks before production does.

UT Austin

McCombs Graduate

Post Graduate Program in AI & ML: Business Applications (2024).

2x Speaker

API Summit

Spoke in 2024 & 2025 on enterprise GenAI gateway integration patterns.

1,500+

Structured JSON Tests

Empirical evaluations on schema accuracy across 7 small models.

Adversarial Scenarios

Compliance test suites built covering finance, medical, and legal domains.

// Core Focus Areas

›
Governed API & MCP Layers
Designing enterprise API ecosystems and MCP-based agentic systems that integrate AI safely into regulated, legacy environments.
›
Agent Framework Testing
Evaluating LangChain and CrewAI agents against real cloud backends — AWS Bedrock, Azure AI Foundry, GCP Vertex AI — to surface production failure modes.
›
RAG Compliance Enforcement
Designing multi-tier verification architectures that prevent LLMs from advice violation (e.g. money laundering or clinical bypass).
›
Structured Output & Context Bias
Benchmarking JSON output modes under schema complexity limits and analyzing middle-context retrieval degradation in small open-source models.
›
Prompt Injection Defense
Hardening models against system role prompt hijacking and jailbreaking using NeMo Guardrails and Llama Guard.

// Models in the Lab

Local Open-Source SLMs

Gemma 4 E2B / E4BGemma-2B / 4B / 9BLlama-3B / 7B / 8BOllama Local Inference

Proprietary API Models

Claude Opus / Sonnet / HaikuGemini Flash / Pro / Live (Voice & Audio)GPT-4o / 3.5

Cloud AI Platforms

AWS BedrockAzure AI FoundryGCP Vertex AI

Agent Frameworks

LangChainCrewAIMCP

Credentials & Certs

UT Austin PGP AI/MLGoogle Cloud Advanced L400

*Focus is predominantly placed on offline-first, small parameter models run on local hardware to establish developer and cost constraints.

// Visual Testing Workflow

Applied Research

Translating academic paper methodologies and arXiv benchmarks into runnable local test code.

→

Agentic Engineering

Orchestrating complex test suites and pipelines using advanced coding agents (Claude Code, Gemini Assist).

→

Statistical Validation

Analyzing boundary conditions, response latency, and compliance rate across thousands of runs.

→

Open Disclosures

Publishing raw, honest failure modes and limitations rather than idealized happy-path demonstrations.