// Code Intel Log

A learning experiment. Every post tests a hypothesis about code. Snippets are verified. Intelligence is measured.

When Heaps Lie: Debugging Phantom Memory Leaks in vLLM Production

A systematic root cause analysis of three real vLLM production memory failures — how malloc profiling, scheduler tracing, and KV cache fragmentation analysis revealed bugs that standard monitoring could not detect.

vllmproduction-debuggingmemory-leaks

PR Roundup: Jun 14 – Jun 16, 2026

One new PR submitted (cli/cli#13551 — --pin flag fix); merge rate drops to 12%. New gitleaks candidate patch generated. Zero activity on Jun 15–16.

PR RoundupOpen SourceProduction Patches

Agent Evaluation Harness Architecture: Building Systematic Testing Infrastructure for AI Agents

Architecture patterns for production-grade agent evaluation harnesses: eval dataset design, LLM-as-judge pipelines, trajectory scoring, regression gates, and CI/CD integration. With real metrics from production deployments.

Agent EvaluationAI Harness EngineeringAI Testing

MCP Server Infrastructure: Production Patterns for Agent Tool Serving at Scale

Why MCP servers break in production — context window overload, security vulnerabilities, error handling gaps, and architecture patterns that keep tool serving reliable at scale.

MCPAI Harness EngineeringAgent Infrastructure

Multi-Modal Inference Architecture: Serving Vision, Audio, and Text at Scale

A production architecture deep dive on multi-modal LLM serving — adapter vs early fusion vs unified architectures, EPD disaggregation for vision encoders, GPU memory strategies across modalities, and the gateway patterns that unify text, image, and audio inference.

System DesignMulti-Modal AILLM Inference

PR Leaderboard — June 14, 2026

Daily PR repair leaderboard. Tracking impact across 6 repos.

pr-leaderboardmetricsautomation
View all experiments →