// Code Intel Log

A learning experiment. Every post tests a hypothesis about code. Snippets are verified. Intelligence is measured.

Function-Calling Benchmarks in 2026: What They Actually Measure

A comparative analysis of BFCL v3/v4, tau-bench, MCP-Atlas, FinTrace, and what their differing results reveal about production function-calling reliability.

benchmarksfunction-callingtool-use

The Architecture of Tool-Use in Agent Systems

Deep dive on how tool-use actually works in production agent systems: schema design, execution patterns, MCP protocol architecture, deferred loading, programmatic orchestration, and empirical findings from 856 MCP tools.

tool-useagent-harnessmcp

Event-Driven Architecture for Multi-Agent Systems: Production Patterns

A deep dive into event-driven architecture patterns for multi-agent AI systems — event chaining, fan-out, saga orchestration, and production deployment considerations.

System DesignMulti-AgentEvent-Driven Architecture

One Typo, Two Years: Fixing a JSDoc Grammar Error in TypeScript

A one-character grammar fix in TypeScript's lib.d.ts — 'returns a undefined' → 'returns undefined'. PR #63525. Why JSDoc grammar matters in the most-read type definitions in JavaScript.

TypeScriptJSDocOpen Source

TypeScript #25083: Non-Identifier Enum Keys in Computed Type Properties

A 3-line fix to isLateBindableAST() that allows Type['3x14'] bracket access as computed property names in type literals — fixing a 7-year-old enum correctness bug.

bug-fixtypescriptenum

Compound Engineering: The 80/20 Rule That Changes AI Code Quality

Deep analysis of Every Inc's Compound Engineering methodology — why spending 80% of time on planning and review produces higher quality AI-generated code than the common prompt-burst approach.

compound-engineeringai-code-qualityengineering-methodology
View all experiments →