References

Every external source cited in this book, grouped by tier in descending authority.

Cited sources

External sources cited inline via <Citation>, grouped by tier in descending authority.

T1 · Official 86 entries

Vendor-official documentation or release notes. Highest trust for factual claims about the vendor’s own tool.

How we contain Claude across products

Max McGuinness, Mikaela Grace, Jiri De Jonghe, Jake Eaton, and Abel Ribbink 2026 captured 2026-05-29 tool: claude-code

original id: anthropic-containment
How Claude Code works in large codebases: Best practices and where to start

Anthropic 2026 captured 2026-05-26 tool: claude-code

original id: anthropic-large-codebases
Seeing like an agent: how we design tools in Claude Code

Thariq Shihipar (Anthropic) 2026 captured 2026-05-26 tool: claude-code

original id: anthropic-seeing-like-agent
Trustworthy agents in practice

Anthropic 2026 captured 2026-05-27 tool: claude-code

original id: anthropic-trustworthy-agents
Scaling Managed Agents: Decoupling the brain from the hands

Anthropic (Lance Martin, Gabe Cemaj, Michael Cohen) 2026 captured 2026-05-27 tool: claude-code

original id: anthropic-managed-agents
How and when to use subagents in Claude Code

Anthropic 2026 captured 2026-05-27 tool: claude-code

original id: anthropic-subagents-blog
How we built Claude Code auto mode: a safer way to skip permissions

John Hughes (Anthropic) 2026 captured 2026-05-27 tool: claude-code

original id: claude-code-auto-mode
Bringing Code Review to Claude Code

Anthropic 2026 captured 2026-05-27 tool: claude-code

original id: claude-code-review-blog
The 2026 MCP Roadmap

David Soria Parra (Lead Maintainer) 2026 captured 2026-05-27 tool: cross-tool

original id: mcp-roadmap-2026
Measuring AI agent autonomy in practice

McCain, Millar, Huang et al. (Anthropic) 2026 captured 2026-05-27 tool: claude-code

original id: anthropic-measuring-autonomy
Donating the Model Context Protocol and establishing the Agentic AI Foundation

Anthropic 2025 captured 2026-05-27 tool: cross-tool

original id: anthropic-mcp-donation
Effective harnesses for long-running agents

Justin Young 2025 captured 2026-05-27 tool: claude-code

original id: effective-harnesses
Key Changes (Changelog) — MCP Specification 2025-11-25

Model Context Protocol maintainers 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-changelog
Specification — Model Context Protocol

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec
Architecture — Model Context Protocol Specification 2025-11-25

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec-architecture
Authorization — Model Context Protocol Specification (revision 2025-11-25)

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec-authorization
Lifecycle — Model Context Protocol Specification 2025-11-25

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec-lifecycle
Prompts — Model Context Protocol Specification (revision 2025-11-25)

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec-prompts
Resources — Model Context Protocol Specification (revision 2025-11-25)

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec-resources
Tools — Model Context Protocol Specification (revision 2025-11-25)

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec-tools
Transports — Model Context Protocol Specification 2025-11-25

Model Context Protocol contributors 2025 captured 2026-05-27 tool: cross-tool

original id: mcp-spec-transports
Equipping agents for the real world with Agent Skills

Anthropic 2025 captured 2026-05-26 tool: claude-code

original id: anthropic-skills
Building agents with the Claude Agent SDK

Thariq Shihipar et al. 2025 captured 2026-05-27 tool: claude-code

original id: building-agents-with-the-sdk
Effective context engineering for AI agents

Prithvi Rajasekaran, Ethan Dixon, Carly Ryan, and Jeremy Hadfield 2025 captured 2026-05-27 tool: claude-code

original id: effective-context-engineering
Piloting Claude in Chrome

Anthropic 2025 captured 2026-05-27 tool: claude-code

original id: anthropic-claude-chrome
How we built our multi-agent research system

Anthropic 2025 captured 2026-05-27 tool: claude-code

original id: anthropic-multi-agent-research
Code execution with MCP: Building more efficient agents

Jones & Kelly (Anthropic) 2025 captured 2026-05-26 tool: claude-code

original id: anthropic-code-exec-mcp
Writing effective tools for agents — with agents

Aizawa (Anthropic) 2025 captured 2026-05-26 tool: claude-code

original id: anthropic-writing-tools
Building effective agents

Erik Schluntz and Barry Zhang 2024 captured 2026-05-27 tool: claude-code

original id: building-effective-agents
A statistical approach to model evaluations

Anthropic 2024 captured 2026-05-27 tool: cross-tool

original id: anthropic-statistical-evals
Building evals

Anthropic 2024 captured 2026-05-27 tool: claude-code

original id: anthropic-building-evals-cookbook
Agent SDK overview

Anthropic captured 2026-05-27 tool: claude-code

original id: agent-sdk-overview
Inspect — Options

UK AI Security Institute captured 2026-05-27 tool: cross-tool

original id: aisi-inspect-options
Using agent memory

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-agent-memory
Get structured output from agents

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-agent-sdk-structured-outputs
Batch processing — Claude Docs

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-batch-processing
Best practices for Claude Code

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-cc-best-practices
Explore the context window

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-context-window
Define tools

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-define-tools
Define success criteria and build evaluations

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-develop-tests
Using the Evaluation Tool

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-eval-tool
Handle tool calls

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-handle-tool-calls
Increase output consistency

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-increase-consistency
Prompting Claude for JSON mode

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-json-mode-cookbook
Configure permissions

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-permissions
Discover and install prebuilt plugins through marketplaces

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-plugins
Pricing - Claude API Docs

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-pricing
Prompt caching

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-prompt-caching
Prompt engineering overview

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-prompt-eng-overview
Console prompting tools

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-prompt-improver
Prompting best practices

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-prompting-best-practices
Beyond permission prompts: making Claude Code more secure and autonomous

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-sandboxing-blog
Configure the sandboxed Bash tool

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-sandboxing-docs
Claude Code settings

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-settings
Skill authoring best practices

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-skills-best-practices
Extend Claude with skills

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-skills-cc
Agent Skills (overview)

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-skills-overview
Agent Skills in the SDK

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-skills-sdk
Strict tool use

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-strict-tool-use
Structured outputs

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-structured-outputs
Create custom subagents

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-subagents-docs
Tool search tool

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-tool-search
Define tools

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-tool-use-define
Tool use with Claude

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-tool-use-overview
Prompting best practices: use XML tags

Anthropic captured 2026-05-26 tool: claude-code

original id: anthropic-xml-tags
How Claude remembers your project

Anthropic captured 2026-05-26 tool: claude-code

original id: cc-memory
Subagents in the SDK

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-agent-sdk-subagents
Track team usage with analytics

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-analytics
Explore the .claude directory

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-claude-directory
Common workflows

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-common-workflows
Manage costs effectively

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-costs
Claude Code GitHub Actions

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-github-actions
Run Claude Code programmatically

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-headless
Monitoring

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-monitoring
Choose a permission mode

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-permission-modes
How Claude Code uses prompt caching - Claude Code Docs

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-prompt-caching
Code Review

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-review-docs
Security

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-security-docs
Persist sessions to external storage

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-session-storage
Handle approvals and user input

Anthropic captured 2026-05-27 tool: claude-code

original id: claude-code-user-input
Demystifying evals for AI agents

Grace, Hadfield, Olivares and De Jonghe (Anthropic) captured 2026-05-27 tool: cross-tool

original id: demystifying-evals
How Claude Code works

Anthropic captured 2026-05-27 tool: claude-code

original id: how-claude-code-works
The 2026-07-28 MCP Specification Release Candidate

David Soria Parra and Den Delimarsky (Lead Maintainers) captured 2026-05-27 tool: cross-tool

original id: mcp-rc-2026
CVE-2025-32711

NVD (NIST National Vulnerability Database) captured 2026-05-27 tool: cross-tool

original id: nvd-cve-2025-32711
Semantic Conventions for GenAI agent and framework spans

OpenTelemetry Authors captured 2026-05-27 tool: cross-tool

original id: otel-genai-agent-spans
Semantic conventions for generative AI metrics

OpenTelemetry Authors captured 2026-05-27 tool: cross-tool

original id: otel-genai-metrics

T2 · Release notes 14 entries

Release blog posts, changelogs, conference talks. Trustworthy for intent and availability claims.

The Anatomy of an Agent Harness

Vivek Trivedy 2026 captured 2026-05-27 tool: cross-tool

original id: trivedy-anatomy-agent-harness
Context Engineering

The LangChain Team 2025 captured 2026-05-26 tool: cross-tool

original id: langchain-context
LangMem SDK for agent long-term memory

The LangChain Team 2025 captured 2026-05-26 tool: cross-tool

original id: langmem
Nx and AI — Why They Work so Well Together

Victor Savkin (Nx) 2025 captured 2026-05-26 tool: cross-tool

original id: nx-savkin-ai
Architectural Decision Records (ADRs)

ADR GitHub organization captured 2026-05-26 tool: cross-tool

original id: adr-github
AGENTS.md

Agentic AI Foundation (Linux Foundation) captured 2026-05-26 tool: cross-tool

original id: agents-md
Anthropic's Prompt Engineering Interactive Tutorial

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropic-prompt-eng-tutorial
anthropics/skills: Public repository for Agent Skills

Anthropic captured 2026-05-27 tool: claude-code

original id: anthropics-skills-repo
Introduction (What is CrewAI?)

CrewAI (documentation) captured 2026-05-27 tool: cross-tool

original id: crewai-introduction
Long-term memory

LangChain captured 2026-05-26 tool: cross-tool

original id: langchain-longterm-memory
LangGraph overview

LangChain captured 2026-05-27 tool: cross-tool

original id: langgraph-overview
LangGraph Multi-Agent Supervisor

LangChain (langchain-ai) captured 2026-05-27 tool: cross-tool

original id: langgraph-supervisor
LangGraph Multi-Agent Swarm

LangChain (langchain-ai) captured 2026-05-27 tool: cross-tool

original id: langgraph-swarm
OpenAI Agents SDK

OpenAI (Agents SDK documentation) captured 2026-05-27 tool: cross-tool

original id: openai-agents-sdk

T3 · Practitioner 61 entries

Respected community writing with a durable argument the author has defended over time.

State of AI Agent Memory 2026: Benchmarks, Architectures & Production Gaps

Mem0 Engineering Team 2026 captured 2026-05-26 tool: cross-tool

original id: mem0-state-2026
Classifier Context Rot: Monitor Performance Degrades with Context Length

Martin & Roger 2026 captured 2026-05-26 tool: cross-tool

original id: martin-classifier-rot
Multi-Agents: What's Actually Working

Walden Yan (Cognition) 2026 captured 2026-05-27 tool: cross-tool

original id: cognition-working
Coding Agents in the Monorepo: Why Context Windows and 50-Service Repos Don't Mix

Tian Pan 2026 captured 2026-05-26 tool: cross-tool

original id: pan-monorepo
AI agents in monorepos: what to configure differently from a single-product repo

Dave Barnwell 2026 captured 2026-05-26 tool: cross-tool

original id: barnwell-monorepo
ShadowPrompt: How Any Website Could Have Hijacked Claude's Chrome Extension

Oren Yomtov (Koi Research) 2026 captured 2026-05-27 tool: cross-tool

original id: shadowprompt-koi
My AI Adoption Journey

Mitchell Hashimoto 2026 captured 2026-05-27 tool: cross-tool

original id: hashimoto-harness-engineering
Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?

Gloaguen, Mündler, Müller, Raychev, Vechev (ETH Zurich) 2026 captured 2026-05-26 tool: cross-tool

original id: eth-agentsmd-study
My .md files vs Claude's memory tool: a practitioner comparison

Andreas Belitz 2026 captured 2026-05-26 tool: cross-tool

original id: belitz-md-vs-memory
Harness engineering for coding agent users

Birgitta Böckeler 2026 captured 2026-05-26 tool: cross-tool

original id: boeckeler-harness
Maintainability sensors for coding agents

Birgitta Böckeler 2026 captured 2026-05-26 tool: cross-tool

original id: boeckeler-sensors
Agent Memory Engineering

Nicolas Bustamante 2026 captured 2026-05-26 tool: cross-tool

original id: bustamante-agent-memory
Claude Code v2.1.62 — Server-Side KV Cache Stale Context Regression (P1)

Taylor (issue reporter) 2026 captured 2026-05-26 tool: claude-code

original id: cc-cache-regression
Why your AI agent doesn't actually remember anything

Ed Huang 2026 captured 2026-05-26 tool: cross-tool

original id: huang-agent-memory
Don't Break the Cache: An Evaluation of Prompt Caching for Long-Horizon Agentic Tasks

Lumer et al. 2026 captured 2026-05-26 tool: cross-tool

original id: lumer-cache
Agentic Much? Adoption of Coding Agents on GitHub

Robbes, Matricon, Degueule, Hora, Zacchiroli 2026 captured 2026-05-26 tool: cross-tool

original id: robbes-adoption
We removed 80% of our agent's tools

Andrew Qu (Vercel) 2025 captured 2026-05-27 tool: cross-tool

original id: vercel-removed-tools
Writing a good CLAUDE.md

Kyle (HumanLayer) 2025 captured 2026-05-26 tool: cross-tool

original id: humanlayer-claudemd
How we're making GitHub Copilot smarter with fewer tools

Anisha Agarwal and Connor Peet (GitHub) 2025 captured 2026-05-27 tool: cross-tool

original id: github-fewer-tools
Advanced Context Engineering for Coding Agents (ACE-FCA)

Dex Horthy (HumanLayer) 2025 captured 2026-05-26 tool: cross-tool

original id: humanlayer-ace
When MCP Servers Attack: Taxonomy, Feasibility, and Mitigation

Zhao, Liu, Ruan, Li, Liang 2025 captured 2026-05-27 tool: cross-tool

original id: zhao-mcp-attack
When Instructions Multiply: Measuring and Estimating LLM Capabilities of Multiple Instructions Following

Harada et al. (EMNLP) 2025 captured 2026-05-26 tool: cross-tool

original id: harada-manyifeval
An AI-powered coding tool wiped out a software company's database

Beatrice Nolan (Fortune) 2025 captured 2026-05-26 tool: cross-tool

original id: replit-fortune
Vibe coding service Replit deleted user's production database, faked data, told fibs galore

Simon Sharwood (The Register) 2025 captured 2026-05-26 tool: cross-tool

original id: replit-register
Incident 1152: LLM-Driven Replit Agent Executed Unauthorized Destructive Commands During Code Freeze

AI Incident Database (Responsible AI Collaborative) 2025 captured 2026-05-26 tool: cross-tool

original id: replit-aiid
How Many Instructions Can LLMs Follow at Once?

Jaroslawicz et al. 2025 captured 2026-05-26 tool: cross-tool

original id: jaroslawicz-ifscale
Using Architecture Decision Records (ADRs) with AI coding assistants

Chris Swan 2025 captured 2026-05-26 tool: cross-tool

original id: swan-adrs
How Not to Detect Prompt Injections with an LLM

Choudhary, Anshumaan, Palumbo, Jha 2025 captured 2026-05-27 tool: cross-tool

original id: kad-dataflip-choudhary
Context Rot: How Increasing Input Tokens Impacts LLM Performance

Hong, Troynikov & Huber (Chroma Research) 2025 captured 2026-05-26 tool: cross-tool

original id: chroma-context-rot
On 'context engineering': 'filling the context window' (X post)

Andrej Karpathy 2025 captured 2026-05-30 tool: cross-tool

original id: karpathy-context-engineering
On 'context engineering' over 'prompt engineering' (X post)

Tobi Lütke (Shopify) 2025 captured 2026-05-30 tool: cross-tool

original id: lutke-context-engineering
Andrej Karpathy: Software in the Age of AI

Andrej Karpathy (Latent Space transcript-mirror) 2025 captured 2026-05-27 tool: cross-tool

original id: karpathy-decade-of-agents
The lethal trifecta for AI agents: private data, untrusted content, and external communication

Simon Willison 2025 captured 2026-05-27 tool: cross-tool

original id: willison-trifecta
Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding

Xiao et al. 2025 captured 2026-05-26 tool: cross-tool

original id: xiao-pcd
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in LLMs

Ye et al. 2025 captured 2026-05-26 tool: cross-tool

original id: ye-muldimif
Build MCP Tools Like Ogres... With Layers

Richard Moot (Block) 2025 captured 2026-05-27 tool: cross-tool

original id: block-layered-tools
Reasoning on Multiple Needles In A Haystack

Wang 2025 captured 2026-05-26 tool: cross-tool

original id: wang-mniah
NoLiMa: Long-Context Evaluation Beyond Literal Matching

Modarressi et al. (ICML) 2025 captured 2026-05-26 tool: cross-tool

original id: nolima
AI Agent Memory Management: When Markdown Files Are All You Need?

Yaohua Chen (ImagineX) 2025 captured 2026-05-26 tool: cross-tool

original id: chen-markdown-memory
12-Factor Agents — Factor 3: Own your context window

Dex Horthy (HumanLayer) 2025 captured 2026-05-26 tool: cross-tool

original id: horthy-12factor
Context Engineering for AI Agents: Lessons from Building Manus

Yichao Ji (Manus) 2025 captured 2026-05-26 tool: cross-tool

original id: ji-manus
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization

Hsieh et al. (ACL Findings) 2024 captured 2026-05-26 tool: cross-tool

original id: hsieh-found-middle
RULER: What's the Real Context Size of Your Long-Context Language Models?

Hsieh et al. (NVIDIA, COLM) 2024 captured 2026-05-26 tool: cross-tool

original id: ruler
MemGPT: Towards LLMs as Operating Systems

Packer et al. 2023 captured 2026-05-26 tool: cross-tool

original id: memgpt
Lost in the Middle: How Language Models Use Long Contexts

Liu et al. (TACL) 2023 captured 2026-05-26 tool: cross-tool

original id: liu-lost-middle
ChatGPT Plugins: Data Exfiltration via Images and Cross Plugin Request Forgery

Johann Rehberger (wunderwuzzi) 2023 captured 2026-05-27 tool: cross-tool

original id: rehberger-markdown-exfil
Generative Agents: Interactive Simulacra of Human Behavior

Park et al. 2023 captured 2026-05-26 tool: cross-tool

original id: generative-agents
API Contract Definitions: Contract first, implementation first, OpenAPI, GraphQL, gRPC

Lena Fuhrimann 2022 captured 2026-05-26 tool: cross-tool

original id: fuhrimann-contract-first
Documenting Architecture Decisions

Michael Nygard 2011 captured 2026-05-26 tool: cross-tool

original id: nygard-adr
Defeating Prompt Injections by Design

Debenedetti, Shumailov, Fan, Hayes, Carlini, et al. captured 2026-05-27 tool: cross-tool

original id: camel-debenedetti
Don't Build Multi-Agents

Walden Yan (Cognition) captured 2026-05-27 tool: cross-tool

original id: cognition-dont-build
Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet

Artem Chaikin and Shivan Kaul Sahib (Brave) captured 2026-05-27 tool: cross-tool

original id: comet-brave
Design Patterns for Securing LLM Agents against Prompt Injections

Beurer-Kellner, Buesser, Creţu, Debenedetti, et al. captured 2026-05-27 tool: cross-tool

original id: design-patterns-beurer-kellner
Breaking down 'EchoLeak', the First Zero-Click AI Vulnerability Enabling Data Exfiltration from Microsoft 365 Copilot

Itay Ravia (Cato Networks / Aim Labs) captured 2026-05-27 tool: cross-tool

original id: echoleak-catonetworks
Bypassing LLM Guardrails: An Empirical Analysis of Evasion Attacks against Prompt Injection and Jailbreak Detection Systems

Hackett, Birch, Trawicki, Suri, Garraghan captured 2026-05-27 tool: cross-tool

original id: guardrail-evasion-hackett
LlamaFirewall: An open source guardrail system for building secure AI agents

Chennabasappa, Nikolaidis, Song, et al. (Meta) captured 2026-05-27 tool: cross-tool

original id: llamafirewall
Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks

Chen, Zharmagambetov, Wagner, Guo (Meta) captured 2026-05-27 tool: cross-tool

original id: meta-secalign
LLM01:2025 Prompt Injection

OWASP Gen AI Security Project captured 2026-05-27 tool: cross-tool

original id: owasp-llm01
LLM03:2025 Supply Chain

OWASP Gen AI Security Project captured 2026-05-27 tool: cross-tool

original id: owasp-llm03
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks

Evtimov, Zharmagambetov, Grattafiori, Guo, Chaudhuri captured 2026-05-27 tool: cross-tool

original id: wasp-evtimov
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

Zheng et al. captured 2026-05-27 tool: cross-tool

original id: zheng-judging-llm-judge

T4 · Conjecture no entries yet

Blog posts, tweets, or unverified claims. Pointers to investigate, not authorities.

No sources at this tier yet.