Current Explorations - Matt Warden

A running list of what I'm actively exploring in AI. Completed items are marked with ~~strikethrough~~.

MIT OpenCourseware 15.773 Hands-On Deep Learning
Strategies for Ed-Fi API tool use by LLMs (MCP or other)
Current strategies for hallucination detection
Hallucination detection: LLM-as-judge
Learning to critical read and analyze LLM Benchmarks & Leaderboards
LLM output validation / evaluation
~~Hallucination detection: Semantic entropy~~
~~Hallucination detection: Beyesian estimation of semantic entropy~~
~~LLM tool use mechanics~~
~~Token efficiency with tool calling~~
~~LLM-hosted tool calling costs and implications~~
~~Forcing JSON/other schema responses by forcing LLM to use 'fake' tool~~
~~Security considerations with LLM tool use~~
~~MCP Server primitives: tools, resources, prompts~~
~~Nuances between function/tool calling and various packaged LLM features (agents, memory, etc.)~~
~~Tokenization, Byte Pair Encoding~~
~~Prompt Caching~~
~~Token pricing: input, output, cached, etc.~~
~~Token count/cost difference by format (JSON, YAML, code, prose, etc)~~
~~Designing LLM conversations for efficient cost~~
~~Developing and testing intuition on model selection (Claude haiku vs sonnet vs opus)~~
~~LLM batch processing~~
~~Anthropic's Contextual Retrieval offering~~
~~RAG pipeline, RAG types (traditional with embeddings, Prompt RAG / table of contents pattern, ELITE, full text search, Graph RAG, etc), vector databases, semantic search~~
~~pipeline: query -> decompose -> run each -> rerank summed results~~
~~Generating embeddings via Azure AI tools~~
~~Embeddings models, tradeoffs, and how to choose~~
~~Semantic math with embeddings~~