A running list of what I'm actively exploring in AI. Completed items are marked with strikethrough.
- MIT OpenCourseware 15.773 Hands-On Deep Learning
- Strategies for Ed-Fi API tool use by LLMs (MCP or other)
- Current strategies for hallucination detection
- Hallucination detection: LLM-as-judge
- Learning to critical read and analyze LLM Benchmarks & Leaderboards
- LLM output validation / evaluation
Hallucination detection: Semantic entropyHallucination detection: Beyesian estimation of semantic entropyLLM tool use mechanicsToken efficiency with tool callingLLM-hosted tool calling costs and implicationsForcing JSON/other schema responses by forcing LLM to use 'fake' toolSecurity considerations with LLM tool useMCP Server primitives: tools, resources, promptsNuances between function/tool calling and various packaged LLM features (agents, memory, etc.)Tokenization, Byte Pair EncodingPrompt CachingToken pricing: input, output, cached, etc.Token count/cost difference by format (JSON, YAML, code, prose, etc)Designing LLM conversations for efficient costDeveloping and testing intuition on model selection (Claude haiku vs sonnet vs opus)LLM batch processingAnthropic's Contextual Retrieval offeringRAG pipeline, RAG types (traditional with embeddings, Prompt RAG / table of contents pattern, ELITE, full text search, Graph RAG, etc), vector databases, semantic searchpipeline: query -> decompose -> run each -> rerank summed resultsGenerating embeddings via Azure AI toolsEmbeddings models, tradeoffs, and how to chooseSemantic math with embeddings
