Current Explorations

A running list of what I'm actively exploring in AI. Completed items are marked with strikethrough.

  • MIT OpenCourseware 15.773 Hands-On Deep Learning
  • Strategies for Ed-Fi API tool use by LLMs (MCP or other)
  • Current strategies for hallucination detection
  • Hallucination detection: LLM-as-judge
  • Learning to critical read and analyze LLM Benchmarks & Leaderboards
  • LLM output validation / evaluation
  • Hallucination detection: Semantic entropy
  • Hallucination detection: Beyesian estimation of semantic entropy
  • LLM tool use mechanics
  • Token efficiency with tool calling
  • LLM-hosted tool calling costs and implications
  • Forcing JSON/other schema responses by forcing LLM to use 'fake' tool
  • Security considerations with LLM tool use
  • MCP Server primitives: tools, resources, prompts
  • Nuances between function/tool calling and various packaged LLM features (agents, memory, etc.)
  • Tokenization, Byte Pair Encoding
  • Prompt Caching
  • Token pricing: input, output, cached, etc.
  • Token count/cost difference by format (JSON, YAML, code, prose, etc)
  • Designing LLM conversations for efficient cost
  • Developing and testing intuition on model selection (Claude haiku vs sonnet vs opus)
  • LLM batch processing
  • Anthropic's Contextual Retrieval offering
  • RAG pipeline, RAG types (traditional with embeddings, Prompt RAG / table of contents pattern, ELITE, full text search, Graph RAG, etc), vector databases, semantic search
  • pipeline: query -> decompose -> run each -> rerank summed results
  • Generating embeddings via Azure AI tools
  • Embeddings models, tradeoffs, and how to choose
  • Semantic math with embeddings