AI Fundamentals. Part 13. The Model Landscape
This video from Pavel Spesivtsev's lecture provides an overview of the current landscape of large language models, categorizing them by how they are accessed and their specific capabilities. Model Categories: When selecting a model for a solution, choices generally fall into three categories: proprietary, open-source (or open-weights), and self-hosted. Proprietary Models: Major players include OpenAI (with GPT-5.3), Anthropic (Claude), and Google (Gemini). OpenAI: Their models are typically multimodal, meaning they can process audio, video, and images simultaneously and possess visual capabilities. They are often preferred for workloads requiring voice processing. Anthropic: Claude can recognize and process images but lacks audio processing capabilities. Google: The Gemini family is highlighted as a leading choice for those already integrated into the Google infrastructure. It is described as a highly advanced, bleeding-edge model. Pavel notes that while OpenAI was first to market, Google's DeepMind initially developed the transformer algorithm that drives these technologies. Open-Weights Models: These models allow users to download and run them on their own hardware, though users typically lack access to the underlying training datasets. Regional Considerations: The speaker advises caution regarding Chinese models due to potential biases in their training data regarding politics. Llama: While considered an alternative to proprietary options, the speaker notes that Llama is currently behind in terms of intelligence and performance compared to the rest of the market, though it may remain suitable for specific retrieval or legal workflows. This is Day 1, Module 1 of the AI Operator Workshop — a 5-day in-person intensive in San Francisco covering secure AI deployment, n8n automation, voice agents, penetration testing, and real-time digital employees. 🔗 Next cohort: https://luma.com/aistartacademy 📍 SF Mission District | hello@aistartacademy.com