Pricing shown is approximate as of early 2026 (input / output per million tokens). All models are accessible via API. Context window = how much text the model can process at once.

Frontier / Closed Models
ModelCompanyContext WindowPricing (Input/Output per 1M tokens)Best For
GPT-4oOpenAI128K$5 / $15General purpose, vision, function calling
o3OpenAI200K$10 / $40Complex multi-step reasoning, math, coding
Claude 3.5 SonnetAnthropic200K$3 / $15Long documents, instruction-following, agents
Claude 3 OpusAnthropic200K$15 / $75Highest reasoning quality, complex tasks
Claude 3 HaikuAnthropic200K$0.25 / $1.25Fast, cheap, high-volume tasks
Gemini 1.5 ProGoogle1M$3.50 / $10.50Extremely long context (books, large codebases)
Gemini 2.0 FlashGoogle1M$0.10 / $0.40Speed + cost, multimodal, real-time apps
Mistral LargeMistral AI128K$3 / $9European alternative, strong in French/multilingual
Grok 2xAI131K$2 / $10Real-time data access via X, current events
Open Source / Self-Hostable Models

Open source models are free to use — you just need hardware (or a hosting service like Groq/Together.ai). Great for privacy, cost control, and customization.

ModelCompanySize OptionsCostBest For
Llama 3.3Meta8B, 70BFree (self-host)Best open-source general model. Rivals GPT-4o in many benchmarks.
Mistral 7B / Mixtral 8x7BMistral AI7B, 47BFree (self-host)Fast, efficient, great for constrained environments
Phi-3 / Phi-4Microsoft3.8B, 14BFree (self-host)Small but surprisingly capable. Runs on a laptop.
Gemma 2Google2B, 9B, 27BFree (self-host)Google's open model. Good reasoning for its size.
Qwen 2.5Alibaba0.5B–72BFree (self-host)Excellent multilingual, strong coding, many size options
DeepSeek R1DeepSeek7B–671BFree (self-host)Reasoning model. Competitive with o1 at a fraction of cost.
Which Model Should I Use?
Your NeedRecommended ModelWhy
Best all-around, cost-effectiveClaude 3.5 SonnetBest instruction-following, long context, great for agents
Cheapest for high volumeClaude 3 Haiku or Gemini Flash~$0.10-0.25 per million tokens
Hardest reasoning taskso3 or Claude 3 OpusHighest accuracy on complex multi-step problems
Very long documents (100K+ words)Gemini 1.5 Pro1M token context window, unmatched
Free, run locally, privacyLlama 3.3 70B via OllamaBest open-source quality, runs locally
Fast local model on a laptopPhi-3 Mini or Mistral 7BRuns on CPU, no GPU needed

These tools let you build agents without writing code — or with very minimal code. Ideal starting points for business users and non-developers.

Microsoft Copilot Studio
No CodeBusiness
Included in Microsoft 365. Build agents in plain English and deploy to Teams, Outlook, SharePoint. Native connectors for all M365 services. Can use Claude or GPT as AI brain.
Included in M365
Visit →
Claude Projects
No CodeBeginner
Create a persistent agent inside claude.ai by writing instructions in plain English. Share via link. Ideal for prototyping any agent in under 5 minutes before building properly.
Free / $20 mo (Pro)
Visit →
OpenAI GPTs
No CodeBeginner
Build custom ChatGPT agents inside ChatGPT. Add a system prompt, upload files as knowledge, add web browsing or DALL-E. Publish publicly or share a link. Very easy to start.
ChatGPT Plus required
Visit →
OpenAI Agent Builder
No CodeIntermediate
Drag-and-drop visual canvas for building agents with web search, code interpreter, file search, and image generation. 75% less time vs. coding from scratch according to OpenAI.
Pay-as-you-go
Visit →
Zapier AI Agents
No CodeIntermediate
Build AI agents using natural language inside Zapier's automation platform. Access to 7,000+ app integrations. Best for teams already using Zapier for automation.
Free tier + paid plans
Visit →
Botpress
No CodeIntermediate
Visual chatbot and agent builder with LLM integration. Deploy to website, WhatsApp, Slack, Teams. Good for customer service agents. Strong free tier.
Free tier available
Visit →
Dify
No CodeIntermediate
Open-source LLM application platform. Visual workflow builder for agents, RAG pipelines, and chatbots. Self-hostable. Supports GPT, Claude, Llama and 30+ models. Very popular.
Free (self-host)
Visit →
Flowise
No CodeIntermediate
Open-source, self-hosted visual builder for LangChain flows and agents. Drag and drop LLM nodes into agentic pipelines. Great for developers who want visual + code flexibility.
Free (open source)
Visit →

These are code-first frameworks for developers. They give you full control over agent behavior, tool use, memory, and multi-agent coordination.

Claude Agent SDK
Intermediate
Anthropic's official SDK for building agents. Handles the agent loop, tool execution, context management, and sub-agents. Available in Python and TypeScript. Best for Claude-based agents.
Free (open source)
Docs →
OpenAI Agents SDK
Intermediate
OpenAI's official Python/Node SDK for building agents with tool use, handoffs between agents, and guardrails. Works with GPT-4o, o3, and other OpenAI models. Simple and well-documented.
Free (open source)
Docs →
LangChain
Intermediate
The most popular agent framework. Connects LLMs to tools, memory, and data. Massive ecosystem with hundreds of integrations. Python and JavaScript. Large community and examples.
Free (open source)
GitHub →
LlamaIndex
Intermediate
Specialized in connecting LLMs to your own data (RAG). Build agents that reason over your documents, databases, and APIs. Best-in-class for enterprise knowledge retrieval.
Free (open source)
GitHub →
CrewAI
Intermediate
Multi-agent framework where AI agents take on roles (researcher, writer, analyst) and collaborate. Great for workflows that need parallel execution or specialization across tasks.
Free (open source)
GitHub →
AutoGen (Microsoft)
Advanced
Microsoft's multi-agent conversation framework. Agents can collaborate, debate, and self-critique. Supports human-in-the-loop. Good for complex reasoning that benefits from multiple AI perspectives.
Free (open source)
GitHub →
Semantic Kernel (Microsoft)
Advanced
Enterprise-grade SDK from Microsoft. C#, Python, Java support. Deeply integrated with Azure. Best for enterprise .NET teams wanting to add AI agents to existing systems.
Free (open source)
GitHub →

These services let you call any LLM via API — including open-source models — without running your own hardware. Great for trying different models at low cost.

Anthropic API
Official API for all Claude models. Best-in-class instruction following and long context. 200K token context. pip install anthropic to get started.
Pay-per-token
Get Key →
OpenAI API
Official API for GPT-4o, o3, and all OpenAI models. Largest ecosystem, most documentation, most integrations. Widely used standard in the industry.
Pay-per-token
Get Key →
Groq
Extremely fast inference for open-source models (Llama, Mixtral, Gemma). Tokens appear almost instantly. Free tier available. Great for real-time applications that need speed.
Free tier + paid
Visit →
Together.ai
Run 100+ open-source models (Llama, Mistral, Qwen, DeepSeek) via API at low cost. Often 5–10x cheaper than equivalent closed-model APIs for similar quality.
Free $25 credit
Visit →
Replicate
Run any model from Hugging Face as an API with one line of code. Great for image, audio, and video models alongside LLMs. Pay per second of compute.
Pay-per-run
Visit →
Hugging Face Inference
Run models directly from the Hugging Face hub via API. 400,000+ models available. Free serverless tier for many models. The largest open-source model repository in the world.
Free tier + paid
Visit →
Google AI Studio / Vertex AI
Access Gemini models via API. AI Studio is free to experiment. Vertex AI is the enterprise-grade version on Google Cloud. Best if you're already on GCP.
Free + enterprise
Visit →
Azure OpenAI
OpenAI models hosted on Azure. Enterprise compliance (SOC 2, HIPAA, EU data residency), private networking, and SLAs. Required for many regulated industries. More expensive than OpenAI direct.
Enterprise pricing
Visit →
AWS Bedrock
Access Claude, Llama, Titan, and other models via AWS. Fully managed, enterprise-grade. Great if you're already on AWS. Supports fine-tuning and RAG via Knowledge Bases.
Pay-per-token
Visit →

Run AI completely free and privately on your own machine. No API keys, no monthly bills, no data leaving your computer. These tools make it trivially easy.

Ollama
Beginner
Run Llama, Mistral, Phi, Gemma, and 100+ models locally with one command: ollama run llama3. macOS, Linux, Windows. Creates a local API compatible with OpenAI clients. Most popular local model runner.
Free (open source)
Download →
LM Studio
Beginner
Beautiful desktop app for discovering, downloading, and running local LLMs. No terminal needed — just a GUI. Built-in chat interface and local server. Best for non-developers who want local AI.
Open WebUI
Intermediate
Self-hosted ChatGPT-style web interface. Works with Ollama, OpenAI API, and Anthropic API. Supports multiple users, conversation history, RAG on files, and image generation. 60K+ GitHub stars.
Free (open source)
GitHub →
Jan.ai
Beginner
Open-source, offline ChatGPT alternative. Desktop app that runs models locally. Privacy-first, no data leaves your device. Simple UI, good for individuals who want a self-contained AI assistant.
Free (open source)
Download →
Dify (self-hosted)
Intermediate
Open-source LLM app development platform. Visual workflow builder, RAG pipelines, agent orchestration, API publishing. Run on your own server with Docker. Excellent for building internal AI tools.
Free (self-host)
GitHub →
LocalAI
Advanced
Free, self-hosted, OpenAI-compatible REST API for running LLMs, image generation, speech-to-text, and more. Drop-in replacement for OpenAI API calls. Great for enterprise data privacy requirements.
Free (open source)
GitHub →

These are purpose-built AI applications for specific tasks. Often the fastest way to get value — no building required.

Search & Research
Perplexity AI
AI search that cites every source. Replaces Google for research. Free tier is very generous.
Free + $20/mo Pro
Visit →
You.com
AI search engine with a customizable mix of web results, AI answers, and specialized search modes.
Free + paid
Visit →
Coding Assistants
Cursor
AI-first code editor. Full codebase context, agent mode, inline edits. Used by millions of developers. Built on VS Code.
Free + $20/mo Pro
Visit →
GitHub Copilot
AI pair programmer integrated into VS Code, JetBrains, and more. Suggests code, explains errors, and has an agent mode for multi-file edits.
$10–19/mo
Visit →
Windsurf (Codeium)
Free Cursor alternative. Agentic code editor with Cascade — an agent that plans and executes multi-step coding tasks across your codebase.
Free + paid
Visit →
v0 by Vercel
Generate web UI components from a text description. Produces React / shadcn / Tailwind code instantly. Best rapid UI prototyping tool available.
Free credits + paid
Visit →
Productivity & Writing
Notion AI
AI built directly into Notion. Summarize, draft, translate, and auto-fill databases. Great for teams already using Notion.
$10/mo addon
Visit →
Microsoft 365 Copilot
AI across Word, Excel, PowerPoint, Outlook, and Teams. Summarizes emails, drafts documents, analyzes spreadsheets. Requires M365 Business/Enterprise.
$30/user/mo
Visit →
Google Gemini for Workspace
AI in Gmail, Docs, Sheets, Slides, and Meet. Drafts emails, summarizes docs, generates images in Slides. Comparable to M365 Copilot for Google Workspace users.
$20/user/mo
Visit →
Cost & Latency Calculator

Estimate the API cost and time for a multi-step agentic loop before you build it.

Cost Per Task
USD
Daily Cost
USD / day
Monthly Cost
USD / month
Est. Latency
seconds / task
* Latency estimate assumes ~0.5–2s per model call (varies by model speed, tool execution, and network). Real latency will vary. Cost estimates based on published pricing as of Q1 2026.
Agentic Benchmark Tracker

How the leading models perform on tasks that actually matter for agents — not just IQ tests. Updated Q1 2026.

Benchmarks are snapshots. Model providers update models frequently. Always run your own evals on your specific task before picking a model for production.

Model Tool Calling Accuracy Long-Context Retrieval Multi-Step Reasoning Instruction Following Cost Efficiency Speed
Claude 3.5 Sonnet Best Overall 97% 96% 92% 94% Fast
GPT-4o 95% 88% 91% 92% Fast
o3 (OpenAI) Best Reasoning 89% 85% 98% 88% Slow
Gemini 1.5 Pro Best Long Context 87% 99% 86% 87% Medium
Gemini 2.0 Flash Best Value 84% 81% 79% 82% Very Fast
Llama 3.3 70B (open source) 81% 77% 84% 83% (free) Varies
DeepSeek R1 (open source) 78% 74% 91% 80% (free) Slow
Visual Workflow Builder

Drag-and-drop sandboxes where you can wire together LLM nodes, tool nodes, and memory nodes to prototype multi-agent systems — no code required.

Dify Workflow Canvas

Full visual canvas: LLM nodes, tool nodes, code nodes, condition branches, and loops. Build complex agentic pipelines and publish as an API or web app. Free self-hosted option.

Open Dify →

n8n AI Workflow Builder

Connect any service with AI reasoning in the middle. Trigger on email, webhook, schedule — run Claude or GPT on the data — output to Slack, CRM, database. 400+ connectors.

Open n8n →

Flowise — LangChain Visual

Open-source drag-and-drop UI for LangChain flows. Chain LLM nodes, retriever nodes, and agent nodes. Best for developers who want visual + code flexibility.

Open Flowise →

OpenAI Agent Builder

Official OpenAI visual canvas for building agents with web search, code interpreter, file search, and handoffs. 75% less time vs. coding from scratch per OpenAI benchmarks.

Open Agent Builder →