F

Foundation Model

A large AI model trained on broad data at massive scale that serves as a general-purpose substrate other applications fine-tune or prompt against. The category that includes GPT, Claude, Gemini, Grok, Llama, and the rest of the frontier model lineup.

What it is

Foundation model is the term coined by Stanford's CRFM in 2021 for AI models trained on very broad data (text, sometimes plus images, audio, code, video) at enormous scale, intended to serve as the substrate for many downstream uses rather than a single task. The defining characteristics: (1) trained once at high cost on a vast corpus, (2) general-purpose enough to be adapted (through prompting, fine-tuning, or RAG) to many specific applications, (3) typically built and operated by a relatively small number of vendors with the compute to train them. Examples include OpenAI's GPT family, Anthropic's Claude family, Google DeepMind's Gemini, xAI's Grok, Meta's Llama (open-weight), DeepSeek's V3 and R1 (open-weight), Mistral's family, and Qwen. The "foundation" framing emphasizes that other software builds on top — the way operating systems were the foundation for applications in the prior software era.

Why it matters

The foundation-model layer is the most consequential infrastructure shift of the decade. The relationship a business has with foundation models — whether as a buyer of API access, an operator of open-weight models on its own infrastructure, or a partner inside a hyperscaler relationship — shapes its AI economics, its data governance posture, and its strategic optionality. For agent operations specifically, the central architectural question is whether the agents are tightly coupled to one foundation model or can swap between several based on capability, cost, and policy (see vendor-neutral AI, capability registry). The vendor landscape consolidates and reshapes constantly — a model that's state-of-the-art today may be third-tier in eighteen months — so designs that bind to a single foundation model carry meaningful migration risk.

Key components

  • Trained at scale — vast corpora, large parameter counts, multi-million-dollar training runs
  • General-purpose substrate — adapted downstream via prompting, fine-tuning, or RAG
  • Vendor landscape — Anthropic, OpenAI, Google, xAI, Meta (open-weight), DeepSeek, Mistral, Qwen
  • Open-weight vs closed — Llama, DeepSeek, Qwen are downloadable; GPT, Claude, Gemini, Grok are API-only
  • Reshapes constantly — frontier capability shifts among vendors quarterly, making vendor-neutral architecture valuable

Need Help Implementing This?

We specialize in putting AI and Agentforce to work for Salesforce customers. Let's talk about your use case.

Book Intro Call