A headless CMS can feel deceptively simple: define a few fields, let editors write, ship content everywhere. But the long-term success of a headless setup depends less on the CMS brand and more on your content model:
What is a “piece of content” in your business, which parts are structured, how those parts relate, and how predictable the content is for downstream consumers like websites, apps, email templates, and search.
This post is a practical approach to content modeling that favors durability over perfection. You will learn how to pick the right level of structure, how to avoid painful migrations, and how to make editorial work easier instead of harder.
What content modeling is (and why it matters)
Content modeling is the process of defining content types (like “Article” or “Product”), their fields (title, summary, body, author), and relationships (an article references a category, products reference collections, etc.). In a headless CMS, the model is the “contract” between editorial and all the places content is used.
When the contract is clear, teams move faster: designers can build components confidently, developers can integrate reliably, and editors know what “done” looks like. When the contract is fuzzy, the CMS turns into a document dump and every channel ends up with one-off exceptions.
A good model typically has three traits:
- Reusable: the same entry can power multiple pages and channels without copy-paste.
- Predictable: fields mean what they say, across all entries.
- Editable: the structure supports real editorial workflows, not just developer convenience.
Start with outcomes and constraints
Before you touch the CMS schema builder, capture the outcomes your model must support. This prevents over-engineering and keeps everyone aligned on what “good” looks like.
Three questions that clarify 80 percent of the model
- Where will content appear? List channels: marketing site, product UI, email, sales PDFs, in-app help, etc.
- Which parts must be consistent? Examples: SEO titles, call-to-action blocks, disclaimers, pricing notes, author bios.
- What needs governance? What requires review, approvals, scheduled publishing, or audit history?
Then note constraints that influence structure:
- Localization: which fields will be translated? Are URLs localized?
- Search: which fields need to be filterable or faceted?
- Personalization: will content vary by audience segment, plan tier, or region?
- Reuse: do you need shared blocks (like FAQs or feature cards) used across many pages?
If you are content-heavy, it helps to inventory what you already publish. Do not migrate everything mentally. Just categorize: “long-form articles,” “landing pages,” “release notes,” “case studies,” “policies,” “knowledge base,” and so on. Those categories often become candidate content types, but not always.
Designing types, fields, and relationships
Think of your model as layers: core content types, shared components, and supporting taxonomy. The goal is to keep the core stable and let the outer layers evolve.
1) Content types: define the durable nouns
Start with content types that represent durable business concepts. “Blog Post” is common, but “Case Study,” “Integration,” “FAQ,” and “Author” are often more reusable than pages.
Keep type count small at first. If you are unsure whether “Guide” and “Article” should be separate, model one type with a “format” field and split later if there is a real behavioral difference.
2) Fields: separate meaning from presentation
Fields should describe meaning, not layout. For example, “primary call to action” is meaningful. “blue button text” is presentation.
- Prefer explicit fields for things that must be consistent (summary, SEO title, hero image reference, canonical URL).
- Use rich text for editorial flexibility, but avoid hiding critical semantics inside a blob of HTML.
- Add validation where it prevents breakage: required fields, max lengths, allowed values, and patterns for slugs.
3) Relationships: reference, do not duplicate
References are the headless CMS superpower. If author bios appear on every article, make an Author type and reference it. If the same FAQ list appears on multiple pages, model FAQs as entries and assemble them via references.
A practical rule: duplicate only when change independence matters. If a disclaimer needs to differ per page and should not retroactively change across the site, copy it. If it must stay consistent, reference it.
4) Taxonomy: keep it boring and stable
Taxonomy includes categories, tags, topics, audiences, product lines, and any classification used for navigation or filtering. Model taxonomy as separate types if you need descriptions, ordering, icons (later), or hierarchical relationships.
Do not create three different classification systems that all mean “topic.” Decide what each one is for, and keep each list curated. Messy taxonomy becomes messy search and messy analytics.
A concrete example: modeling a product knowledge article
Imagine a SaaS company that publishes help content that also needs to show up inside the product UI. The same “How to connect your inbox” article should appear as a web page, an in-app drawer, and be searchable.
A naive approach is a single rich text field called “body.” That can work for a simple blog, but it creates problems for in-app usage: you might need a short “in-app summary,” a step list, or a warning callout that is styled consistently.
Here is a lightweight model that stays flexible without becoming a giant page builder:
{
"type": "KnowledgeArticle",
"fields": {
"title": "string (required)",
"slug": "string (required, unique)",
"summary": "string (max 200)",
"body": "richText",
"steps": "array<Step> (optional)",
"audience": "enum (Admin, Member, Developer)",
"topics": "ref[] -> Topic",
"relatedArticles": "ref[] -> KnowledgeArticle",
"lastReviewedAt": "date (optional)",
"status": "enum (Draft, Review, Published)"
}
}
In this setup:
- summary is predictable for search results and in-app previews.
- steps enables consistent rendering as a numbered list in product UI while still allowing the main body to be rich text.
- topics supports filtering and navigation without scraping the body.
- lastReviewedAt enables an editorial process for keeping content accurate.
Notice what is missing: fields like “two column layout” or “background color.” Those belong in the front-end component system, not the content contract.
Checklist: a model review before you build
Use this checklist as a quick review with editorial, design, and engineering before you commit to a schema. It is easiest to change the model now, not after hundreds of entries exist.
- Channel fit: Can each channel consume this content without special cases?
- Field clarity: Would two editors interpret each field the same way?
- Validation: Which fields must be required to prevent broken pages?
- Reuse strategy: Which elements are referenced (shared) vs duplicated (page-specific)?
- Taxonomy discipline: Do categories and tags have clear purposes and owners?
- URL plan: Are slugs unique where they need to be? Do you need nested paths?
- Localization: Which fields are translatable and which are not?
- Editorial workflow: What does Draft, Review, and Published mean, and who can change status?
- Analytics: Do you need fields for campaign attribution, audience, or product area?
- Migration realism: If you already have content, can you reasonably populate these fields?
Common mistakes (and how to avoid them)
Most content modeling failures are not technical. They come from mixing responsibilities or optimizing for a single page instead of a system.
Mistake 1: building “Page” as your only type
A generic Page type with a giant flexible body pushes structure into ad hoc conventions. That makes it harder to search, reuse, and validate. Instead, create types for durable concepts (Article, Case Study, FAQ, Product) and reserve Page builders for exceptional marketing needs.
Mistake 2: encoding presentation in fields
Fields like “hero layout style” or “button color” leak design decisions into content. The result is fragile: redesigns require content edits. Keep styling in the front-end; keep meaning in the CMS.
Mistake 3: overusing rich text to store everything
Rich text is great for narrative, but it is a poor place for metadata and repeatable structures. If you frequently need to extract or render something consistently (steps, warnings, code snippets, specs), consider modeling it explicitly.
Mistake 4: uncontrolled taxonomy growth
Without ownership, tags become a junk drawer. Put someone in charge of each taxonomy list, limit who can create new terms, and periodically merge duplicates.
Mistake 5: forgetting the “consumer” viewpoint
The real customer of a content model is not the editor interface. It is every consumer downstream: templates, search, feeds, and integrations. Make a habit of asking: “Could a simple renderer reliably display this entry?” If not, you are creating hidden coupling.
- Model durable concepts as content types, not as page-specific layouts.
- Keep fields semantic. Structure what must be consistent; leave narrative flexible.
- Use references to enable reuse and governance, and duplicate only when independence is required.
- Taxonomy needs ownership and constraints, or it will undermine search and navigation.
- Review models with editorial, design, and engineering before content volume makes changes expensive.
When not to over-model
Structure has a cost: more fields to fill, more validation to satisfy, and more ways to block publishing. A strong model is not the most detailed model. It is the one that matches your operational reality.
Consider keeping things simpler if:
- Your content is highly experimental and the format changes often. Start with a minimal type and add structure as patterns stabilize.
- You do not have editorial capacity to populate extra fields reliably. A “required” field that is always nonsense is worse than no field.
- Only one channel matters and you do not expect reuse. A basic blog can succeed with fewer structured fields.
- You cannot commit to governance for taxonomy and shared components. References without ownership can create accidental site-wide changes.
A helpful compromise is “progressive structuring”: start with a small set of required fields, then add optional structured blocks where they clearly improve consistency or reuse.
Conclusion
Content modeling is product design for your content system. The best models are clear enough to be reliable and flexible enough to evolve without frequent migrations. Start with outcomes, model durable concepts, and be intentional about what you structure, what you reference, and what you leave as rich text.
If you want more system-level posts like this, the Archive is the best place to browse, and RSS is the simplest way to keep up.
FAQ
How many content types should I start with?
Start with the smallest set that matches your real publishing needs, often 3 to 8 types. If you find yourself using a “Page” type for everything, that is a sign you should introduce at least one more durable type (like Article, Case Study, or Knowledge Article).
Should I use rich text or structured blocks?
Use rich text for narrative content and structured fields for anything you need to validate, reuse, filter, or render consistently across channels. A hybrid approach is common: rich text for the main body plus structured fields for summary, steps, and callouts.
When should I use references versus embedded fields?
Use references when you need reuse, shared updates, or governance (authors, topics, shared FAQ entries). Embed or duplicate when the content should be independent per entry, or when references would create unwanted global changes.
How do I keep taxonomy from getting messy?
Define what each taxonomy is for (navigation, filtering, internal reporting), limit term creation to a small group, and periodically merge duplicates. If editors are guessing between similar terms, rename or consolidate them.
What is a quick test for a good content model?
Pick two real entries and imagine rendering them in two different channels (for example, website and in-app). If you need to “interpret” fields differently per entry or hand-edit the output, the model likely needs clearer semantics or more structure in the right places.