Skip to main content

Site Migration — Content Architecture

Sharing domain expertise to inform a collaborative WordPress build

EN 中文

SCA is migrating from Webflow to WordPress. I've prepared this briefing to share my work on the school's content architecture and open a conversation with our implementation partner about how we combine our expertise to build a site that meets SCA's goals.

This briefing is a living document, and can be actively updated as our conversation develops. This document itself is managed through an AI-assisted editorial workflow — I describe a change in plain language, and it's revised, rebuilt, and redeployed automatically.

This document describes the complete target architecture as a roadmap. The same content model can inform a phased approach: even if the main site launches as a Foundation Phase build, specific initiatives can draw on the structured content infrastructure immediately.

The Content Architecture I've Prepared
  • 12 content types modeled with typed fields and relationships — maps directly to WordPress CPT + custom fields
  • All Webflow content extracted: 9 pages, 7 programs, 13 news articles, 87 images
  • Public content API + minimal reference frontend at web-beta-lilac-27.vercel.app — built to prove the data model works, not as a design proposal

This infrastructure is currently operational in Sanity CMS and serving content via live API — the question is how WordPress best relates to this existing layer. Full schema details are in the Appendix.

Conversations to Have
  • WordPress architecture: how best to implement this content model — CPT + fields, external API, or a hybrid
  • Migration scope: which pages, what functionality, what timeline
  • How to balance editorial usability with structural integrity
1

The Starting Point

The current site and why a migration was needed

Springfield Commonwealth Academy's website at springfieldcommonwealthacademy.org runs on Webflow. It's a competent marketing site — about 20 pages covering academics, athletics, student life, admissions, and news — with professional photography and a clean layout.

What the Webflow Site Contains
Information pages ~9 pages (About, History, Vision, Admissions, Contact, etc.)
Academic programs 5 program pages under Academics
Athletics 3 pages (overview + individual sports)
News articles CMS-driven listing with individual article pages
Faculty directory Photo grid with ~12 staff members
Photography 87+ images across campus, student life, athletics
Gradelink integration External link to enrollment portal

What Webflow doesn't do well for SCA's needs: Content is locked inside Webflow's visual builder. There's no API to query school data programmatically. News articles can't be reused in newsletters or consumed by other systems. There's no way to automate content workflows — every update requires manual editing in the Webflow designer. As the school grows, these limitations compound.

The school decided to migrate off Webflow. The question was: migrate to what, and how do you make sure the migration creates value beyond just changing platforms?

2

The Content Architecture

How SCA's content needs to be structured — and what I've built so far

Rather than moving content from one website builder to another, I modeled the school's content as structured data — typed, queryable, and available via API. This represents how SCA's content needs to be organized to support our AI-forward positioning. The website becomes one consumer of this data, not the only place it lives. I'm open to guidance on the best WordPress patterns to achieve this structure.

The Core Idea

A news article in Webflow is a visual layout. A news article in a structured content system is data: it has a title, a publish date, a body, a category, an author, and an image — each as a separate, typed field. That same article can render on a website, appear in an email newsletter, respond to an API query, or be read by an AI assistant. The content is written once and used everywhere.

Sanity's blog describes this shift as "The Quiet Reshape of Content Operations" — treating content as an institutional asset with its own structure and access layer, not website filler. AI systems, automation tools, and multi-channel publishing all depend on content being structured rather than trapped in page layouts.

What I Built: 12 Document Types + 3 Shared Objects

I defined 12 document types and 3 reusable object types in Sanity CMS that model SCA's operations. The document types are the main content — pages, news, programs. The object types (SEO metadata, social links, announcements) are reusable structures embedded inside multiple document types for consistency. Full schema details are in the Appendix.

Content Type What It Represents
Page Information pages with parent/child hierarchy (About, Vision, History, etc.)
News Articles with source tracking (manual entry, Instagram import, external link)
Program Academic and athletic programs with descriptions, coaches, college placement data
Person Faculty and staff with roles, bios, department relationships
Department Organizational units that people belong to
Event School events with dates, locations, descriptions
Student Project Student work with automated Google Drive provisioning
Alumni Story Graduate outcomes and testimonials
Media Gallery Photo collections for campus life, events, athletics
Boarding Feature Boarding program highlights for international families
Admissions Path Application steps and requirements by enrollment type
Site Settings Global configuration: school name, contact info, social links, announcements

These types have relationships — a Program can reference its Coach (a Person), a Person belongs to a Department, Pages have parent/child hierarchies. This is what makes the data queryable in ways that flat pages can't be.

Key Decisions — My Domain

Four architectural decisions shaped this work. These reflect SCA's institutional needs — how they're best implemented in WordPress is where platform expertise comes in.

  1. Sanity for the content layer. Sanity provides a real-time API, typed schemas defined in code, and a content studio that editors can use without developer help. The content is accessible via HTTP — any system that can make a GET request can read it.
  2. Astro for a reference frontend. I built a deliberately minimal reference website to prove the content layer works. It renders all the structured data into real pages, but the visual design is intentionally secondary — because the architecture keeps data and presentation completely independent. Any frontend can consume the same content with a different design.
  3. Content extraction before rebuilding. Rather than manually recreating content, I programmatically extracted everything from the Webflow site — page text, program descriptions, news articles, images — and imported it into Sanity as structured data. This work doesn't need to be repeated.
  4. AI-readable by design. The site outputs static HTML with JSON-LD structured data and semantic markup. No JavaScript is required to read any page. Any LLM with web access — ChatGPT, search engine crawlers, prospective parent AI assistants — can parse the content directly. The public API means AI agents can also query school data programmatically.
3

What Exists Today

A live reference site, content API, and honest fit-gap analysis

There are three concrete deliverables live today: a reference website, a content API, and extracted content ready for any frontend to consume.

What's Live
Reference Website web-beta-lilac-27.vercel.app — token-based design foundation, responsive, self-hosted fonts
Content API Public GROQ endpoint — any system can query school content via HTTP
Content Imported 9 info pages, 7 programs, 13 news articles, 87 images — all extracted from Webflow and structured in Sanity
Automation Student project system with automatic Google Drive folder provisioning

The reference website renders the structured content into working pages: a homepage with program highlights, a news section with article detail pages, and content pages for About, Academics, Athletics, Student Life, Admissions, and Contact. It has a complete design system with a defined typography scale, color palette, and responsive layouts. It's not a mockup — it's a working site consuming live data from the content API.

Scope Checklist: Webflow Site vs. What's Built

Use this as a scope reference. Green = equivalent or better. Yellow = partially built (data model exists, frontend incomplete). Red = not started. New capabilities not on the current site are marked separately.

Feature Webflow (Current) Sanity + Astro (Built) Status
Homepage Full design with hero image Design token foundation with hero (text-only), value props, programs, CTA. No hero image. Partial
About / History / Vision Rich pages with photography Content imported, styled template. Images need migration. Partial
Academics + Programs 5 program pages 7 programs in Sanity with typed fields. Template rendering. Partial
Athletics 3 pages with photos Schema with conditional fields (seasonal sports). Basic page. Partial
Faculty Directory Photo grid, ~12 staff Schema exists (Person + Department). No data imported yet. Not started
Student Life Rich photo page Schema exists, basic page with content. Partial
Community / Engagement 2 pages with galleries Schema exists. No frontend page. Not started
News & Articles CMS-driven listing 13 articles imported. Listing + detail pages. Data complete; frontend is reference-grade. Data complete
Admissions Multi-section page Schema + basic page. Full admissions path schema ready. Partial
Gradelink Enrollment External link button URL field in site settings. Not wired to frontend. Partial
Contact Page Form + embedded map Basic page. No contact form or map integration. Partial
Alumni Stories Not on current site Schema ready — graduate outcomes, testimonials New
Media Galleries Not on current site Schema ready — organized photo collections New
Student Projects Not on current site Drive automation works. Frontend is basic. System works
Boarding Features Not on current site Schema ready — boarding program highlights for international families New
Content API None Full public API — any system can query all school content Complete
Editorial Workflow None Source tracking, QA flags for imported content New

"Partial" means the data model and content exist but the frontend doesn't match the Webflow original's visual richness. "Not started" means the schema exists but no frontend page or data. "New" marks capabilities the Webflow site doesn't have at all — these are optional scope additions.

4

Where the Value Is

What's available for the WordPress build, and what still needs attention

The content model and extracted data are the foundation for the WordPress build. Here's how my work maps to WordPress concepts — platform expertise will determine how best to implement these patterns.

Content Model → WordPress CPT + Custom Fields

The 12 content types translate directly to WordPress Custom Post Types with Advanced Custom Fields (or equivalent). Each type has defined fields and relationships:

  • Programs with category, level, description, coach reference, college placement data → CPT with ACF fields + relationship to Person
  • People with roles, bios, department references → CPT with taxonomy or relationship field
  • Pages with parent/child hierarchy, rich text, SEO fields → native WordPress pages with custom fields
  • News, Events, Student Projects, Alumni Stories → each a CPT with typed fields rather than freeform page builder content

Full schema details with every field and relationship are in the Appendix. This can serve as a starting point for CPT registration and field group setup — optimization suggestions from WordPress implementation experience are welcome.

Content Already Extracted
  • All text content extracted programmatically from Webflow — page text, program descriptions, news articles — stored as structured data. Can be imported into WordPress or consumed via API.
  • 87 images downloaded and hosted on Sanity's CDN. Available for re-import into WordPress media library.
  • WordPress can implement this model directly (CPT + fields), integrate with the existing content API, or take a hybrid approach — recommendations on what works best are welcome.
Operational Infrastructure
  • Google Drive automation. Student project system provisions Drive folders when a project is created. Independent of the website frontend.
  • Editorial workflow. Source tracking (manual, Instagram, external) and QA flags for imported content.
  • New content types (alumni stories, media galleries, boarding features, admissions paths) have schemas defined but no content yet.
Implementation Opportunities
  • Contact forms — Standard WordPress form implementation needed.
  • Gradelink integration — URL field exists in site settings; enrollment button to be wired into frontend.
  • Faculty data migration — Requires manual import or custom parsing solution for Webflow's photo grid format. Schema is ready to receive data.
  • Image relationship mapping — Assets extracted and hosted; linking to specific content entries to be completed during WordPress migration.
  • Community/engagement pages — Backend schemas complete; frontend implementation pending design decisions.
5

Implementation Approaches

How we combine content architecture with WordPress expertise

SCA's institutional target is structured content. Implementation may be phased based on organizational priorities — both Foundation Phase and Infrastructure Phase are valid professional approaches. I've outlined the options below; professional input on the recommended path is welcome.

The Strategic Direction

Whether we begin with Foundation Phase or Infrastructure Phase, the goal is architecture that can evolve. Foundation Phase (professional web presence) is a viable starting point provided it can migrate to Infrastructure Phase (structured content) without rebuilding from scratch. The specific risk to manage is not page builders themselves, but content locked into proprietary shortcodes or layout-specific blobs that cannot be cleanly extracted later. I'm not attached to any particular WordPress pattern — what matters is that content remains portable.

Approach A: WordPress Implements the Model Natively

Build Custom Post Types + custom fields that replicate the 12 content types. Content is imported into WordPress directly. Sanity serves as the reference for what the content model looks like; WordPress owns everything going forward.

  • Standard WordPress architecture: CPT + ACF (or equivalent) + templates
  • Content imported from Sanity export or re-entered manually
  • Schema design, field definitions, and relationships documented and ready
  • All frontend work (themes, forms, plugins, SEO) lives in WordPress

This represents the Infrastructure Phase path. The content model defines what "program" and "person" and "event" mean for SCA, so the WordPress build starts with a clear data architecture instead of a blank slate. If starting with Foundation Phase using conventional themes, the architecture should keep content in standard WordPress structures (posts, pages, native fields) rather than embedding it in page builder shortcodes or proprietary block formats — so that migrating to CPTs later is a data operation, not a content re-creation effort. A question for discussion: are there scenarios where consuming the Sanity API directly makes sense for WordPress, or is native CPT implementation the better path for SCA's needs? Input on this is welcome.

Technical Discussion Points

Areas where professional input would be valuable:

  • WordPress architecture recommendations given these content relationships
  • Approaches to balancing editorial usability with structural integrity
  • WordPress best practices for content exportability
  • If starting with Foundation Phase: which theme architectures best support future migration to CPTs? If starting with Infrastructure Phase: recommended CPT architecture for these content types
  • Standard approaches for custom fields: ACF, blocks, or other options
  • Performance/SEO baselines: Core Web Vitals, images, caching
6

What I'm Hoping For From This Partnership

Core requirements, open areas, and how we work together

I want to be transparent about what's fixed and what's open. SCA's institutional requirements set certain constraints. Everything else is a conversation.

Phased Implementation

Angelene's vision for SCA's AI-forward positioning can be realized through progressive stages, depending on implementation scope and timeline:

Foundation Phase focuses on professional web presence with visual polish, mobile responsiveness, and basic content management. Content is presented through established WordPress themes. This phase may utilize page builder tools for rapid deployment, with the understanding that content will be migrated to structured CPTs in Infrastructure Phase. The goal is bridgeable architecture, not permanent page-builder dependency. This delivers immediate institutional value and positions SCA for future enhancement.

Infrastructure Phase implements the full structured content model described in this document — CPTs with defined fields, queryable relationships between content types, API accessibility, and AI-ready architecture. This enables automated content workflows, multi-channel publishing, and the complete AI-forward capabilities that differentiate SCA's positioning.

Both phases serve SCA's journey. The question is whether to implement the complete infrastructure now, or to establish the foundation first and evolve to full structure in a subsequent phase.

Institutional Target Architecture

The following describes SCA's Infrastructure Phase destination — the complete structured content architecture. Not all of these are Phase 1 requirements. Foundation Phase may utilize interim solutions (conventional themes, manual curation) as a viable starting point, with the understanding that these evolve toward the target architecture over time. The destination includes:

  • Content stored as structured data. CPTs with defined fields. A "program" has a name, category, coach, and description as separate queryable fields.
  • Relationships between content types are queryable. A program references its coach (a person). A person belongs to a department. Pages have parent/child hierarchies. These connections must be navigable in code.
  • Content is exportable and portable. If SCA changes platforms in the future, all content and relationships can be extracted cleanly.
  • AI-readable output. Static HTML with semantic markup and structured data (JSON-LD). No JavaScript required to parse content. This is how prospective families' AI assistants, search engines, and SCA's own AI tools will interact with the site.
Implementation Flexibility

I'm not presuming to dictate WordPress implementation details — that's the domain of platform expertise:

  • Theme architecture, plugin stack, hosting environment
  • Custom fields approach — ACF, native blocks, or another pattern
  • Editorial workflow and content management UX
  • Performance optimization, caching strategy, deployment pipeline
Shared Understanding — The Outcomes That Matter

To ensure we meet SCA's institutional needs, these are the outcomes I'd like us to align on. Implementation choices will determine how these are achieved in WordPress:

  • Faculty, programs, news, and events are template-driven from structured data — not manually assembled pages
  • Content relationships are preserved (program → coach, person → department, page hierarchy)
  • The site renders semantic HTML that AI tools and search engines can parse without executing JavaScript
  • Content can be exported to JSON or another portable format at any time
Future Application / Optionality

The structured content architecture described in this document has value beyond the immediate website migration. Once content is stored as typed, relational data:

  • AI tools can query institutional content directly — answering parent questions, generating marketing materials, supporting admissions workflows
  • Content can be published across multiple channels (website, mobile app, printed materials, social media) from a single source
  • Institutional knowledge is preserved as queryable data rather than locked in page layouts
  • Future platform migrations become data transfers rather than content re-creation

These capabilities become available regardless of when Infrastructure Phase is implemented. The architecture is the prerequisite; timing is an organizational decision.

Collaboration Model

I provide content architecture, business requirements, and AI-readiness strategy. Implementation expertise guides WordPress-specific decisions. If Foundation Phase is the starting point, I can identify which decisions now will ease the transition to Infrastructure Phase later. My priority is ensuring we don't close doors to Angelene's full vision.

7

Appendix: Content Model & Design System

Full schema reference, design tokens, and live API access

This section is for developers and agencies who want to inspect the underlying data model and design system. Everything here is accessible without needing to clone a repository.

Content Model — 12 Document Types

Each document type is a structured content definition with typed fields and relationships. You can query any of these via the public API (see Try the API below).

Type Key Fields Relationships
page title, slug, body (rich text), seo parent → page (hierarchy)
news title, slug, date, summary, body, image, source, featured, editorialFlags
program name, slug, category, level, description, image, collegePlacementSummary coach → person
person name, role, bio, photo, email, order department → department
department name, slug
event title, slug, startDate, endDate, location, category, description, image
studentProject title, slug, studentEmail, year, summary, body, gallery, visibility, status student → person, program → program. Triggers Google Drive automation.
alumniStory name, slug, graduationYear, university, achievement, quote, story, photo
mediaGallery title, slug, date, description, images[] (with caption/alt), category
boardingFeature title, slug, summary, body, order
admissionsPath title, slug, audience (athlete/international/domestic), summary, body, applyUrl
siteSettings schoolName, tagline, address, phone, email, enrollmentUrl, socialLinks[], announcement Singleton (one per site)
Shared Object Types (3)

These are reusable structures embedded inside document types — not standalone content. They ensure consistency across the system.

Object Fields Used By
seo title, description (160 char), social image page, news, siteSettings
socialLink platform (facebook, instagram, twitter, etc.), url siteSettings
announcement enabled, text, link siteSettings
Current Content Inventory

What's actually in the system right now (live data, queryable via API):

Pages (info pages imported from Webflow) 9
News articles 13
Programs (academic + athletic) 7
Student projects (with Drive integration) 4
Images uploaded to Sanity CDN 87
Faculty, events, galleries, alumni, boarding, admissions paths Schemas ready, no content yet
Try the API — No Authentication Required

The content API is public for reads. You can paste these URLs into any browser to see live data. No setup, no tokens, no plugins.

All programs with their fields:

https://wesg5rw8.api.sanity.io/v2024-01-01/data/query/production?query=*[_type=="program"]{name,slug,category,description}

Latest news articles:

https://wesg5rw8.api.sanity.io/v2024-01-01/data/query/production?query=*[_type=="news"]|order(date desc){title,slug,date,summary}

Page hierarchy (parent-child relationships):

https://wesg5rw8.api.sanity.io/v2024-01-01/data/query/production?query=*[_type=="page"]{title,slug,"parent":parent->title}

Everything (all content types):

https://wesg5rw8.api.sanity.io/v2024-01-01/data/query/production?query=*[!(_type match "system.*")]{_type,_id}

This is standard HTTP. WordPress can consume these endpoints with wp_remote_get(). Any language with HTTP support works.

Design Approach — Token Foundation

The reference frontend uses design tokens — named values for colors, typography, and spacing — rather than ad-hoc styles. This is a starting vocabulary, not a complete design system. The value is the approach: tokens can be mapped to any CSS framework, WordPress theme, or design tool, and AI tools can iterate on them systematically. Changing the visual design doesn't require changing the content.

Color Palette
Token Value Use
Navy 900 #0A1628 Footer, dark backgrounds
Navy 600 #2c5d8a Links, interactive elements
Gold 500 #C9A227 Primary accent, buttons, badges
Neutral 900 #1A1F2A Primary text
Neutral 50 #FAFBFC Page backgrounds
Typography
Headings Cormorant Garamond (self-hosted, variable weight 600–700)
Body text Inter (self-hosted, variable weight 400–600)
Scale 12px → 48px (9 steps), responsive with clamp()
Content width 42rem (672px) optimal reading width, 64rem wide, 80rem full
Component Patterns
Buttons Primary (gold bg), Secondary (navy outline), Ghost (text only). 3 sizes.
Cards Shadow + rounded corners. Interactive variant with hover lift. 16:10 image ratio.
Sections Default (white), Alt (neutral-50 bg), Dark (navy bg with white text).
Responsive Mobile-first. Breakpoints at 768px, 1024px, 1280px.

To see all of this in action: visit the reference site. The design system is implemented as CSS custom properties (design tokens) — any developer can inspect them in the browser's DevTools.

WordPress Concept Mapping

For developers familiar with WordPress, here's how the Sanity concepts translate:

Sanity Concept WordPress Equivalent
Document type (e.g., program) Custom Post Type
Object type (e.g., seo) ACF Field Group
Reference (e.g., coach → person) ACF Relationship / Post Object field
Portable Text (rich body content) Gutenberg blocks or ACF WYSIWYG
GROQ query WPGraphQL or WP_Query