Skip to main content

SCA Content Infrastructure

Technical reference for the school's structured content layer

EN 中文
1

Overview

What this system is and where things live

I've built a structured content layer on Sanity CMS that models the school's operations — programs, people, news, events, student projects — as queryable data exposed via API. A reference frontend on Vercel demonstrates the data layer. I haven't done any frontend design work — the content is available for any frontend to consume.

Sanity Project Config
Project ID wesg5rw8
Dataset production
API Version 2024-01-01
GROQ Endpoint https://wesg5rw8.api.sanity.io/v2024-01-01/data/query/production
Reference Site web-beta-lilac-27.vercel.app
Repositories
sca-website Sanity studio + Astro frontend (monorepo, npm workspaces)
sca-explainers This briefing site (Astro SSG on Cloudflare Pages)
sca-internal Project management and cross-project status
Current Status
Component Status
12 content schemas defined Done
Content migrated: 9 pages, 7 programs, 13 news articles, 87 images Done
GAM automation (student Drive folder provisioning) Done
Reference frontend (SSR for dynamic content) Done
Faculty data extraction Not started
Frontend design Not started
2

Architecture

Schemas, data shapes, automation

I defined the schemas in apps/sca-studio/schemaTypes/. Each defines a content type with typed fields, validation rules, and references to other types.

Document Types (12)
Type Key Fields References
page title, slug, body (Portable Text), parent parent → page (hierarchy)
person name, role, bio, image department → department
program title, slug, description, collegePlacementSummary coach → person
news title, slug, publishedAt, source, body
studentProject title, description, driveUrl, folderStatus → triggers GAM automation
event title, date, location, description
department name, description
alumniStory name, graduationYear, story, outcomes
mediaGallery title, images[], description
boardingFeature title, description, image
admissionsPath title, steps, requirements
siteSettings title, description, announcement singleton

Most document types include an seo object. The studentProject type includes a folderStatus field that tracks Drive provisioning state — this is what enables the automation described below.

Shared Object Types (3)
Type Fields Used By
seo title, description, image page, news, siteSettings (as defaultSeo)
announcement enabled, text, link siteSettings
socialLink platform, url siteSettings

These are reusable structures embedded in document types, not standalone documents. They ensure consistency — an SEO object has the same shape whether it's on a page or a news article.

Querying Content

All content is accessible via GROQ (Sanity's query language) or GraphQL. The API is public for reads — no authentication required to query published content. See Code Reference for query examples and a live endpoint you can test in your browser.

GAM Automation — Example of What Structured Content Enables

This is an example of what structured content enables. When I create a studentProject document in Sanity, a GAM watch agent automatically provisions a Google Shared Drive folder for the project. The folder name embeds the Sanity document ID as an idempotency key: '{title} [sca-project-{id}]'. The same pattern would work with WordPress Custom Post Types — what matters is typed documents with status fields, not the CMS.

  • Polls Sanity for folderStatus == "Provisioning" records
  • Creates Shared Drive folder via GAM CLI
  • Writes Drive URL back to Sanity document
  • Supervised by systemd with crash recovery (ADR-005)
  • Currently on a dedicated Linux server; standard Node.js — no platform-specific dependencies

Staff creates a project → system provisions a Drive workspace → editorial review before publishing. No IT tickets.

Reference Frontend

I built a reference Astro site that demonstrates how the content layer renders. Key patterns in apps/web/src/pages/:

  • SSR for detail pages/news/[slug] and /projects/[id] fetch at request time (ADR-007)
  • Catch-all route[...slug].astro resolves the page hierarchy for all 9 imported info pages
  • Static index pages/news, /projects are build-time rendered

Deployment is manual CLI: vercel --prod from apps/web. No Git-triggered deploys (ADR-002).

3

Integration

How to consume this from WordPress or anything else

WordPress Integration

WordPress can pull structured content from Sanity via REST API, scheduled sync, or webhooks. The Sanity API is a standard HTTP endpoint — any system that can make a GET request can consume the content.

See Code Reference for a working PHP example and a mapping of Sanity concepts to their WordPress + ACF equivalents.

Portability

The Sanity-specific code is limited to queries and client config in apps/web/src/pages/. The schema patterns map directly to WordPress Custom Post Types + ACF Field Groups. Content is exportable as JSON (sanity dataset export). The data layer is portable — schema design is the investment, not vendor lock-in.

What Content Modeling Enables for AI

Structured content is more reliably queryable by AI systems than page-based content. When content has typed fields and declared relationships, AI can give precise answers without having to infer structure from HTML. For example, a parent's AI assistant asking "What programs does SCA offer?" gets more reliable results from a query against typed program documents than from scraping a web page that mixes program descriptions with navigation and marketing copy.

This applies regardless of whether the structured data lives in Sanity, WordPress + ACF, or any other system with a query API. The value is in the schema design, not the vendor. Industry context: Sanity's analysis of the shift in content operations, Google's structured data guidelines, Schema.org.

Data Ownership

I want to be explicit about data ownership: both technical teams are consultancies. The school owns all content and data. Sanity content is exportable as JSON (sanity dataset export). Images are on Sanity's CDN but downloadable. I've version-controlled the schemas in GitHub. This document is part of ensuring any developer can pick up where I leave off.

4

Code Reference

GROQ examples, WordPress integration code, concept mapping

GROQ Query Examples

GROQ queries run against the endpoint or via the JavaScript client.

// All programs with their coach
*[_type == "program"]{
  title, slug, description,
  collegePlacementSummary,
  coach->{name, role, image}
}
// News articles, most recent first
*[_type == "news"] | order(publishedAt desc){
  title, slug, publishedAt, source,
  "imageUrl": mainImage.asset->url
}
// Page tree (parent/child hierarchy)
*[_type == "page"]{
  title, slug,
  parent->{title, slug}
}

Try it live — paste into your browser: 'https://wesg5rw8.api.sanity.io/v2024-01-01/data/query/production?query=*[_type=='program']{title,slug}'

WordPress: Fetch From Sanity (PHP)
// In your WordPress theme or plugin
$query = urlencode('*[_type == "program"]{title, slug, description}');
$url = "https://wesg5rw8.api.sanity.io/v2024-01-01/data/query/production?query={$query}";
$response = wp_remote_get($url);
$programs = json_decode(wp_remote_retrieve_body($response))->result;
Sanity → WordPress Concept Mapping
Sanity WordPress + ACF
document type Custom Post Type
object type ACF Field Group
reference ACF Relationship / Post Object field
Portable Text Gutenberg blocks or ACF WYSIWYG
GROQ query WPGraphQL or WP_Query