Metadata
Page-level metadata - content type, associated products, last updated, word count - lets you take a broader, more strategic view of your content.
It helps you answer questions like the following:
- As a writer:
- Am I missing something obvious in the content strategy?
- What are some pages I should be updating right now?
- How does X tutorial compare with all tutorials? Is it getting more traffic than the baseline?
- As a manager:
- Are we over or underinvesting in a specific product area? Or a specific content type?
- How does the traffic to this set of products compare to another?
- How can I communicate broader trends to my stakeholders?
You cannot answer these questions without some level of rollup reporting, which you can only get through metadata.
At Cloudflare, we track the following information about different pages:
Value | Description | Examples |
---|---|---|
Product | The top-level subfolder of the page. | dns , bots |
Product Group | The primary area that each product falls into. | Application Performance , Developer Platform |
Tags | Specific atttributes related to a page's content or purpose. | AI , JavaScript , Headers |
Content type | The primary purpose of the page, which corresponds to our listed content types. | how-to , faq |
Last modified | How many days ago was this page last updated? | 63 |
Last reviewed (optional) | How many days ago was this page last reviewed? | 100 |
Of all of these values, there is a bit of nuance to our Last reviewed metadata. Last reviewed differs from Last modified because a review is more thorough than an update. A review implies that all contents of the page have been vetted for accuracy.
Because of this extra effort, we only track Last reviewed for content types that are particularly important to the user journey and require an additional level of maintenance. At the moment, those content types are tutorials.
We set these values at two different levels, the folder level and the page level.
We set two values at a folder level, Product
and Product Group
. We take this approach because we can assume that these values apply every page within that folder.
For example, here's the content from our DNS folder ↗.
name: DNS
product: title: DNS url: /dns/ group: Application performance
meta: title: Cloudflare DNS docs description: Cloudflare DNS provides the fastest, most resilient, and simplest managed DNS platform to meet your needs. author: "@cloudflare"
resources: community: https://community.cloudflare.com/tags/c/reliability/7/none dashboard_link: https://dash.cloudflare.com/?to=/:account/:zone/dns learning_center: https://www.cloudflare.com/learning/dns/what-is-dns/
We primarily set page-level attributes through the page's frontmatter.
For example, here are the values set for our Build a Slackbot tutorial.
---updated: 2024-06-05difficulty: Beginnerpcx_content_type: tutorialtitle: Build a Slackbottags: - Honolanguages: - TypeScript---
However, the last_modified
value is pulled automatically from the git history of a file.
We choose to render all of these values as specific meta
properties for each page.
For example, these are the meta
properties and values on the AI Audit - Get Started page.
<meta name="pcx_content_group" content="Core platform" ><meta name="pcx_product" content="AI Audit" ><meta name="pcx_content_type" content="get-started" ><meta name="pcx_last_modified" content="7" >
We render these values using a custom override for our Head.astro
↗ file. If specific values are set, we then add them as meta tags onto the page.
if (product.data.product.title) { ["pcx_product", "algolia_product_filter"].map((name) => { metaTags.push({ name, content: product.data.product.title, }); }); }
We get two primary benefits from structuring our content this way.
First, our metadata is easily consumable by anyone who crawls our pages. We started using these values for our Algolia search configuration and internal reporting, but have since expanded to sharing this data with other teams that consume our content for AI systems too.
Additionally, this decisions means that our GitHub repo is always the source of truth. We do not have to keep a spreadsheet or mapping updated elsewhere, the source of truth is always in our repo and - by extension - a lot more likely to be accurate than if we maintained multiple sources of truth.
It's difficult to avoid errors with this kind of metadata, specifically because we are relying on freeform text entry in the frontmatter of individual files.
We utilize Zod schemas ↗ heavily in our Astro site, which are defined in src/schemas/
↗.
These allow us to provide Intellisense guidance ↗ for contributors using IDEs for local development.

Was this helpful?
- Resources
- API
- New to Cloudflare?
- Products
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark