Volunteer Engagement
PostSecret: making 14,000 anonymous secrets searchable
PostSecret is one of the longest-running community art projects on the web, built on secrets people mail in on postcards. The platform turns two decades of that archive into a searchable experience, with search that understands meaning, not just keywords. I am the volunteer technical lead and primary engineer, working directly with founder Frank Warren.
- Volunteer Technical Lead
- WordPress + Qdrant + embeddings
- ~14,000 anonymous secrets
Overview
For twenty years, PostSecret has collected secrets that strangers mail in on homemade postcards. The collection grew into one of the most recognized community art projects on the internet, but it lived as images with no real way to explore it. The digital archive makes the whole body of work searchable while protecting the anonymity and sensitivity the project is built on.
- ~14,000 anonymous secrets brought into a single searchable archive
- Search that understands meaning: semantic, full-text, and faceted search merged into one ranked result set
- A classifier tags every secret by topic, feeling, meaning, style, location, and vibe
- A moderation pipeline keeps sensitive submissions out of public view by default
- Built on WordPress, hosted on Automattic's Pressable
The project
PostSecret's archive is unlike a normal content library. Every item is anonymous, emotionally charged, and submitted in trust. The design problem was to make two decades of that material explorable without ever flattening it into a generic database.
Two decades of postcards
- Thousands of physical cards, front and back
- Years of submissions with uneven metadata
- No structured way to browse by theme or feeling
Hard to explore
- The collection lived as images, not searchable text
- Keyword search alone misses meaning and tone
- Readers could not follow a thread across years
Sensitive by nature
- Anonymity is the whole point and must be preserved
- Some submissions are not appropriate for public view
- Moderation has to be built in, not bolted on
The archive and search
Search is the heart of the experience. A single query runs three ways at once and the results are blended into one ranked list, so a reader can search by exact words, by meaning, or by browsing facets, and still get one coherent answer.
| Mode | What it does |
|---|---|
| Full-text | Classic keyword search over the transcribed text of each secret |
| Semantic | Meaning-based search using embeddings stored in a Qdrant vector database, so a query finds related secrets even with no shared words |
| Faceted | Browse by classifier-assigned facets: topics, feelings, meanings, style, locations, and vibe |
| Unified ranking | A query parser detects intent and a ranking method blends the three modes into one result set |
The same engine powers "similar secrets" on the detail page and a faceted homepage feed that renders the archive in an editorial layout rather than a flat list.
Architecture
The platform is a set of custom WordPress plugins, chosen so the project lives natively inside Automattic's ecosystem and the Special Projects team can operate it without a bespoke stack.
The stack
- WordPress with custom plugins for classification and ingestion, search, the homepage feed, and database migrations
- A Qdrant vector database that stores the embeddings powering semantic search and similar-secret recommendations
- A model that classifies each secret into facets and produces the embeddings used for meaning-based search
- MySQL for the WordPress data store plus custom tables for classification, facets, embeddings, audit logs, and search evaluation
- Pressable, Automattic's managed WordPress hosting, where the platform runs
Ingestion and moderation
Getting thousands of physical postcards into a clean, safe, searchable archive is its own pipeline, and moderation is a first-class part of it rather than an afterthought.
Ingestion
- A cloud worker reads source images from storage and posts them to a secured ingest endpoint
- Duplicate detection keeps the earliest submission date across copies
- Front and back sides of each card are paired into a single record
- New items arrive as "needs review" and are never public until vetted
Moderation
- Each secret is tiered by severity, including checks for explicit content and personal information
- A review queue surfaces anything flagged or escalated
- A single visibility gate decides what appears, so sensitive material stays out of view by default
- Every change is written to an audit log
My role
I am the volunteer technical lead and primary engineer on the project, working directly with founder Frank Warren and Automattic's Special Projects team on a weekly cadence. I designed and built the platform's core systems end to end - the search stack, the classification pipeline, the content-moderation pipeline, the ingestion path, and the schema-migration system - and built the front-end that surfaces them. It is hosted on Automattic's Pressable.
- Designed and built the unified search stack (full-text, semantic, faceted)
- Built the classification pipeline that tags every secret across six facets
- Built the content-moderation pipeline with severity tiering and a visibility gate
- Built the cloud-based ingestion path with duplicate detection and front/back pairing
- Built the schema-migration system for the custom data model
- Built the front-end that surfaces search, the detail view, and the homepage feed
Outcomes
Building something that has to handle sensitive content with care?
PostSecret is what happens when a beloved, deeply human project needs real infrastructure - meaning-based search, careful moderation, and a platform a small team can operate. If that is the kind of problem you are working on, let's talk.