Cache similar-threads results instead of searching per view#913
Merged
Conversation
The "Similar Threads" sidebar ran a full-text search on every thread-page render via getCachedSimilarThreads. Its 5-minute ActionCache missed ~99% of the time (long-tail SEO traffic rarely re-hits a thread within 5 min), so ~9.8M searches/month each scanned the full ~2.42 GB content index — ~23.7M Convex query-GBs, ~99.9% of all search billing (~$2.3k/mo and rising with the corpus). Replace the per-view search with a persistent `similarThreads` store keyed by thread id: - Hit: resolve the stored list via indexed reads, no full-text search. - Cold/stale miss: search once, persist, return. Recompute only past a 30-day staleness window. 22.15M May views came from just 2.09M unique threads (10.6x repeat-view multiplier), so this cuts search volume to ~2.1M/mo (~$500/mo); the staleness window trades freshness for further savings. Callers without a parent-channel scope (the MCP tool, which sends currentThreadId "0") bypass the store so they can't collide on a shared key; that path already returned [] and does no search work. Population is lazy — the store warms as threads are viewed, no backfill needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
answeroverflow-main-site | 98da403 | Jun 01 2026, 05:53 PM |
CI runs the main-site postinstall under the runner's default Node (now 22, NODE_MODULE_VERSION 127), but the cached better-sqlite3 prebuilt is ABI 115 (Node 20), so `node scripts/generate-community-servers.mjs` aborted with ERR_DLOPEN_FAILED and failed install for every job (Lint/Test/Typecheck). Switch the script to Bun's built-in `bun:sqlite` (no native addon, no ABI to mismatch) and run it with `bun`, matching the repo convention of preferring bun:sqlite over better-sqlite3. Output is byte-identical. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
With install fixed, `bun run lint` now actually runs in CI and flagged latent debt in untouched files: - code.tsx: file-level biome-ignore for noDangerouslySetInnerHtml (HTML is shiki-generated from trusted code, matching chart.tsx's existing suppression) - dashboard page: drop an unused `isLoading` destructured binding Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
With install fixed, the Typecheck job now runs main-site's tsgo and flagged latent errors in untouched code: - generated community-servers: emit rows via JSON.parse with a declared CommunityServerRow[] type, so TS no longer infers a giant per-row union and fails with TS2590 - og/shared.tsx: build the ArrayBuffer explicitly so the return type is ArrayBuffer, not ArrayBuffer | SharedArrayBuffer - tsconfig: exclude worker.ts and .open-next from typecheck — the Cloudflare Workers entry imports the build-only ./.open-next/worker.js artifact and Workers types, so it can't typecheck without a build (Workers is out of scope) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The Similar Threads sidebar runs a full-text search on every thread-page render (
getCachedSimilarThreads). Its 5-minuteActionCachemissed ~99% of the time — long-tail SEO traffic rarely re-hits the same thread within 5 minutes — so the underlying query ran ~9.8M times in May, each scanning the entire ~2.42 GB content index (constant: p50 = p95 = avg).That's ~23.7M Convex query-GBs/month — ~99.9% of all search billing (≈ $2.3k/mo, and rising as the corpus grows since each search reads the whole index). Confirmed via Axiom: the volume tracks thread-page views at a flat ~0.47 ratio and is decoupled from MCP traffic, so it's the website sidebar, not the MCP.
Fix
Replace the per-view search with a persistent
similarThreadsstore keyed by thread id:SIMILAR_THREADS_STALE_MS), which bounds staleness so new threads eventually surface.Removed the ineffective 5-minute
ActionCache. Population is lazy — the store warms as threads are viewed, so no backfill is needed; the schema change is purely additive (new empty table).Safety
Callers without a
currentParentChannelId(the MCPfind_similar_threadstool, which sendscurrentThreadId: "0") bypass the store so they can't collide on a shared key. That path already returned[]today and does no search work, so behavior is unchanged.The Discord bot's separate live
getSimilarThreadsquery (tiny volume, under the free tier) is untouched.Expected impact
22.15M May views came from only 2.09M unique threads (a 10.6× repeat-view multiplier the old cache missed). The staleness window is the cost/freshness knob — e.g. 90 days → ~$165/mo, trending toward the new-thread-creation rate for longer windows.
Follow-ups (not in this PR)
search_contentwould cut the per-search cost.Test plan
bunx tsgo -p convex/tsconfig.json --noEmit— clean (only pre-existing unrelated Stripe error)public/search:getSimilarThreadsInternalquery-GBs drop in Axiom after rollout🤖 Generated with Claude Code