v2026.5.12 — Research Corpus Tooling, Engineering Blog Launch, and Release-Workflow Discipline
v2026.5.12 — Research Corpus Tooling, Engineering Blog Launch, and Release-Workflow Discipline
Tag: `v2026.5.12` · Released: 2026-05-27 · npm: `npm install -g [email protected]`
This release lands the bulk of AIWG's research-corpus tooling merge from `section9/research-papers` into the native `research-complete` framework — radar/freshness, profile generation, funder analytics, discovery logging, and a configurable corpus root with a shared parser foundation — and launches the AIWG engineering blog as a single-source-of-truth markdown collection rendered by pagenary 2026.5.3. The Gitea release workflows now state what they actually do (Gitea owns its own registry mirror; npmjs.org publishes from GitHub Actions via OIDC) — closing #1481 once the operator rotated the underlying token.
What's new
The research-corpus tooling merge is the largest piece. AIWG's `research-complete` framework absorbed the radar/freshness subsystem (`radar-init`, `radar-status`, `radar-report` plus matching skills and a template), profile generation (`profile-generate`, `profile-status`, `--fm` variant), entity-profile graph analytics (centrality, communities, temporal), funder-network analytics, and discovery-logging with PROF-S curator tooling. A shared parser foundation backs all of it, and the corpus root is now configurable instead of hardcoded — what was 26 path-bound scripts in a separate repo is now one native subsystem you reach through `aiwg corpus` and `aiwg index`. Epic #1496 is roughly two-thirds done; the remaining sidecar lint, induction-quality, citation-graph, and integrity-scan clusters carry through to the next release.
`aiwg index build` renders corpus markdown views natively in-process — by-year, by-topic, by-venue, by-authors, by-method, by-model-size, by-training-pipeline, citation-network, and the new radar/discovery/funder views — replacing the separate `regenerate_indices.py` step.
Index configuration consolidated into `aiwg.config` under a validated schema (`index.graphs`); `.aiwg/.index/` is gitignored by default and `aiwg doctor` surfaces its freshness.
The AIWG engineering blog is now a single-source-of-truth markdown collection. Write `docs/blog/<slug>.md`, push to `main`, and pagenary 2026.5.3 auto-generates `docs.aiwg.io/blog/index.json` (envelope schema `{title, route, count, generated, posts[]}`) plus a companion `feed.xml` from each post's frontmatter — no hand-authored HTML, no hand-maintained manifest. The first post, "How AIWG builds your customized system prompt," is live; four reviewed companions (A10–A13) are queued. Publishing conventions, frontmatter spec, image placement (`docs/.public/blog/` → `/assets/blog/`), and the publish flow are documented at `docs/contributing/publishing-blog-posts.md`.
Doc-site CI now installs `@pagenary/publisher` as a devDependency and runs `npx pagenary build:tenants aiwg-docs`, replacing the previous `git clone roctinam/pagenary.git → /tmp/pagenary` step. The clone path is retired, `GT_ACCESS_TOKEN` is no longer referenced by any workflow, and the secret can be revoked after one clean deploy on the npm path. `@pagenary/publisher` itself advanced from `2026.5.1` (exact) to `^2026.5.3` (caret) to pick up the collection-support feature that drives the blog manifest. The bump used the documented `--min-release-age=0` override since pagenary is first-party (`roctinam/pagenary`).
The Gitea release workflows had their topology drift cleaned up. `npm-publish.yml`'s header and dry-run text still claimed it published to "both Gitea and npmjs.org," contradicting the fact that the npmjs.org legs were disabled back in #1283 (May 2026). The text now says Gitea-registry-only, names GitHub Actions as the npmjs.org publishing authority via OIDC, and marks `NPMJS_TOKEN` as referenced only by the disabled legs (revokable). Both `npm-publish.yml` and `gitea-release.yml` already documented `NPM_TOKEN` as a Gitea API token; the operator rotated the underlying token to one with both `repository:write` and `package:write` scopes, which closes #1481. The next signed-tag push exercises both Gitea jobs through valid auth.
A small dev-experience fix landed: the CLI's `dist/src/update/notifier.mjs` import path failed on a checkout where `dist/` had been cleaned, because the `.mjs`/JSON/YAML files are copied into `dist/src/` by `build:copy-mjs` (separate from `tsc`). Rebuilding via `npm run build:cli` restores it; #1513 tracks an `aiwg doctor` guard so a missing build fails loud and actionable next time instead of a raw `MODULE_NOT_FOUND`.
Documentation updates
- Added this release announcement and the 2026.5.12 changelog section.
- Added `docs/blog/` (with the first post) and `docs/contributing/publishing-blog-posts.md` (publishing-process contract).
- Registered the Blog section in `docs/_manifest.json` + `docs/contributing/_manifest.json`.
- Rewrote the publishing doc to the real collection mechanism: envelope schema, derived-vs-verbatim fields, no-status-filter (drafts stay out of `docs/blog/`), `docs/.public/blog/` image convention.
- Clarified `.gitea/workflows/npm-publish.yml` header + dry-run text and the `NPMJS_TOKEN` note.
User impact
- Research operators get the section9 tooling natively in AIWG: radar dashboards, profile generation, funder graphs, and discovery logs run through `aiwg corpus` / `aiwg index` with one configurable corpus root.
- Anyone touching the docs site benefits from a faster, secret-free CI clone path (no more `/tmp/pagenary` clone).
- Blog publishing is now write-markdown-and-merge — no HTML, no manifest editing.
- Release operators see a Gitea release pipeline that finally matches the real topology, and the next release proves the rotated token end-to-end.
- Developers who hit the `dist/notifier.mjs` import error from a clean/fresh checkout now have a clear path (`npm run build:cli`) and an upcoming doctor guard tracking under #1513.
Known follow-up
- pagenary#19 — page renderer still shows YAML frontmatter as visible text on the rendered docs.aiwg.io page (the generated manifest is unaffected). Clean page rendering arrives when pagenary lands the render-side strip and ages past the supply-chain release-age gate.
- #1486 — fix 53 pre-existing broken internal doc links and re-enable the `strictLinks: true` gate on `docsite-build`.
- Epic #1496 remaining clusters — sidecar lint & metadata repair (#1503), induction quality (#1504), citation-graph densification (#1505), integrity / submission-risk scan (#1506), and the `needs-infrastructure` items: scanned-page vision extraction (#1507) and Fortemi import (#1508, deferred). Entity-profile graph analytics (#1501) is partial.
- #1513 — `aiwg doctor` should detect a missing/incomplete `dist/` and emit an actionable error.
- Provider field-validation cluster (#1405 / #1407–1415) remains correctly `blocked:field-validation` pending per-provider field reports.
Verification
Release-readiness checks for this line:
npm run check:versions
npm run typecheck
npm run build:cli
npm test
npm run uat