ASTRONOM Whitepaper
A decentralized data infrastructure for the agentic web — turning consensual human web behavior into open, owned, tradable training fuel for autonomous AI agents.
Abstract
The next generation of AI agents — those that act autonomously on the web on behalf of users — is bottlenecked not by model architecture or compute, but by access to diverse, real, consented human web behavior data. Closed labs accumulate proprietary tool-use traces internally; public datasets are stale and narrow. ASTRONOM is a decentralized network that captures, augments, and trains on consented browser trajectories, with on-chain provenance and contributor rewards. This document outlines the architecture, economics, and roadmap of the network.
The agentic data crisis
2025–2026 saw the emergence of capable browser-using agents (Claude Computer Use, Operator, in-house OpenAI/Anthropic systems). Each requires massive volumes of demonstration data from real users navigating real websites with real intent. The supply chain for that data is currently:
- Closed and centralized — top labs collect internally, sharing nothing.
- Stale — public datasets like Mind2Web were collected in 2023; the modern web has moved on.
- Narrow — single-task or single-domain, not generalizable.
- Unethical — scrapers don't compensate the humans whose work generated the data.
This is the same data crisis Physical AI faced in 2024 — and the solution is the same: simulation-first capture, distributed augmentation, on-chain provenance.
Browser-native capture pipeline
Each contributor installs the ASTRONOM Capture Client. During opt-in capture sessions, the client records:
- DOM snapshots at action timestamps
- User actions (clicks, scrolls, key inputs, navigation)
- Screen coordinates and viewport state
- Task intent (user-provided, optional)
The trajectory is locally redacted — passwords, payment details, personal IDs are stripped before upload. The contributor reviews and signs the submission. Validators verify quality. Synthesis engines augment the trajectory across thousands of domain-randomized variants. Final outputs feed Vision-Language-Action policy training.
Three layers, one loop
┌─────────────────────────────────────────────────────────────┐
│ Capture Client → Validator → Synthesis → Training │
│ (browser) (stake) (GPU) (VLA) │
│ ↑ ↓ │
│ └────────────── Token Rewards ─────────────┘ │
└─────────────────────────────────────────────────────────────┘
↓
Every trajectory: data_id on-chain
↓
Verifiable. Tradable. Owned.
$ASTRO token
The $ASTRO token coordinates the four-sided market: contributors, validators, compute, consumers. Detailed tokenomics, supply schedule, and incentive curves will be published in v1.0 of this whitepaper before mainnet launch.
Roadmap
- Q1 2026 — Project founded. Initial team formation, core protocol architecture finalized.
- April 2026 — Public site launch. Early Access dashboard opens for community signups.
- May 14, 2026 — 7 PM UTC — $ASTRO TGE on Solana. Mainnet activation. Capture Client alpha for design partners.
- Q3 2026 — Public Capture Client beta. First technical report published. Validator network expansion.
- Q4 2026 — Full validator decentralization. Tokenomics v1.0 finalized.