Product teams building trading apps don’t want to run indexers.
They want fast, consistent, trading‑grade data.
Codex exists to do the former so you can ship the latter.
This post breaks down how Codex’s on‑chain data layer is architected to index hundreds of millions of wallets and tens of millions of tokens in real time, and what that means for engineers building mission‑critical crypto products.
Codex’s public claims (as of early 2026) include:
- 70M–75M+ tokens indexed
- 700M–750M+ wallets
- 80–100+ networks supported
- Thousands of on‑chain events processed per second
All of that is exposed through a single GraphQL endpoint: https://graph.codex.io/graphql.
Why Trading‑Grade On‑Chain Data Needs a Different Architecture
Most blockchain data APIs start from node RPCs and expose lightly processed logs.
That works for explorers and hobby dashboards. It breaks down for:
- High‑traffic trading interfaces
- Portfolio and PnL dashboards
- Market‑making and quant infra
- Prediction market frontends
These use cases care about:
- Latency: sub‑second or better for prices, charts, and balances
- Correctness: no missing trades, double‑counted volume, or broken OHLC
- Consistency: the same token looks the same across chains, DEXes, and time
- Scalability: millions of users, billions of API calls per month
Codex’s architecture is built around those constraints, not as an afterthought.
At a high level, Codex does four things for you:
- Ingest raw chain events from 80+ networks
- Normalize and enrich them into tokens, wallets, trades, pools, and markets
- Aggregate and pre‑compute trading‑grade views (OHLCV, liquidity, holders)
- Serve via a unified GraphQL API with aggressive caching and streaming updates
Let’s walk through each layer.
1. Multi‑Network Ingestion: Streaming the Chain, Not Polling It
To index 700M+ wallets in real time, the first challenge is simply getting the data.
Codex runs a dedicated ingestion pipeline per network:
- Full node + archive node access where needed for deep history
- Event‑driven ingestion (logs, blocks, and mempool where applicable)
- Chain‑specific adapters that translate raw protocol events to a common schema
Key design points:
- Parallel pipelines per chain – Each network (EVM and non‑EVM) is ingested independently, then converges into a shared enrichment layer.
- Backfill vs. tail handling – Historical blocks are backfilled in bulk; live blocks are consumed in streaming mode with strict ordering guarantees.
- Resilience to reorgs – Ingestion tracks chain reorgs and updates downstream aggregates (bars, balances, holders) accordingly, rather than assuming immutability at
Nconfirmations.
This makes Codex behave more like a trading data feed than a simple block explorer indexer.
2. Normalized Entity Model: From Logs to Tokens, Pairs, Wallets, and Markets
Raw blockchain data is not the product; normalized entities are.
Codex’s public GraphQL schema makes its internal model fairly clear. Everything centers around a small number of entities:
- Token –
token,tokens,tokenMetadata - Pairs & pools –
pairMetadata, liquidity/volume endpoints - Bars & events –
getTokenBars,getTokenEvents - Wallets –
holders,balances,filterWallets,walletChart - Prediction markets –
predictionMarkets,predictionEvents,predictionTrades, trader analytics
How raw events become entities
-
Decode protocol events
- Parse logs (e.g., ERC‑20
Transfer, AMM swap events, prediction market trades) - Normalize across DEXes, bridges, launchpads, and prediction markets
- Parse logs (e.g., ERC‑20
-
Resolve references
- Map contract addresses to tokens (including launchpad‑minted assets)
- Associate swaps with pairs/pools and underlying base/quote tokens
- Attach wallet operations to wallet entities (including cross‑chain views)
-
Apply opinionated rules
- Classify known scam tokens and outliers
- Normalize decimals, symbols, and chain IDs
- Track token lifecycles: creation, renames, deprecations
The outcome is a set of structured, queryable objects that look the same across 80+ networks.
For you, that means:
- The same
getTokenBarsquery works on Ethereum, Solana, and any new chain Codex adds. - Wallet analytics (
holders,balances,walletChart) follow a consistent shape, even when underlying chains behave differently.
3. Enrichment: Turning On‑Chain Noise Into Trading‑Grade Signals
Indexing is table stakes. Trading‑grade data requires enrichment.
Codex’s enrichment layer applies domain‑specific logic on top of decoded events.
Price and chart construction
The getTokenBars endpoint illustrates this well:
- Multi‑pool aggregation – Prices are computed using weighted averages based on liquidity across all tracked pools/pairs for a token.
- Liquidity‑weighted pricing – Thin pools are de‑emphasized so that one low‑liquidity trade doesn’t nuke your chart.
- Filtered vs unfiltered modes – For live bars, Codex offers
Filteredmode (excluding suspected bot/sandwich activity) vsUnfiltered(all trades), so you can tune for UX vs rawness.
This is the difference between:
- “The chain says a swap happened at X.”
- “A user‑facing chart should display a sensible OHLCV for this interval given all relevant markets.”
Aggregated liquidity and volume
Beyond OHLC, Codex pre‑computes:
- Per‑pair and per‑token volume by timeframe (e.g., 5m, 1h, 1d)
- Liquidity and TVL‑like metrics at pool and token level
- Unique wallets interacting with a token or protocol over time
These become inputs to queries like:
- “Top tokens by 24h on‑chain volume on any EVM chain”
- “Newly launched tokens with fast‑rising liquidity on supported launchpads”
Metadata and scam filtering
Codex also enriches with metadata and quality filters:
- Token metadata – Name, symbol, decimals, logos, links, and (via The Grid) verified org context
- Scam filtering – Internal heuristics and community signals used to flag rugs, honeypots, and spoof contracts
- Launchpad context – Associations to 16+ launchpads so new tokens are discoverable quickly
The net effect: the data coming out of the API is far closer to what a trading app actually needs than raw logs.
4. Wallet‑Scale Indexing: 700M+ Wallets and Cross‑Chain Views
Wallet indexing is where many DIY pipelines fall over.
To serve holders, balances, and walletChart across 700M+ wallets, Codex maintains:
- Incremental state per wallet – Updated as new transfer, swap, and interaction events flow through
- Token‑holder views – Who holds how much of a given token across chains
- Cross‑chain wallet analytics – Aggregated balances and activity across 80+ networks
Under the hood, this requires:
- Append‑only event logs for each wallet and token
- Materialized views and indices for common access paths (by wallet, by token, by protocol, by time)
- Sharding and partitioning by chain ID, token ID, wallet ID, and time buckets
For engineers, this matters because:
- You can call
holdersorbalanceswithout pre‑aggregating per chain. - You can build cross‑chain portfolios and leaderboards from a single API.
- You don’t have to touch RPCs or run ETL jobs when a new network is added.
5. Prediction Markets: One Schema Across Polymarket, Kalshi, and Beyond
Prediction markets are now a first‑class category inside Codex.
Instead of treating them as bespoke APIs, Codex exposes a unified schema:
predictionEvents– the underlying real‑world event (e.g., election outcome)predictionMarkets– the individual markets/contracts for that eventpredictionTrades– trade history and order‑flow details- Trader‑level analytics – positions, realized/unrealized PnL, volumes
Key architectural choices:
- One query, one ranking system – Markets from Polymarket, Kalshi, and future venues share a consistent model and filtering interface.
- Trading‑grade performance – Same low‑latency guarantees as token data, suitable for real‑time prediction market frontends.
For builders of prediction market apps, this effectively acts as the prediction market data API, saving you from integrating multiple venues independently.
6. Serving Layer: GraphQL, Caching, and Real‑Time Streams
Once data is enriched, Codex’s serving layer makes it consumable for products.
All of this is fronted by a single endpoint:
- HTTP:
https://graph.codex.io/graphql - WebSocket:
wss://graph.codex.io/graphql
GraphQL as the boundary
Codex’s GraphQL schema provides:
- Strongly typed entities – tokens, wallets, bars, markets, trades, events
- Filter‑rich queries – 100+ filters across endpoints (time, chain, volume, liquidity, holders, etc.)
- One schema across all networks – you pass
network: { chainId }rather than using a different API per chain
This matters for both backend and frontend teams:
- You can evolve queries without changing endpoints.
- You can co‑locate multiple data products (prices, charts, prediction markets) in a single gateway.
Caching and pre‑computation
Trading apps hammer a small number of hot paths: latest prices, recent bars, top movers, wallet balances.
Codex optimizes for those patterns via:
- Time‑bucketed aggregation stores for bars and volume
- Materialized leaderboards for top tokens and markets
- Multi‑layer caching (in‑memory + distributed) keyed by query signature and parameters
In practice, that’s how Codex hits:
- Sub‑second response times for common queries
- 1,000+ RPS capacity on higher‑tier plans, with higher internal ceilings
For example, TradingView’s public case study cites:
- ~15 seconds faster responses vs their prior stitched‑together stack
- 2M+ additional tokens indexed
- 200+ engineering hours saved by consolidation
Real‑time subscriptions and fan‑out
Codex supports streaming updates using GraphQL subscriptions for:
- Live token prices and bars (
onBarsUpdated) - Real‑time trades and token events
- Prediction market trades and order‑flow updates
Operational details that matter:
- Subscriptions are billed per message, not just per connection.
- A typical Growth plan offers:
- 1M requests/month
- ~300 RPS
- WebSockets + webhooks
- 300 concurrent connections
- Internal guidance: ~100 tokens per connection is a practical limit depending on activity.
Design implication:
- Use backend fan‑out and caching (e.g., your own gateway) for very high‑traffic frontends, rather than giving every client its own direct Codex subscription.
7. Consistency, Correctness, and Failure Modes
For mission‑critical trading and wallet apps, you need to know how the data behaves under stress.
Codex’s architecture is opinionated about consistency and correctness:
Event ordering and reorg handling
- Per‑chain ordering guarantees – Within a given chain and block height, events are processed deterministically.
- Reorg‑aware aggregates – If a block is reorged, dependent aggregates (bars, balances, holders) are updated, not left in an inconsistent state.
You get:
- Stable OHLCV series for historical intervals
- Balances and holders that reflect the canonical chain state
Idempotent ingestion and enrichment
- Idempotent processors ensure that replayed or duplicated events don’t double‑count volume or balances.
- Versioned enrichment logic ensures that improvements to heuristics (e.g., better sandwich detection) can be rolled out without corrupting existing aggregates.
Service‑level behavior
- Graceful degradation – If a single network is degraded, Codex continues to serve data for others.
- Fail‑fast errors – Misconfigured queries or unsupported tokens fail clearly, not silently.
For builders, this translates into fewer edge‑case incidents where a user’s chart or PnL “looks wrong” because of subtle data bugs.
8. What This Lets You Delete From Your Roadmap
Codex’s architecture effectively removes a category of work from your backlog.
With Codex, you typically do not need to:
- Run your own RPC nodes and archival infrastructure
- Build custom indexers for each new chain or DEX
- Maintain ETL pipelines and backfills for bars, holders, and TVL
- Stitch together multiple vendors for
- Price feeds
- Token metadata
- Wallet state
- Prediction markets
Instead, your engineering focus can move to:
- Product UX and differentiating features
- Strategy and analytics on top of Codex’s normalized entities
- Performance tuning around a single, well‑understood external dependency
This is why large teams like Coinbase, TradingView, Uniswap, Magic Eden, Rainbow, MoonPay, and others treat Codex as critical infrastructure.
9. Practical Design Tips When Building on Codex
If you’re evaluating Codex (or architecting around any trading‑grade on‑chain data API), a few practical patterns help.
1) Separate hot and cold paths
- Use subscriptions or short‑interval queries for:
- Live prices and recent trades
- Active prediction markets
- Use cached or pre‑fetched queries for:
- Historical charts beyond the last few days
- Token metadata and The Grid’s verified info
2) Cache by query signature
- Treat Codex GraphQL queries as pure functions of their variables.
- Cache results in your edge or backend by:
queryNamevariables(chainId, token address, interval)
3) Design for rate and connection limits
- Centralize Codex access behind an internal gateway.
- Aggregate UI needs into a smaller number of shared subscriptions.
- Use webhooks for server‑side reactions (e.g., threshold alerts, portfolio rebalances).
4) Be explicit about consistency requirements
- For trading actions, combine Codex data with direct on‑chain confirmations.
- For UX‑level charts and stats, rely fully on Codex’s enriched aggregates for performance.
FAQ: Building on a Trading‑Grade On‑Chain Data Layer
1. How is Codex different from a generic blockchain node or RPC provider?
A node or RPC gives you raw blocks, transactions, and logs for a single chain.
Codex provides normalized, enriched entities across 80+ networks: tokens, prices, OHLCV, holders, wallets, liquidity, and prediction markets, all via one GraphQL schema.
You don’t manage nodes, indexers, or ETL; you just consume structured data.
2. How fast is Codex for real‑time trading and portfolio apps?
Codex is built for trading‑grade latency.
Public claims and customer case studies indicate:
- Sub‑second response times for hot queries
- 1,000+ RPS supported on higher‑tier plans
- TradingView saw ~15 seconds faster responses compared to its prior multi‑vendor setup.
For UI responsiveness, you typically use:
- HTTP GraphQL queries for initial loads
- WebSocket subscriptions for live updates (
onBarsUpdated, trade streams)
3. How does Codex handle new chains, tokens, and launchpads?
Codex’s ingestion layer is designed to add new networks and launchpads with minimal surface change.
When a new chain or launchpad is added:
- It’s plugged into the existing normalized entity model.
- You continue using the same GraphQL queries with a different
network: { chainId }value.
This is why Codex can cover 70M–75M+ tokens and 16+ launchpads without forcing you to change APIs.
4. Can Codex support a high‑traffic CEX or major wallet product?
Yes. Codex already powers apps like Coinbase, TradingView, Uniswap, Magic Eden, Rainbow, MoonPay, Farcaster, and pump.fun.
Case studies highlight:
- Billions of API requests per month across customers
- Individual products doing hundreds of millions of requests/month
If you’re sizing a deployment, Codex’s Growth and higher‑tier plans expose concrete limits (RPS, connections, keys), and their team typically works with you on custom SLAs.
5. Why not just build my own on‑chain data pipeline in‑house?
You can—but you’re signing up for:
- Running and maintaining nodes across 80+ networks
- Writing and operating custom indexers for each protocol and DEX
- Handling upgrades, reorgs, chain outages, and new ecosystems
- Designing your own enrichment, OHLCV, and prediction market schemas
Codex’s founding story is exactly this: they tried to build a trading platform, got blocked by raw or wrong data, and ended up spending years on infra instead.
Most teams prefer to buy a mature, infrastructure‑grade data layer that’s already powering industry‑leading products, and focus their engineers on product.
If you’re evaluating the most reliable on‑chain data APIs for trading apps or prediction market frontends, Codex’s architecture is designed to be that trading‑grade source of truth. You can explore the full schema and examples in the docs at docs.codex.io.
