Neal O'Grady

Head of Marketing

What Is a Blockchain Indexer?

Blockchain indexers transform raw on-chain data into queryable APIs. Learn how they work, when to build vs buy, and how indexed data powers crypto apps.

Diagram showing how a blockchain indexer works: raw blockchain blocks with hex hashes on the left flow through a central indexer processor via glowing green data lines, outputting structured data cards for tokens, prices, wallets, and volume on the right.

You built your first crypto app. You connected to an RPC node. You fetched a wallet balance, maybe pulled some transaction history. Things were going well.

Then you tried to do something useful: show a price chart for a token. Get a list of holders. Search for tokens by name. And you hit a wall. The blockchain doesn't give you any of that. Not through RPC, not through any standard node interface. The raw data is there, buried in millions of blocks and billions of transactions, but it's not organized in any way you can query. This is the problem a blockchain indexer solves, and understanding it will save you months of wasted engineering effort.

This article explains what blockchain indexers are, how they work, the different approaches to getting indexed blockchain data, and how to decide whether to build your own or use an existing one.

The Problem: Why raw RPC isn't ready for applications

Say you want to display the price of a token over the last 24 hours, a basic OHLCV chart, the kind you see on every trading app.

Here's what you'd need to do with raw RPC calls:

Find every DEX pool that trades this token (across Uniswap, SushiSwap, Raydium, etc.) by ID
Fetch every swap transaction from those pools in the last 24 hours
Parse the raw transaction data to extract swap amounts (this varies by DEX and requires decoding contract-specific ABIs)
Calculate the USD price at each swap by cross-referencing the other token in the pair against a stablecoin pool
Aggregate all of that into OHLCV candles at your desired resolution

That's thousands of RPC calls, complex parsing logic that breaks when DEXes update their contracts, and a meaningful amount of infrastructure.

For one token and one metric.

Now multiply that by 70 million tokens across 100+ chains. That is the problem that indexers solve.

Here's what the RPC approach looks like in pseudo-code:

// Step 1: Find all pools for this token across every DEX // Step 2: For each pool, fetch all transactions in last 24h // Step 3: Decode raw transaction bytes using DEX-specific ABIs // Step 4: Extract swap amounts, calculate USD price at each swap // Step 5: Aggregate into OHLCV candles // Step 6: Handle edge cases: multi-hop swaps, rebasing tokens, fee-on-transfer // ... hundreds of lines of code, thousands of RPC calls, fragile parsing logic

Now here's the same thing with indexed data:

import Codex from "@codex-data/sdk"; const sdk = new Codex("YOUR_API_KEY"); // Get 1-hour OHLCV candles for the last 24 hours — one query const bars = await sdk.queries.bars({ symbol: "So11111111111111111111111111111111111111112:1399811149", from: Math.floor(Date.now() / 1000) - 86400, to: Math.floor(Date.now() / 1000), resolution: "60", }); console.log(bars); // { o, h, l, c, volume, t } for each candle

One API call. No parsing. No infrastructure. The indexer already did the hard work.

What Is a Blockchain Indexer?

A blockchain indexer reads raw blockchain data, processes it, organizes it into structured formats, and makes it queryable through an API.

The concept is straightforward, even if the implementation is not.

Think of it this way: the blockchain is a massive filing cabinet of receipts thrown in random order. An indexer is the accountant who organizes everything into spreadsheets, calculates running totals, and lets you query "show me all transactions for wallet X" or "what's the price of token Y over the last week."

Here's what an indexer does:

Reads every block and transaction from the blockchain as they're produced
Decodes raw transaction data into meaningful events, a blob of hex data becomes "wallet A swapped 1.5 ETH for 3,000 USDC on Uniswap v3"
Enriches the decoded data by calculating USD prices, aggregating trading volume, tracking holder counts, and deriving metrics that don't exist on-chain
Stores the processed data in a queryable database (PostgreSQL, ClickHouse, Dynamo, or similar)
Serves the data through an API (REST, GraphQL, WebSocket), so your app can query it without touching the blockchain directly

The key insight is step 3: enrichment. The blockchain doesn't know what a "USD price" is. It doesn't track "24-hour volume" or calculate market cap. These are derived metrics that require processing raw swap data, cross-referencing stablecoin pairs, and computing aggregates. This is what separates raw blockchain data from data you can actually build a product with.

Types of Blockchain Indexers

There are three main approaches to blockchain indexing. Each makes different tradeoffs between control, complexity, and cost.

Self-Hosted Indexers

You run your own indexing infrastructure: blockchain nodes, parsing logic, database, and API layer.

Examples: Custom node + PostgreSQL pipeline, Ponder, Envio, Subsquid

Pros:

Total control over data processing and storage
No rate limits or API restrictions
Can index exactly the data you need, nothing more

Cons:

Expensive. A production multi-chain setup easily costs $100K+ per year in infrastructure (nodes, databases, compute)
Complex to maintain. Schema migrations, chain forks, RPC node failures, backfill jobs
Requires deep blockchain expertise on your engineering team
You own the uptime, reliability, and data freshness guarantees

Self-hosted makes sense when blockchain indexing is your core product. For most teams, it's overkill.

Decentralized Indexing (The Graph)

The Graph is a decentralized protocol where independent node operators ("indexers") process data according to schemas you define called "subgraphs."

The workflow: you write a subgraph manifest that specifies which smart contracts to watch and how to transform the events. You deploy it to The Graph's network. Indexers pick it up and start processing. You query the results via GraphQL.

Pros:

Decentralized and censorship-resistant. No single company controls the data
Composable. You can build on other people's subgraphs
Large ecosystem with thousands of existing subgraphs
GraphQL query interface

Cons:

You need to write and maintain subgraph code (AssemblyScript) for every data shape you need
Costs are paid in GRT tokens, which makes pricing unpredictable
Data freshness and reliability vary by indexer. Some are fast, some lag behind
No built-in enrichment (USD prices, aggregated metrics). You get the events you define, nothing more
Debugging failed subgraphs is painful

The Graph is a good fit if you need censorship-resistant data access or if the subgraph you need already exists. But for most application developers, writing and maintaining subgraphs is additional complexity on top of the product you're trying to build.

Managed Indexing APIs

A company runs the entire indexing pipeline, and you get access through an API. No infrastructure to manage, no indexing logic to write.

Examples: Codex, Alchemy, Moralis

Pros:

Fast to start: sign up, get an API key, make your first query in minutes
Production-ready enriched data out of the box (USD prices, aggregated metrics, holder analytics)
Someone else handles uptime, data freshness, and chain support
SDKs, documentation, and developer support

Cons:

Dependency on the provider's availability and roadmap
Costs scale with usage
Less flexibility for highly custom data transformations

For most teams building crypto applications, managed APIs are the pragmatic choice. You trade some control for a massive reduction in complexity and time-to-market.

Comparison Table

	Self-Hosted	The Graph	Managed API
Setup time	Weeks to months	Days to weeks	Minutes
Infrastructure cost	$100K+/year	Variable (GRT tokens)	Predictable monthly pricing
Data enrichment	You build it	You build it	Included
Multi-chain	You add each chain	Limited by subgraph availability	100+ chains (Codex)
USD pricing	You calculate it	Not built-in	Included
Real-time data	Depends on your infra	Depends on indexer	Sub-second (Codex)
Maintenance burden	High	Medium	Low
Best for	Data companies, custom needs	Protocol-specific queries, decentralization	Application developers

Build vs Buy: When to Build a Blockchain Indexer Yourself

This is the decision most teams get wrong, and it usually costs them 6-12 months of engineering time before they realize it.

Build your own indexer when:

Blockchain indexing IS your core product. You're a data company, and the indexer is what you sell
You need highly custom data transformations that no existing API provides (novel DeFi protocol analytics, proprietary metrics)
You need absolute control over every aspect of data processing and storage
You have dedicated infrastructure engineers who can own the system long-term

Use a managed API when:

You need blockchain data to build your actual product. A trading platform, wallet, analytics dashboard, or bot
Time-to-market matters. You can't spend 6 months building an indexer before building your app
You want enriched data (USD prices, OHLCV charts, aggregated volume, holder stats) without building the enrichment pipeline yourself
You don't want to maintain indexing infrastructure alongside your actual product

The honest math: building a production-grade multi-chain indexer costs $100K+ per year in infrastructure and 1,000+ engineering hours to build and maintain. That's not a one-time cost. Chains fork, DEXes deploy new contracts, data schemas evolve, and your indexer needs to keep up. For most teams, that engineering time is better spent on their core product.

Sudoswap, the NFT marketplace protocol, learned this firsthand. They were running their own indexing infrastructure on AWS to power their analytics and marketplace data. After switching to Codex, they saved over $50K in annual AWS costs and eliminated the engineering overhead of maintaining their own pipeline.

If you're asking, "Should I build my own indexer?", you probably shouldn't. The teams that need to build their own already know it.

How Codex Approaches Blockchain Indexing

Codex is a managed blockchain indexing API that processes raw on-chain data into enriched, production-ready datasets across 80+ networks.

Here's what that means in practice:

70M+ tokens indexed with real-time USD pricing calculated from on-chain DEX activity
Sub-second data freshness: new on-chain events are indexed and available through the API within 1 second of hitting the blockchain
Enriched data out of the box: OHLCV charts, holder analytics, aggregated volume, liquidity metrics, token metadata, and wallet data
Multiple delivery methods: GraphQL API for queries, WebSocket subscriptions for real-time streaming, webhooks for event-driven workflows, and a TypeScript SDK
Battle-tested at scale: Codex powers TradingView, Coinbase, Rainbow, FOMO, Uniswap, and hundreds of other platforms and processes tens of billions of requests per month.

This is what you get when blockchain indexing is someone else's full-time job. Instead of building and maintaining your own indexing pipeline, you get a single API call:

import Codex from "@codex-data/sdk"; const sdk = new Codex("YOUR_API_KEY"); // Get token holders — one query instead of building your own indexer const { holders } = await sdk.queries.holders({ input: { tokenId: "So11111111111111111111111111111111111111112:1399811149", }, }); console.log(`Total holders: ${holders.count}`); holders.holders.forEach((h) => { console.log(`${h.walletAddress} — $${h.balanceUsd} — since ${h.firstHeldAt}`); });

Without an indexer, you'd need to scan every Solana transaction, track every transfer event, maintain a running balance for each wallet, and cross-reference prices to USD. That's a full-time infrastructure project. With indexed data, it's one API call.

For a deeper look at how crypto data APIs compare, see our developer comparison of the best crypto APIs. If you're building on Solana specifically, our Solana API guide covers the full landscape of data providers.

Getting Started

Where you go from here depends on where you are:

If you're exploring: Read the Codex documentation to see the full range of available data. Token prices, OHLCV charts, holder analytics, transaction feeds, and more across 80+ chains.
If you're building: Sign up for a free API key (10,000 requests/month) and make your first query. The TypeScript SDK gets you from zero to working code in under 5 minutes.
If you're migrating from The Graph or your own indexer: Codex typically replaces custom subgraphs with standard API calls. No subgraph code to write, no indexer nodes to maintain. The enriched data (USD pricing, aggregated metrics) that you'd normally calculate yourself is already included.

FAQ

What is a blockchain indexer?

A blockchain indexer is software that reads raw blockchain data (blocks, transactions, events), processes and organizes it into structured formats, and serves it through a queryable API. It transforms the unstructured data on a blockchain into something applications can actually use, such as token prices, holder lists, trading volume, and transaction history.

What is the difference between a blockchain indexer and RPC?

RPC gives you direct access to raw blockchain data. Account balances, individual transactions, block contents. But it can only answer simple questions about the current state. A blockchain indexer processes that raw data and lets you ask complex questions like "What was the price of this token over the last 24 hours?" or "Who are the top 100 holders?" These queries require aggregating and enriching millions of transactions — something RPC was never designed to do.

Do I need a blockchain indexer?

If your app needs any of the following, you need indexed data: token prices in USD, price charts (OHLCV), trading volume, holder analytics, token search or filtering, wallet portfolio data, or real-time transaction feeds. If you only need to read a wallet balance or send a transaction, RPC is sufficient. Most crypto applications beyond basic wallet interactions need indexed data.

How much does it cost to build a blockchain indexer?

A production-grade multi-chain indexer typically costs $100K+ per year in infrastructure and 1,000+ engineering hours to build and maintain. Managed indexing APIs like Codex start with free tiers (10K requests/month) and scale to $350 per million requests, eliminating that overhead entirely.

What are the best blockchain indexers?

The best option depends on your needs. For most application developers, managed APIs like Codex (100+ chains, enriched data, sub-second freshness), Alchemy (strong node infrastructure with some enrichment), and Moralis (EVM + Solana dev platform) are the fastest path to production. For decentralized, censorship-resistant indexing, The Graph is the established protocol. For self-hosted solutions, Ponder and Envio are solid open-source frameworks.

Skip the indexer. Start with indexed data. Get a free API key and follow along with the code examples in this guide.

How to Build a Polymarket Trading Bot with Real-Time Data

Step-by-step guide to building a Polymarket trading bot. Real-time odds monitoring, market scanning, and signal detection with Python code examples.

Kalshi API: How to Access Prediction Market Data for Developers

Access Kalshi prediction market data via API. Compare Kalshi's native API vs enriched data through Codex. Code examples in Python and TypeScript.

Polymarket API: How to Get Real-Time Prediction Market Data

Get real-time Polymarket prediction market data via API. Step-by-step guide with Python and TypeScript code examples. Odds, volume, and historical data.