You built your first crypto app. You connected to an RPC node. You fetched a wallet balance, maybe pulled some transaction history. Things were going well.
Then you tried to do something useful: show a price chart for a token. Get a list of holders. Search for tokens by name. And you hit a wall. The blockchain doesn't give you any of that. Not through RPC, not through any standard node interface. The raw data is there, buried in millions of blocks and billions of transactions, but it's not organized in any way you can query. This is the problem a blockchain indexer solves, and understanding it will save you months of wasted engineering effort.
This article explains what blockchain indexers are, how they work, the different approaches to getting indexed blockchain data, and how to decide whether to build your own or use an existing one.
The Problem: Why raw RPC isn't ready for applications
Say you want to display the price of a token over the last 24 hours, a basic OHLCV chart, the kind you see on every trading app.
Here's what you'd need to do with raw RPC calls:
- Find every DEX pool that trades this token (across Uniswap, SushiSwap, Raydium, etc.) by ID
- Fetch every swap transaction from those pools in the last 24 hours
- Parse the raw transaction data to extract swap amounts (this varies by DEX and requires decoding contract-specific ABIs)
- Calculate the USD price at each swap by cross-referencing the other token in the pair against a stablecoin pool
- Aggregate all of that into OHLCV candles at your desired resolution
That's thousands of RPC calls, complex parsing logic that breaks when DEXes update their contracts, and a meaningful amount of infrastructure.
For one token and one metric.
Now multiply that by 70 million tokens across 100+ chains. That is the problem that indexers solve.
Here's what the RPC approach looks like in pseudo-code:
// Step 1: Find all pools for this token across every DEX
// Step 2: For each pool, fetch all transactions in last 24h
// Step 3: Decode raw transaction bytes using DEX-specific ABIs
// Step 4: Extract swap amounts, calculate USD price at each swap
// Step 5: Aggregate into OHLCV candles
// Step 6: Handle edge cases: multi-hop swaps, rebasing tokens, fee-on-transfer
// ... hundreds of lines of code, thousands of RPC calls, fragile parsing logic
Now here's the same thing with indexed data:
import Codex from "@codex-data/sdk";
const sdk = new Codex("YOUR_API_KEY");
// Get 1-hour OHLCV candles for the last 24 hours — one query
const bars = await sdk.queries.bars({
symbol: "So11111111111111111111111111111111111111112:1399811149",
from: Math.floor(Date.now() / 1000) - 86400,
to: Math.floor(Date.now() / 1000),
resolution: "60",
});
console.log(bars); // { o, h, l, c, volume, t } for each candle
One API call. No parsing. No infrastructure. The indexer already did the hard work.
What Is a Blockchain Indexer?
A blockchain indexer reads raw blockchain data, processes it, organizes it into structured formats, and makes it queryable through an API.
The concept is straightforward, even if the implementation is not.
Think of it this way: the blockchain is a massive filing cabinet of receipts thrown in random order. An indexer is the accountant who organizes everything into spreadsheets, calculates running totals, and lets you query "show me all transactions for wallet X" or "what's the price of token Y over the last week."
Here's what an indexer does:
- Reads every block and transaction from the blockchain as they're produced
- Decodes raw transaction data into meaningful events, a blob of hex data becomes "wallet A swapped 1.5 ETH for 3,000 USDC on Uniswap v3"
- Enriches the decoded data by calculating USD prices, aggregating trading volume, tracking holder counts, and deriving metrics that don't exist on-chain
- Stores the processed data in a queryable database (PostgreSQL, ClickHouse, Dynamo, or similar)
- Serves the data through an API (REST, GraphQL, WebSocket), so your app can query it without touching the blockchain directly
The key insight is step 3: enrichment. The blockchain doesn't know what a "USD price" is. It doesn't track "24-hour volume" or calculate market cap. These are derived metrics that require processing raw swap data, cross-referencing stablecoin pairs, and computing aggregates. This is what separates raw blockchain data from data you can actually build a product with.
Types of Blockchain Indexers
There are three main approaches to blockchain indexing. Each makes different tradeoffs between control, complexity, and cost.
Self-Hosted Indexers
You run your own indexing infrastructure: blockchain nodes, parsing logic, database, and API layer.
Examples: Custom node + PostgreSQL pipeline, Ponder, Envio, Subsquid
Pros:
- Total control over data processing and storage
- No rate limits or API restrictions
- Can index exactly the data you need, nothing more
Cons:
- Expensive. A production multi-chain setup easily costs $100K+ per year in infrastructure (nodes, databases, compute)
- Complex to maintain. Schema migrations, chain forks, RPC node failures, backfill jobs
- Requires deep blockchain expertise on your engineering team
- You own the uptime, reliability, and data freshness guarantees
Self-hosted makes sense when blockchain indexing is your core product. For most teams, it's overkill.
Decentralized Indexing (The Graph)
The Graph is a decentralized protocol where independent node operators ("indexers") process data according to schemas you define called "subgraphs."
The workflow: you write a subgraph manifest that specifies which smart contracts to watch and how to transform the events. You deploy it to The Graph's network. Indexers pick it up and start processing. You query the results via GraphQL.
Pros:
- Decentralized and censorship-resistant. No single company controls the data
- Composable. You can build on other people's subgraphs
- Large ecosystem with thousands of existing subgraphs
- GraphQL query interface
Cons:
- You need to write and maintain subgraph code (AssemblyScript) for every data shape you need
- Costs are paid in GRT tokens, which makes pricing unpredictable
- Data freshness and reliability vary by indexer. Some are fast, some lag behind
- No built-in enrichment (USD prices, aggregated metrics). You get the events you define, nothing more
- Debugging failed subgraphs is painful
The Graph is a good fit if you need censorship-resistant data access or if the subgraph you need already exists. But for most application developers, writing and maintaining subgraphs is additional complexity on top of the product you're trying to build.
Managed Indexing APIs
A company runs the entire indexing pipeline, and you get access through an API. No infrastructure to manage, no indexing logic to write.
Examples: Codex, Alchemy, Moralis
Pros:
- Fast to start: sign up, get an API key, make your first query in minutes
- Production-ready enriched data out of the box (USD prices, aggregated metrics, holder analytics)
- Someone else handles uptime, data freshness, and chain support
- SDKs, documentation, and developer support
Cons:
- Dependency on the provider's availability and roadmap
- Costs scale with usage
- Less flexibility for highly custom data transformations
For most teams building crypto applications, managed APIs are the pragmatic choice. You trade some control for a massive reduction in complexity and time-to-market.
Comparison Table
Build vs Buy: When to Build a Blockchain Indexer Yourself
This is the decision most teams get wrong, and it usually costs them 6-12 months of engineering time before they realize it.
Build your own indexer when:
- Blockchain indexing IS your core product. You're a data company, and the indexer is what you sell
- You need highly custom data transformations that no existing API provides (novel DeFi protocol analytics, proprietary metrics)
- You need absolute control over every aspect of data processing and storage
- You have dedicated infrastructure engineers who can own the system long-term
Use a managed API when:
- You need blockchain data to build your actual product. A trading platform, wallet, analytics dashboard, or bot
- Time-to-market matters. You can't spend 6 months building an indexer before building your app
- You want enriched data (USD prices, OHLCV charts, aggregated volume, holder stats) without building the enrichment pipeline yourself
- You don't want to maintain indexing infrastructure alongside your actual product
The honest math: building a production-grade multi-chain indexer costs $100K+ per year in infrastructure and 1,000+ engineering hours to build and maintain. That's not a one-time cost. Chains fork, DEXes deploy new contracts, data schemas evolve, and your indexer needs to keep up. For most teams, that engineering time is better spent on their core product.
Sudoswap, the NFT marketplace protocol, learned this firsthand. They were running their own indexing infrastructure on AWS to power their analytics and marketplace data. After switching to Codex, they saved over $50K in annual AWS costs and eliminated the engineering overhead of maintaining their own pipeline.
If you're asking, "Should I build my own indexer?", you probably shouldn't. The teams that need to build their own already know it.
How Codex Approaches Blockchain Indexing
Codex is a managed blockchain indexing API that processes raw on-chain data into enriched, production-ready datasets across 80+ networks.
Here's what that means in practice:
- 70M+ tokens indexed with real-time USD pricing calculated from on-chain DEX activity
- Sub-second data freshness: new on-chain events are indexed and available through the API within 1 second of hitting the blockchain
- Enriched data out of the box: OHLCV charts, holder analytics, aggregated volume, liquidity metrics, token metadata, and wallet data
- Multiple delivery methods: GraphQL API for queries, WebSocket subscriptions for real-time streaming, webhooks for event-driven workflows, and a TypeScript SDK
- Battle-tested at scale: Codex powers TradingView, Coinbase, Rainbow, FOMO, Uniswap, and hundreds of other platforms and processes tens of billions of requests per month.
This is what you get when blockchain indexing is someone else's full-time job. Instead of building and maintaining your own indexing pipeline, you get a single API call:
import Codex from "@codex-data/sdk";
const sdk = new Codex("YOUR_API_KEY");
// Get token holders — one query instead of building your own indexer
const { holders } = await sdk.queries.holders({
input: {
tokenId: "So11111111111111111111111111111111111111112:1399811149",
},
});
console.log(`Total holders: ${holders.count}`);
holders.holders.forEach((h) => {
console.log(`${h.walletAddress} — $${h.balanceUsd} — since ${h.firstHeldAt}`);
});
Without an indexer, you'd need to scan every Solana transaction, track every transfer event, maintain a running balance for each wallet, and cross-reference prices to USD. That's a full-time infrastructure project. With indexed data, it's one API call.
If you're building on Solana specifically, our Solana API guide covers the full landscape of data providers.
Getting Started
Where you go from here depends on where you are:
- If you're exploring: Read the Codex documentation to see the full range of available data. Token prices, OHLCV charts, holder analytics, transaction feeds, and more across 80+ chains.
- If you're building: Sign up for a free API key (10,000 requests/month, no credit card required) and make your first query. The TypeScript SDK gets you from zero to working code in under 5 minutes.
- If you're migrating from The Graph or your own indexer: Codex typically replaces custom subgraphs with standard API calls. No subgraph code to write, no indexer nodes to maintain. The enriched data (USD pricing, aggregated metrics) that you'd normally calculate yourself is already included.
FAQ
What is a blockchain indexer?
A blockchain indexer is software that reads raw blockchain data (blocks, transactions, events), processes and organizes it into structured formats, and serves it through a queryable API. It transforms the unstructured data on a blockchain into something applications can actually use, such as token prices, holder lists, trading volume, and transaction history.
What is the difference between a blockchain indexer and RPC?
RPC gives you direct access to raw blockchain data. Account balances, individual transactions, block contents. But it can only answer simple questions about the current state. A blockchain indexer processes that raw data and lets you ask complex questions like "What was the price of this token over the last 24 hours?" or "Who are the top 100 holders?" These queries require aggregating and enriching millions of transactions — something RPC was never designed to do.
Do I need a blockchain indexer?
If your app needs any of the following, you need indexed data: token prices in USD, price charts (OHLCV), trading volume, holder analytics, token search or filtering, wallet portfolio data, or real-time transaction feeds. If you only need to read a wallet balance or send a transaction, RPC is sufficient. Most crypto applications beyond basic wallet interactions need indexed data.
How much does it cost to build a blockchain indexer?
A production-grade multi-chain indexer typically costs $100K+ per year in infrastructure and 1,000+ engineering hours to build and maintain. Managed indexing APIs like Codex start with free tiers (10K requests/month) and scale to $350 per million requests, eliminating that overhead entirely.
What are the best blockchain indexers?
The best option depends on your needs. For most application developers, managed APIs like Codex (100+ chains, enriched data, sub-second freshness), Alchemy (strong node infrastructure with some enrichment), and Moralis (EVM + Solana dev platform) are the fastest path to production. For decentralized, censorship-resistant indexing, The Graph is the established protocol. For self-hosted solutions, Ponder and Envio are solid open-source frameworks.
Skip the indexer. Start with indexed data. Get a free API key and follow along with the code examples in this guide.

