Chain State Management

Understanding active state vs archive state in EVM chains, and how to manage disk usage through state sync and offline pruning.

When running an EVM-based blockchain (C-Chain or Subnet-EVM L1s), your node stores blockchain state on disk. Understanding the difference between active state and archive state is crucial for managing disk space and choosing the right sync method.

Active State vs Archive State

Active State

The active state represents the current state of the blockchain—all account balances, contract storage, and code as of the latest block. This is what your node needs to validate new transactions and participate in consensus.

PropertyDetails
Size~500 GB for C-Chain
ContentsCurrent account balances, contract storage, code
Required forValidating, sending transactions, reading current state
Sync methodState sync (fast, downloads only current state)

Archive State (Total State)

The archive state includes the complete history of all state changes since genesis. This allows querying historical state at any block height (e.g., "What was this account's balance at block 1,000,000?").

PropertyDetails
Size~3 TB+ for C-Chain (and growing)
ContentsComplete state history at every block
Required forHistorical queries, block explorers, analytics
Sync methodFull sync from genesis (slower, replays all blocks)

Most validators and RPC nodes only need the active state. Archive nodes are typically only required for block explorers, indexers, and specialized analytics applications.

Why State Grows Over Time

Even if you start with just the active state, your node's disk usage will grow over time:

  1. New blocks: Each block adds new state changes
  2. State trie overhead: The Merkle Patricia Trie structure stores intermediate nodes
  3. Deleted state retention: Old trie nodes aren't automatically removed

This means a node that started with 500 GB via state sync might grow to 1 TB+ over months of operation, even though the "current" active state is still ~500 GB.

Future Improvement: Firewood

Firewood is an upcoming database upgrade that will address the issue of total state growing too large. This next-generation storage layer is designed to efficiently manage state growth and reduce disk space requirements for node operators.

Managing Disk Usage

Option 1: State Sync (Re-sync)

The simplest way to reclaim disk space is to delete your node's data and re-sync using state sync. Instead of replaying the entire blockchain history to reconstruct the current state, state sync allows nodes to download only the current state directly from network peers. This shortens the bootstrap process from multiple days to just a couple of hours.

State sync is ideal for:

  • Validator nodes that don't need full transaction history
  • RPC nodes focused on current state queries
  • Any node where historical data queries are not required

State sync is available for the C-Chain and Avalanche L1s, but not for P-Chain or X-Chain. Since the bulk of transactions and state growth occur on the C-Chain, state sync still provides significant benefits for bootstrap time and disk usage management.

Configuring State Sync

State sync is enabled by default for the C-Chain. For Avalanche L1s, you can configure it per-chain:

# Stop your node first
sudo systemctl stop avalanchego

# Remove the database (adjust path as needed)
rm -rf ~/.avalanchego/db

# Restart - node will state sync automatically
sudo systemctl start avalanchego
ProsCons
Simple, no configuration neededSeveral hours of downtime
Guarantees minimal disk usageLoses any local transaction index
Fresh database with no fragmentationMust re-sync from scratch
Fast bootstrap (hours vs days)Not available for P-Chain or X-Chain

Zero-Downtime Re-sync for Validators

To avoid validator downtime, spin up a fresh node and let it state sync completely. Once synced, stop both nodes, copy the ~/.avalanchego/staking/ folder from your current validator to the new node, then start the new node. Your validator identity (staking keys) transfers instantly with no missed uptime.

Option 2: Offline Pruning

Offline pruning removes old state trie nodes while keeping your node's database intact. This is faster than a full re-sync but requires temporary additional disk space.

See the Reduce Disk Usage guide for detailed instructions.

ProsCons
Faster than full re-syncRequires ~30-60 minutes downtime
Preserves transaction indexNeeds temporary disk space for bloom filter
No network bandwidth requiredSlightly more complex setup

Choosing the Right Approach

ScenarioRecommended Approach
Disk nearly full, need space fastState sync (re-sync)
Regular maintenance, have spare disk spaceOffline pruning
Running a block explorer or indexerKeep archive state, add more storage
New validator setupState sync (required)

Monitoring Disk Usage

Track your node's disk usage over time to plan maintenance:

# Check database size
du -sh ~/.avalanchego/db

# Check available disk space
df -h /

Consider setting up alerts when disk usage exceeds 80% to give yourself time to plan maintenance.

P-Chain and X-Chain State

The P-Chain and X-Chain have significantly smaller state footprints compared to the C-Chain:

  • P-Chain: Stores validator metadata, Avalanche L1 definitions, and staking transactions. State size is typically < 10 GB and grows very slowly.
  • X-Chain: Handles AVAX transfers using the UTXO model. State size is typically < 50 GB and grows slowly.

Important limitations:

  • State sync is not available for P-Chain or X-Chain
  • These chains always sync from genesis by replaying all transactions
  • Bootstrap time is faster than C-Chain despite no state sync due to much smaller state size (typically < 1 hour)
  • Disk space management is rarely needed for these chains

L1-Specific Considerations

For Avalanche L1s running Subnet-EVM:

  • State size scales with usage: High-throughput chains accumulate state faster
  • Same pruning tools apply: Offline pruning works identically to C-Chain
  • State sync available: Configure via Subnet-EVM chain config
  • Plan storage accordingly: Reference the system requirements for your throughput tier

Is this guide helpful?