Parity light node sync, warp sync, and full node sync
Goal: We will run a Parity Full Ethereum node and will measure the task of running an Ethereum node on 3 matrices: Time, Infrastructure and Skill.
What is an Ethereum node?
A device/program which interacts with ethereum network is called node or client. This node interacts with blockchain and provides multiple functionalities such as querying blockchain, executing transactions, managing wallets and much more.
Different Ethereum client implementations
There is multiple ethereum clients implementation in which Geth (go-ethereum) and Parity are the most popular. We will be running our Ethereum node using Parity implementation.
List of Ethereum clients:
- Geth (Go) wiki, repo, issues, (fully working, i.e. passing all JSON tests for the VM, plus can run a full node)
- Parity Ethereum (Rust) repo (fully working; light, warp, full, or archive node)
- cpp-ethereum (C++) docs, repo (fully working)
- pyethapp (Python) wiki, used for Casper FFG, repo. Includes py-ethereum, which in turn includes Serenity R&D.
- py-evm/Trinity (Python, has a sharding implementation under development), a full client (near completion as of late April 2018) developed from Py-ethereum. (almost fully working)
- Nethermind .NET Core
- EWasm, just an alternative to the EVM that is compliant with the EVM ABI.
- Harmony/ethereumj (Java) (fully working)
- Exthereum (Elixir) (fully working EVM implementation, not a client).
- ciri (Ruby) (WIP)
- Solidity EVM (Poc), not intended to be a complete client, just for runtime one-off execution.
What is node Sync?
A Blockchain node keeps the copy of transaction history. While starting your own node you need to get this copy from other running nodes. This process is called node sync.
There are different ways to install Parity, depending on your OS. I am using Ubuntu 16.04 for running Parity. You can install Parity binaries using one line.
$ bash <(curl https://get.parity.io -L)
If you are running Windows or MacOS, you can check installation instructions here. Once installed we will choose what kind of sync we want to run on our machine. There are multiple types of syncs are provided by Parity client.
Choosing a Sync type
Blockchain data is normally in GBs, and in some cases *TBs* (Ethereum Archive node is ~3.75 TB). There are different types of syncs which you can run depending on the application you’re building and the security you require. For example — if you are running a mining Grid, you’ll need to have Full Archive nodes; if you are a developer just starting to learn Ethereum, a light node sync may be enough for you.
A Full node normally contains a full copy of blockchain (all history, of every block and every transaction, all validated). Running a Full Ethereum node needs a lot of computing and storage resource. Full node sync verifies blocks validity, replay all transactions (this utilizes CPU and disk IO heavily). To run a Full Ethereum node you’ll need a multi-core CPU machine and at least 500 GB of high-performance disk space (i.e. SSD).
If you want an Archive node, with all the history and intermediary state of every account and contract for every block, you'll need at least 4TB storage (also SSD, or blocks will not process quickly enough to achieve a sync with the network).
Internet bandwidth also important here, you need at least a DSL connection.
Light node or Light clients do not download a full copy of blockchain and rather depends on a Full node to request a certain type of information on-demand basis. These light nodes are not secure as they depend on the honesty of Full nodes in the network.
Ethereum’s light client sync mode allows users to spin up a node that only downloads block headers and relies on Merkle proofs to verify specific parts of the state tree as needed. Light peers are extremely commonplace and critical components in the Ethereum network today. Their architecture serves as a great starting point for anyone extending or redesigning client in a secure, concurrent, and performant way. [source]
Run an Ethereum node using Parity
Before going further let’s discuss different types of full and light node syncs available with Parity client. So you can decide which one you want for your node.
Light node Syncs
Experimental: run in light client mode. Light clients
synchronize a bare minimum of data and fetch necessary
data on-demand from the network. Much lower in storage,
potentially higher in bandwidth. Has no effect with
Parity light client (version >=1.11) starts synchronization from a hardcoded block number if no related database was found on the device. This allows reaching the top of the chain in a matter of seconds.
Parity Light node sync command
Run this command for parity light node sync, after installing the parity.
$ parity --light
2- Use the flag
--no-hardcoded-sync — as
--light option uses a hardcoded block number as a starting point, to prevent this behavior and sync all headers starting from the genesis block.
By default, if there is no existing database the light
client will automatically jump to a block hardcoded in
the chain's specifications. This disables this feature.
Parity Light sync with no hardcoded block height
$ parity --light --no-hardcoded-sync
Warp sync — Parity also provide an option called warp sync, this is default sync method even if you don’t use
--warp option. Warp Sync is an extension to the Ethereum Wire protocol, which involves sending snapshots over the network to get the full state at a given block extremely quickly, and then filling in the blocks between the genesis and the snapshot in the background. Rather than downloading the whole blockchain and replay every transaction, Warp sync downloads all the blocks header and transaction receipt to reach the latest state. This sync can take up 1–2 days depending on your network bandwidth.
Parity Warp sync command
Run one from the below commands for Parity warp sync
$ parity // (Warp included by default)
$ parity --no-ancient-blocks
This disables downloading of old blocks after snapshot restoration or warp sync. The result is you start at current best block minus 30,000 historic blocks and will only keep the future blocks without downloading the full history.
However, this should not be used in production or on any node that is used to manage any value of Ether or Tokens. Because malicious nodes can easily provide you with tampered snapshots or blocks. The only way to ensure full integrity of the received data is a full verification of all blocks including ancient blocks.
Node Sync (Warp + Full) —
The most common node, a Full node, contains the most recent ~100 blocks and prunes the rest. You'll need around 500GB of high-performance storage (SSD) to keep this node in sync. This is the default sync method, or you can specify with the
--pruning fast flag on launch.
Node Sync (Warp + Archive) —
A node with Archive data takes immense CPU and storage resources and can take months to sync. You will need at least 4TB storage and multicore CPU processors. To sync a node with Archive data, you can run below command:
$ parity --pruning archive
Parity provides extensive configuration options for your node. Though you should only use them if you understand them completely.
Parity also provides a GUI configure generator for your node, Check here.
The Elephant in the Room
As we described above running an ethereum node require 3 major resources.
- Time — Apart from Light node sync, every other node sync type takes a lot of time (days-weeks). This can be frustrating for a developer or anyone who wants their node up and running to perform his/her task as quickly as possible.
- Infra — Infra can be a real barrier, though current laptops come with decent SSD storage. If you run a Full node on your home system, probably you won’t be using it in production. So for serious development, you have to use some cloud service (a machine with 500GB external disk and 4 vCPU with 15 GB internal memory costs around $185 per month on Google Cloud). Playing with various providers, we noticed that 8GB+ RAM is better, and dedicated CPU instances are far more consistent at being on the latest block vs. standard/shared instances. While it is possible to run a bare Full node on a few GB of RAM and 2 vCPU, it’s not something that would work reliably in production under a consistent load, over an extended period of time.
- Skill (to maintain and secure the node) — This is where sysadmin skills kick in. If you are not a sysadmin, it can be a real pain point for you. Even if you are a well-seasoned developer with command-line mojo and kernel & network-tuning skills, it’s not exactly ideal for you to spend hours managing your node, learning about security, and handling node client updates — this is all valuable time you could be spending building your application, developing features, and launching quicker than your competition.
As there is no incentive to run a Full node yet [check here] (unless you are building a business on it), running and maintaining a Full Ethereum node is a cumbersome and resource-exhaustive process. This gets weirder when you try to run multiple nodes with different clients. We at QuickNode are solving this problem for you.
With QuickNode, you can spin up your own Ethereum node in minutes. QuickNode is reliable, scalable, and developer friendly.
We have helped a lot of Ethereum business build fast and scale their applications. We learned a lot by running, managing and maintaining thousands of Ethereum nodes since July 2017, which we’ll share with you in upcoming blogs. So sign up for our weekly newsletter and updates!
QuickNode is building infrastructure to support the future of Web3. Since 2017, we’ve worked with hundreds of developers and companies, helping scale dApps and providing high-performance access to 16+ blockchains. Subscribe to our newsletter for more content like this and stay in the loop with what’s happening in Web3! 😃