The Non-Tech People-Friendly Guide to Web3 Backend Architecture

The Non-Tech People-Friendly Guide to Web3 Backend Architecture

zisland

zisland

GTM

Foreword

As a Web3 practitioner, whether technical or not, one should have a basic understanding of the Web3 applicationā€™s backend architecture. Apart from work-related needs, it is because these backend features are the technical foundation for most Web3 ideations.

It often takes weeks or months for even mature engineers to figure out the backend tech stack of Web3 applications. If one does not have a technical background, the learning process can be painful. I hope this content helps readers establish a framework for efficient learning.

If you have any questions or comments, please feel free to contact me.

The backend architecture of Web3 applications is completely different from that of Web2, mainly because some components of the Web2 backend are replaced by the blockchain-based infrastructure.

In Web2:

A Web2 application can be roughly abstracted into 3 parts, namely:

(Twitter as an example)

  1. Frontend code: used to define UI and user interaction. e.g. Swiping down the screen will trigger a refresh on Twitter.
  2. Backend code: used to define business logic. e.g. The content will be synced to the home page after retweeting.
  3. Database: used to store data. e.g. Every piece of content we tweet will be stored in the database.

The abstracted architecture is shown in the following:

1.jpg

Now let's combine the three. e.g. When you come across an interesting tweet and like it, the frontend receives this action and tells the backend. According to the code, the backend verifies that a like should be given to the tweet and tells the database to record. After completing the record, the database reports to the backend and then to the frontend, so the user will see that the ā€œred heartā€ on the page is lit.

This shows how Web2 applications like Twitter work after a high degree of abstraction.

In Web3:

The backend architecture of Web3 applications has changed dramatically, mainly because of changes in the database and backend code.

  1. Database: Web3's decentralized blockchain network replaces Web2's centralized database, and a large amount of value-based data is stored on the blockchain. This data is accessible and can be utilized by anyone.
  2. Backend code: The Web2-specified business logic-oriented backend code is also replaced by on-chain protocols and smart contracts, which are mostly open-source and have a high level of interoperability.

(Here we can briefly touch base on how smart contracts (defining business logic) function on the blockchain network. Take Ethereum as an example, smarts contracts compute and process data through EVM (Ethereum Virtual Machine) according to the consensus under various operating environment. Then the data is packaged into blocks and permanently stored on the chain. This whole process is fully decentralized and not controlled by any single centralized party)

The abstracted architecture of Web3 applications is shown in the following:

2.jpg

Now assume that Twitter is a Web3 application, let's contextualize how it works:

First, Twitter's backend engineer creates a smart contract defining the business logic related to likes, and deploys it on the chain. When you like a tweet, the frontend directly calls the smart contract, and it automatically adds a like to the tweet. This action is packaged into a block along with other data and permanently stored on the chain. Finally, the frontend receives that the data has been successfully uploaded on the chain, and the red heart on the page is lit.

This is how Web3 applications work after a high degree of abstraction (many details are omitted for ease of understanding).

How does the Web3 frontend interact with the blockchain network?

As mentioned earlier, in Web3, the frontend of an application can directly interact with the blockchain network to call smart contracts and realize the business logic. Interacting with the chain is the key to whether a Web3 application can run normally, so how to achieve it?

The answer is to interact with the chain through nodes.

Blockchain networks rely on numerous nodes for access and decentralization. Each node keeps a copy of the state of the chain, including the code/data of smart contracts. Also, each node has the right to initiate on-chain transactions, which are then confirmed by miners and synced to other nodes.

Interaction with the chain can be divided into two scenarios: initiating transactions ("write" data) and indexing data ("read" data), both of which need to be implemented through nodes.

Currently, there are two mainstream solutions:

  1. Self-build and run nodes
  2. Use third-party node service providers

What is the difference between these two ways?

The analogy of third-party services is Web2's cloud services, while self-built nodes are like traditional physical servers. Self-building ensures that the team has complete control over the nodes, at the cost of resources and time. For instance:

  1. Usually, a professional team dedicated to solving this problem is needed, which at least includes backend engineers and operations engineers.
  2. Self-built nodes still require purchasing cloud services for computation and storage, which can be expensive.
  3. In the case of Ethereum, it usually takes more than a week to complete syncing historic data after a new archive node is established.
  4. Single-node simply cannot meet most business needs. Operation and maintenance of multi-node clusters require multiple issues like data consistency to be solved.
  5. If the business needs multi-chain support, the above workload will be multiplied (plus the tech stack of different chains is different).

Third-party service providers are professional teams operating a large cluster of nodes. They are responsible for solving all the above problems and providing the nodes to projects in the form of APIs. Such services often support multi-chain and are packaged with various operation and maintenance modules.

At present, some services like banks' core businesses still require the usage of physical servers, though cloud service has become mainstream. Undeniably, some Web3 applications also need to self-build nodes to ensure security. But in most scenarios, using third-party services is the best cost-effective solution.

The operational architecture of Web3 applications is shown in the following:

3.jpg

Signing is required to initiate a transaction

ā€œWriting data on the chainā€ is often referred to as ā€œinitiating the transactionā€.

We can freely read data stored on the chain after a Web3 application is connected to the blockchain network through nodes (whether self-built or third-party). However, if we want to "write" data, signing with a private key is needed before initiating a transaction. As a Web3 user, you must have experienced the scenario where you're required to sign via a wallet.

The asymmetric cryptography behind public and private keys is complicated and will not be explained in this article. Ordinary users can take the "private key" as the key to unlock on-chain identities (that's why you should keep the private key in a safe place). Wallets like MetaMask are key management tools. For example, MetaMask is browser-based and stores private keys in caches (either on phones or PCs), when a transaction is needed, it will be activated for the user to click and sign.

Take the Web3 version of Twitter as an example. When you like a piece of content, because the like action needs to be recorded on the chain, after you click like on the frontend, the application needs to activate the wallet which will ask you to sign for additional information. The transaction will only be initiated after the wallet is signed.

At this point, the operational architecture of Web3 applications becomes the following:

4.jpg

Decentralized storage solutions

As the cost of storing a large amount of data on the chain is very high (because of gas fees), we usually do not store all the data on the chain. Therefore, distributed storage solutions such as IPFS and Swarm are needed.

In both cases, although the data is not stored on the chain, the peer-to-peer distributed file system they rely on helps avoid the potential monopoly of centralized databases and thus the data cannot be tampered with by centralized parties.

In particular, some applications even store their frontend codes on IPFS/Swarm, aiming for the highest degree of decentralization.

So now our Web3 application architecture looks like this:

5.jpg

Utilizing data on the blockchain network

Interaction with blockchain networks can be divided into two parts: writing data and reading data. And the latter has much bigger use scenarios than the former - we can jump to this conclusion by thinking of the proportion of viewing content and creating content while using Twitter. Besides, reading data is free whereas writing data requires gas fees.

However, reading data has a higher threshold than writing data. Because the blockchain network is a distributed ledger, the transactions packaged in each block are different (transactions initiated by global users are recorded in a fixed time period, so the content of each record must be different), and the data is structured in linked lists. Linked-list data cannot be utilized directly, we need to decode and restructure the on-chain data first, and then query/index this data.

In addition, different business scenarios have different requirements for data, so not all decoded data is ready-to-use. Therefore, engineers are required to clean and process data according to their own needs. Hereā€™s what they should do to ā€œreadā€ on-chain data:

  1. Decoding standard protocols: ERC20, ERC721, ERC1155, etc. are commonly used protocols;
  2. Decoding smart contracts: this part is often non-standard because of differences in engineering and business logic;
  3. Cleaning and processing of data: according to business needs, eliminate redundancies and filter out valuable data;
  4. Restructuring data: transform the linked-list data into a structured database for data query/index.

Just like node services, there are also teams dedicated to providing the services of reading data, usually delivered in the form of APIs.

The following figure is the Web3 application architecture we have now explored:

6.jpg

Sometimes, we still need servers

Not all data and backend codes in Web3 applications will be on-chain since there are high costs associated with storing data and deploying smart contracts on chains. Besides, chains have performance limitations.

Similarly, not all projects need to store the remaining off-chain data in a distributed storage system, which is far from mature compared to a centralized database. It may sacrifice user experience, and greatly increase R&D costs as well as operation and maintenance costs.

Therefore, most applications still operate their own databases and backend codes off-chain and only put the most core business on the chain. For instance, smart contracts defining the core business logic as well as users' core assets are generally on-chain, while NFT's metadata is often stored in distributed networks to improve credibility at the user end (if it's stored in centralized databases, there might be the risk that users cannot view/access the NFT due to settings imposed by centralized entities).

As for less core functional logic and data, using Web2 servers is still a common choice. For example, Web3 social tools rarely store chat information on the chain (due to cost, experience, and privacy protection), and GameFi projects generally only put user assets-related data (rather than the frontend, backend, and database) on chains.

Building an off-chain server to handle non-core data and business logic can maximize user experience as well as minimize R&D and operation and maintenance costs.

As shown in the figure below:

7.jpg

Most of the time, we need multi-chain support

Web3 is currently a multi-chain ecosystem, and this pattern may exist for a long time.

Projects often choose a chain as the main chain, but they also consider supporting other chains for wider coverage of the client base as well as higher user volume. In particular, platform-based applications are required to support multi-chain, otherwise, the data may be incomplete and usability might be greatly reduced.

The blockchain trilemma is a widely accepted concept. That is, for decentralization, scalability, and security, a chain can only take two out of three. Different chains have different tradeoffs, so the tech stack of chains is different. It is particularly obvious between EVM and non-EVM chains.

When we need to support multiple chains, the R&D and operation and maintenance costs will be multiplied. Every time we add a new chain, repetitive work has to be done.

Now we have a relatively complete version of the abstracted Web3 application architecture:

8.jpg

Third-party solutions currently available

Combining all the content above, we know that we should do the following to create a Web3 application:

  1. Write smart contracts that define core business logic and deploy them on the chain
  2. Use distributed storage (on demand)
  3. Support tools such as MetaMask for users to confirm transactions
  4. Build and maintain node clusters to interact with the chain
  5. Decode the on-chain data and process it according to the requirements
  6. Establish off-chain servers (on demand) to structure data for more efficient queries and indexes
  7. Develop APIs to support frontend calls
  8. Support for multi-chain

Most of the work above points to the interaction between applications and blockchain networks. Self-built solutions require a lot of time and resources, not to mention that users will not choose an application just because it has a solid infrastructure - they care more about what problems the product actually solves. To conclude, self-built infrastructure is a high-investment low-return choice.

Confronting this, projects can also use third-party services. For example, the one-stop, cost-effective, high-efficiency solution, Chainbase.

Chainbase is a Web3 interaction layer infrastructure that provides multi-chain data and node APIs, and supports generating custom APIs via SQL. It can greatly lower the threshold for accessing and utilizing the blockchain network and enable projects to spend more time and resources on the application itself instead of infrastructure work.

Now we have the complete version of the Web3 application architecture:

9.jpg

About Chainbase

Chainbase is a leading Web3 blockchain interaction layer infrastructure. By providing cloud-based API services, it helps developers quickly access and utilize blockchain networks and easily build Web3 applications.

Chainbase makes blockchain interaction and data query/index on chains simple and easy to operate. Anyone can use, build and publish open APIs, which allows developers to focus on application-level innovation instead of solving the back-end hassles.

Chainbase currently supports Ethereum, Polygon, BSC, Fantom, Avalanche, Arbitrum, and other chains. This allows projects of all sizes to quickly reduce development time and costs, no matter which chains they are building on!

Want to learn more about Chainbase? Visit our website: chainbase.online Sign up for a free account. and Check out our documentation

Website | Blog | Twitter | Discord | Link3

Extended Reading

  1. The Decentralization Dilemma: From the Banning of Tornado Cash to the Future and Development of Web3 Infrastructure
  2. Chainbase Provides Global Customers with Stable and Secure Web3 Infrastructure on AWS
  3. Chainbase Integrates with Avalanche to Power Developer Ecosystem