Pollum Infra Report for Syscoin Community
Report Period: JUL-SEP
Report Date: 22nd September
Table of Contents
Executive Summary
The main focus from July to September was to set up and establish the sysnevm RPC nodes for L1 and L2. To accomplish this objective first step was to understand the new L2 pieces and it’s params to get a node working, and then all efforts were to move the L2 to a reliable infrastructure.
There were some blockers and challenges on the road, especially when trying to deploy the RPC at our current infrastructure. During the process, the infrastructure had to be reorganized and a different approach was used to support the L2’s behavior and also take advantage of L2’s pieces for L1 to avoid extra costs.
The main challenge for the Rollux network was that the deployment time for each of the clusters increased a lot and the RPC’s deployment had to be done several times due to new chain updates, testing, and debugging.
However, it was possible to have new improvements done for the RPCs during the period. The most significant was:
- deploy of L1 and L2 in the same instances avoiding extra costs;
- eth-proxy development with the following features:
- a cache behavior based on EIP-1898;
- contract blacklist;
- metrics collector;
- WebSocket support;
- eth-proxy replaced the nginx proxy;
- started the migration from the global accelerator to Cloudfront;
- costs reduction by changing the ec2 instances and using fewer disks space;
- optimization of the tools panel to verify RPC usage metrics
During that period some additional deploys and improvements were done for the graph nodes. The main was:
- Deploying a system file node to L2 and the graph node that consumes that file node, allowing Pollum’s and syscoin partners to deploy subgraphs at the Rollux network, a vital stepstone for DeFi development ;
- The deployment of a graph node for Rollux testnet, making it even easier for partners to onboard the blossoming DeFi scene at Rollux;
- Improvement of graph node deployment using container orchestration.
As the graph node deployment is well known, the main challenges faced were related to the Rollux node deployments as an archive node;
Services
RPC L1/L2 Public Infra
Overview
-
Endpoints Breakdown:
- l1
- mainnet:
- https://rpc.syscoin.org
- wss://rpc.syscoin.org/wss
- testnet:
- https://rpc.tanenbaum.io
- wss://rpc.tanenbaum.io/wss
- tools:
- mainnet
- testnet
- mainnet:
- l2
- mainnet:
- https://rollux.rpc.syscoin.org
- wss://rollux.rpc.syscoin.org/wss
- testnet:
- https://rollux.rpc.tanenbaum.io
- wss://rollux.rpc.tanenbaum.io/wss
- tools:
- mainnet:
- l1
-
Status:
- https://rpc.syscoin.org - Operational
- was://rpc.syscoin.org/wss - Operational
- https://rpc.tanenbaum.io - Operational
- was://rpc.tanenbaum.io/wss - Operational
- https://rollux.rpc.syscoin.org - Operational
- wss://rollux.rpc.syscoin.org/wss - Operational
- https://rollux.rpc.tanenbaum.io - Operational ( Running in a dedicated machine without Pollum’s proprietary infrastructure )
- wss://rollux.rpc.tanenbaum.io/wss - Operational ( Running in a dedicated machine without Pollum’s proprietary infrastructure )
- https://tools.rpc.syscoin.org/ - Operational
- https://tools.rollux.rpc.syscoin.org/ - Operational
- https://tools.rpc.tanenbaum.io/ - Operational
- https://tools.rollux.rpc.tanenbaum.io/ - Operational
-
Updates:
- The infrastructure for sysnevm’s RPC is not using ECS anymore due to the complexity aggregated by the L2 nodes. To keep the same end result of easy scalability and maintenance we are now using scripts to manage the containers inside the ec2 instances. Every time a new instance is started, the script runs to set up all the needed environments and bring the containers up.
- The infra is being migrated to Cloudfront, aiming for smaller latency.
- The new L2 nodes are up and running without bugs.
- We have an eth-proxy (based on GitHub - shalzz/ethereum-worker: A caching layer for an ethereum node using Cloudflares CDN and Cloudflare workers) with a customized code to cache Ethereum nodes, which has provided us with a good performance in terms of cache.
- Tools API and fronted refracted and updated to get the nodes in this new infrastructure
Short Description
The Rollux testnet was set up and validated by the sys team and then the eth was tested, first in Cloudflare which was discarded as a good option, due to the price and multiple limitations generated by running it at Cloudflare. The last test was run alone inside a container. To make the eth-proxy work outside Cloudflare it was needed to refactor the code to work with express, the wss-proxy and blacklist support were also included in the eth-proxy together with the metrics API for the tools frontend.
Once the eth-proxy was finished the next step was to deploy the L2 to the mainnet. At this point, the infrastructure code for sysnevm was updated and changed to work only with auto-scaling groups and load balancers without ECS and using cloudfront instead global accelerator.
A meeting had to be done with a sys team member to solve some mismatches between the configs being used by the infrastructure. After the call, the mainnet node ran without issues. However, the testnet L2 official infra still faces the same issue when starting on our infrastructure.
RPC L1/L2 Archive Infra
Overview
-
Endpoints Breakdown:
- L1
- 18.188.59.171:8545
- L2
- 13.59.22.26:8545
- L1
-
Status:
- 18.188.59.171:8545 - Operational
- 13.59.22.26:8545 - Operational
-
Updates:
- A new sysnevm archive node was set up for internal usage. The main service using this node is an instance of the graph
Short Description
The nodes are configured in two separate instances. The L1 node is using a custom container image builder from version v4.2.2 of sysnevm. The L2 nodes use the current automation provided by syscoin at the Rollux repository.
Blockbook
Overview
-
Endpoints Breakdown:
- UTXO Mainnet
- UTXO Testnet
-
Status:
- https://blockbook.elint.services/ - Operational
- https://blockbook-dev.elint.services/ - Operational
- Updates:–
Short Description
The UTXO blockbooks are the first service we started providing to Syscoin and have been operational for a long time due to their smaller usage more in a maintenance mode guaranteeing it to be operational. It is built on top of a simpler infrastructure: a load balancer and 3 nodes on EC2 guaranteeing it’s always operational for the community.
There are a few open tasks which are not been concluded as of the release date of this report which are:
- Reduce to two nodes as along this year usage never exceeded 70% of the CPU power
- UI small adjustments at the navbar
- Scatter the two nodes in America and Asia/Europe guaranteeing more efficiency for people across the globe
The Graph Nodes
Overview
-
Endpoints Breakdown:
- L1
- 18.224.43.215
- L2
- mainnet
- 18.117.235.62
- testnet
- mainnet
- L1
-
Status:
- 18.224.43.215 - Operational
- 18.117.235.62 - Operational
- rollux.graph.rpc.tanenbaum.io - Operational
-
Updates:
- New the Graph node dedicated to L2
- New the Graph node dedicated to the L2 testnet
- The deployment is being automated using docker-compose and shell scripts.
Short Description
All the graph nodes are running in dedicated machines. Both nodes for the mainnet are running in x86 architectures while the testnet node is running in Arm architecture aiming for lower cost.
There is no scalability dedicated infrastructure configured for the graph nodes. All of them are running only in single ec2 instances and are accessible by IP. Only the testnet the graph node has a DNS for partners.
As we progress in the testing (infra adjustments mostly) and stability of these nodes they will have open DNS for community usage
New Developments
There are some new developments being done for sysnevm and for the graph nodes. For syscoin, the nodes were restructured based on the Rollux container orchestration provided in the repository, but the code was changed to fit some infra requirements to deploy both L1 and L2 together and make them public without increasing the costs.
On the infra side, we also had the CloudFront tests to better distribute the RPC nodes around the world. Cloudfront can act similarly to Global Accelerator redirecting the requests to the nearest node, decreasing the latency, and also counting with a cache structure available on multiple servers even closer to where the requests are being made. This is still being studied to provide better latency.
The RPC tools page is being improved, it was refactored to use Nextjs and now it’s also using the eth-proxy metrics.
There are plans to add API keys for partners and an overall rate limit to all publicly available services, including graph nodes and RPCs, to avoid future DDOS or spamming possibilities.