Solana Bot Benchmark:Reducing End-to-end Transaction Latency to 100ms

Introduction

As of the publication date (2025/8/26), the true TPS of Solana still maintains above 1000, and various Bots are increasingly being applied in DeFi, NFT, and automated trading. However, many developers are still troubled by the negative impact of transaction latency on Bot's efficiency and competitiveness.

In this article, we will build a minimal Bot program based on our understanding of Solana's infrastructure, and conduct latency benchmark in Server E2E and Client E2E modes under common application scenarios such as automated trading and Trading Bots.

We will share the architectural design, design experience, and benchmark methods for high-performance, low-latency programs, providing high-performance Bot templates and performance benchmark as references. Our goal is to provide developers who pursue extreme latency optimization with basic design ideas and performance benchmark through this article (developers can independently compare our benchmark results to lower the latency of existing systems and more easily push to the limits of optimization), encouraging more functional competition while improving Bot performance.

The Necessity and Significance of E2E Benchmark

In the scope of this article, E2E (end to end) refers to the entire process from receiving a transaction signal or user intent -> constructing a transaction -> sending the transaction -> receiving shred -> confirming the transaction landing result. Referencing actual application scenarios, we divide E2E testing into two modes: Server E2E and Client E2E. Server E2E mode is suitable for automated trading programs and Web Trading Bots that plan to optimize internal system latency, while Client E2E mode is suitable for various online trading systems that need to collect user transaction intent from interactive media such as browsers or mobile apps.

In the increasingly competitive Solana ecosystem, multiple competitors exist for the same transaction signal. Optimizing at the millisecond level and reducing latency can significantly enhance competitive effects while reducing slippage losses and transaction failure rates.

Therefore, low latency is key to the competitive advantage of Solana Bots. Due to the limited reference materials for high-performance distributed systems, if developers adopt traditional design patterns and performance benchmark methods, they may easily overlook network latency, geographical factors, and external dependencies, leading to potential architectural design issues. In comparison, E2E testing can:

Identify bottlenecks from end to end: including RPC calls, transaction construction, transaction broadcasting, and transaction confirmation
Optimize architectural design: help developers think about architecture design tradeoffs based on actual benchmark data
Provide technical decision-making basis: provide technical decision-making basis for transaction confirmation through comparing the latency differences between native Shred parsing and Yellowstone (Geyser) gRPC subscription

Server E2E vs Client E2E: Benchmark Type Introduction and Differential Analysis

Server E2E

Server E2E refers to the latency benchmark method where automated programs directly interact with Solana RPC on the server side. Server E2E simulates the end-to-end scenario of an automated trading Bot from constructing a transaction to confirming the transaction landed on chain, without involving user operations on the interface side, suitable for high-frequency trading Bots that operate in batches on the server side. The Server E2E test chain is shown below:

We use Rust to build Bot programs and deploy them on Bot Servers, with the Bot program integrating BlockRazor's Solana transaction sending service. When the Bot program sends a transaction which is included in a block, it will be broadcasting in the form of shreds in the Solana network. We use native Shred parsing (integrating BlockRazor's Shred Stream) and Yellowstone gRPC subscription to confirm whether the transaction has been included in a block.

Client E2E

Client E2E refers to the transaction latency benchmark method that simulates users manually operating on the client side (Web applications, mobile apps, etc.) and interacting with the server side, suitable for Trading Bots with end users. The transaction routing is shown below:

Unlike Server E2E, the Bot program in Client E2E needs to provide trading services to end users. Therefore, based on Server E2E, we introduce Bot Client on the transaction sending end to send transactions to Bot Server through HTTP requests.

Due to interaction based on the HTTP between Bot Client and Bot Server, Client E2E introduces additional latency compared to Server E2E, including browser rendering, network proxies, DNS domain name resolution, cross-region network transmission, etc.

Summary

The differences between Server E2E and Client E2E are as follows:

Benchmark Cases: Actual Scenarios and Necessity

Method

Construct, sign, and send a set of transactions in different cases, recording the signature time of each transaction
- ☞ Parse native Shred, listen for target transactions, and if the signature of the target transaction hits, record the timestamp as the transaction confirmation time. This method does not represent the final confirmation result of the transaction, but can hit with higher probability at lower latency
- Subscribe to the gRPC stream pushed by Yellowstone, listen for target transactions, and if the signature of the target transaction hits, record the timestamp as the transaction confirmation time. This method listens for transactions at the Processed level
Set high priority fees and tips for each transaction
- Priority Fee: 0.005 SOL
- Tip: 0.003 SOL
100 transactions per group, sending one transaction every 5s

Metrics

Transaction latency: the timestamp difference between transaction signature and transaction confirmation

Cases

Server E2E Multi-Region

Server E2E Multi-Region refers to deploying independent Bot Servers in Frankfurt, Amsterdam, New York, and Tokyo, and connecting them to BlockRazor Solana transaction sending Relays in the same region, as shown below:

According to the aforementioned method, a set of transactions is sent through each Bot Server, and the transaction latency is recorded. The results are as follows:

From a regional perspective, Frankfurt has the lowest transaction latency, followed by Amsterdam, New York has higher transaction latency, and Tokyo has the highest transaction latency. E2E latency is negatively correlated with the concentration of Validators in the region.

The above phenomenon is mainly caused by the current ☞ geographical distribution of Solana staking. Assuming the Bot Server is deployed in Tokyo, while the Leader node frequently appears in Frankfurt where Validators are most concentrated, the transaction would need to travel from Japan to Germany to be received by the Leader node. Similarly, for the Leader node's block and its shreds to be received by the Bot Server, they would need to travel back from Germany to Japan.

In terms of transaction monitoring methods, parsing shreds locally has lower latency but cannot obtain the actual execution results of transactions. Comparatively, Yellowstone gRPC subscription can obtain the actual execution results of transactions but adds some latency.

Client E2E Multi-Region

Client E2E Multi-Region refers to deploying independent Bot Clients and Bot Servers in Frankfurt, Virginia, and Tokyo. Bot Clients send transaction requests to Cloudflare's multi-region load balancer with geographic proximity routing enabled. Cloudflare executes proximity routing logic after receiving the message. As shown below:

According to the aforementioned method, a set of transactions is sent through each Bot Client, with the following results:

Frankfurt: Average 142 ms, 50th percentile 118 ms
Virginia: Average 174 ms, 50th percentile 179 ms
Tokyo: Average 263 ms, 50th percentile 272 ms

Comparing Client E2E and Server E2E in the same region, we can find:

Client E2E has an average of 20-30ms higher transaction latency than Server E2E in Frankfurt and Tokyo. The high transaction latency of Client E2E is due to the addition of HTTP interaction between Bot Client and Bot Server, and factors such as browser rendering, network proxies, and DNS domain name resolution all cause latency to rise.

Client E2E Single-Region vs Multi-Region

In the Client E2E Single-Region case, we deploy a single independent Bot Server in Tokyo, with Bot Clients deployed in multiple regions sending transaction requests to the Cloudflare's load balancer with geographic proximity routing enabled. After receiving the message at the nearest PoP, Cloudflare routes all requests to the Bot Server in Tokyo, as shown below:

According to the aforementioned method, a set of transactions is sent through each Bot Client, with the comparison results with Client E2E Multi-Region as follows:

From the data analysis, Client E2E Single-Region has 335ms higher average transaction latency in Frankfurt, 148ms higher in Virginia, and 23ms higher in Tokyo compared to Client E2E Multi-Region. It is evident that Client E2E Multi-Region has significant advantages in terms of transaction latency, but multi-region deployment increases system complexity and cost. Although Client E2E Single-Region has higher transaction latency and is limited by geographical location, it is suitable for Bot services with low budgets or high user geographical concentration.

Conclusion and Outlook

Through E2E testing, we have verified the extreme speed potential of Solana Bots in different scenarios and the importance of Bot geographical location optimization.

We recommend that Bot services with strong geographical attributes can be deployed in proximity to user-dense areas, with low necessity for global distribution deployment. For Bot services targeting global customers, we recommend multi-region distributed deployment, configuring a unified domain name, and routing user transaction requests to the corresponding services through load balancing and proximity routing, achieving the lowest latency routing on the transaction request side.

In terms of transaction confirmation, native shred parsing has lower latency and is suitable for automated trading Bots sensitive to transaction latency; Yellowstone gRPC subscription has higher latency but can subscribe to transaction execution results, suitable for Bot services that are relatively insensitive to transaction latency or need transaction execution results.

We have open-sourced the code for this benchmark, welcome to visit ☞ https://github.com/BlockRazorinc/solana-e2e-benchmark to view

If you are interested in benchmark details, data reports, or more extreme performance optimization, please feel free to ☞ contact us. We are happy to provide more insights and cooperation opportunities.