EDX Markets – Building a fault-tolerant, low-latency exchange with Aeron in 7 months

In our recent Aeron MeetUp in New York, Chris Walsh, Head of Trading Systems at EDX Markets (EDXM), shared insights into the crypto-exchange’s journey of building a fault-tolerant, low-latency exchange using Aeron Cluster and Hydra. EDXM was able to build and deploy the exchange platform, achieving zero unplanned outages a year after launch and a median round trip latency of 73 microseconds:

 

Who is EDXM and what do they do?

EDXM is a digital asset technology firm catering to institutional clients. It operates a non-custodial, non-conflicted exchange and clearinghouse, hosting a spot cryptocurrency market in the U.S. and a perpetual futures market in APAC. Backed by prominent names in both traditional finance and the crypto space, EDXM aims to bring the market structure of traditional equities markets into the crypto space.

Challenges faced by EDXM in building its exchange

Initially, EDXM’s exchange was running on a vendor-owned platform, which served well during the startup phase. However, to compete and differentiate themselves longer term, EDXM needed to take control of their technology stack. They required a system that was:

  • Always correct and provided consistency
  • High-performance, with microsecond response times
  • Capable of running 24/7 with high availability
  • Allowed them to differentiate at their own pace

Why EDXM chose Aeron for its exchange platform

Aeron, as the underlying trading infrastructure, was chosen for several key reasons:

  • Performance: Aeron is the global technology standard for high-throughput, low-latency trading systems.
  • High availability: Aeron Cluster provides fast failover and ensures the system remains available with five nines uptime.
  • Deterministic execution: Aeron Cluster’s sequenced message log combined with deterministic business logic simplify troubleshooting and testing.

How EDXM uses Aeron and Hydra in its exchange architecture

EDXM partnered with Adaptive, the creators of Aeron, to build and deploy the exchange platform into production in 7 months. It utilized both Aeron itself and Adaptive’s Hydra development platform to accelerate and de-risk the project delivery. Using those infrastructure components allowed them to focus on their business logic and differentiating code.


EDXM - Exchange Architecture Revisited
EDXM – Exchange Architecture Revisited

EDXM’s exchange architecture includes:

  • Gateways: For order management, market data, and admin functions.
  • History writer and reader: For auditing and maintaining an order history database.
  • Central engine: The central engine, which is the core of their system, is responsible for running the matching algorithm and maintaining the central limit order books. It also handles pre-trade risk checks to ensure orders do not violate customer limits and manages all reference data in the system, such as users, accounts, and instruments. The engine operates as a three-node Aeron Cluster, ensuring high availability and fast failover and is deployed on-premises in their data center for production but can also be deployed in the cloud for lower environments.

The entire architecture is designed to be highly modular, allowing for easy scaling and maintenance. Each component, from the gateways to the central engine, can be independently scaled and updated without affecting the overall system. This modularity also allows EDXM to quickly adapt to changing market conditions and regulatory requirements.

Technical elements and key learnings from EDXM’s implementation

The technical implementation of EDXM’s exchange platform includes several critical elements that ensure its robustness and efficiency:

  • Single-threaded business logic: Makes creating deterministic execution possible and simplifies the domain model.
  • Deterministic collections: Careful handling of collections to ensure access to collection members is performed in a deterministic manner.
  • Command log: Persisting a log of commands allows for system state recovery by replaying the log, which is essential for state recovery and resilience.
  • Aeron Cluster: Provides resilience through replication and consensus on state among nodes.

For a detailed overview of EDX Market’s implementation, watch the full recording of Chris Walsh’s talk. Chris dives into the concepts of deterministic state transitions, snapshotting, gateways, observability and divergence detection.

Rapid deployment and performance

EDXM was able to build and deploy the exchange platform into production in about 7 months, showcasing the efficiency and effectiveness of their chosen technology stack and development approach. They now achieve a median round trip latency of 73 microseconds and after being in production for over a year, they have recorded zero unplanned outages.

Future plans for EDXM’s exchange platform

EDXM plans to continue to build on its Aeron developments. They aim to explore more creative compositions with Aeron Cluster, faster snapshotting, election controls and continued performance tuning especially on the side of business logic.
For a detailed overview of EDXMarket’s implementation and to watch the full recording of Chris Walsh’s talk, please click here >>.