Solving The Trillion Dollar Data Problem In Edge Computing
Introduction
The last 15 years have seen several successive waves of big data platforms and companies offering new database and analytics capabilities. Leveraging the public cloud, these large scale distributed data systems work well in centralized, single region or single data center topologies and are geared for “Read intensive” data problems.
Companies such as Cloudera, Snowflake, and more recently Databricks are examples of successful companies that have innovated with novel “read architectures” that reduce the structural cost of reading immutable data and built new big data orientated analytics apps and capabilities by exploiting their lower costs of reads.
We are now faced with new challenges with workloads and use cases that are “write intensive”. The write path is far more challenging than the read path. This is because the read path is built on immutable or non-changing data. The write path necessarily involves mutable data with varying rates of change and growth in the data sets of use cases.
Additionally, the write path today has also evolved from the client/server’s “request – response”based interaction model to new, streaming data architectures where streams of “events” need to be processed as they are created or generated by various sources.
A similar cascade of innovations is needed on the write path to solve the structural costs of data writes as big data platforms have done to change the structural costs of data reads.
We summarize the challenges of the write path in modern cloud data platforms as follows:
- Expensive Writes - (Ex: DynamoDB - writes are 16x to 64x more expensive than reads).
- Centralized architectures - Introduces large access latencies for globally distributed users and as well as large transit latencies and network transfer costs for data to be shipped to the cloud.
- Require network communication intensive and coordinated approaches to consistency such as using state machine replication or consensus and therefore are only feasible within a single cloud region (due to the need for ultra-low latency and reliable data center class networks).
- Require synchronous replication to ensure all nodes in a distributed system make forward progress on writes with transaction semantics where either all related writes are accepted Macrometa 2or all are rejected. In combination with point #3 above, replication and coordination end up being the Achilles heel for write-intensive distributed systems.
Today’s enterprise apps and cloud workloads are no longer “read intensive”, but are a combination of:
- “Write intensive” (ex: IoT, Monitoring, ClickStreams, Fraud Detection etc) or
- “Read-Write balanced” (ex: E-Commerce, Gaming, Adtech etc
This is because modern cloud native architectures don’t use just one database behind them for persistence but instead use several, specialized databases and data stores for different types of semi structured and unstructured data. This “polyglot persistence” pattern used by modern apps further exacerbates the write path and write intensity problem as now a single application level write generates a successive cascade of amplified writes to multiple data stores underneath the application.
The “Multi database/datastore mutations” pattern where an application level write results in several cascading updates to multiple independent databases or data stores causes very high levels of write amplification and resulting costs.
Attempts to solve these problems using conventional techniques of “write back” and “write through” caching no longer work for such polyglot backend patterns – one simply cannot buffer writes in a single buffer and fan out to multiple databases and data stores underneath without completely giving up on consistency and transaction semantics.
For use cases where multi-region data, or edge based data processing are needed, the problems are further exacerbated by the physical link latency, topology and reliability of the wide area network. These use cases have to contend with well known problems such as:
- Network latencies with 100s of ms,
- Unreliable & jittery networks.
Given the challenges just described, current centralized distributed data systems and the technologies that underpin them cannot be generalized well to fit edge & multi-cloud workloads.
Macrometa has been now for several years, focused on solving the challenges of the write path.We don’t just want to solve the cost and consistency problem of the write path but also enable globally distributed stateful apps & web services that can run in 10s and 100s of regions world wide concurrently with less than 50 ms end to end latency for data access by 95% of the world’s population.
This is a non trivial computer science problem because:
- It requires our platform to run in wide-area deployments where nodes forming the database may be separated by more than 100ms of latency and unreliability in the network connections.
- It requires our platform to run across 10s to 100s of data centers and yet present a single system image (SSI).
- It requires us to have a significantly better write cost structure and cost-performance ratio than current cloud data systems built on centralized architectures.
Macrometa accomplishes this with a novel architecture that combines geo-distributed, coordination free write and replication techniques and a multi-modal data platform with tunable consistency levels to solve these problems in an edge-native way.
Download the White Paper to learn more