Antipode: Enforcing Cross-Service Causal Consistency in Distributed Applications

Modern internet-scale applications su � er from cross-service inconsistencies , arising because applications combine multiple independent and mutually-oblivious datastores. The end-to-end execution � ow of each user request spans many di � erent services and datastores along the way, implicitly establishing ordering dependencies among operations at different datastores. Readers should observe this ordering and, in today’s systems, they do not. In this work, we present Antipode, a bolt-on technique for preventing cross-service consistency violations in distributed applications. It enforces cross-service consistency by propagating lineages of datastore operations both along-side end-to-end requests and within datastores. Antipode enables a novel cross-service causal consistency model, which extends existing causality models, and whose enforcement requires us to bring in a series of technical contributions to address fundamental semantic, scalability, and deployment challenges. We implemented Antipode as an application-level library, which can easily be integrated into existing applications with minimal e � ort, is incrementally deploy-able, and does not require global knowledge of all datastore operations. We apply Antipode to eight open-source and public cloud datastores and two microservice benchmark applications. Our evaluation demonstrates that Antipode is able to prevent cross-service inconsistencies with limited programming e � ort and less than 2% impact on end-user latency and throughput.


Introduction
Modern internet-scale applications, such as social networks, online forums and e-commerce sites, are global-scale, decentralized distributed systems that comprise many dierent services and back-end datastores.In these systems, the end-to-end request ow of end-user interactions is complex, spanning multiple dierent services and machines and involving interactions with multiple dierent datastores [2,23,24,48,58,65].Popular design patterns like microservice architectures further reinforce this complexity: services are loosely coupled, each implements a small slice of application logic, and each makes independent choices of datastores and consistency model [34,42].
Cross-service inconsistencies are a new challenge that arises in this setting.Although consistency is well-studied in the context of individual distributed datastores, new issues appear when an application uses multiple independent distributed datastores.In particular, a single end-to-end request can make multiple writes to multiple dierent datastores over the course of its execution; these writes are issued by the dierent services (and machines) traversed by the request.As a whole, the request establishes an implicit visibility ordering for its writes, which readers must respect if they are to be consistent.
However, in today's systems, datastores are independent and mutually oblivious.Each datastore implements its own consistency model; there is no coordination between datastores when replicating updates; and no single service has global knowledge of all datastore interactions of an end-toend application request.Consequently, existing systems can neither detect cross-service consistency violations, nor enforce a visibility ordering for readers.This challenge has emerged recently, from both user bug reports [29][30][31][32] and reports from practitioners [2,4,42,59,66].
In this work, our goal is to provide developers with principles and tools to prevent cross-service inconsistencies in distributed applications.This goal is particularly challenging due to the size and complexity of the collection of services and their interaction patterns, as we further motivate in §2, using experiments with multiple services and a large-scale trace from Alibaba [48].Furthermore, the fact that dierent systems are developed independently calls for a solution that can be incrementally adopted and deployed by each system that wants to prevent this class of inconsistencies.
To address these challenges, we present Antipode, a system that enforces cross-service causal consistency for applications with requests that span multiple processes and interact with multiple datastores.The design of Antipode brings together four main concepts, all of which are essential to the eectiveness and practicality of our solution.
First, we extend Lamport's causal consistency model to a new cross-service causal consistency denition.The new denition introduces the concept of lineages, embodying the dependent actions of a request across multiple processes.Furthermore, developers can select the relevant subset of dependencies that are amassed as requests percolate, leading to a sensible balance between semantics and scalability.
Second, we designed Antipode in a way that does not need global knowledge of all services and datastores.Instead, services only need to communicate lineage metadata with the end-to-end execution ow of requests and within datastore operations, piggybacking on existing request context propagation mechanisms [50,54].
Third, to avoid stalling every read to check for possible incoming dependent updates, Antipode allows developers to selectively enforce causal relationships through a simple and generic API, with a service-specic implementation.This is key to avoiding user-visible delays, while also decoupling the generic specication aspects from the implementation that is specic to each service.
Finally, to facilitate an incremental deployment on top of existing service implementations, Antipode takes a pragmatic bolt-on approach, inspired by prior work in causal consistency [14], where a service developer integrates Antipode as an application-level library and datastore shim.
We demonstrate the practical benets of Antipode through experiments using eight popular cloud and open-source datastores (MySQL, Dynamo, Redis, S3, SNS, AMQ, MongoDB, and RabbitMQ), an end-to-end evaluation on the DeathStar-Bench [36] and TrainTicket [66] microservices benchmarks, and a public-cloud microbenchmark.Our experimental evaluation pinpoints the existence of cross-service inconsistencies in these systems, and shows that Antipode is able to eectively prevent them with minimal performance impact.
In summary, the contributions of this paper are as follows: • We exemplify empirically how cross-service consistency violations arise both in open-source benchmark applications and using public cloud datastores.• We propose cross-service causal consistency (XCY), an extension of Lamport's causal consistency to systems where requests span multiple processes and interact with multiple datastores.
• We present Antipode, a system designed to enforce cross-service causal consistency.The design of Antipode brings together the four main concepts mentioned above, to produce a solution that is (1) scalable, (2) performant, and (3) easy to deploy through an incremental integration with existing microservice systems and datastores (in merely tens of LoC).
• We demonstrate experimentally that Antipode can effectively enforce cross-service causal consistency with minimal impact on end-to-end performance.
The remainder of this paper is structured as follows: in §2, we motivate how the increasingly complex architecture of modern applications renders them vulnerable to crossservice inconsistencies.§3 delves into the challenges associated with enforcing cross-service consistency and presents the insights oered by Antipode to tackle them.In §4, we dene XCY and introduce the notion of lineages, and in §5 we detail how Antipode eectively tracks and enforces these concepts.The design and implementation details of Antipode are discussed in §6, and in §7 we present an experimental evaluation of Antipode using a combination of public cloud datastores and microservice benchmarks.In §8 we compare Antipode to existing approaches, and in §9 we conclude by summarizing our ndings.

On the complexity of modern applications
At an intuitive level, the more complex an application's architecture and the patterns of interaction between its components, the higher the chances of cross-service inconsistencies occurring -in particular when a single request accesses multiple datastores and triggers numerous state externalizations.This is particularly relevant for modern applications [2,23,24,58,65], as they prevailingly resort to architectural patterns that prescribe a loose coupling of services, using various dierent consistency models for datastores, and allowing for independence between dierent services [34,42].Furthermore, the complexity of end-to-end request ows and resulting graphs of interacting services can be daunting: a single user request may span hundreds of sub-queries that traverse multiple microservices [2,36,59].We therefore argue that these architectural features are fertile ground for cross-service inconsistencies.
Substantiating this claim through real-world examples is, however, a challenging task.While there are many anecdotal reports of inconsistencies in large-scale services, they rarely have enough detail to diagnose whether their cause is due to cross-service issues or not.(An exception is a report from Facebook [2], which we elaborate on in the next section.)Fortunately, Alibaba recently released a comprehensive dataset [48], from which we can glean deeper insights regarding both deployment size, and, more interestingly, the respective request patterns.In terms of the scale of the deployment, we found that out of Alibaba's more than 17k microservices, more than 80% are stateful services, namely databases, caches or message queues.The prevalence of stateful services in the requests' large call graphs is also very high.In fact, Fig. 1 shows that more than 20% of requests perform 20 or more calls to stateful services.Moreover, more than half of the requests call 5 or more unique stateful services, with 10% calling more than 20.Furthermore, Alibaba's situation is by no means an exception: Uber has also disclosed comparable ndings regarding the intricacies of their call graphs [65].In particular, they report that a single request can call up to 1400 unique endpoints, has an average of 112 RPC calls per request -and a maximum of 275k -and has an average request depth of 8.5 and a maximum of 35.Overall, all these ndings conrm not only the sheer scale of modern large-scale applications, but also how complex the call graphs of a typical request have become.Furthermore, the analysis shows that it is common for a single request to externalize state through multiple stateful services, which we argue can lead to cross-service inconsistencies.
Next, to motivate our work more concretely and precisely characterize cross-service inconsistencies, we abstract away from the complexity of Alibaba's trace, and focus on a distilled, but realistic example that we use throughout the paper.

Example: Post-Notication application
The following motivating example, which closely follows a description of a real problem in Facebook's infrastructure [2], illustrates cross-service consistency violations, and highlights how and why these occur in distributed applications.We consider a simplied version of a post-notication application, depicted in Fig. 2. In this application, users can upload posts and followers receive notications, similarly to applications such as social networks, online forums, and e-commerce sites.
Internally, the application comprises four dierent services, each responsible for a dierent task in the end-to-end request ow, namely: a post-upload service that works as a proxy for the clients; a post-storage service responsible for storing and processing the contents of posts; a notifier service in charge of disseminating notication events; and a follower-notify service that noties followers of new posts.
Cross-service consistency violations can occur in this application: followers in Region B can be notied of posts that do not yet exist in that region.Concretely, we step through the end-to-end request ow to illustrate how this can occur: ¿ A user in Region A invokes the top-level post-upload API, which internally makes an RPC to post-storage passing the post data as argument.¡ post-storage writes the post data to its internal replicated datastore, returning a post ID as the RPC response.¬ post-upload makes an RPC to notifier passing the post ID and user ID. notifier writes a notication event to its internal replicated queue.√ Eventually, the notication arrives to Region B, where it is dequeued by notifier and delivered to follower-notify.ƒ To deliver notications, follower-notify rst retrieves the post data by calling post-storage and passing the post ID. ≈ post-storage reads the post data from its internal replicated datastore and replies to follower-notify.∆ follower-notify delivers the post and notication to followers in that region.
The cross-service inconsistency is visible in step ≈ of this request ow.The internal replicated datastore of poststorage might not have replicated the post data to Region B yet, and thus reading the post could yield an empty or stale result.This inconsistency occurs because our example uses two independent datastores in the end-to-end ow of application logic: the posts datastore at ¡ and the notications queue at ¬. Individually, each of these datastores correctly implements some consistency model.Yet together, we fail to achieve our expected consistent behavior: post data should be visible by the time the notication event is delivered.

Exploring cross-service inconsistencies
To further highlight the risk of cross-service inconsistencies and their sensitivity to deployment aspects, we performed an experiment to measure these inconsistencies in several cloud datastores from Amazon Web Services (AWS).We implemented the post-notication example application described in §2.2 and deployed the services across two global regions.We measure inconsistencies as previously described: how often a notication is received before the corresponding post can be read.We used multiple dierent cloud datastores, namely MySQL (a RDBMS), DynamoDB (a NoSQL database), S3 (an object store), and Redis (an in-memory cache) for posting data; and SNS (a publish-subscribe system), AMQ (a message broker), and DynamoDB for notication events.Table 1 details the percentage of observed inconsistencies.The results show that some combination of systems experience more inconsistencies than others.For example, using S3 as post-storage, we observed numerous inconsistencies across all notifier datastores, suggesting a slower internal replication.In contrast, with DynamoDB as notifier, we observed low rates of inconsistencies across all post-storage datastores, suggesting a less optimized replication for the notication's specic type of payload -which enables the post to replicate sooner than the notication.We study this scenario in more depth in §7.

Challenges & Insights
To enforce cross-service consistency in the challenging setting of large-scale deployments with several independent microservices, the design of Antipode addresses the following main challenges through the corresponding key insights.

Extending causal consistency
First, we need to understand at a conceptual level what is the disconnect between the current view on causal consistency and the architecture of modern distributed systems, and how to address that mismatch.Recall that the original denition of causality stems from the happened-before partial order dened by Lamport [43], which states that two events are (causally) related by this partial order if they are either consecutive events from the same process, or the sending and corresponding receiving events for a given inter-process message (plus the transitive closure of the previous two classes).This denition assumed a simple model where a system was a collection of processes and events were either executing a single machine instruction (or subprogram) or sending and receiving messages between processes.
This denition was later extended in the context of the ISIS system to causal broadcast [16] and shortly after by Ahamad et al. to causal memory [1].In particular, the latter denition makes the observation that, in a shared memory system, communication occurs via reading and writing from shared memory positions, and therefore the causality partial order needs to be extended with a writes-into order, capturing the reading of a value that was written by another process.Insight #1.In this paper, we observe that the brave new world of microservices and large scale distributed systems no longer has a simple and logically centralized view of a shared memory, through which processes read and write all the side eects that need to be shared among them.Instead, we have complex patterns of interactions between services that stem from a single user request, as exemplied in our simple postnotication example (but translated to a much larger scale, as evidenced by the Alibaba trace [48]).To address this, we strengthen Lamport's notion of causality to have a broader view of the meaning of "writing to a shared memory", so that it encompasses all the side eects that percolate throughout dierent services -a novel concept named lineage.

Capturing cross-service dependencies
Addressing cross-service inconsistencies through a causal approach also poses inherent scalability challenges, namely due to the fact that the number of dependencies that are amassed in our target scenarios can be prohibitively large.
The most common approach for tracking these dependencies is to use vector clocks, where each entry contains the most recent version observed for each process.More recent solutions optimize this by coalescing vector clocks into a single scalar [18,27] (at the cost of requiring frequent state dissemination).Regardless of the technique used to track and enforce causal dependencies, the large dependencies trees amassed (and the corresponding metadata size) have been shown to lead to scalability and performance bottlenecks [13,17,20,21,51].We argue the eects will be largely magnied in a cross-service setting.For example, in an ecosystem as large as Alibaba's [48], this would require enforcing dependencies from possibly hundreds of services (each of which will depend on the same order of other services, and so on).At the protocol level, this implies a proportional number of entries in a vector clock, or frequent expensive cross-service synchronization calls.
On top of that, we argue that some of these dependencies may not be worth enforcing.Going back to our running example, if the post also triggered a write to a continuously updated data analytics stream, it would not make sense to wait for the entire set of analytics to be recomputed before reading any data that was produced from it.Insight #2.Our goal is to devise a system that empowers the developer with tools for capturing and enforcing these dependencies, in a way that is as automated as possible, while also giving the developer some control over the subset of these dependencies that need to be enforced.To automatically detect the relevant dependencies, we track dependent operations across services by communicating metadata, conveying this set of dependencies both within datastores and alongside end-to-end requests.This metadata can easily be extracted and conveyed through existing systems by leveraging causal tracing frameworks [40,50,53,56,60,63] already commonly deployed.Furthermore, we provide developers with an API to explicitly add and remove dependencies that will be enforced at a later stage.We argue that this combination leads to a smaller dependency graph, resulting in improved performance at scale, at the cost of requiring some intervention from developers, which our evaluation shows to be simple in practice.

Enforcing cross-service consistency
Enforcing cross-service consistency is particularly challenging within these increasingly complex environments, especially since each service individually lacks both (a) the endto-end knowledge about previous datastore operations made by other services, and also (b) the knowledge regarding the protocol implementation and semantics of other datastores.To illustrate this through our running example, notifier lacks both (a) the knowledge of the initial write to post-storage, and (b) the knowledge of the asynchronous replication that was triggered in response to the write post operation.
As straw man solutions, we could ameliorate both classes of problems by strengthening the guarantees of post-storage to make its replication synchronous, but this introduces undesirable delays that are discouraged in practice [34,47,48].We could also try to incorporate more global knowledge about end-to-end requests.For example, the notifier service could manually check the post-storage service before delivering the notication.Alternatively, all datastores could synchronize their replication progress.Generically, this requires developers to enforce consistency at an application-wide scale -which, although it is the status-quo in microservice-based applications [34,42], is precisely the burden we aim to minimize.Overall, these approaches break the design philosophy of microservice applications, which intentionally imposes strict boundaries and loose coupling between services, to enable rapid and independent development [34,42].Recently, Google introduced Service Weaver [38], a framework that enables developers to build microservice-based application using a programming model similar to writing a monolith application.While this framework helps developers tame the complexity of managing a microservice deployment, it is not meant to address either data placement or possible cross-service inconsistencies.
A related challenge is that existing approaches typically enforce the visibility of causal dependencies at either read or write operations [3,5,18,26,27,[44][45][46]55], which, in a scenario with cross-service dependencies, can inadvertently add user-facing delays that degrade the user experience.For instance, in the post-notication application, the application delivers a notication to the end-user, which triggers the fetching of the corresponding post from post-storage.However, if this read is performed with a set of cross-service dependencies, it may result in the end-user having to wait for the replication process to complete before obtaining a consistent view of both the notication and post objects.Insight #3.In order to avoid the performance penalty of checking and enforcing cross-service dependencies at every single datastore operation, Antipode decouples the enforcement into a separate barrier call.This oers a more exible approach by allowing the developer to selectively enforce the visibility of cross-service dependencies through a specic API call with a generic interface and semantics, but a service-specic implementation that is opaque to other services.Additionally, it allows the developer to select the best barrier placement in order to hide the delay from the end-users, independently of datastore operations.

Incremental deployment
Many existing solutions for preventing cross-service inconsistencies require architectural and internal changes to existing applications (as detailed in §8).However, we believe that the instrumentation for such prevention should not require a forklift upgrade of the entire set of applications: we should aim for a minimal and self-contained set of changes that allows each individual service to benet from the new consistency semantics.Insight #4.Inspired by prior work [12], Antipode takes a pragmatic bolt-on design, where its logic runs as a shim layer around existing services.This approach does not require deep changes to the internals of services or datastores (unlike FlightTracker [59]), making it a more exible and adaptable solution for gradually correcting cross-service inconsistencies.In addition, Antipode is agnostic to the interface or semantics of the services that comprise the system, and provides an API that does not require end-to-end application knowledge.

Cross-Service Causal Consistency (XCY)
In this section, we dene XCY, which, unlike prior causality denitions, captures data inconsistencies such as those exemplied in §2 and allows for an ecient and scalable design ( §5) that can be readily applied to existing applications ( §6).

Lineages
The concept of a lineage captures a tree of events (or actions) across dierent services, corresponding to the various branches that are spawned as a consequence of a given client request.For example, in the case of the post-notication example ( §2.2), there are two distinct lineages.The rst has two concurrent branches corresponding to the write post and notication operations, including their respective replications.A separate lineage starts when the follower-notify reads the notication, followed by the post read at post-storage.
The concept of lineage has actually been used extensively within the distributed tracing community [19,33,50,54,60], but was never formalized nor incorporated into Lamport's causal consistency.(We provide a formal denition of the system model and the concept of lineage in appendix.) Although lineages are a simple concept, their instantiation can be very complex.For instance, at Alibaba, user requests typically form a tree, where more than 10% of stateless microservices fan out to at least ve other services, and where the average call depth is greater than four [48].Additionally, this tree contains, on average, more than ve stateful services (Fig. 1).This attests to how scattered application state is in microservice-based applications, which makes it more challenging to track and aggregate the dependencies between states into lineages.

XCY denition
To dene XCY, we begin by outlining the abstract model on which it operates.We restrict our model to a system that encompasses two operations: write(k,v) and read(k).Generalizing to complex queries and updates would be straightforward.We denote lineages as L and we dene L(0) to be the lineage L such that operation 0 ∈ L. XCY also makes use of the Lamport happened-before relationship [43], denoted →, which relates operations that either succeed each other in the same execution thread or the sending and receiving the same message across processes.
We denote the cross-service causal order between operations as , which extends the canonical happened-before to our setting.Given two operations 0 and 1, if 0 1, we use the terms that 1 depends on 0 or 0 is a dependency of 1.Three rules dene this relationship: 1. Happened-before.If 0 → 1 then 0 1 2. Reads-from-lineage.If 0 ′ is a write operation and 1 is a read operation that returns the value written by 0 ′ , then 0 1, ∀0 ∈ L(0 ′ ) 3. Transitivity.For any operations 0, 1, and 2, if 0 1 and 1 2, then 0 2.
Note that Lamport's causality denition only uses rules 1 and 3, and therefore the additional dependencies stemming from rule 2 create a stronger denition, as we elaborate next.
We can now dene XCY consistency in a similar way to causal memory [1], i.e., by taking the operator, and imposing that each process sees the execution of operations in an order that respects , i.e.: Denition.An execution G is XCY consistent if, for each process ?8 , there is a serialization of the all write and ?8 's read events of G that respects .
Intuitively, what we capture is that, after an operation 1 from one lineage reads the value written by an operation 0 from another lineage, any further operations that depend on 1 must observe the eects of the entire L(0) (and not just 0 itself, as in classical denitions).In the context of our post-notication example, given that writing the post and the notication belong to the same lineage, then reading the notication will not only establish a causal dependency to that operation, but also to the post itself.This causal dependency will then be transitively carried to subsequent operations, namely reading the post (which thus precludes a post not found scenario).How XCY relates to Lamport's causal consistency.Given the commonalities between XCY and traditional causal consistency denitions [1,43], it is natural to ask whether one is stronger than the other.The answer is that XCY is stronger (i.e., more restrictive) than Lamport's causality.This is because a read operation adds a causal dependency to the entire lineage and not just the operation that wrote the value that was read.In other words, read operations that observe the result of a particular write operation require causally subsequent operations to observe the entire oshoot of execution branches resulting from the external client request that originally generated the observed write.
Fig. 3 illustrates this distinction between the happensbefore relation as dened by Lamport [43] and extended by Ahamad et al. [1], and XCY's denition of .This example includes two requests: request 1 (' 1 ) highlighted in orange and request 2 (' 2 ) in blue.' 1 begins with an initial action that triggers multiple subsequent events across various services.Meanwhile, ' 2 originates from reading a value previously written by ' 1 at Service A. Note that these two events, write(y) and read(y), are considered to be causally related according to traditional happens-before relation by Lamport [43] and extended by Ahamad et al. [1].Then, after that relation was established, ' 2 percolated through the application and one of its branches ended in Service B, where it performs read(x).Furthermore, one of the branches ' 1 performed a write operation, write(x), also at Service B. At this point, this raises the question of whether write(x) and read(x) should be causally related.The answer in Lamport's happens-before relation is that these events are concurrent, since they come from dierent branches of execution, meaning that they can be ordered in any way.In contrast, XCY is more restrictive than Lamport's causality denition, because as soon as there is a read of a value that was written by another lineage -such as read(y) and write(y) -a new relation is established between lineage ' 1 and the new event from ' 2 , which means that all events from ' 1 are ordered before read(y) and the causally subsequent events from ' 2 .Therefore, within XCY, read(x) should wait for the eects of write(x) to be visible.
Note, however, that for scalability and incremental deployment considerations, our implementation enables developers to relax XCY by selectively choosing a relevant subset of operations, as we will describe next.

Enforcing XCY
This section presents the main design choices of Antipode, a system that enforces XCY in a scalable way.

Tracking dependencies in Antipode
Keeping track of dependencies within lineages entails a design choice that was articulated in a recent system [12], namely between potential and explicit causality.Potential causality refers to the traditional denition [43], where all possible inuences via (transitive) dependencies are implicitly established.Transparently tracking all dependencies leads to large dependency graphs, which can degrade the performance and scalability of the system [13,17,20,51].In turn, explicit causality [13,41] requires applications to explicitly identify and declare dependencies between write operations.For example, applications can mark these dependencies based on user interactions (e.g., replying to a post establishes a dependency between the post and the reply).This results in smaller dependency graphs (and hence better performance at scale), at the cost of requiring developer intervention and knowledge.
In Antipode, we aim to strike a balance between the two approaches, namely by automating the dependency tracking process to the fullest extent, while giving developers control over relevant dependencies so that the dependency set is reduced, leading to improved performance at scale.Implicit dependency tracking.By leveraging context propagation tools, such as the ones used for distributed tracing [40,50,53,56,60,63], we are able to automatically track dependencies across the graph of traversed services that communicate via message passing.However, this is not sucient to track and propagate dependencies through replicated datastores (due to writing and then reading from the same key, in possibly dierent replicas).To address this, we use a bolt-on approach [14], where the interaction with the underlying datastores is conducted indirectly via a shim layer, allowing datastores to remain unchanged.This enables us to interpose write operations and automatically capture dependencies across replicas of traversed datastores.Both techniques allow us to collect dependencies gradually and automatically over time, as requests percolate through the application.Explicit dependency tracking.To allow developers to explicitly add or remove dependencies from the current lineage context, Antipode provides a generic append and remove API.This granular control over dependency management enables developers to, in exceptional cases, capture dependencies that were not automatically detected and remove irrelevant dependencies for an optimized user experience.
In addition, Antipode further reduces the dependency set by dropping the ongoing set when the execution ends (or when stop is called).While this is Antipode's default behavior, developers can selectively override it by explicitly calling the transfer procedure.This procedure transfers the dependency set from one lineage to a subsequent one, explicitly establishing the transitivity between them.
The rationale behind this design choice is simple: if a lineage already has a high number of dependencies, blindly transferring dependency sets between lineages might result in an explosion of the dependency graph, a challenge even traditional causal consistency approaches wrestle with [13].This is especially relevant for objects that are constantly read and written (known as linchpin objects [2]).By giving developers the ability to control this behavior, we empower them to make informed decisions to manage the dependency graph eectively and optimize performance.While this API increases the programming overhead and changes the original semantics (namely the transitivity rule of the operator) we found this to be an acceptable trade-o that promotes scalability, since, by default, long dependency chains across lineages are truncated.
As a result of the previous design choice, developers are required to use transfer to ensure a correct XCY order of the application.To illustrate this scenario, we extend our running example (Fig. 2) so that, before writing a new post, user Alice blocks her follower Bob by writing to an access control list, held in a geo-replicated storage.This results in two lineages: L 1;>2: , the request from Alice to block user Bob in the acl-storage, and L ?>BC , the request from Alice to create a post.In this case, after Alice blocks Bob, Bob should not receive a notication for the subsequent post.However, this would happen in case the acl-storage replication is slower than the post-storage replication, allowing Bob to see the notication and the post, resulting in an XCY consistency violation.To correct this scenario with Antipode, the dependency set of L 1;>2: (containing the write to the acl-storage) is copied to L ?>BC by having the developer explicitly call transfer(L 1;>2: ,L ?>BC ).
Overall, the combination of the implicit and explicit approaches (1) facilitates the tracking of dependencies within a lineage, while also (2) preventing the system from amassing huge sets of causal dependencies, and (3) allowing developers to pinpoint which lineages are logically connected.

Enforcing dependencies in Antipode
While dependency tracking provides one dimension of XCY, there is also the need to enforce the visibility of captured dependencies before a new operation takes eect.Traditionally, dependency enforcement is done implicitly, i.e., the underlying service is able to resolve a list of causal dependencies without developer intervention.In a cross-service setting, however, this approach has a signicant drawback: it requires the services to enforce the set of dependencies at every read or write operation.This is undesirable in our context since it requires frequent cross-service communication, which introduces unacceptable delays.
For this reason, Antipode uses explicit dependency enforcement, allowing developers to select the places where XCY dependencies must be enforced.To this end, developers place barrier calls in selected locations of their applications.
This primitive takes the current lineage and enforces an order of operations that is consistent with the denition of XCY by unpacking the causally-dependent operations that are currently being carried by the lineage.It then enforces the visibility of these operations at the corresponding services by invoking the barrier operation, which has the semantics of blocking until those dependent operations are made visible (or superseded by more recent operations).This approach has two advantages.First, it provides developers with a ne-tuned balance between correctness (by enforcing important dependencies) and performance (by bypassing irrelevant ones).Second, it gives developers control over the best location for that enforcement to happen, which is crucial to avoid negatively aecting the user experience.We elaborate on barrier placement in §6.3 and showcase its tradeos in §7. 4.
One argument that can be made against barrier is that it is as explicit as today's application-level solutions, since both of them require the developer to manually select its locations.What makes Antipode's approach better suited is not only barrier, but its combination with the implicit/explicit dependency tracking, which keeps services loosely coupled and does not require end-to-end knowledge of what to enforce.For instance, in the previous ACL example, barrier enforces dependencies that were automatically gathered from all datastores involved in the post lineage (acl-storage and poststorage) in a way that does not require knowledge about which systems were involved and how they are implemented.
As we mentioned, a downside of relying on developer input for enforcing the visibility of dependent operations is that not all consistency violations will be necessarily prevented, and some undesired behaviors may be observed when developers do not place the required barrier calls.We envision that Antipode can also be helpful in this context by additionally working as a testing tool: instead of exhaustively trying to prevent every possible variant of XCY violation, developers can (as part of their development cycle) use Antipode to incrementally correct them.

Antipode
Antipode is an application-level library for enforcing XCY.Table 2 outlines Antipode's API operations, and Fig. 5 illustrates their interactions.Integrating Antipode entails three .concerns for service developers.First, datastore invocations (read and write) are replaced with proxy calls to Antipode's Shim API, in the same manner as prior work [14].Second, Antipode's Lineage API piggybacks on the system's context propagation [49,54], which requires instrumentation if not already present in the system.In practice, context propagation is widespread in microservice systems, as it is a prerequisite for commonplace tools like distributed tracing [40,50,53,56,60,63].(In our evaluation, for example, context propagation already exists in the benchmark systems.)Third, to actually enforce causal dependencies, developers selectively add calls to Antipode's barrier API to block an execution until its causal dependencies are satised.End-to-end ow.Internally, Antipode comprises several components that together enforce XCY.Fig. 4 depicts the end-to-end ow of the example from §2.2, annotated with Antipode interactions; we refer to the numbers in the gure in our description.¿ The request begins at the post-upload service which starts a new lineage.The lineage is passed with the RPC to post-storage.¡ The call to write on the post-storage database is proxied via Antipode's Shim API, and the lineage is included as an argument.The write call returns an updated lineage that reects this new database operation.¬ In its RPC response, post-storage includes the updated lineage.Likewise, post-upload then passes the updated lineage with the RPC to notifier.√ The call to write on the notifier queue is proxied via Antipode's Shim API, passing the lineage as an argument.ƒ When the notication has replicated and is read in Region B, the read call also returns the lineage.≈ The notier includes the lineage in calling follower-notify.
Here, the follower-notify service calls barrier.∆ Internally, the call to barrier at follower-notify will inspect the dependencies contained in the lineage, then await replication to nish at the relevant datastores.« barrier will only return once replication has completed; follower-notify can now safely read the post and deliver the notication to followers.

Core API barrier(L) Enforces lineage's dependencies
Shim API L ′ ← write(:,E, L) Writes key along with lineage E, L ← read(:) Reads key and returns the lineage wait(L) Waits for all the lineage dependencies Lineage API L ← root() Initialize lineage in the running process stop(L) Closes lineage in the running process append(L,34?) Appends dependency to a lineage remove(L,34?) Removes dependency from a lineage transfer(L0,L 1 ) Transfers L 1 dependencies into L0 B ← serialize(L) Serializes the lineage L ← deserialize(B) Deserializes the lineage Table 2. Antipode API reference.Shim layer method arguments might change according to the underlying datastore.

Creating and updating lineages
We implement lineages as a set of write identiers.A write identier uniquely identies a write to a datastore (e.g., a 30C0BC>A4, :4~, E4AB8>= [45]).Antipode relies on the underlying datastore to generate the unique write identiers (e.g., at ¡ and √), by assuming a versioned key-object model (we believe that this does not reduce generality, e.g., Flight-Tracker [59] makes a similar assumption).Furthermore, many of the existing datastores natively oer this model, such as the rowversion of Azure SQL Database [11] or hlc in Cock-roachDB [25], while others can be trivially extended to support it, such as AWS DynamoDB [6].These identiers are later used by calls to barrier (e.g., at ≈), in order to check if writes have been replicated.Every Shim API write call takes a lineage as an argument and returns an updated lineage.The returned lineage simply extends the lineage given as input argument to include the new write identier.

Propagating lineages
Antipode propagates lineages in two dimensions: alongside end-to-end requests as they traverse services via RPC calls (e.g., ¬, ≈); and with data values as they are replicated within datastores (e.g., ƒ, ∆).Request propagation.To maintain and propagate lineages with end-to-end requests, developers use Antipode's Lineage API.At each point in time when a request is executing in a thread, it will have a corresponding lineage; typically, this is stored in a pre-existing (thread-local) request context [50].The root API call initializes an empty lineage; it is used only at the beginning of a request's execution, before any reads or writes have occurred.Conversely, the stop API call discards a lineage from the request context.In practice, stop calls are rare because execution tends to just end, discarding contexts (and lineages) in the process.After a call to write, the returned lineage is written to the request context.Services must include their lineages with all RPC requests and responses.The transfer API call establishes continuity between two lineages by combining their dependency sets (as detailed in §5.1).Datastore propagation.Antipode propagates lineages alongside data values within datastores.For each call to the Shim write API, Antipode serializes the lineage given as an argument, and writes it alongside the data value in the underlying datastore.For a subsequent call to the Shim read API elsewhere, Antipode calls read' (i.e., the read operation from the underlying datastore) and deserializes the data value and its corresponding lineage.The caller of read can then combine the returned lineage with their own current lineage using the Lineage transfer API.Datastore propagation requires datastore-specic Shim Layer implementations.However, as described below, this entailed no more than 50 LoC for each of the 8 datastores in our evaluation.

Enforcing consistency
Antipode's barrier API call enforces the visibility of a lineage.It takes a lineage as an argument and will block until all writes contained in the lineage are visible in the underlying datastores.Internally, a barrier will inspect the write identiers in the lineage and contact the corresponding datastores.For each datastore, barrier will call the datastore-specic wait API, which will block until the write identier is visible in that datastore.Note that wait is datastore-specic because visibility depends on the design choices and consistency model of the underlying datastore.Once wait has returned for all identiers in the lineage, barrier will return.
For developers adopting Antipode, placing barrier calls is the main new implementation decision.Developers are free to decide where in the code barrier should be called.Naïvely we could place a barrier call immediately preceding any read call, and this would achieve XCY.While this fully automated solution is attractive, by placing barrier on the critical path of every read request we would add unacceptable delays and lead to user-visible slowdowns (as we show in §7.4).For a better user experience, it may be more sensible to call barrier earlier in a request's execution -Antipode gives developers the exibility to make this choice.
To guide developers in choosing barrier locations, Antipode can work as a passive consistency checker by providing a dry-run mode which allows developers to simulate the enforcement of barrier locations without actually enforcing them.This procedure returns developers insights into lineages that were unable to be enforced during the rst attempt, which might hint at the presence of cross-service inconsistencies.In our experience, we found that relationships between dierent datastores can be empirically detected by developers, namely from commonalities between their data models and schemas.E.g., notications objects in notifier have a post-id referring to the corresponding post in post-storage.These foreign key-like relationships provide a practical way of identifying necessary barrier locations.
We also implemented other variants of barrier that accept a timeout, and an asynchronous barrier that triggers a callback to application code once dependencies are visible.Furthermore, we implemented a practical optimization strategy specically tailored for geo-replicated datastores.This involves implementing the wait procedure to enforce dependencies only from replicas that are co-located with its caller, thereby avoiding (whenever the underlying datastore allows it) global enforcement.

Implementation
Antipode APIs.We have implemented Antipode's Lineage API and Core API in C++ (250 LoC), Java (175 LoC), and Python (40 LoC).Antipode piggybacks lineage metadata on OpenTelemetry baggage [53] and thus requires minimal additional implementation for lineage propagation.Shim layers.Antipode requires a new shim layer implementation for any supported datastore.The shim layer's purpose is to implement datastore-specic lineage propagation and wait logic.This often requires a one-time change by the developer to the underlying data model schema, which is still preferable to changing the complex internals of the underlying datastores.Note that, in a bolt-on approach, concerns such as replication, fault-tolerance, and availability are delegated to the underlying datastore [14].
As we mentioned, Antipode is able to enforce XCY irrespective of the consistency level of the underlying systems, as long as it is possible to implement wait with at least monotonic reads [62] semantics.For example, even though DynamoDB is originally eventually consistent -which could lead to semantics that make it hard to ensure that an object is visible -we were able to implement wait by simply leveraging the available strongly consistent reads [8].We implemented Antipode Shim layers for the following datastores: MySQL, DynamoDB, Redis, S3, SNS, AMQ, MongoDB and RabbitMQ.No shim layer implementation exceeded 50 LoC.
All code is open source and publicly available [10].

Evaluation
We evaluate Antipode in terms of its eectiveness and performance by applying it to three benchmarks representative of real-world, microservice-based applications: a Post-Notication microbenchmark, the DeathStarBench benchmark [36], and the Train Ticket benchmark [66].Our analysis is structured around two questions, which capture the main costs and benets of using Antipode: • How prevalent are XCY violations, and does Antipode eectively prevent them?• What is the overhead of enforcing XCY dependencies?
7.1 Case studies Post-Notication.As a microbenchmark, we implemented a serverless version of the post-notication example in §2.2, using various public cloud geo-replicated storage solutions.In our implementation, we have two cloud functions (R and W), which access o-the-shelf services for the post-storage and notifier logic.Each client request spawns a W call, which writes a new post to post-storage, and then creates a new notication in the notifier.A new R is spawned when a new notifier replication event is received.For this scenario, we consider that an XCY violation occurs when reading a post outputs object not found.We solve this violation by placing a barrier right after the R receives the notication replication event.For the o-the-shelf datastores, we used our previously developed shim layers.These changes resulted in modifying less than 20 LoC.DeathStarBench.The DeathStarBench is a suite of microservice based common web applications.One such application is a social network application where users can perform standard actions like writing posts, following users and reading their timelines.In comparison to the rst benchmark, DeathStarBench has a scale that is closer to a realworld application: it has more than 30 unique microservices, with a mix of datastores, cache services, and pub-sub queues, mainly implemented in C++.For our evaluation, we extended the original services with geo-replication logic.
We focus on the interaction where a user publishes a new post, which causes an asynchronous task to be placed on a pub-sub queue.When that message is processed, the corresponding post is fetched, and the timeline of each follower is updated with the contents of the post.An XCY violation occurs when the follower tries to read a post, and it outputs object not found1 .Antipode solves this error with a barrier call right after it dequeues the notication object.We implemented lineage tracking by leveraging the already existing context propagation tool (Jaeger).We developed the shim layers for the respective datastores (RabbitMQ TrainTicket.Developed as a testbed for replicating industrial faults, the TrainTicket benchmark [66] is a microservicebased application that provides typical ticket booking functionalities, such as ticket reservation and payment.It is implemented in Java and it consists of more than 40 services including web servers, datastores and queues. We focus on the XCY violation that occurs when a user cancels a ticket.This operation is split into two tasks: (0) changing the status of the ticket to be cancelled, and (1) refunding the ticket price to the client.These events are performed by dierent services, interacting with dierent datastores.A violation happens when the refund (1) is delayed, resulting in the customer not seeing the refunded amount right away.This scenario was identied as a prevalent issue in the fault-analysis survey performed by the benchmark authors [66]: using asynchronous tasks within a request might result in events being processed in a dierent order, which might lead to incorrect application behavior.Unlike previous applications, no replication was needed to observe an XCY violation.Antipode was added by placing a barrier before returning the cancellation output to the user.We detail the consequences of this in §7.4.We implemented lineages leveraging the already existing context propagation tool and reused previously developed shim layers, thereby xing the violation in just 10 LoC.

Experimental setup
Post-Notication.We deployed the Post-Notication application on the AWS Lambda platform.For the datastores, we used several o-the-shelf products with built-in global replication features.For MySQL, S3 and Redis, the post object size is roughly 1MB.For DynamoDB, post objects are 400KB (which is the maximum allowed object size).Notication objects, in turn, are a notification-id,post-id pair of about 120B.For each experiment, we submit 1000 post creations requests at the Frankfurt (EU) data center, and notications were read from a data center in Central US (US).DeathStarBench.We deployed DeathStarBench on Google Cloud platform, through a Docker-based deployment.Each

Does Antipode prevent XCY violations?
As a rst experiment, we determined the prevalence of XCY violations, and validated whether Antipode prevented them.Post-Notication.Table 1 shows the percentage of inconsistencies found.The observed dierences in their prevalence is caused by dierent levels of delay by dierent services in replicating data asynchronously.
To conrm this, we ran an experiment where we allow more time for post replication by adding an articial delay before publishing the notication.Fig. 6 shows the results of this experiment, where each line corresponds to a dierent post-storage datastore type (the notifier datastore is always SNS).The results show that, as the notication delay increases, fewer inconsistencies are found.This is because, by adding a delay before publishing the notication, we allow the post to replicate sooner than the notication, and hence reduce the possibility of an XCY violation.
Finally, we report that, regardless of the combination of individual datastore consistency semantics, by applying Antipode, we saw that this inconsistency was always corrected.DeathStarBench.On average, we observed 0.1% of violations for the US→EU replication pair, whereas for the US→SG replication pair we observed 34%.We note that we found a standard deviation of 42% for the US→SG scenario.A likely explanation for the discrepancy is that this results from a mix between the network conditions and MongoDB's replication protocol (which is reported to suer in the presence of network latency [52]).When we applied Antipode, the inconsistency was always corrected.TrainTicket.On average, 0.57% violations were found in normal behavior -this relatively low value is expected for an application that has no replication, and where all services are in the same datacenter.When we applied Antipode, the inconsistency was always corrected.
7.4 What is the overhead of enforcing XCY dependencies?Preventing XCY violations imposes a coordination penalty on the application, which is materialized when Antipode enforces the visibility for a set of dependencies in a barrier call.In this set of experiments, we quantify the performance overhead of Antipode in terms of latency and throughput.In addition, we introduce a consistency window metric: Consistency Window.This refers to the time window between one client issuing an initial write and another client attempting to read the written data.We measure the consistency window regardless of whether a consistent result is returned -in baseline experiments, many attempted reads result in XCY violations.However, when Antipode is used, the consistency window represents the time-to-consistency, since the barrier call prevents progress until a consistent read is possible.Post-Notication.In Fig. 7 we show the results of an experiment that measures the consistency window for the Post-Notication application, comparing its original version with the one using Antipode.For this application, we consider that the consistency window spans from the moment the post is written at the W, until the R tries to read it.In the original application, reads are allowed to proceed immediately, despite returning an inconsistent result.With Antipode, the barrier call will block until a consistent result is available.Consequently, the consistency window of the applications increases proportionally to the datastore's replication delay.This delay is substantially dierent across datastores, and thus the consistency window varies based on how long barrier must block.For instance, AWS states that S3 can spend up to 15 minutes to fully propagate an object [9].In contrast, MySQL uses a faster replication scheme, and propagation happens within 1 second of the initial write [7].These observations are consistent with the longer consistency windows in Fig. 7, but also with the measured inconsistencies in Fig. 6: e.g., with 50 seconds of articial delay, S3 presents a 20% chance of observing XCY violations.In this case, Antipode was able to x violations by waiting for a replication conrmation from S3, which took ≈18 seconds on average.Overall, we conclude from this experiment that the overheads induced by the lineage propagation instrumentation and shim layer mechanisms of Antipode are negligible and that the increased duration of the consistency window stems almost exclusively from replication delays of the underlying systems and the consequent wait for consistency.DeathStarBench.In Fig. 8 we compare DeathStarBench with and without Antipode, under two replication pairs: US→EU and US→SG.For this application, we consider that the consistency window ranges from the post being written to the datastore (MongoDB), until reading the notication from the message queue (RabbitMQ) and fetching the corresponding post to be added to the followers' timeline.
The left side of Fig. 8 shows the throughput-latency results of the experiment from the point of view of the post writers.Since we placed the barrier call right after the asynchronous read from RabbitMQ of the notication object, the impact of the barrier call is not felt by the writer.Therefore, in this case, we only observe the eect of creating and propagating lineages and using the shim layer.Regarding lineage metadata size, we found that the maximum size was below 200 bytes.This was also the maximum metadata size across all experiments.As a follow-up, we use the Alibaba dataset to assess how metadata size would fare in a realistic deployment.Assuming the worst-case scenario where all stateful operations are part of the dependency chain, we found that for 99% of requests the maximum metadata size is below 1 KB and, on average, just 200 bytes.In addition, we assessed the impact and overhead of modifying the datastore schema to accommodate Antipode's metadata, as summarized in Table 3.The results show that the increase in average object size is under 200 bytes, which is in line with the reported metadata size.The only notable exception is MySQL, where the average object size increased by 14 KB, which we attribute to more complex data structures surrounding the new column and index created for lineage identiers.
Overall, Antipode is able to x all XCY violations without incurring in a signicant performance impact, as we observe a maximum of 2% penalty on the DeathStarBench throughput.
The placement of the barrier creates a wait for consistency that postpones notication message delivery, increasing the consistency window.The right side of Fig. 8 shows that, at peak throughput, this increase accounted for a maximum of 2ms for US→EU, and 6ms for US→SG.The extension of the consistency window is signicantly smaller than in the previous application due to the use of MongoDB, which has a faster replication strategy.
This scenario highlights the advantages of explicit enforcement, which allows developers to choose the best placement for barrier, in order to minimize the negative eects for the end user experience.In this instance, it resulted in just a small delay in the time to deliver the notication.TrainTicket.In contrast to the previous two applications, where the XCY violation resulted from a race condition between the replication protocols of two dierent datastores, in the TrainTicket application, the identied XCY violation results from the "lack of sequence control in the asynchronous invocations of multiple message delivery microservices" [66].More concretely, the two messages that lack coordination are the cancel order message, and the refund money message.In this scenario, the goal would be to have a consistent output where both the order is cancelled and the refund is issued.
In order to x this scenario, the barrier call must be placed in the request's critical path, thereby forcing the user to actively wait for the conclusion of both actions.Fig. 9 showcases the impact of this enforcement on performance.Compared with the original application, the Antipode corrected version (without inconsistent outputs) exhibits just over 15% overhead on throughput and 17% on latency.
This TrainTicket scenario highlights the trade-os between performance and correctness that barrier allows to developers.For instance, in the DeathStarBench scenario we were able to hide the consistency window penalty on the reader side -outside the writer request critical pathwhereas in TrainTicket that is not possible.Consequently, due to the placement of the barrier in the critical path, the consistency window delays the barrier imposes, are then directly reected in the throughput-latency analysis (around 4ms which corresponds to the same 17% increase).In this instance, the developer has to decide whether to expose to the user the increased latency caused by ensuring that no inconsistent intermediate states will be observed.

Related Work
Existing noteworthy systems that address cross-service consistency take one of the following approaches: they either wrap requests in other abstractions, resort to centralized coordination mechanisms, add transactions to the design, or propose an overall revision of the system architecture.Wrapping approaches.Developed at Facebook, the Flight-Tracker [59] metadata server was designed to provide readyour-writes (RYW) guarantees across a variety of datastores.It identies a user session through a ticket abstraction, to which all of a user's write operations are associated.Tickets are created and updated through a metadata server, and are passed between dierent services and datastores.
FlightTracker requires applications to correctly identify a user session, which -as its authors acknowledge -is not always easy.In contrast, request contexts in Antipode can be started on-demand without the need for user sessions.Furthermore, FlightTracker requires changes to the datastore public and internal APIs.We argue that this is a heavy implementation burden, whose cost is acceptable only to a few large industry players.Antipode requires no such changes, and is usable with o-the-shelf cloud-provider datastores.Additionally, FlightTracker is an all-or-nothing approach that does not allow for gradual corrections of violations, unlike Antipode.Coordination-based approaches.Traditionally, for multiple systems to interact, the use of a logically centralized coordination mechanism like Zookeeper [39] was the natural design choice.However, strongly consistent coordination systems introduce a performance bottleneck and go against microservice principles.In particular, the proponents of this class of architectures encourage the community to embrace eventual consistency due to its better performance and higher scalability [34].Antipode provides cross-service consistency in a way that is decentralized, can be gradually adopted, and allows for more behaviors than strongly consistent solutions.Transaction-based approaches.Distributed transactions can also be applied in this context, e.g., in the form of 2PC protocols [15].Just as coordination-based approaches, distributed transactions suer from low performance [61].Faced with this problem, the community migrated towards an approach known as Sagas [37].A saga is a continuous sequence of local transactions.If any saga transaction fails, a costly series of compensating transactions that undo the already applied eects must be run.Although sagas gained acceptance within the community [22,28], they fall short when compensating mechanisms are not possible or hard to achieve.Reversal is especially challenging when transactions trigger third-party side eects [4].Furthermore, Sagas often still rely on an orchestrator-like entity that sequences the steps of a saga.
Antipode's approach is fundamentally dierent for several reasons: (1) it is decentralized and therefore does not require any orchestrator, (2) it can be gradually adopted, since it does not require coordination between all entities in the same request; and (3) it does not require compensation mechanisms since violations are prevented in the rst place.Full-rewrite approaches.There are also proposals that opt for a complete redesign of the application, often ending up merging dierent services -and their respective datastores -into a single one.A good example of this approach is Diamond [64], where a reactive application with two dierent datastores (distributed storage and notication service), was merged into a single service that provided both functionalities.In fact, service re-design is a hot topic in the microservice community [22,35].Aegean [4] proposes a redesign of the replication layer of datastores that are based on state machine replication [57], with the intention of ensuring that dependent services always see a strongly consistent view of ongoing operations.While this approach provides strong consistency across services, it assumes that the individual replicated systems implement a serializable state machine, unlike Antipode which supports dierent underlying consistency models.
In our view, this represents an alternative approach to the problem, where Antipode has the advantage of not needing profound changes to existing code bases or protocols.

Conclusion
Microservices emerged as an architectural style that provides loosely coupled and independently deployable services, leading to good scalability, performance, and maintainability.And while they fullled this promise, data consistency was sacriced and developers were left with accepting eventual consistency as the norm.
We presented Antipode, a library with a simple yet powerful API, allowing developers to enforce XCY consistency with: (1) no need to rewrite their entire applications, (2) no global management of large-scale applications, and (3) gradual and independent adoption according to the microservice ethos.Our evaluation with eight open-source and cloudbased datastores, and using two large microservice benchmarks, shows that Antipode prevents inconsistencies with a limited programming eort and low performance overhead.

Figure 1 .
Figure 1.(Left) CDF of the number of calls to services per request, and (right) CDF of unique services called per request, on Alibaba's trace dataset.For ease of presentation, we cut o the long tail to only show values within the 95 th and 99 th percentile, respectively.

Figure 2 .
Figure 2. Request ow for publishing a post in the Post-Notication example application.

Figure 3 .
Figure 3.This example illustrates the dierence between Lamport's → and XCY's .While the red dependency is present in both denitions, the green dependency stems from the concept of lineage and is only present in .

Figure 5 .
Figure 5. Overview of the interactions between a service and the Antipode APIs.write' and read' are the original operations on the underlying datastore.

Figure 6 .
Figure 6.Percentage of inconsistencies found in Post-Notication after an articial delay was added before publishing the notication.The notifier is always SNS.

Figure 7 .
Figure 7. Consistency window in Post-Notication for the original post-storage, and with Antipode enabled.notifier is always SNS.

Figure 8 .
Figure 8. (Left) Average throughput vs. Latency and (Right) Consistency window (at peak 125 req/s) in the original DeathStarBench, and with Antipode enabled.Results split between US→EU and US→SG replication pairs.

Figure 9 .
Figure 9. (Left) Average throughput vs. Latency and (Right) Consistency window (at peak 360 req/s) with and without Antipode enabled for TrainTicket.Latency is increased due to barrier placement in the request's critical path.

Table 3 .
Average object size increase from the original applications to the Antipode enabled version.