CQRS perspective over Infinispan and A-MQ in English

Every time we take a new step to the application integration we find stoppers, but there are many technologies that help us support a vision.

When we are considering an integration strategy there are two main paradigms to support it, SOA (Service oriented architecture) and EDA (Event Driven Architecture). Both paradigms represent a new situation that the areas of Development and Operations area must face. Sharing data is no longer a concurrence problem over the data base, neither it is an over night replication; now every process must be considered as a potencial user that needs data managed by other application.

The problem

The definition of service owners reveals a problem:

Administration of the integration Impact .

 

Does a transactional system have the responsibility over the system´s needs?

Is the consumer responsible for the use of the services he does?

Can it be managed by the governance?

An alternative

Although the governance sets rules to keep a healthy coexistence, there is an alternative that allows us to separate those responsibilities. This could be done by applying a pattern called CQRS (command query responsibility segregation). I heard about this concept in a Greg Young´s article “CQRS, Task Based UIs, Event Sourcing agh!”

Now, it could be simple to ask developers to design, build and support a second data structure… (you could try, although I´m not so sure they will do it) but what´s up with legacy applications or applications not developed by us? Modifying them could be very costly or even impossible but we can take advantage of integration interfaces or extension points which let us use…. events!!!

How can we apply this pattern?

We can extend the pattern and delegate the responsibility on another application. This application should be focused on the roll of a service provider and its design should prioritize scalability.

How can we have scalability in the data layer? Betting on innovation using state of the art technologies like In-memory databases for whom scalability is the fortress. In my experience, I used Infinispan. It has a small learning curve and it is easy to scale on it. The latest versions are very robust and finally, bugs on the transaction manager and locks problems in the synchronization between instances have been fixed.

This allows us to have a database with full horizontal scalability with a data design oriented for service provision.

This database receives the information thought events (event sourcing pattern), these design advantages are:

  • Decoupling
  • Data federation from several sources

the disadvantages are:

  • Temporal inconsistence from the source. We need to manage the latency between the original transaction and the time needed to process the event.

Infinispan

Infispan is an in-memory database under the key/value scheme. Its power resides in the access speed to the data, not only because of the in-memory feature, but also for the direct access by keyword. Access through keyword represents a clear advantage if the keyword search criteria for the services we expose match those ones used when storing the data.
Infinispan can be used both as an API or as a server.
In this type of architecture, we see ISPN as a concurrence point of many applications where one writes and the rest read, for there are a series of services exposing the information.
As API, we experience:
  • Less network usage.
  • More flexibility for the processing of combined data.
  • The processing capacity and storing scalate together.
  • There is no need to overload the ESB for data processing or generate an extra layer.
As a server, we observe:
  • Integration via hotrod.
  • Flexibility in the infrastructure scalability.
  • Problems, restarting or deployment of new versions of the service layers are events completely independent from the persistence layer in Infinispan.

Integration of Application Legacy

    • If we are dealing with a commercial application, we need to analyze its integration points. In case of not having any, we still have the possibility (although not pleasant) of implementing a mechanism of pooling the database (trigger, pooling daemon, etc) to retrieve any changes.
        • How can we guarantee the notification of the totality of the changes?
          1. We can make the persistence of the event part of the transaction.
          2. We can define some mechanisms for manual/automatic retrieval for the persistence of the event.

 

From the moment we have the data loading mechanism, the memory cluster configuration is completely independent:

  • It can be replicated or distributed, depending on the amount of data to store, the robustness we want to confer to it and the data reloading time.
  • The cache store guarantees the option of recovery from a node/complete cluster crash or error. It is often inefficient or even impossible to do this from the origin    

Event bus

In spite of looking for alternatives, I always come to the same conclusion, the best way of implementing the Event Bus is by way of a JMS platform. The Bus must act as a buffer of events allowing a certain level of elasticity for the consumers and achieving both that the event persistence has a minimum impact on the performance and resource cost for the producer.

In order to guarantee the arrival of the events into the Bus, we have two strategies:

  1. The publisher application incorporates the publication as part of the transaction, meaning that
    • If the Event Bus is not available, the transaction can’t take place.
    • The transaction takes longer.
  2. The publisher application has a mechanism of persistence and retrieval, requiring of:
    • Bigger infrastructure
    • More responsibilities for the application.

Whichever strategy we choose, the objective is to guarantee a higher platform availability and scalability. This will allow to support an increase in the traffic through the platform and its availability time.

JMS strategy:

Public-Subscriber scheme: It allows scalability in the message traffic and permits incorporation of consumers without restarting or configuring the infrastructure.

Durable Subscribers: this feature guarentees the preservation of the messages until the consumer retrieves them.

Persistent Messages: this feature preserves the messages even in the event of platform restart.

The strategy consists in putting together a pub-sub scheme where a publisher sends a message on a specific topic and n subscribers retrieve it. This allows scalability with the subscribers at the time of registry when they subscribe to the topic with the same client ID. Being persistent allows us to guarantee the arrival of the message.

Conclusion

The Application integration through services and/or events generates impact problems among applications. Responsibility segregation is a good strategy at the time of facing the trouble of administrating the integration impact. In-memory databases and JMS are technologies that provide the adequate functionalities for the implementation of a CQRS architecture, allow to decouple consumers from producers and have structures designed for query.

My personal experience with Infinispan for information retrieval services and JBoss A-MQ for messaging administration has been more than satisfactory, allowing me to have a 7×24 infraestructure with message delivery guaranty and to provide high traffic services, reducing response times by 10-100 times.

Advertisements