Make Services Fault Tolerant & Supportable for Production Use

January 3, 2015

A lot of teams are building services for clients both internal and external to your organization. Typically, there is quite a bit of focus on succeeding from a functional sense – did we get the key requirements addressed? does it cover the plethora of rules across markets / jurisdictions? and so on. Some of the more experienced teams, consider the non-functional aspects as well – e.g. logging, auditing, exception handling, metrics, etc. and I talked about the value of service mediation for addressing these in an earlier post.

There is an expanded set of capabilities that are also necessary when addressing non-functional requirements – those that are very relevant specially when your service grows in popularity and usage. They fall under two categories: operational agility and fault-tolerance.  Here are a few candidate capabilities in both these categories:

Operational Agility / Supportability:

  • Ability to enable / disable both services and operations within a service
  • Ability to provision additional service instances based on demand (elastic scaling)
  • Maintenance APIs for managing resources (reset connection pool, individual connections, clear cached data, etc.)
  • Ability to view Service APIs that are breaching and ones that are the risk of breaching operational SLAs
  • Model and detect out of band behavior with respect to resource consumption, transaction volumes, usage trends during a time period etc.

Fault Tolerance:

  • Failing fast when there is no point executing operations partially
  • Ability to detect denial of service attacks from malicious clients
  • Ability to gracefully handle unexpected usage spikes via load shedding, re-balancing, deferring non-critical tasks, etc.
  • Detecting failures that impact individual operations as well as services as a whole
  • Dealing with unavailable downstream dependencies
  • Leveraging time outs when integrating with one or more dependencies
  • Automatically recovering from component failures

In future posts, I will expand on each of these topics covering both design and implementation strategies. It is also important to point out that both these aspects are heavily interconnected and influence each other.


Track Service Reuse Metrics

December 24, 2011

Service driven systematic reuse takes conscious design decisions, governance, and disciplined execution – project after project. In order to sustain long running efforts such as service orientation, it is critical to track, report, and get buy-in from senior management in the organization. So what metrics are useful? Here are a few:

  • Total number of service operations reused in a time period
  • Total effort saved due to systematic reuse in a time period
  • Number of new service consumers in a time period
  • Number of new consumer integrations in a time period (this includes integrations from both new and existing consumer
  • Service integrations across transports/interface points (for instance, the service operation could be accessed SOAP over HTTP, or as SOAP over JMS, or REST, etc.)

What metrics do your teams track?


Abstract Utility Functions for Effective Service Reuse

February 23, 2011

The Utility abstraction pattern enables non-business centric logic to be separated, reused, and governed independent of other services.

Utility Abstraction pattern encapsulates cross-cutting functionality across service capabilities into a separate component. For instance, cross-cutting functionality such as logging, notification, and auditing are required by several service capabilities. Dedicated services based on the utility service model guided by enterprise architecture specifications.

This pattern provides several benefits: 

  • Reduces and potentially eliminates cross-cutting logic from individual services. This keeps the individual services domain-specific and lightweight.
  • Reduces and potentially eliminates redundant cross-cutting logic that might be implemented across the service inventory. This will reduce development and testing costs while minimizing duplication.
  • This pattern also enables the reuse of utility capabilities across both business processes and service capabilities.
  • In addition to being the central component for a cross-cutting function, this abstraction will facilitate changes to implementation. For example, logging may use a file store and later switch to a database-driven solution. Likewise, with a utility abstraction component, it is simpler to migrate to an alternate provider – replace an in-house implementation with a cheaper cloud provider.

Separating the utility component also has benefits from a non-functional standpoint. The utility function can be executed asynchronously (to save response time for a service) or additional instances can be supported to enable concurrent processing.

This pattern also makes it easier to perform additional functions surrounding the core cross-cutting function. Taking logging as an example again, if archiving/backup policies change there will be one piece of logic to update rather than touch individual services. It is important to note however that this pattern can increase the size, complexity, and performance demands of service compositions.

There are a variety of strategies to realize this pattern – via service mediation layer in a service bus based architecture or using a lightweight proxy that intercepts service methods.


Service Enable Business Processes

October 9, 2010

Business processes hosted in a BPM container (whether running in-process or as an external engine) need to be exposed to consumers in your enterprise. Here’s where your SOA efforts can come in handy – these business processes can be exposed as services via interfaces provided in the service mediation layer. Doing so decouples the vendor specific business process interfaces from enterprise consumers and allows for additional horizontal capabilities to be hooked in (metrics capture, monitoring, authentication, authorization etc.).

So what can these services do? Here’s a very simplified diagram of how these services fit in:

These services are commonly referred to as Task Services and they encapsulate operations on business processes.  Common domain-relevant schemas can be leveraged from other services efforts – facilitating consistency and reduce data transformations between the business and services layers.


%d bloggers like this: