Paul Vincent

Paul Vincent is CTO for Business Rules and Complex Events Processing at TIBCO. He has been applying rule engine technologies for over 20 years in financial services, government, defense, and manufacturing. He is a contributor to the OMG and W3C standards on rule and decision modeling and interchange, and to the Event Processing Technical Society. He has a Masters in Intelligent Systems and frequently lectures on real-time decisions, event processing and operational intelligence.


DEBS2012 on Event Patterns

At DEBS last month, a few members of the EPTS Reference Architecture team tutored on the latest Functional Event Patterns list – with sample and pseudocode implementations – covering all aspects of event preparation, analysis, (complex event) detection, and reaction. From a TIBCO CEP perspective, this version mostly covers TIBCO BusinessEvents rule patterns (i.e. how these functional patterns map to a standard event-based production rule pseudocode), with a few references to BE State Models and the odd BE Continuous Query [*1]. These can be viewed alongside examples from Oracle EP, IBM WODM, IBM Stream Insight and the PROLOG-based Prova.

This remains very much a work-in-progress [*2], but should give a good idea of where we are heading. In the “real world,” it should be noted that often many of these functions are combined into a single operation (eg: covering preparation/filtering, analysis/transformation, detection/composition and reaction/assessment in a single rule or query). [Read more...]

Decision Latency… Just Part of the Same Problem as Data & Process Latency

Interesting read from Gartner’s Jim Sinur, talking about decision and action latency. Jim’s talks about how Big Data is meant to solve decision latency – but it’s not at all clear how or why Big Data affects the latency of either designing or making decisions. Surely Big Data is about extractable information for improving the quality of decisions? Maybe I missed something critical in the Big Data hype. If anything, running analytics on larger data sets is sure to increase latency rather than reduce it! Of course, this is why real-time analytics are increasingly important to complement the traditional data analytics…

I would, however, agree that decision, data and process latency are key weapons in the fight against the costs of Big Data; you want to respond effectively to events as they occur, despite those event rates increasing: that means efficient decision engines, high performance data access, as well as responsive process engines. [Read more...]

Event Server as the 21st-Century App Server

Those interested in the latest use cases for CEP may want to read parts of the transcript for the TIBCO quarterly earnings call from last week.  TIBCO CEO Vivek Ranadivé commented on a few of them:

“I have spoken before as to how our event-driven software platform positions us at the convergence of the most significant technology forces of our time: big data, cloud, mobility, social networking and the shift to real time. … For example, … our real-time rules engine, which detects relevant patterns amid the stream of events; … But even more impressive than any of these individual pearls is the various powerful ways in which you can string them together…”

“The first such value pack is around what we refer to as trigger-based marketing or an event-driven approach to up-selling and cross-selling your existing customer base… It’s what we’re doing today for MGM Resorts.”

[Read more...]

Big Data vs Fast Data

There was an interesting, but not uncommon, comment (on tibbr) recently about a Proof Of Concept in the CEP space that had been completed in 3 days by the 3-person TIBCO team while the competitor team were still struggling at the 3 weeks mark. This despite, per certain analysts’ reports, this competitor being one of the “big guns” in CEP. In the past some could argue that high productivity is an opposing requirement to high performance / scalability; I would counter that in event processing they are closely related. Consider:

  • TIBCO BusinessEvents remains today one of the few CEP technologies to include integrated high performance datagrid technology – you develop the concept model with the necessary metadata and methods for interacting with that data, but have no need to step out into a different (database) environment
  • Large (Tb level) datasets can be accommodated in the DataGrid simply by organising several DataGrid service instances (and a fast interconnect!)
  • Without such data interaction, development teams are forced to involve new skillsets and problems in integrating (at best) other cache or datagrid technologies to (at worst) high-latency databases. [Read more...]

Modelling Choreography (with events, states and business rules)

This week the BCS SPA group held a fascinating session titled “Modelling Choreography” by requirements analyst Ashley McNeile.

Ashley described some of the past efforts to model and implement choreographies, using types of process algebra such as  Robert Millner’s Calculus of Communicating Systems (CCS) and its derivative Pi-Calculus. However, Ashley used sequences of events and states (i.e. a state diagram) which he also compared to Michael Jackson’s formalised object lifecycles (e.g. JSD  / Jackson Diagrams). Various W3C efforts have described choreographies too – e.g. WS-CDL. Of course the latest modelling construct for choreography is BPMN2!

As an example of his practice, Ashley described an example – modelling bank account transactions via Protocol Modelling (using simple state diagrams):

  • state model 1: defined the close and withdraw events on an active account
  • state model 2: defined the freeze and release account events
  • state model 3: this had no state transitions, but defined the state by the associated constraints (or business rules)
    • if balance < 0 then account state is overdrawn
    • one cannot close an account if it is overdrawn
  • all 3 state models operate in parallel.

To analyse these state models they can be combined into a single state models (with all combinations of states, and all events), and then the unreachable states can be filtered out. The interesting thing here is (1) the analysis of state models for completeness and (2) the use of incomplete state diagrams as a business notation for textual (policy or constraint) business rules.

Other observations:

  1. These types of business rule apply to states and data; they can be extracted and modified (by a developer, or state modeller) into event rules or guards in a state transition diagram. Is it interesting to specify these business rules up front before mapping to events and processes? Yes from a business perspective, as new events or states might affect or be affected by existing business rules.
  2. Using a state to specify a business rule (in terms of the state and output) is an interesting notation that lends itself well to mapping to appropriate events (or indeed processes). Could it catch on in the business rule community?
  3. The use of an explicit choreography language has not had  much success it seems. Google WS-CDL and most entries are dated 2009 or earlier. BPMN2′s choreography may yet prove useful but possibly the concepts are too difficult for business modellers yet imply a co-operative design process for developers that rarely occurs in practice (beyond “this is the interface”!).
  4. At the end of the day, the sequence of events in a business system is just a complex event – which maybe can tell you if the choreography is valid or not.

I’ll add a link to the slides to help explain all the above when they become available…

Annex: a Distributed System Choreography Development Process:

This process describes a development process of state diagrams for choreography purposes:

  1. Define participants and messages (/events) that interact between them
  2. Define states with events as messages from and to, with only 1 sender per state
  3. Project the states out to individual participants – i.e the parts of the state model for each participant – allowing ambiguous states but ensuring these have no sends
  4. Merge the states for each participant
  5. Enact – check each event at a time to prove feasibility of the interacting state models


Event Processing Platforms vs Engines

Opher Etzion just made an interesting classification of the CEP tools market in his observations on the Bloor Research comments on CEP and Big Data, part of an increasing amount of coverage on CEP. To wit:

  • Event Processing Platform is a software that enables the creation of event processing network, handle the routing of events among agents, management, and other common infrastructure issues.
  • Event Processing Engine is a software that enables the creation of the actual function – in the EPN term implementing agents.

In the CEP Market analysis we don’t try to distinguish between these – probably because it would be contentious. For example, to some folks an “event processing network” is managed as a single process – possibly multi-threaded, but bounded on a single machine instance. To others (like TIBCO) the network is a message or event distribution mechanism for breaking the constraints of a single process or system (e.g. performance, scalability, and fault tolerance constraints). Furthermore “event processing agents” might be viewed as “event processing operations” – like a single pattern detection query, or a pattern matching rule, arranged in some kind of activity or business process diagram – or as more autonomous processing agents that can handle a number of operations and cooperate declaratively towards some solution.

If one views an Event Processing Platform as one that handles routing across multiple processes and distributed systems, then the potential candidates is reduced somewhat [*1]. Of course, any CEP engine can be used acoss multiple systems with a shared middleware infrastructure, but individually they are “blind” to the other agents and the design tools do not handle the cooperative nature of the agents. Of course, one can set up a message type to include management information to allow for some semblance of distributed control, but this is more likely to be a developer task than a platform capability.

Looking at something like TIBCO BusinessEvents, we can see this satisfies the requirements of a (physically distributed) Event Processing Platform:

  1. Enables a (computer) network of event processing agents – typically as a minimum of rule agents and cache /datagrid agents, in pretty much any configuration.
  2. Enables a (single process) network of event processing operations – typically the network is implemented as  declarative rules, but can be visualised as a network in a report.
  3. Enables different types of Event Processing Engines – apart from the rule agents, you can also have (continuous) query agents.  Rule agents can also be customised as “decision agents” (executing decision rules,  or decision tables), “analytics agents” (executing predictive analytics models in Spotfire S+ or R), or “optimization agents” (executing NuOpt optimization routines in  Spotfire Statistical Services) [*2]

Notes:

[*1] Other candidates for an Event Processing Platform across distributed systems include IBM Infosphere Streams (although IBM is very quiet these days about that), and EventZero. If there are any others please comment them, and if enough we’ll update the  Market Analysis with this classification…

[*2] Note that invoking Spotfire services involves invoking the Spotfire platform under the control of a rules agent; from an architecture point of view these are just SOA services, like calling BusinessWorks services during event processing.

Big Data vs. Event Processing

Database pundit Curt Monash made a brief mention of event processing (/event stream processing) in his discussion on “big data terminology”, presumably as a response to the discussion he started with Forrester’s Brian Hopkins where Brian (very reasonably IMHO) defined “big data” as: “techniques and technologies that make handling data at extreme scale economical.”

With “extreme scale” being defined mainly by the metrics of volume and “velocity” – with the latter being the obvious area of interest from an event processing perspective, as stated by Curt: “Low-volume/high-velocity problems are commonly referred to as ‘event processing’ and/or ‘streaming.’”

[Read more...]

Human Event Processing at WEF

“Gentlemen’s magazine” Esquire has an article by Ryan D’Agostino about TIBCO CEO Vivek Ranadive and mentions the new tibbr-based application for coordinating strategies and tactics among world leaders at WEF.

TopCom, … is a private communications platform for the two hundred most powerful people in the world.

TopCom is being officially launched in late January at the annual meeting of the World Economic Forum in Davos, Switzerland. It is basically a customized, ridiculously secure version of tibbr, a platform developed by Tibco as a kind of combination Facebook, Twitter, e-mail, texting, and Skype. It is a private social network, essentially – in this case, for world leaders.

… The top two hundred WEF members – basically, the people who run the world – can speak to one another on a given subject, and then they can choose to loop in members from lower tiers (experts, academics, etc.) as needed, widening the pool of knowledge on whatever problem is on the table.

…Tibco consulted with both the Japanese prime minister at the time of last year’s tsunami, Naoto Kan, and his successor, Noda, when it was developing its presentation for the WEF board of directors, to find out what would have been useful to them at the time of the disaster. Schwab, too, collaborated. The result, which will be on display in Davos, is the first time a global organization will introduce its own proprietary communications platform. …

Big Data vs Event Processing

Database pundit Curt Monash made a brief mention of event processing (/event stream processing) in his discussion on “big data terminology”, presumably as a response to the discussion he started with Forrester’s Brian Hopkins where Brian (very reasonably IMHO) defined “big data” as:

techniques and technologies that make handling data at extreme scale economical.

with “extreme scale” being defined mainly by the metrics of volume and “velocity” – with the latter being the obvious area of interest from an event processing perspective, as stated by Curt:

Low-volume/high-velocity problems are commonly referred to as “event processing” and/or “streaming”.

Ignoring what might constitute high volume / high velocity problems (see later), Curt replaces “velocity” with “structure” to the “big data metrics” chart (with “velocity” being included in his “bigness” metric). But of course the argument over whether “structure” or “velocity” (or neither or both) are relevant metrics for Big Data is entirely perspective-based:

  1. both are characteristics of data / events and
  2. both affect processing and storage techniques,
  3. … along with other metrics like data lifecycles and data value.

From an event perspective, event payloads (real-time data) can be simple values, tuples (such as the equivalent of a database record), or complex explicit data (such as an XML document), for which something like TIBCO BusinessEvents rules, continuous queries or patterns can be applied. For unstructured text then you may want to add TIBCO Patterns, and for non-deterministic data something like TIBCO Spotfire S+ (think neural nets and the like).

From a “big data” perspective, event processing use cases can include customer purchase records, credit card transactions, phone voice packets or text messages, inventory updates, operational sensor reports, etc etc. But from the event processing perspective (i.e. actually exploiting “big data”) there is another dimension to consider: the scale and velocity of the incoming events versus the scale and velocity (and structure) of the existing data it needs to be related to and/or processed against. Some examples might be:

  • large volumes of data at high velocities, compared to large volume of data
    = national security applications
  • large volumes of data at high velocities, compared to normal volume of data
    = sensor processing like Radar
  • normal volumes of data at high velocities, compared to large volume of data
    = web search
  • normal volumes of data at high velocities, compared to normal volume of data
    = automated trading in Capital Markets

This might be a useful way of comparing Big Data requirements against the multitude of different IT technologies and solutions out there. Today, CEP is mostly dealing with normal volumes of data at low to high velocities being tested against normal(ish) volumes of data (maybe up to Terabytes but not Petabytes), with the higher end values requiring fast datagrid solutions such as TIBCO ActiveSpaces. But as always, it would be interesting to have some metrics against the Big Data use cases  to see what we are all talking about…

Bloor bets on CEP for 2012

Thanks to David Luckham for pointing out an interesting set of predictions from Philip Howard, analyst at Bloor Research. The top 3 were directly CEP-related, the others mostly indirectly CEP-related…

  • Real-time everything. … What I think is interesting is the growth in the data replication market specifically to support real-time BI as opposed to failover, disaster recovery, zero-downtime migrations and the like. I would not be at all surprised if we see the introduction of lightweight BI-only data replication products into the marketplace.

Right-on! The world is indeed real-time, and businesses are increasingly realising it (and in some cases, IT departments too). As for BI-only data replication… using real-time data and event technologies (e.g. TIBCO ActiveSpaces) provides the BI data alongside the operational data in real-time – you want to replicate to the database for analytics and archiving only, rather than replicate to the real-time data store!

  • Continuous BI. I think we’ll hear a lot more about this as a generic market for complex event processing as opposed to the vertical markets that CEP has previously addressed.

Again, this is already happening, although the traditional BI vendors are resisting the change as much as possible. CEP vendors are using terms like operational visibility, operational intelligence, continuous intelligence, real-time analytics etc – to pretty much all indicate the same continuous BI capability. So computing statistics on-the-fly is an increasing trend, but note is not yet formalised (I haven’t seen any text book on this topic yet).

  • CEP adoption by SIEM vendors…. and smaller, more agile companies … all offering CEP in this space …

Security Information and Event Management is in some ways a subset or specialisation of CEP: SIEM is about managing security events, security event processing, and management of detected security events. CEP tools are already doing this at the cyber-security scale, and most SIEM tools I see are providing (usually constrained) CEP capabilities. The problem is that every event can be considered from a security perspective (e.g. TIBCO Hawk detecting resource usage increases and monitoring log files) so there is often an advantage in taking the wider CEP view than some vendors’ limited perspective on what constitutes a security event.

  • Warehousing adoption by SIEM vendors. …How can you claim to offer analytics against security and log data if you don’t have an analytic platform to support it? …

Although one should not confuse data warehouses with analytics, the implication is clear: apart from real-time analytics you should also consider long-term trend analysis and other predictive analytics against your events (security and otherwise), using appropriate visual and statistical analytic tools (e.g. TIBCO Spotfire and TIBCO Spotfire Miner respectively).

  • Growth in PMML adoption. … the standard for porting data mining models. …

PMML is a great idea and hopefully it will mature more for useage across the SAS and R communities over the coming months. Apart from the analytics standard there is also work on decision modelling (DMN) which could have even wider repercussions (not that I’m biased at all!).

  • Lots more big data. … As more products and companies enter this space, or claim to, the more murky the whole big data thing will become.

As more big data is created, there will be more clamour for processing it before it becomes data (i.e. as events).

  • The emergence of the Data Scientist. …

Or Company Statistician? Or Business Event Analyst? To go with the Business Process Analyst and Business Decision Analyst presumably…

  • The logical data warehouse. …

So is this the start of the demise of the illogical data warehouse? From an event perspective, data warehouses are really for storing old events for long term analytics purposes. That shouldn’t be a big deal…