Closing the Big Data Loop

It has been two weeks since TUCON 2012, the TIBCO user conference held annually in Las Vegas, Nevada. TIBCO is well-known as an integration company but used the event to demonstrate its broad platform approach to the biggest challenges of today, like digital customer experience, loyalty, and the topic of this article, Big Data.

The TIBCO CTO, Matt Quinn spoke about the patterns hiding within the rapidly increasing amounts of data flowing across the enterprise. Quinn made an important distinction between data at rest, information sitting in databases or flat files waiting to be queried, and data in motion, which includes streaming data and data stored in-memory (also known as cache).

Quinn made the point that it takes analytics like those offered within TIBCO’s Spotfire product to be able to see what would be invisible to people trying to keep up with the increasing deluge of information. To Quinn, the smart enterprise finds patterns in historical and machine data (log files that up until recently weren’t mined for patterns) that provide insight that can be applied to data that’s coming at today’s “full speed.” [Read more...]

Big Data in Real Time

Big Data was first characterized in 2001 as having three Vs: Volume, Velocity and Variety.  Volume refers to the sheer size of data that you need to work with, whether it’s Gigabytes, Terabytes or Petabytes.  Velocity is about the speed at which new data is generated, coming from more and faster streams of events.  Variety talks to the many different ways that data is represented, whether it has different structures, or has no structure at all.

To these three Vs, I like to add a fourth: Volatility.  When I talk about data Volatility, I’m talking less about the actual data, and more about what it represents.  Events occur every day in your enterprise that are digital representations of threats to, or opportunities for, your organization.  Perhaps the event represents the chance to help a customer in your store find – and buy – an item he’s looking for.  Or maybe the event is telling you that a cyber-thief is making off with your sensitive information.

In either case, the situation isn’t waiting around for you to respond to it; it’s on its own schedule.  If you aren’t ready to respond in the appropriate amount of time, then your customer – or his personal information he entrusted you with – has left your premises.

So what do you do about these four Vs?  It all boils down to just three words:  Understand.  Anticipate.  Act. [Read more...]

Gartner Makes the Case for Complex Event Processing to Keep Up with Real-time Big Data

TIBCO’s complex event processing and in-memory data grid solutions are the perfect way to solve the problem that Gartner’s Roy W. Schulte and Bill Gassman call, “the conventional save and process paradigm” that isn’t fast enough for today’s big data challenges.

The enterprise world is becoming more time-critical by the moment. The need to analyze real-time data for opportunities, risks and efficiencies before putting it into a database is moving companies to complex event processing combined with in-memory storage.

“We found Roy Schulte and Bill Gassman’s report on complex event processing’s role in Big Data to be very insightful,” said Ivan Casanova, Senior Director of Product Marketing, TIBCO. “We could not agree more that changing conditions, including greater volume, velocity and variety of data is making the conventional save-and-process paradigm obsolete for big data.” [Read more...]

DEBS2012 on Event Patterns

At DEBS last month, a few members of the EPTS Reference Architecture team tutored on the latest Functional Event Patterns list – with sample and pseudocode implementations – covering all aspects of event preparation, analysis, (complex event) detection, and reaction. From a TIBCO CEP perspective, this version mostly covers TIBCO BusinessEvents rule patterns (i.e. how these functional patterns map to a standard event-based production rule pseudocode), with a few references to BE State Models and the odd BE Continuous Query [*1]. These can be viewed alongside examples from Oracle EP, IBM WODM, IBM Stream Insight and the PROLOG-based Prova.

This remains very much a work-in-progress [*2], but should give a good idea of where we are heading. In the “real world,” it should be noted that often many of these functions are combined into a single operation (eg: covering preparation/filtering, analysis/transformation, detection/composition and reaction/assessment in a single rule or query). [Read more...]

The Power of Patterns – Part 3

Check out The Power of Patterns and The Power of Patterns – Part 2 before reading on to discover the third and final type.

Patterns of Events

Based on our experiences and skills, there are other patterns that we deal with naturally every day. As complex sequences of events unfold around us, we are able to reach conclusions about likely outcomes. For instance, this might be something as simple as when we are driving and see the lights ahead turn to green (an event), a large vehicle having difficulty accelerating up the hill (a series of events), while there are still 15 cars ahead of us and we know the green light time is usually 45 seconds (let’s call these type 3 patterns). Since we know the chances are we won’t make it through the green light, we are mentally prepared to slow down, rather than accelerate.

Now imagine the more complex situation of an electrical generation and distribution grid supplying power to millions of households and businesses. Events are being generated at every point in the network, from the household meter readings happening every 5 minutes or less; substations and transformers each recording and transmitting their current operating situations (load, temperature, etc.); power stations and wind turbine farms generating and sharing their current workloads. That’s, of course, not the end of it… numerous external factors are being monitored: current and projected weather conditions or a local sports team playing at 8pm and the #1 TV program showing at 9pm. [Read more...]

Fastest Cars in the World Rely on Pirelli Tires

A Pirelli-supported team smashed the outright circuit lap record at the Australian GT Championship at Winton. “We’re very happy with the pace of the Pirelli,” Maranello Motorsport engineer Pat Cahill commented post-event. “Jonny Reid hadn’t driven on the Pirelli before, and he was impressed with the pace and consistency of the tire.”

Let’s take it to the highest level: FORMULA ONE racing have the fastest on Earth – cars with speeds up to 220 mph with frequent hairpin turns, often on slick tracks. Milan-based Pirelli – exclusive tire supplier to FORMULA ONE racing through 2013 – uses TIBCO technology to circulate essential, real-time race data to FORMULA ONE Management (FOM) and FORMULA ONE teams.

Pirelli Decides to Speed Things Up With TIBCO Technology

With TIBCO, Pirelli became faster and more accurate in collecting, sorting, analyzing, and reporting key performance indicators (KPIs). Data processes that once consumed 90 days, now through integration, take just 30 minutes to generate the latest KPIs on business units and individual products. With these real-time indicators now readily available enterprise-wide, decision makers not only see Pirelli more clearly – they manage it more skillfully.

Pirelli’s reliance on TIBCO’s real-time data backbone doesn’t stop at the FORMULA ONE circuit; it extends to the marketplace, where competitive pressures are equally intense.  In the case of Pirelli, “21st-century efficiencies” mean integrating a global network of some 10,000 distributors and retailers. Every time Pirelli integrates a distributor into the system, the company boosts profitable growth.

TIBCO integration made Pirelli more profitable by balancing inventories to ensure that each time a customer enters a store, the right tires are in stock – no more, no less.  With TIBCO-powered integration, Pirelli gathers optimized results on the race track, generates new profit, and optimizes work processes and information flow to reduce total costs of ownership for its distributors.

“We capture practice sessions, qualifying sessions, top speeds, lap times, intermediates, flags, accidents, and the position of each car on the track and in the pit lane. The engineer of each team uses this real-time data to better understand the car, the track, and how to optimize performance. It’s an advantage enabled by Pirelli’s TIBCO integration.” 

– Fabrizio Orioli, Pirelli’s Integration Service Manager

Read the full story here.

Decision Latency… Just Part of the Same Problem as Data & Process Latency

Interesting read from Gartner’s Jim Sinur, talking about decision and action latency. Jim’s talks about how Big Data is meant to solve decision latency – but it’s not at all clear how or why Big Data affects the latency of either designing or making decisions. Surely Big Data is about extractable information for improving the quality of decisions? Maybe I missed something critical in the Big Data hype. If anything, running analytics on larger data sets is sure to increase latency rather than reduce it! Of course, this is why real-time analytics are increasingly important to complement the traditional data analytics…

I would, however, agree that decision, data and process latency are key weapons in the fight against the costs of Big Data; you want to respond effectively to events as they occur, despite those event rates increasing: that means efficient decision engines, high performance data access, as well as responsive process engines. [Read more...]

The Power of Patterns – Part 2

Check out The Power of Patterns and read on to discover the second type:

Patterns of similarity when people make decisions about duplicate records

The second type of patterns about data (let’s call these type 2 patterns) occurs as humans look at multiple records (an illustration follows) trying to decide, for instance, if they are duplicate records. Without realizing it, they are sensing and evaluating the patterns of data similarities for each of the attributes, and sets of attributes, while of course automatically taking into account any misfielding and differences in the data. What they don’t calculate or really have any knowledge of is the mathematical similarity score that could be calculated for each set of attributes.

Now imagine we get a domain expert to label representative sets of record pairs as being duplicate or not-duplicate, yes or no, true or false, etc. We could build a model of the sets (vectors) of attribute similarity scores that lead the domain expert to their particular conclusions. With enough training, the model would know when it was ready to take over reaching the same conclusions (to converge) as the domain expert would reach. At this point, the model is ready to run and reach highly accurate conclusions based on the prior training by domain experts. Providing more accurate results means that higher levels of automation can be achieved in that particular business process.


Looking at the data and the attribute similarity scores you’ll notice a few keys points: misfieldings are taken into account, differences in spelling are recognized and scored appropriately, and also that the use of nicknames (or any other semantic equivalents for that matter) is supported. 

Learn more here!

How We Prevent and Predict Power Outages

Picture by Sheffield Tiger

Today, the power infrastructure has evolved into an automated and intelligent network of information that enables real-time balancing of electricity supply and demand.  This information helps electric utility companies provide energy reliability and service quality at the best prices. In addition to the challenges of aging infrastructure and increasing compliance needs, energy companies must be able to measure demand accurately, compare with supply resources and possible faults, predict outages at any point of time, and accordingly set procedures for transmission of reliable electricity to customers.

PJM Interconnection (PJM) operates in the largest wholesale power market in the world and coordinates the continuous buying, selling, and delivery of electricity across 13 states and the District of Columbia, serving more than 60 million people. For PJM, real-time operational visibility of regional conditions and reliability issues is essential when you consider that data is drawn from nearly 74,000 points on the grid. [Read more...]

The Election Campaign of the Future is Happening Now

The way we do everything is different… so why are we still running election campaigns and voting relatively the same way we did 200+ years ago? Imagine if we could bring the process into the 21st century using the technologies that are innovating today? Guess what?  We can.