Is HzB a Good Metric for Event Processing throughput?

In a previous post about Event Processing at the LHC, we learned that the scientists involved talked about event measurements in terms of Hertz or “cycles per second”. Although particularly suited to observation events, one wonders if this could also be a suitable term for business event processing (for example, a large parcel delivery company may be able to process its parcel events at 50 KHz or 50K per sec).

However, the unit of work-done should also include the amount of data being processed: a 3-byte sensor reading being read at 1MHz is requiring “less work” than a 250KB message being processed at 10KHz. So apart from the throughput in Hz, we should also consider a combined metric = the HertzByte. So 10Kb worth of event being processed at 10KHz would mean a throughput of 100MHzB.

So what is the peak or sustained HzB rating of your CEP application?

Comments

  1. George Long says:

    This certainly got me thinking, I have had a little bit of exposure benchmarking event processing platforms and had to be disciplined on how to express performance.

    The factors to determine performance as you have pointed out are the number and size of the events at the source to the CEP solution. These events are not to be confused with derived events generated as part of the solution.

    The affinity here is obviously with TPC, I want to know ahead of time what the target ingest rate is and we have an implicit assumption that none can be lost. In reality my solution needs to be rated to handle a sustained load AND accommodate burst activity without operator intervention. Predictive capacity growth models should be a natural output of any CEP workflow.

    Back to a performance index …

    The business wants a solution to handle X # of parcels per time period.
    Any solution evolves over time and would create additional workflows so the index is really a scorecard

    So the scorecard would need to relate the ingress rate and EACH egress rate rated to the source event and the lag to yield each egress.
    I do like the composite notation, however I want to verbally describe my system as yielding 50x2K events per second.
    Dropping the event size {ie. 2K} does not change the essential narrative in describing the throughput of my system to others. Ie. 50x2K HzB or 50 Hz
    The networking folks can do the math on the volume for the appropriate distribution fabric (Mbps or Gbps).
    The Hertz is a well understood unit of frequency and does not need the ‘B’ qualifier if we use the annotated mode (x2K) as the size unit is again using conventional notation (K,M, G etc).

    Now I have

    50x3K Hz in, 32xK2 out with lag of 5 seconds (note the nKm for fractional units)
    So I can fully quantify my event workflow with 50:32x3K:K2/5s Hz
    The colon is used to separate incoming from outgoing, the event ratios span 5 seconds but they occur every second.
    There are (50+32) events externally visible to the system, a total of (50x3K) + (32xK2) data bytes was transported and the elapsed time was 5 seconds

    Now I have all of the information to determine my throughput relative to another flow or another system.

    This ‘index’ can be subjectively reduced as the business sees 50 events a second with a lag of 5 seconds.
    The nature of the lag may be fixed as the final event may is a 5 second window.
    If my input load increases the system must have capacity to deal with the input.
    As my business evolves and affects the processing then I can express the relative impact of the change against the above baseline, ie.
    From the baseline 50:32x3K:K2/5s Hz my system now yields twice the output events as before so I can express 50:32x3K:K2/5s or :+32 Hz

    So, alas no ‘index’ but a tangible means of expressing yield for a workflow with

    Workflow yield = (# of input events) : (# of output events related to the source event) x (size of input event) : (size of output event) / (lag time) Hz

    • Paul Vincent says:

      Hi George – makes perfect sense to me. Some thoughts:
      a. Size of input and output event are clearly application specific (and also relatively static).
      b. # output events per some set of input events (+ state) is also application specific
      c. variables across implementations of the application will be #input events per some time period (typically also application specific, but often the item to be stressed) and lag time

      Cheers

  2. Paul Vincent says:

    HzB FAQ…
    1. Why not just use Mb per sec?
    1a: MB per sec doesn’t really emphasise that we are dealing with discrete events rather than some flow of data.
    1b. HzB can always be defined in terms of its parts eg 1Mx2K HzB to indicate the event rate x event size

    2. How can this be extended to include the all-important latency (especially needed for front office financial systems)?
    2a. Note that latency is a “service response time” ie difference in time between some input and output. So 1st we have to define both input and output Event Processing metrics (ie HzB in and HzB out). Then typically the important latency metric we want is the t(event out) – t(last relevant event in). So for complete coverage we might want
    - peak HzB in,
    - HzB out for peak in,
    - latency (ms) for peak HzB in
    - min latency (ms) for some specified HzB in and HzB out
    or more
    Of course, it would be neat to add this as an instrumented output of any CEP application :)

    Cheers

Speak Your Mind

*