Export (0) Print
Expand All
1 out of 1 rated this helpful - Rate this topic

Count Windows

A count window defines a subset of events that fall within some period of time and over which you can perform some set-based computation such as an aggregation. Count windows, like snapshot windows, do not have a fixed window size. They are defined by the number of events they contain. This enables a number of scenarios in which a set-based operation must be applied on a specific number of events in a stream that has an irregular event frequency.

Tip Tip

In this version of StreamInsight, count windows can only be used with user-defined aggregates or user-defined operators, but not with built-in aggregates and TopK.

For a general description of event windows and their implementation and use in StreamInsight, see Using Event Windows.

The variable window size of count windows is defined by the count of distinct event start times along a timeline. The set of count windows with a count of N is defined as all timespans that contain N consecutive event start times, where N is at least 1. More precisely, each such timespan ends at the start time plus one tick of the Nth distinct event. Count windows move along the timeline with each distinct event start time. Hence, each new event that defines a new event start time causes the creation of a new count window, as long as the count is fulfilled. That is, as long as the number of distinct event start times is smaller than N a new window is not created. As with hopping windows, the offset between count windows is called hop size; however, unlike a hopping window, in general, the hop size for count windows will not be constant.

If each event on the timeline has a unique timestamp, the number of events in each such window is equal to N. If multiple events have the same event start time, the number of contained events will be higher than N. It is important to understand this behavior when applying an aggregations to the window.

The following illustration shows a stream with three events: e1, e2, and e3. The vertical bars show the count window boundaries that are defined by these events. The event streams in light blue represent the event streams moving through time. The orange boxes show the count windows and the contained events in each window. The first two distinct event start times belong to e1 and e2, and the first count window spans the corresponding period, starting with the start time of e1, and ending with the start time of e2 plus one tick. The second window contains the start times of e2 and e3:

CountWindowForEvents

Note that this illustration shows the events in the windows already clipped by the window input policy. The input policy, as for all StreamInsight windows, is to clip the events to the window size. A time-sensitive aggregate or operator will use these clipped event lifetimes in the windows, instead of the original ones.

The following illustration shows count windows with point events, which are valid only for a single instant of time. This further demonstrates that the CountByStartTimeWindow() method counts distinct event start times instead of events themselves. The illustration shows a stream of point events. The vertical bars show the count window start times defined by the point events. The event streams in light blue represent the event streams moving through time. Note that two events are collocated at the same time. The orange boxes show the count windows and the contained events in each window. The two collocated events count as one unit regarding the window parameter N.

Count window.

In the case of point events, the window behavior of adding one tick to the start time of the Nth event when defining the window end time causes the window to incorporate all its events entirely.

A count window is defined by the count of distinct event start times along the timeline. A count window is specified in Language Integrated Query (LINQ) as shown in the following example. The input policy is always ClipToWindow and the output policy is always PointAlignToWindowEnd.

var agg = from w in inputStream.CountByStartTimeWindow(10)
          select new { sum = w.MySum(e => e.i) };

Since count windows currently cannot be used with built-in aggregates, the example above assumes a user-defined aggregate MySum.

As for any other window, this window output policy is applied to the result of the set-based operation. The only available output policy for count windows is to produce a point event at the end of the window. The timestamp of the point time coincides with the last event start time in the window. In other words, the point event still lies within the window span. Applying an aggregation of the input of the diagram above, the output stream looks as follows:

CountWindowforEvents3

For a time-sensitive user-defined operator (UDO) in which the UDO can produce result events including timestamps, the timestamps are overwritten by the output policy. Each event that the UDO returns is transformed into a point event at the end of the window such that the point is aligned with the start time of the last event in the window, just like the single result event from an aggregation, shown in the previous diagram.

Community Additions

ADD
Show:
© 2014 Microsoft. All rights reserved.