Module agg
source code
agg [-r] [-g GROUPING_FUNCTION] INITIAL_VALUE
AGGREGATION_FUNCTION
agg [-r] [-c GROUPING_FUNCTION] INITIAL_VALUE
AGGREGATION_FUNCTION
Aggregates objects from the input stream. If
GROUPING_FUNCTION
is omitted, then one output object is
generated by initializing an accumulator to INITIAL_VALUE
and then combining the accumulator with input objects using
AGGREGATION_FUNCTION
. AGGREGATION_FUNCTION
takes two inputs, the current value of the accumulator and an object from
the input stream.
Example: If the input objects are integers 1, 2, 3
, then
the sum of the integers is computed as follows:
... ^ agg 0 'sum, x: sum + x'
which yields 6
.
If GROUPING_FUNCTION
is specified, then a set of
accumulators is maintained, one for each value of
GROUPING_FUNCTION
. Each output object is a tuple with two
parts, the group value and the accumulated value for the group.
Example: If the input objects are ('a', 1), ('a', 2), ('b', 3),
('b', 4)
, then the sum of ints for each string is computed as
follows:
... ^ agg -g 'x, y: x' 0 'sum, x, y: sum + y'
which yields ('a', 3), ('b', 7)
.
If the grouping function is specified with the -g
flag,
then agg generates its output when the input stream has ended. (It has
to, because group members map appear in any order.) In some situations
however, group members appear consecutively, and it is useful to get
output earlier. If group members are known to be consecutive, then the
group function can be specified using the -c
flag.
If the -r
flag is specified, then one output object is
generated for each input object; the output object contains the value of
the accumulator so far. The accumulator appears in the output row before
the inputs. For example, if the input stream contains 1, 2,
3
, then the running total can be computed as follows:
... ^ agg -r 0 'sum, x: sum + x' ^ ...
The output stream would be (1, 1), (3, 2), (6, 3)C
. In
the last output object, 6
is the sum of the current input
(3
) and all preceding inputs (1, 2
).
The -r
flag can also be used with grouping. For example,
if the input objects are ('a', 1), ('a', 2), ('b', 3), ('b',
4)
, then the running totals for the strings would be computed as
follows:
... ^ agg -r -g 'x, y: x' 0 'sum, x, y: sum + y' ^ ...
The output stream would be (1, 'a', 1), (3, 'a', 2), (3, 'b',
3), (7, 'b', 4)
.
|
agg(initial_value,
aggregator,
group=None,
consecutive=None,
running=False)
Combine inputs into a smaller number of outputs. |
source code
|
|
agg(initial_value,
aggregator,
group=None,
consecutive=None,
running=False)
| source code
|
Combine inputs into a smaller number of outputs. If neither
group nor consecutive is specified, then there
is one accumulator, initialized to initial_value . The
aggregator function is used to combine the current value of
the accumulator with the input to yield the next value of the
accumulator. The arguments to aggregator are the elements of
the accumulator followed by the elements of one piece of input. If
group is specified, then there is one accumulator for each
group value, defined by applying the function group to each
input. consecutive is just like group except
that it is assumed that group values are adjacent in the input sequence.
At most one of group and consecutive may be
specified. If running is false , then output
contains one object per group, containing the aggregate value. (If
neither group nor consecutive are provided,
then there is just one group, representing the aggregate for the entire
input stream.) If running is true, then each the aggregate
value for the group is written out with each input object -- i.e., the
output contains "running totals". In this case, the aggregate
values appear before the input values in the output object.
|