Skip to main content

Aggregation

Sometimes you don't want every matching row — you want a summary. FlowLog supports aggregate functions in rule heads for exactly this.

Syntax

Wrap an expression in an aggregate function in the rule head:

Head(group_key, agg_op(value_expr)) :- Body(...).

Columns not wrapped in an aggregate become group-by keys automatically. If you've used SQL's GROUP BY, same idea — but the grouping is implicit.

Supported operators

OperatorWhat it computes
min / MINMinimum value
max / MAXMaximum value
sum / SUMSum of values
count / COUNTNumber of tuples
average / AVGAverage value

Case doesn't matter — min and MIN are the same.

Examples

Connected components

Label each node with its component — the smallest reachable node id:

.decl Arc(a: int32, b: int32)
.input Arc(IO="file", filename="Arc.csv", delimiter=",")

.decl CC(node: int32, cc: int32)
.printsize CC

CC(node, min(node)) :- Arc(node, _).
CC(node, min(cc)) :- Arc(other, node), CC(other, cc).

node is the group-by key; min(...) picks the smallest label across all matches.

Single-source shortest paths

Compute shortest distances from seed nodes:

.decl arc(src: int32, dest: int32, weight: int32)
.decl id(src: int32)
.decl sssp(x: int32, y: int32)

.input arc(IO="file", filename="Arc.csv", delimiter=",")
.input id(IO="file", filename="Id.csv", delimiter=",")

sssp(x, min(0)) :- id(x).
sssp(y, min(d1 + d2)) :- sssp(x, d1), arc(x, y, d2).

.printsize sssp

The aggregate argument can be an arithmetic expression — min(d1 + d2) picks the shortest combined distance.

Aggregation and stratification

Aggregated relations participate in FlowLog's stratification analysis, just like negation. The compiler makes sure each aggregate is fully computed before anything downstream reads it.