聚合

A key feature of analytics is reducing many values down to a summary. This act is called “aggregation” and always includes a function — for example, average or sum — that reduces values in the table to a single row.

`aggregate` 變換

The aggregate transform takes a tuple to create one or more new columns that “distill down” data from all the rows.

from invoices
aggregate { sum_of_orders = sum total }

上面的查詢計算 invoices 表所有行的 total 列的總和，以產生單一值。

當元組中包含一個或多個聚合表達式時，aggregate 可以一次產生多個摘要。aggregate 會捨棄元組中不存在的所有列。

from invoices
aggregate {
    num_orders = count this,
    sum_of_orders = sum total,
}

在上面的示例中，結果是有兩列的單一行。count 函式顯示傳入的表中的行數；sum 函式將所有行的 total 列的值相加。

分組

假設我們想要產生表中 每個城市 發票的摘要。我們可以為每個城市建立一個查詢，並聚合其行：

from invoices
filter billing_city == "Oslo"
aggregate { sum_of_orders = sum total }

但我們需要為每個城市做這種操作：London、Frankfurt 等。當然這是重複的（且令人厭倦的）且容易出錯（因為我們需要手動輸入每個 billing_city）。此外，我們需要在開始前建立每個 billing_city 的列表。

`group` 變換

The group transform separates the table into groups (say, those having the same city) using information that’s already in the table. It then applies a transform to each group, and combines the results back together:

from invoices
group billing_city (
    aggregate {
        num_orders = count this,
        sum_of_orders = sum total,
    }
)

熟悉 SQL 的人可能已經注意到，我們剛才將聚合與分組分離了。

雖然這些操作在 SQL 中是相關的，但 PRQL 使得可以直接將 group 和 aggregate分開使用，同時與其他變換函式結合，例如：

from invoices
group billing_city (
    take 2
)

This code collects the first two rows for each city’s group.

Keyboard shortcuts

PRQL language book

聚合

aggregate 變換

分組

group 變換

`aggregate` 變換

`group` 變換