Aggregate functions are functions which take multiple rows as inputs and produce just a single row of the result.
The following is a list of all available aggregate functions. They are divided into several sections:
AVG
, COUNT
, SUM
, MIN
, MAX
, ANY_VALUE
.BIT_AND
, BIT_OR
, BIT_XOR
.STRING_AGG
).BOOL_AND
, BOOL_OR
).AVG(expr: float64) -> float64
AVG(expr: timeperiod) -> timeperiod
Returns the average value of the expression over all inputs.
Return type
The function always returns a float64
result, except in case when the operand is a timeperiod
, when it returns a timeperiod
result.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Notes
In the case of a timeperiod
argument, the result will be truncated to the whole number of milliseconds.
If a higher precision is desired, the inputs can be type-casted to float64
.
Examples
SELECT AVG(x)
FROM (VALUES 10, 20, NULL::int64) AS t(x);
--> 15.0
SELECT AVG(p::timeperiod)
FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);
--> 1606.666::timeperiod
COUNT(expr: T) -> int64
COUNT(*) -> int64
Returns the number of input rows for which the expression is not NULL
, or the total number of input rows.
Arguments
The expression that is counted can be of any type.
Notes
When there are no rows, or all the expressions evaluate to NULL
, the function will return 0
.
Examples
SELECT COUNT(*), COUNT(x)
FROM (VALUES 10, 20, NULL::int64) AS t(x);
--> +----------+----------+
--> | COUNT(*) | COUNT(x) |
--> +----------+----------+
--> | 3 | 2 |
--> +----------+----------+
SUM(expr: int64) -> int64
SUM(expr: float64) -> float64
SUM(expr: timeperiod) -> timeperiod
Returns the sum of values of the expression over all inputs.
Return type
When the expression has an integral type, the result of the function will be an int64
, and when the expression type is floating-point, the result will be a float64
.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT SUM(x)
FROM (VALUES 10, 20, NULL::int64) AS t(x);
--> 30
SELECT SUM(x)
FROM (VALUES 5.4, -2.8, NULL::float64) AS t(x);
--> 2.6
SELECT SUM(p::timeperiod)
FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);
--> 4820::timeperiod
MAX(expr: T) -> T
Returns the maximum value of the expression over all inputs.
Arguments
The input expression must be of a type supporting comparisons (i.e. any type except spatial types).
Return type
The result will have the same type as the input expression.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT MAX(x)
FROM (VALUES 10, 20, NULL::int64) AS t(x);
--> 20
MIN(expr: T) -> T
Returns the minimum value of the expression over all inputs.
Arguments
The input expression must be of a type supporting comparisons (i.e. any type except spatial types).
Return type
The result will have the same type as the input expression.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT MIN(x)
FROM (VALUES 10, 20, NULL::int64) AS t(x);
--> 10
ANY_VALUE(expr: T) -> T
Returns the value of the expression for an arbitrary row amongst the input rows.
Arguments
The input expression can be of any type.
Return type
The type of the result is the same as the type of the input expression.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT ANY_VALUE(x)
FROM (VALUES 7, 11, NULL::int64) AS t(x);
--> 7 -- could also be 11
BIT_AND(expr: int64) -> int64
Returns the bitwise AND of all values of the expression over the inputs.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT BIT_AND(x)
FROM (VALUES 7, 11, NULL::int64) AS t(x);
--> 3
See also
BIT_OR(expr: int64) -> int64
Returns the bitwise OR of all values of the expression over the inputs.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT BIT_OR(x)
FROM (VALUES 7, 11, NULL::int64) AS t(x);
--> 15
See also
BIT_XOR(expr: int64) -> int64
Returns the bitwise XOR of all values of the expression over the inputs.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT BIT_XOR(x)
FROM (VALUES 7, 11, NULL::int64) AS t(x);
--> 12
See also
STRING_AGG(expr: text, delimiter: text) -> text
Concatenates all the values of the expression over the inputs into a single string, placing a delimiter between every two values.
Null handling
NULL
expression values are ignored and no delimiter is placed after them. The result will be NULL
only when there are no non-null inputs.
Examples
SELECT STRING_AGG(x, ', ')
FROM (VALUES 'Hello', NULL::text, 'world!') AS t(x);
--> 'Hello, world!'
BOOL_AND(expr: bool) -> bool
Returns true when the expression evaluates to true for all inputs, otherwise returns false.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT BOOL_AND(x)
FROM (VALUES true, false, NULL::bool) AS t(x);
--> false
See also
BOOL_OR(expr: bool) -> bool
Returns true when the expression evaluates to true for at least one input, otherwise returns false.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Examples
SELECT BOOL_OR(x)
FROM (VALUES true, false, NULL::bool) AS t(x);
--> true
CORR(y: float64, x: float64) -> float64
Returns the coefficient of correlation between values of two expression over all the inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the function is calculated by the formula:
COVAR_POP(y, x)
CORR(y, x) = --------------------------------------------------------
STDDEV_POP(IFNOTNULL(x, y)) * STDDEV_POP(IFNOTNULL(y, x))
Examples
SELECT CORR(y, x) FROM
(VALUES (2.25, 4.5), (3.75, 6.75), (4.5, 6.15), (5, NULL::float64)) AS t(x, y);
--> 0.8287694018176
See also
COVAR_POP(y: float64, x: float64) -> float64
Returns the population covariance between values of two expression over all the inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the function is calculated by the formula:
SUM(IFNOTNULL(x, y)) * SUM(IFNOTNULL(y, x))
SUM(y * x) - -------------------------------------------
REGR_COUNT(y, x)
COVAR_POP(y, x) = ----------------------------------------------------------
REGR_COUNT(y, x)
Examples
SELECT COVAR_POP(y, x)
FROM (VALUES (2.25, 4.5), (3.75, 6.75), (4.5, 6.15), (5, NULL::float64)) AS t(x, y);
--> 0.7375
See also
COVAR_SAMP(y: float64, x: float64) -> float64
Returns the sample covariance between values of two expression over all the inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the function is calculated by the formula:
SUM(IFNOTNULL(x, y)) * SUM(IFNOTNULL(y, x))
SUM(y * x) - -------------------------------------------
REGR_COUNT(y, x)
COVAR_SAMP(y, x) = ---------------------------------------------------------
REGR_COUNT(y, x) - 1
Examples
SELECT COVAR_SAMP(y, x)
FROM (VALUES (2.25, 4.5), (3.75, 6.75), (4.5, 6.15), (5, NULL::float64)) AS t(x, y);
--> 1.10625
See also
STDDEV_POP(expr: float64) -> float64
STDDEV_POP(expr: timeperiod) -> timeperiod
Returns the population standard deviation of values of the expression over all inputs.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Notes
The result of the function is computed by taking the square root of the VAR_POP function, i.e. STDDEV_POP(x) = SQRT(VAR_POP(x))
.
Examples
SELECT STDDEV_POP(x) FROM
(VALUES 25, 41, 28, NULL::int64) AS t(x);
--> 6.94422221866655
SELECT STDDEV_POP(p::timeperiod)
FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);
--> 1448.708::timeperiod
See also
STDDEV_SAMP(expr: float64) -> float64
STDDEV_SAMP(expr: timeperiod) -> timeperiod
Returns the sample standard deviation of values of the expression over all inputs.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Notes
The result of the function is computed by taking the square root of the VAR_SAMP function, i.e. STDDEV_SAMP(x) = SQRT(VAR_SAMP(x))
.
Examples
SELECT STDDEV_SAMP(x) FROM
(VALUES 25, 41, 28, NULL::int64) AS t(x);
--> 8.50490054811538
SELECT STDDEV_SAMP(p::timeperiod)
FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);
--> 1774.297::timeperiod
See also
VAR_POP(expr: float64) -> float64
VAR_POP(expr: timeperiod) -> timeperiod
Returns the population variance of values of the expression over all inputs.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Notes
The result of the VAR_POP
function is calculated by the formula:
SUM(x) * SUM(x)
SUM(x * x) - ---------------
COUNT(x)
VAR_POP(x) = --------------------------------
COUNT(x)
Examples
SELECT VAR_POP(x) FROM
(VALUES 25, 41, 28, NULL::int64) AS t(x);
--> 48.22222222222222
SELECT VAR_POP(p::timeperiod)
FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);
--> 2098755.555555::timeperiod
See also
VAR_SAMP(expr: float64) -> float64
VAR_SAMP(expr: timeperiod) -> timeperiod
Returns the sample variance of values of the expression over all inputs.
Null handling
NULL
expression values are ignored. The result will be NULL
only if there are no non-null inputs.
Notes
The result of the VAR_SAMP
function is calculated by the formula:
SUM(x) * SUM(x)
SUM(x * x) - ---------------
COUNT(x)
VAR_SAMP(x) = --------------------------------
COUNT(x) - 1
Examples
SELECT VAR_SAMP(x) FROM
(VALUES 25, 41, 28, NULL::int64) AS t(x);
--> 72.33333333333333
SELECT VAR_SAMP(p::timeperiod)
FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);
--> 3148133.333333::timeperiod
The linear regression functions fit an ordinary-least-squares regression line to pairs of expression values over all the inputs.
REGR_AVGX(y: float64, x: float64) -> float64
Returns the average value of the independent variable x
over all valid inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_AVGX
function is calculated by the formula: REGR_AVGX(y, x) = AVG(IFNOTNULL(y, x))
.
Examples
SELECT REGR_AVGX(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 2.5
See also
REGR_AVGY(y: float64, x: float64) -> float64
Returns the average value of the dependent variable y
over all valid inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_AVGY
function is calculated by the formula: REGR_AVGY(y, x) = AVG(IFNOTNULL(x, y))
.
Examples
SELECT REGR_AVGY(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 5.0
See also
REGR_COUNT(y: float64, x: float64) -> int64
Returns the number of valid inputs used to fit the regression line.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_COUNT
function is calculated by the formula:
REGR_COUNT(y, x) = SUM(IF(x IS NOT NULL AND y IS NOT NULL, 1, 0))
Examples
SELECT REGR_COUNT(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 3
REGR_INTERCEPT(y: float64, x: float64) -> float64
Returns the y-intercept of the regression line fitted to the input values.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_INTERCEPT
function is calculated by the formula:
REGR_INTERCEPT(y, x) = REGR_AVGY(y, x) - REGR_SLOPE(y, x) * REGR_AVGX(y, x)
Examples
SELECT REGR_INTERCEPT(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 2.259615384615
See also
REGR_R2(y: float64, x: float64) -> float64
Returns the coefficient of determination (also called R-squared or goodness of fit) for the regression line fitted to the input values.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_R2
function is calculated by the formula:
REGR_R2(y, x) = CASE
WHEN VAR_POP(IFNOTNULL(y, x)) = 0
THEN NULL
WHEN VAR_POP(IFNOTNULL(x, y)) = 0
THEN 1
ELSE POWER(CORR(y, x), 2)
END
Examples
SELECT REGR_R2(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 0.9917582417582
REGR_SLOPE(y: float64, x: float64) -> float64
Returns the slope of the regression line fitted to the input values.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_SLOPE
function is calculated by the formula:
COVAR_POP(y, x)
REGR_SLOPE(y, x) = ------------------------
VAR_POP(IFNOTNULL(y, x))
Examples
SELECT REGR_SLOPE(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 1.096153846154
See also
REGR_SXX(y: float64, x: float64) -> float64
Returns the sum of squares of independent variable x
for all valid inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_SXX
function is calculated by the formula: REGR_SXX(y, x) = REGR_COUNT(y, x) * VAR_POP(IFNOTNULL(y, x))
.
Examples
SELECT REGR_SXX(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 6.5
See also
REGR_SXY(y: float64, x: float64) -> float64
Returns the sum of squares of the product between the independent variable x
and dependent variable y
for all valid inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_SXY
function is calculated by the formula: REGR_SXY(y, x) = REGR_COUNT(y, x) * COVAR_POP(y, x)
.
Examples
SELECT REGR_SXY(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 7.125
See also
REGR_SYY(y: float64, x: float64) -> float64
Returns the sum of squares of dependent variable y
for all valid inputs.
Null handling
Input rows for which either or both args are NULL
are ignored. The result will be NULL
only when there are no valid (non-null) inputs.
Notes
The result of the REGR_SYY
function is calculated by the formula: REGR_SYY(y, x) = REGR_COUNT(y, x) * VAR_POP(IFNOTNULL(x, y))
.
Examples
SELECT REGR_SYY(y, x) FROM
(VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);
--> 7.875
See also