MSQL Reference – Aggregate Functions

Aggregate functions are functions which take multiple rows as inputs and produce just a single row of the result.
The following is a list of all available aggregate functions. They are divided into several sections:

Basic Aggregate Functions - AVG, COUNT, SUM, MIN, MAX, ANY_VALUE.
Bitwise Aggregate Functions - Aggregate functions for doing bitwise operations: BIT_AND, BIT_OR, BIT_XOR.
String Aggregate Functions - Aggregate functions returning strings (currently only STRING_AGG).
Boolean Aggregate Functions - Functions for aggregating Boolean values: BOOL_AND, BOOL_OR).
Statistical Aggregate Functions - Aggregate functions for computing variances, covariances, standard deviations and correlations..
Linear Regression Aggregate Functions - Aggregate functions for fitting the regression line to a set of input.

Basic Aggregate Functions

agg AVG

AVG(expr: float64) -> float64

AVG(expr: timeperiod) -> timeperiod

Returns the average value of the expression over all inputs.

Return type

The function always returns a float64 result, except in case when the operand is a timeperiod, when it returns a timeperiod result.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Notes

In the case of a timeperiod argument, the result will be truncated to the whole number of milliseconds. If a higher precision is desired, the inputs can be type-casted to float64.

Examples

SELECT AVG(x)
  FROM (VALUES 10, 20, NULL::int64) AS t(x);

--> 15.0

SELECT AVG(p::timeperiod)
  FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);

--> 1606.666::timeperiod

agg COUNT

COUNT(expr: T) -> int64

COUNT(*) -> int64

Returns the number of input rows for which the expression is not NULL, or the total number of input rows.

Arguments

The expression that is counted can be of any type.

Notes

When there are no rows, or all the expressions evaluate to NULL, the function will return 0.

Examples

SELECT COUNT(*), COUNT(x)
  FROM (VALUES 10, 20, NULL::int64) AS t(x);

--> +----------+----------+
--> | COUNT(*) | COUNT(x) |
--> +----------+----------+
--> |        3 |        2 |
--> +----------+----------+

agg SUM

SUM(expr: int64) -> int64

SUM(expr: float64) -> float64

SUM(expr: timeperiod) -> timeperiod

Returns the sum of values of the expression over all inputs.

Return type

When the expression has an integral type, the result of the function will be an int64, and when the expression type is floating-point, the result will be a float64.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT SUM(x)
  FROM (VALUES 10, 20, NULL::int64) AS t(x);

--> 30

SELECT SUM(x)
  FROM (VALUES 5.4, -2.8, NULL::float64) AS t(x);

--> 2.6

SELECT SUM(p::timeperiod)
  FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);

--> 4820::timeperiod

agg MAX

MAX(expr: T) -> T

Returns the maximum value of the expression over all inputs.

Arguments

The input expression must be of a type supporting comparisons (i.e. any type except spatial types).

Return type

The result will have the same type as the input expression.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT MAX(x)
  FROM (VALUES 10, 20, NULL::int64) AS t(x);

--> 20

agg MIN

MIN(expr: T) -> T

Returns the minimum value of the expression over all inputs.

Arguments

The input expression must be of a type supporting comparisons (i.e. any type except spatial types).

Return type

The result will have the same type as the input expression.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT MIN(x)
  FROM (VALUES 10, 20, NULL::int64) AS t(x);

--> 10

agg ANY_VALUE

ANY_VALUE(expr: T) -> T

Returns the value of the expression for an arbitrary row amongst the input rows.

Arguments

The input expression can be of any type.

Return type

The type of the result is the same as the type of the input expression.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT ANY_VALUE(x)
  FROM (VALUES 7, 11, NULL::int64) AS t(x);

--> 7  -- could also be 11

Bitwise Aggregate Functions

agg BIT_AND

BIT_AND(expr: int64) -> int64

Returns the bitwise AND of all values of the expression over the inputs.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT BIT_AND(x)
  FROM (VALUES 7, 11, NULL::int64) AS t(x);

--> 3

See also

BIT_OR BIT_XOR

agg BIT_OR

BIT_OR(expr: int64) -> int64

Returns the bitwise OR of all values of the expression over the inputs.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT BIT_OR(x)
  FROM (VALUES 7, 11, NULL::int64) AS t(x);

--> 15

See also

BIT_AND BIT_XOR

agg BIT_XOR

BIT_XOR(expr: int64) -> int64

Returns the bitwise XOR of all values of the expression over the inputs.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT BIT_XOR(x)
  FROM (VALUES 7, 11, NULL::int64) AS t(x);

--> 12

See also

BIT_AND BIT_XOR

String Aggregate Functions

agg STRING_AGG

STRING_AGG(expr: text, delimiter: text) -> text

Concatenates all the values of the expression over the inputs into a single string, placing a delimiter between every two values.

Null handling

NULL expression values are ignored and no delimiter is placed after them. The result will be NULL only when there are no non-null inputs.

Examples

SELECT STRING_AGG(x, ', ')
  FROM (VALUES 'Hello', NULL::text, 'world!') AS t(x);

--> 'Hello, world!'

Boolean Aggregate Functions

agg BOOL_AND

BOOL_AND(expr: bool) -> bool

Returns true when the expression evaluates to true for all inputs, otherwise returns false.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT BOOL_AND(x)
  FROM (VALUES true, false, NULL::bool) AS t(x);

--> false

See also

BOOL_OR

agg BOOL_OR

BOOL_OR(expr: bool) -> bool

Returns true when the expression evaluates to true for at least one input, otherwise returns false.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Examples

SELECT BOOL_OR(x)
  FROM (VALUES true, false, NULL::bool) AS t(x);

--> true

Statistical Aggregate Functions

agg CORR

CORR(y: float64, x: float64) -> float64

Returns the coefficient of correlation between values of two expression over all the inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the function is calculated by the formula:

                                  COVAR_POP(y, x)
CORR(y, x) = --------------------------------------------------------
             STDDEV_POP(IFNOTNULL(x, y)) * STDDEV_POP(IFNOTNULL(y, x))

Examples

SELECT CORR(y, x) FROM
  (VALUES (2.25, 4.5), (3.75, 6.75), (4.5, 6.15), (5, NULL::float64)) AS t(x, y);

--> 0.8287694018176

See also

COVAR_POP STDDEV_POP

agg COVAR_POP

COVAR_POP(y: float64, x: float64) -> float64

Returns the population covariance between values of two expression over all the inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the function is calculated by the formula:

                               SUM(IFNOTNULL(x, y)) * SUM(IFNOTNULL(y, x))
                  SUM(y * x) - -------------------------------------------
                                             REGR_COUNT(y, x)
COVAR_POP(y, x) = ----------------------------------------------------------
                                      REGR_COUNT(y, x)

Examples

SELECT COVAR_POP(y, x)
  FROM (VALUES (2.25, 4.5), (3.75, 6.75), (4.5, 6.15), (5, NULL::float64)) AS t(x, y);

--> 0.7375

See also

COVAR_SAMP STDDEV_POP

agg COVAR_SAMP

COVAR_SAMP(y: float64, x: float64) -> float64

Returns the sample covariance between values of two expression over all the inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the function is calculated by the formula:

                                SUM(IFNOTNULL(x, y)) * SUM(IFNOTNULL(y, x))
                   SUM(y * x) - -------------------------------------------
                                              REGR_COUNT(y, x)
COVAR_SAMP(y, x) = ---------------------------------------------------------
                                    REGR_COUNT(y, x) - 1

Examples

SELECT COVAR_SAMP(y, x)
  FROM (VALUES (2.25, 4.5), (3.75, 6.75), (4.5, 6.15), (5, NULL::float64)) AS t(x, y);

--> 1.10625

See also

COVAR_POP STDDEV_SAMP

agg STDDEV_POP

STDDEV_POP(expr: float64) -> float64

STDDEV_POP(expr: timeperiod) -> timeperiod

Returns the population standard deviation of values of the expression over all inputs.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Notes

The result of the function is computed by taking the square root of the VAR_POP function, i.e. STDDEV_POP(x) = SQRT(VAR_POP(x)).

Examples

SELECT STDDEV_POP(x) FROM
  (VALUES 25, 41, 28, NULL::int64) AS t(x);

--> 6.94422221866655

SELECT STDDEV_POP(p::timeperiod)
  FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);

--> 1448.708::timeperiod

See also

VAR_POP STDDEV_SAMP

agg STDDEV_SAMP

STDDEV_SAMP(expr: float64) -> float64

STDDEV_SAMP(expr: timeperiod) -> timeperiod

Returns the sample standard deviation of values of the expression over all inputs.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Notes

The result of the function is computed by taking the square root of the VAR_SAMP function, i.e. STDDEV_SAMP(x) = SQRT(VAR_SAMP(x)).

Examples

SELECT STDDEV_SAMP(x) FROM
  (VALUES 25, 41, 28, NULL::int64) AS t(x);

--> 8.50490054811538

SELECT STDDEV_SAMP(p::timeperiod)
  FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);

--> 1774.297::timeperiod

See also

VAR_SAMP STDDEV_POP

agg VAR_POP

VAR_POP(expr: float64) -> float64

VAR_POP(expr: timeperiod) -> timeperiod

Returns the population variance of values of the expression over all inputs.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Notes

The result of the VAR_POP function is calculated by the formula:

                            SUM(x) * SUM(x)
               SUM(x * x) - ---------------
                                 COUNT(x)
VAR_POP(x) = --------------------------------
                        COUNT(x)

Examples

SELECT VAR_POP(x) FROM
  (VALUES 25, 41, 28, NULL::int64) AS t(x);

--> 48.22222222222222

SELECT VAR_POP(p::timeperiod)
  FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);

--> 2098755.555555::timeperiod

See also

STDDEV_POP VAR_SAMP

agg VAR_SAMP

VAR_SAMP(expr: float64) -> float64

VAR_SAMP(expr: timeperiod) -> timeperiod

Returns the sample variance of values of the expression over all inputs.

Null handling

NULL expression values are ignored. The result will be NULL only if there are no non-null inputs.

Notes

The result of the VAR_SAMP function is calculated by the formula:

                             SUM(x) * SUM(x)
                SUM(x * x) - ---------------
                                  COUNT(x)
VAR_SAMP(x) = --------------------------------
                       COUNT(x) - 1

Examples

SELECT VAR_SAMP(x) FROM
  (VALUES 25, 41, 28, NULL::int64) AS t(x);

--> 72.33333333333333

SELECT VAR_SAMP(p::timeperiod)
  FROM (VALUES '3 minutes 20 seconds', '17 minutes', '1 hour', NULL::text) AS t(p);

--> 3148133.333333::timeperiod

Linear Regression Aggregate Functions

The linear regression functions fit an ordinary-least-squares regression line to pairs of expression values over all the inputs.

agg REGR_AVGX

REGR_AVGX(y: float64, x: float64) -> float64

Returns the average value of the independent variable x over all valid inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_AVGX function is calculated by the formula: REGR_AVGX(y, x) = AVG(IFNOTNULL(y, x)).

Examples

SELECT REGR_AVGX(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 2.5

See also

REGR_AVGY

agg REGR_AVGY

REGR_AVGY(y: float64, x: float64) -> float64

Returns the average value of the dependent variable y over all valid inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_AVGY function is calculated by the formula: REGR_AVGY(y, x) = AVG(IFNOTNULL(x, y)).

Examples

SELECT REGR_AVGY(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 5.0

See also

REGR_AVGX

agg REGR_COUNT

REGR_COUNT(y: float64, x: float64) -> int64

Returns the number of valid inputs used to fit the regression line.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_COUNT function is calculated by the formula:

REGR_COUNT(y, x) = SUM(IF(x IS NOT NULL AND y IS NOT NULL, 1, 0))

Examples

SELECT REGR_COUNT(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 3

agg REGR_INTERCEPT

REGR_INTERCEPT(y: float64, x: float64) -> float64

Returns the y-intercept of the regression line fitted to the input values.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_INTERCEPT function is calculated by the formula:

REGR_INTERCEPT(y, x) = REGR_AVGY(y, x) - REGR_SLOPE(y, x) * REGR_AVGX(y, x)

Examples

SELECT REGR_INTERCEPT(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 2.259615384615

See also

REGR_SLOPE

agg REGR_R2

REGR_R2(y: float64, x: float64) -> float64

Returns the coefficient of determination (also called R-squared or goodness of fit) for the regression line fitted to the input values.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_R2 function is calculated by the formula:

REGR_R2(y, x) = CASE
                     WHEN VAR_POP(IFNOTNULL(y, x)) = 0
                     THEN NULL
                     WHEN VAR_POP(IFNOTNULL(x, y)) = 0
                     THEN 1
                     ELSE POWER(CORR(y, x), 2)
                END

Examples

SELECT REGR_R2(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 0.9917582417582

agg REGR_SLOPE

REGR_SLOPE(y: float64, x: float64) -> float64

Returns the slope of the regression line fitted to the input values.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_SLOPE function is calculated by the formula:

                        COVAR_POP(y, x)
REGR_SLOPE(y, x) = ------------------------
                   VAR_POP(IFNOTNULL(y, x))

Examples

SELECT REGR_SLOPE(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 1.096153846154

See also

REGR_INTERCEPT

agg REGR_SXX

REGR_SXX(y: float64, x: float64) -> float64

Returns the sum of squares of independent variable x for all valid inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_SXX function is calculated by the formula: REGR_SXX(y, x) = REGR_COUNT(y, x) * VAR_POP(IFNOTNULL(y, x)).

Examples

SELECT REGR_SXX(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 6.5

See also

REGR_SXY REGR_SYY

agg REGR_SXY

REGR_SXY(y: float64, x: float64) -> float64

Returns the sum of squares of the product between the independent variable x and dependent variable y for all valid inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_SXY function is calculated by the formula: REGR_SXY(y, x) = REGR_COUNT(y, x) * COVAR_POP(y, x).

Examples

SELECT REGR_SXY(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 7.125

See also

REGR_SXX REGR_SYY

agg REGR_SYY

REGR_SYY(y: float64, x: float64) -> float64

Returns the sum of squares of dependent variable y for all valid inputs.

Null handling

Input rows for which either or both args are NULL are ignored. The result will be NULL only when there are no valid (non-null) inputs.

Notes

The result of the REGR_SYY function is calculated by the formula: REGR_SYY(y, x) = REGR_COUNT(y, x) * VAR_POP(IFNOTNULL(x, y)).

Examples

SELECT REGR_SYY(y, x) FROM
  (VALUES (1, 3.5), (2, 4.25), (4.5, 7.25), (NULL::float64, 8), (10, NULL::float64)) AS t(x, y);

--> 7.875

See also

REGR_SXX REGR_SXY