PostgreSQL is equipped with a robust suite of statistical functions that are essential for performing detailed data analysis directly within the database. These functions allow users to calculate various statistical measures, such as averages, variances, standard deviations, and correlations, directly from stored data. This capability not only streamlines the analytical process but also enhances performance by leveraging PostgreSQL’s powerful query optimization. Whether you’re analyzing financial data, customer metrics, or scientific measurements, PostgreSQL’s statistical functions provide the tools necessary to derive meaningful insights without the need for additional statistical software. This integration of statistical analysis into the database engine supports a wide range of applications, from business intelligence and market research to scientific research and predictive analytics, making PostgreSQL an invaluable tool for data scientists and analysts alike, Here is a comprehensive list of the different types of aggregate functions available, categorized by their use cases:

Table of Contents

1. General-Purpose Aggregates

COUNT(expression): Counts non-null values in a set.
SUM(expression): Sums up the values of the expression.
AVG(expression): Calculates the average of the expression values.
MIN(expression): Finds the minimum value of the expression.
MAX(expression): Finds the maximum value of the expression.

2. Statistical Aggregates

STDDEV(expression): Computes the standard deviation of the input values.
STDDEV_POP(expression): Computes the population standard deviation.
STDDEV_SAMP(expression): Computes the sample standard deviation.
VAR_POP(expression): Computes the population variance.
VAR_SAMP(expression): Computes the sample variance.
VARIANCE(expression): Alias for VAR_SAMP.
COVAR_POP(expression1, expression2): Calculates the population covariance between two expressions.
COVAR_SAMP(expression1, expression2): Calculates the sample covariance.
CORR(expression1, expression2): Calculates the correlation coefficient.

3. Ordered-Set Aggregates

MODE(): Returns the mode of a set of values (the value that appears most frequently).
PERCENTILE_CONT(fraction): Computes a percentile based on a continuous distribution of the column values.
PERCENTILE_DISC(fraction): Computes a percentile based on a discrete distribution of the column values.

4. Hypothetical-Set Aggregates

Similar to ordered-set aggregates but include a hypothetical element:
- RANK(expression)
- DENSE_RANK(expression)
- PERCENT_RANK(expression)
- CUME_DIST(expression)

5. Array Aggregates

ARRAY_AGG(expression): Aggregates values, including nulls, into an array.
STRING_AGG(expression, delimiter): Concatenates input values into a string, separated by a delimiter.

6. JSON Aggregates

JSON_AGG(expression): Aggregates values as a JSON array.
JSONB_AGG(expression): Aggregates values as a JSONB array.
JSON_OBJECT_AGG(key, value): Aggregates input as a JSON object.
JSONB_OBJECT_AGG(key, value): Aggregates input as a JSONB object.

7. XML Aggregates

XMLAGG(expression): Concatenates XML values.

8. Grouping Operations

GROUPING(expression): Identifies the level of aggregation, particularly in GROUP BY GROUPING SETS.

Specialized Aggregates

BIT_AND(expression): Performs a bitwise AND operation on all input values.
BIT_OR(expression): Performs a bitwise OR operation on all input values.
BOOL_AND(expression): Returns TRUE if all input values are TRUE.
BOOL_OR(expression): Returns TRUE if at least one input value is TRUE.

PostgreSQL’s aggregate functions enable powerful data analysis by performing statistical, mathematical, and group-based operations directly within SQL queries, enhancing its analytical capabilities.

The WebScale Database Infrastructure Architecture, Engineering and Operations Company

Full-Stack Database Engineering & Cloud DBaaS Solutions for PostgreSQL, MySQL, MongoDB & More | Performance, Scalability, High Availability, Security & Analytics Experts