# MIN_BY and MAX_BY Functions in Snowflake

Spread the love
Contents

## MIN_BY and MAX_BY Functions in Snowflake

MIN_BY and MAX_BY are aggregate functions in Snowflake that calculate the minimum or maximum value of a column and return the value of another column in that row.

Syntax

``````MIN_BY( <column_1>, <column_2> [ , <maximum_number_of_values_to_return> ] )
MAX_BY( <column_1>, <column_2> [ , <maximum_number_of_values_to_return> ] )``````

Arguments

## Implementation and Use Cases with examples

Let us understand the implementation and use cases of these functions using the following sample SALES data set.

``````----------------------------------------------
|"PRODUCT_ID"  |"SALE_DATE"  |"SALE_AMOUNT"  |
----------------------------------------------
|101           |2024-01-01   |200.00         |
|101           |2024-01-02   |150.00         |
|101           |2024-01-03   |220.00         |
|102           |2024-01-02   |190.00         |
|102           |2024-01-03   |155.00         |
|102           |2024-01-04   |160.00         |
|103           |2024-01-03   |300.00         |
|103           |2024-01-04   |330.00         |
|103           |2024-01-05   |310.00         |
----------------------------------------------``````

For example, if we wanted to calculate the value of the least sale amount, we could use the MIN(SALE_AMOUNT) expression to calculate the value. But if, instead, we wanted to return the SALE_DATE on which the least sale was registered, it would be a multi-step process, first to calculate the minimum sale amount, and then to find the date based on that minimum value.

MIN_BY and MAX_BY functions are designed to handle exactly such scenarios, simplifying the process by avoiding the multi-steps involved.

Example-1:

The following SQL statement returns the SALE_DATE on which the least sale was registered.

``````SELECT MIN_BY(SALE_DATE,SALE_AMOUNT) FROM SALES;

-----------------------------------
|"MIN_BY(SALE_DATE,SALE_AMOUNT)"  |
-----------------------------------
|2024-01-02                       |
-----------------------------------``````

A typical query involves a self join and multiple steps to calculate the same value as shown below.

``````WITH CTE AS(
SELECT MIN(SALE_AMOUNT) AS MIN_SALE_AMOUNT FROM SALES
)
SELECT SALE_DATE FROM SALES a
JOIN CTE b ON a.SALE_AMOUNT = b.MIN_SALE_AMOUNT
;``````

Note that if there are multiple dates on which the least sale occurred, the MIN_BY query would still return a single value. The returned value might be non-deterministic and could vary between executions.

Example-2:

Additionally, we can also include the number of values to be returned using the MIN_BY and MAX_BY functions.

The following SQL statement returns the two SALE_DATE values on which the least sales were registered.

``````SELECT MIN_BY(SALE_DATE,SALE_AMOUNT,2) FROM SALES;

-------------------------------------
|"MIN_BY(SALE_DATE,SALE_AMOUNT,2)"  |
-------------------------------------
|[                                  |
|  "2024-01-02",                    |
|  "2024-01-03"                     |
|]                                  |
-------------------------------------``````

Note that the query returned an array containing multiple dates, ordered by their sale values. If there are multiple dates on which the least sale occurred, the ordering of values will be non-deterministic in the return value.

Example-3:

The following SQL statement returns the SALE_DATE corresponding to the least and the highest SALE_AMOUNT values.

``````SELECT MIN_BY(SALE_DATE,SALE_AMOUNT), MAX_BY(SALE_DATE,SALE_AMOUNT) FROM SALES;

-----------------------------------------------------------------------
|"MIN_BY(SALE_DATE, SALE_AMOUNT)"  |"MAX_BY(SALE_DATE, SALE_AMOUNT)"  |
-----------------------------------------------------------------------
|2024-01-02                        |2024-01-04                        |
-----------------------------------------------------------------------``````

## MIN_BY and MAX_BY functions in GROUP BY Clause

Similar to the MIN and MAX functions, the MIN_BY and MAX_BY functions can be used in conjunction with the GROUP BY clause to determine the minimum and maximum values of one column based on another column within each group of data.

The following SQL statement returns the latest sale amount of each product.

``````-- Calculates the latest sales amount of each product
SELECT
PRODUCT_ID,
MAX_BY(SALE_AMOUNT,SALE_DATE) AS LATEST_SALE_AMOUNT
FROM SALES
GROUP BY PRODUCT_ID
;

---------------------------------------
|"PRODUCT_ID"  |"LATEST_SALE_AMOUNT"  |
---------------------------------------
|101           |220.00                |
|102           |160.00                |
|103           |310.00                |
---------------------------------------``````

MIN and MAX functions can be used in conjunction with MIN_BY and MAX_BY functions in Snowflake. While MIN and MAX functions calculate the minimum and maximum values directly, MIN_BY and MAX_BY allow you to retrieve associated values from other columns based on these extremes, providing a more comprehensive view of your data.

The following SQL statement return the latest sale amount of each product along with the date on which sale is registered.

``````-- Calculates the latest sales amount of each product along with the sales date
SELECT
PRODUCT_ID,
MAX(SALE_DATE) AS LATEST_SALE_DATE,
MAX_BY(SALE_AMOUNT,SALE_DATE) AS LATEST_SALE_AMOUNT
FROM SALES
GROUP BY PRODUCT_ID;

------------------------------------------------------------
|"PRODUCT_ID"  |"LATEST_SALE_DATE"  |"LATEST_SALE_AMOUNT"  |
------------------------------------------------------------
|101           |2024-01-03          |220.00                |
|102           |2024-01-04          |160.00                |
|103           |2024-01-05          |310.00                |
------------------------------------------------------------``````

## MIN_BY and MAX_BY Analytical functions

MIN_BY and MAX_BY functions can be used in conjunction with analytical functions (window functions) in Snowflake. By combining them with a window function, you can perform row-wise computations across partitions of data, rather than just aggregating results across groups.

The following SQL statement calculates the latest sale amount for each product and represents this value against each record of the product.

``````SELECT
PRODUCT_ID, SALE_DATE, SALE_AMOUNT,
MAX_BY(SALE_AMOUNT,SALE_DATE) OVER(PARTITION BY PRODUCT_ID) AS LATEST_SALE_AMOUNT
FROM SALES
ORDER BY PRODUCT_ID, SALE_DATE;

---------------------------------------------------------------------
|"PRODUCT_ID"  |"SALE_DATE"  |"SALE_AMOUNT"  |"LATEST_SALE_AMOUNT"  |
---------------------------------------------------------------------
|101           |2024-01-01   |200.00         |220.00                |
|101           |2024-01-02   |150.00         |220.00                |
|101           |2024-01-03   |220.00         |220.00                |
|102           |2024-01-02   |190.00         |160.00                |
|102           |2024-01-03   |155.00         |160.00                |
|102           |2024-01-04   |160.00         |160.00                |
|103           |2024-01-03   |300.00         |310.00                |
|103           |2024-01-04   |330.00         |310.00                |
|103           |2024-01-05   |310.00         |310.00                |
---------------------------------------------------------------------``````

Subscribe to our Newsletter !!

Related Articles: