1. Introduction
Consider a table that captures the details of marks scored by students in each subject as shown below.
--CREATE STUDENT_MARKS TABLE
CREATE OR REPLACE TABLE STUDENT_MARKS(
ID NUMBER,
MATHS NUMBER(3),
PHYSICS NUMBER(3),
CHEMISTRY NUMBER(3)
);
--INSERT DATA INTO THE TABLE
INSERT INTO STUDENTS VALUES(1,50,65,70);
INSERT INTO STUDENTS VALUES(2,88,90,65);
INSERT INTO STUDENTS VALUES(3,92,87,81);
The requirement is to capture the highest and lowest marks scored by each student. This can be accomplished using custom logic with CASE statements, as demonstrated below.
SELECT *,
CASE WHEN MATHS > PHYSICS AND MATHS > CHEMISTRY THEN MATHS
WHEN PHYSICS > CHEMISTRY THEN PHYSICS
ELSE CHEMISTRY
END AS MAX_MARKS,
CASE WHEN MATHS < PHYSICS AND MATHS < CHEMISTRY THEN MATHS
WHEN PHYSICS < CHEMISTRY THEN PHYSICS
ELSE CHEMISTRY
END AS MIN_MARKS
FROM STUDENT_MARKS;
In the scenario discussed, we have calculated the highest and lowest marks from a set of three subjects. Now consider the prospect of performing the same calculations from a set of 6 subjects. The complexity of the calculations increases as the number of items in the set increases, from which we aim to calculate the highest and lowest values.
Snowflake provides a simple solution to solve this problem through inbuilt functions GREATEST and LEAST.
2. GREATEST and LEAST Functions in Snowflake
The GREATEST function in Snowflake retrieves the largest value from a list of expressions, whereas the LEAST function returns the smallest value from a list of expressions.
Below is the syntax of the GREATEST and LEAST functions.
GREATEST(<exp1>,<exp2>,…)
LEAST(<exp1>,<exp2>,…)
The same scenario of calculating the highest and lowest marks scored by each student can be simplified using the GREATEST and LEAST functions as shown below.
SELECT ID, MATHS, PHYSICS, CHEMISTRY,
GREATEST(MATHS,PHYSICS,CHEMISTRY) MAX_MARKS,
LEAST(MATHS,PHYSICS,CHEMISTRY) MIN_MARKS
FROM STUDENT_MARKS;
The expressions used in GREATEST and LEAST functions can belong to any data type. However, it is essential that all expressions within a single call share either the same data type or compatible types.
3. Using GREATEST and LEAST Functions in Snowflake with NULL values
Both GREATEST and LEAST functions return the result as NULL if any of the expression values passed to the functions are NULL.
For example, let us add a new record in the Students table to understand how NULLs are handled by the GREATEST and LEAST functions.
--INSERTING NEW STUDENT RECORD WITH NULL VALUES
INSERT INTO STUDENT_MARKS VALUES(4,NULL,45,-2);
The query would return NULL as both the highest and lowest values for the student with ID 4 as shown below.
SELECT ID, MATHS, PHYSICS, CHEMISTRY,
GREATEST(MATHS,PHYSICS,CHEMISTRY) MAX_MARKS,
LEAST(MATHS,PHYSICS,CHEMISTRY) MIN_MARKS
FROM STUDENT_MARKS;
To handle the NULL values, Snowflake provides two other functions which are ARRAY_MAX and ARRAY_MIN
4. ARRAY_MAX and ARRAY_MIN functions in Snowflake
ARRAY_MAX and ARRAY_MIN functions in Snowflake are similar to GREATEST and LEAST functions except that they return the highest and lowest element values from an array.
Below is the syntax of ARRAY_MAX and ARRAY_MIN functions.
ARRAY_MAX([<exp1>,<exp2>,…])
ARRAY_MIN([<exp1>,<exp2>,…])
The highest and lowest marks scored by each student can be calculated using ARRAY_MAX and ARRAY_MIN functions by wrapping the field names in square brackets as shown below.
SELECT ID, MATHS, PHYSICS, CHEMISTRY,
ARRAY_MAX([MATHS,PHYSICS,CHEMISTRY]) MAX_MARKS,
ARRAY_MIN([MATHS,PHYSICS,CHEMISTRY]) MIN_MARKS
FROM STUDENT_MARKS;
One important point to note is that ARRAY_MAX and ARRAY_MIN functions return a value of VARIANT data type.
The return fields can be cast to the appropriate data type to return the values in the desired format. In the student’s example discussed, the return fields can be returned as a NUMBER data type as shown below.
SELECT ID, MATHS, PHYSICS, CHEMISTRY,
ARRAY_MAX([MATHS,PHYSICS,CHEMISTRY])::NUMBER MAX_MARKS,
ARRAY_MIN([MATHS,PHYSICS,CHEMISTRY])::NUMBER MIN_MARKS
FROM STUDENT_MARKS;
5. Conclusion
Snowflake’s GREATEST and LEAST functions help in calculating the highest and lowest values from a set of values but returns NULL if any of the argument value is NULL. This behavior is not a limitation but an intended feature, as outlined in the documentation. The developers need to understand how these functions behave with NULL values and operate accordingly.
Alternatively, ARRAY_MAX and ARRAY_MIN functions can be used when NULL values are involved. However, it is important to note that these functions return a value of VARIANT data type which requires an additional step to convert them back to the required data type.
Subscribe to our Newsletter !!
Related Articles:
Learn how to find and kill long running queries in Snowflake using the QUERY_HISTORY table functions available under Information Schema.
The GROUP BY ALL clause is similar to GROUP BY clause except that it automatically include all non-aggregate items in the SELECT clause for grouping.
QUALIFY clause in a SELECT statement allows you to filter query results of Window functions within the same query.