CASE Statement in Snowflake Snowpark

Spread the love

1. Introduction

The CASE statement in SQL is used to evaluate conditional logic. It allows you to perform if-then-else logic within your queries using WHEN and THEN clauses.

  • The CASE statement checks each row to see if the conditional statement is TRUE.
  • When a condition is TRUE, the evaluation stops, and the corresponding result (following THEN) is returned.
  • If no conditions are TRUE, the result after the optional ELSE is returned. Otherwise, NULL is returned.

CASE statements can be useful for transforming data, creating new columns, or applying different calculations based on specific conditions.

SELECT
    column1,
    column2,
    CASE
        WHEN condition1 THEN result1
        WHEN condition2 THEN result2
        ...
        ELSE result
    END AS new_column_name
FROM
    table_name;

2. CASE Statement in Snowflake Snowpark

A CASE statement can be constructed in Snowflake Snowpark using the `when()` function to specify a condition and the corresponding result for that condition. The `otherwise()` method is used to specify the optional result when no other specified conditions are met.

The following is the syntax of the CASE statement in Snowflake Snowpark.

df.select(when(condition, result1).otherwise(result2))

3. Demonstration of CASE Statement in Snowpark

Consider the following DataFrame as a source for the demonstration of the implementation of the CASE statement in Snowpark.

>>> df = session.createDataFrame([1,2,[None],3], schema=["a"])
>>> df.show()
--------
|"A"   |
--------
|1     |
|2     |
|NULL  |
|3     |
--------

The following code demonstrates creating a new DataFrame column using a CASE statement in Snowflake Snowpark.

  • It selects the existing column “a” and creates a new column named “Result” that returns ‘A’ if the value in column “a” is greater than or equal to 0, otherwise, it returns ‘B’.
>>> from snowflake.snowpark.functions import when, col

>>> df_case = df.select(col("a"),
                        when(col("a") >= 0, 'A')
                       .otherwise('B')
                       .alias("Result")
                        )
>>> df_case.show()
-------------------
|"A"   |"RESULT"  |
-------------------
|1     |A         |
|2     |A         |
|NULL  |B         |
|3     |A         |
-------------------

The following SQL query is the equivalent of the provided Snowflake Snowpark code.

SELECT a,
    CASE
        WHEN a >= 0 THEN 'A'
        ELSE 'B'
    END AS Result
FROM df;

The otherwise() method is optional in the code and when not specified returns NULL as shown below.

>>> df_case = df.select(col("a"),
                        when(col("a") >= 0, 'A')
                       .alias("Result")
                        )

>>> df_case.show()
-------------------
|"A"   |"RESULT"  |
-------------------
|1     |A         |
|2     |A         |
|NULL  |NULL      |
|3     |A         |
-------------------

The following SQL query is the equivalent of the provided Snowflake Snowpark code.

SELECT a,
    CASE
        WHEN a >= 0 THEN 'A'
    END AS Result
FROM df;

4. Adding Multiple conditions to a CASE statement

The following code snippet demonstrates creating a new DataFrame column in Snowflake Snowpark using multiple conditions with `when()` and `otherwise()`. It selects the existing column “a” and creates a new column named “Result” based on the following logic:

  • If the value in column “a” is 1, “Result” will be ‘A’.
  • If the value in column “a” is 2, “Result” will be ‘B’.
  • If the value in column “a” is NULL, “Result” will be ‘C’.
  • If none of these conditions are met, “Result” will be ‘D’.
>>> df_multi_case = df.select(col("a"),
                              when(col("a") == 1, 'A')
                             .when(col("a") == 2, 'B')
                             .when(col("a").is_null(), 'C')
                             .otherwise('D')
                             .alias("Result")
                             )

>>> df_multi_case.show()
-------------------
|"A"   |"RESULT"  |
-------------------
|1     |A         |
|2     |B         |
|NULL  |C         |
|3     |D         |
-------------------

The following SQL query is the equivalent of the provided Snowflake Snowpark code.

SELECT a,
    CASE
        WHEN a = 1 THEN 'A'
        WHEN a = 2 THEN 'B'
        WHEN a IS NULL THEN 'C'
        ELSE 'D'
    END AS Result
FROM df;

5. Adding Multiple conditions in a WHEN clause of CASE statement

Consider the following DataFrame as a source for the demonstration.

>>> data = [[1,1],
            [1,0],
            [0,1],
            [0,0]]

>>> df = session.createDataFrame(data, schema=["a","b"])
>>> df.show()
-------------
|"A"  |"B"  |
-------------
|1    |1    |
|1    |0    |
|0    |1    |
|0    |0    |
-------------

The following code snippet demonstrates how multiple conditions can be specified in a WHEN clause to create a new DataFrame column in Snowflake Snowpark that implements AND gate logic. It selects the existing columns “a” and “b”, and creates a new column named “Result” where the value is ‘TRUE’ if both “a” and “b” are 1, otherwise the value is ‘FALSE’.

>>> df_and = df.select(col("a"),col("b"),
              when((col("a") == 1) & (col("b") == 1), 'TRUE')
             .otherwise('FALSE')
             .alias("Result")
              )

>>> df_and.show()
------------------------
|"A"  |"B"  |"RESULT"  |
------------------------
|1    |1    |TRUE      |
|1    |0    |FALSE     |
|0    |1    |FALSE     |
|0    |0    |FALSE     |
------------------------

The following SQL query is the equivalent of the provided Snowflake Snowpark code.

SELECT a, b,
    CASE
        WHEN (a = 1 AND b = 1) THEN 'TRUE'
        ELSE 'FALSE'
    END AS Result
FROM df;

Subscribe to our Newsletter !!

Related Articles:

Leave a Comment

Related Posts