Snowflake Dynamic Data Masking

Spread the love

1. What is Snowflake Dynamic Data Masking?

Snowflake Dynamic Data Masking is a security feature that allows organizations to mask sensitive data in their database tables, views, and query results in real-time. This is useful for protecting sensitive information from unauthorized access or exposure.

Snowflake Dynamic Data Masking allows the data to be masked as it is accessed, rather than being permanently altered in the database.

With Dynamic Data Masking, users can choose which data to mask and how it should be masked, such as by replacing sensitive information with dummy values or by partially revealing data. This can be done at the column level, meaning that different columns can be masked differently depending on the sensitivity of the data they contain.

2. Steps to apply Snowflake Dynamic Data Masking on a column

Follow below steps to perform Dynamic Data Masking in Snowflake.

  • Step-1: Create a Custom Role with Masking Privileges
  • Step-2: Assign Masking Role to an existing Role/User
  • Step-3: Create a Masking Policy
  • Step-4: Apply the Masking Policy to a Table or View Column
  • Step-5: Verify the masking rules by querying data

Step-1: Create a Custom Role with Masking Privileges

The below SQL statement creates a custom role MASKINGADMIN in Snowflake.

create role MASKINGADMIN;

The below SQL statement grants privileges to create masking policies to the role MASKINGADMIN.

grant create masking policy on schema MYDB.MYSCHEMA to role MASKINGADMIN;

The below SQL statement grants privileges to apply masking policies to the role MASKINGADMIN.

grant apply masking policy on account to role MASKINGADMIN;

Step-2: Assign Masking Role to an existing Role/User

The MASKINGADMIN role by default will not have access to any database nor warehouse. The role needs to be assigned to another Custom Role or a User who have privileges to access a database and warehouse.

The below SQL statement assigns MASKINGADMIN to another custom role named DATAENGINEER.

grant role MASKINGADMIN to role DATAENGINEER; 

This allows all users with DATAENGINEER role to inherit masking privileges. Instead if you want to limit the masking privileges, assign the role to individual users.

The below SQL statement assigns MASKINGADMIN to a User named STEVE.

grant role MASKINGADMIN to user STEVE;

Step-3: Create a Masking Policy

The below SQL statement creates a masking policy STRING_MASK that can be applied to columns of type string.

create or replace masking policy STRING_MASK as (val string) returns string ->
  case
    when current_role() in ('DATAENGINEER') then val
    else '*********'
  end;

This masking policy masks the data applied on a column when queried from a role other than DATAENGINEER.

Step-4: Apply (Set) the Masking Policy to a Table or View Column

The below SQL statement applies the masking policy EMAIL_MASK on a column named LAST_NAME in EMPLOYEE table.

alter table if exists EMPLOYEE modify column LAST_NAME set masking policy STRING_MASK;

Note that prior to dropping a policy, the policy needs to be unset from all the tables and views on which it is applied.

Step-5: Verify the masking rules by querying data

Verify the data present in EMPLOYEE table by querying from two different roles.

The below image shows data present in EMPLOYEE when queried from DATAENGINEER role.

Unmasked data when queried from DATAENGINEER role
Unmasked data when queried from DATAENGINEER role

The below image shows data present in EMPLOYEE when queried from ANALYST role where the data present in LAST_NAME column is masked.

Masked data when queried from ANALYST role
Masked data when queried from ANALYST role

3. Remove (Unset) Masking Policy on a column

The below SQL statement removes (unsets) a masking policy applied on a column present in a table.

alter table if exists EMPLOYEE modify LAST_NAME unset masking policy ;

4. Partial Data Masking in Snowflake

Snowflake also supports partially masking the column data.

The below SQL statement creates a masking policy EMAIL_MASK which partially mask the email data when queried from ANALYST role leaving the email domain unmasked.

create or replace masking policy EMAIL_MASK as (val string) returns string ->
case
  when current_role() in ('DATAENGINEER') then val
  when current_role() in ('ANALYST') then regexp_replace(val,'.+\@','*****@') -- leave email domain unmasked
  else '********'
end;

The below SQL statement applies the masking policy EMAIL_MASK on a column named EMAIL in EMPLOYEE table.

alter table if exists EMPLOYEE modify column EMAIL set masking policy EMAIL_MASK;

The below image shows data present in EMPLOYEE when queried from ANALYST role where the data present in EMAIL column is partially masked.

Partial Data Masking
Partial Data Masking

5. Conditional Data Masking in Snowflake

Conditional Data Masking allows you to selectively apply the masking on a column by using a different column to determine whether data in a given column should be masked.

The CREATE MASKING POLICY syntax consists of two arguments. The first column always specifies the column to mask. The second column is a conditional column to evaluate whether the first column should be masked. 

The below SQL statement masks the data when the value of conditional columnis less than 105.

create or replace masking policy EMAIL_MASK as (mask_col string, cond_col number ) returns string ->
  case
    when cond_col < 105 then mask_col
    else '*********'
  end;

The below SQL statement applies the masking policy EMAIL_MASK on a column named EMAIL based on the value of the conditional column EMPLOYEE_ID present in the EMPLOYEE table.

alter table if exists EMPLOYEE modify column EMAIL set masking policy EMAIL_MASK using(email, employee_id);

The below image shows the output of a query from EMPLOYEE table where the EMAIL data is masked based on the value of EMPLOYEE_ID.

Conditional Data Masking
Conditional Data Masking

6. Altering Masking Policies in Snowflake

Snowflake supports modifying the existing masking policy rules with new rules and renaming of a masking policy. The changes done to the masking policy will go into effect when the next SQL query that uses the masking policy runs.

Below is the syntax to alter the existing masking policy in Snowflake.

ALTER MASKING POLICY [ IF EXISTS ] <name> SET BODY -> <expression_on_arg_name>

ALTER MASKING POLICY [ IF EXISTS ] <name> RENAME TO <new_name>

ALTER MASKING POLICY [ IF EXISTS ] <NAME> SET COMMENT = '<string_literal>'

7. Extracting information of existing Masking Policies in Snowflake

SHOW MASKING POLICIES

Lists masking policy information, including the creation date, database and schema names, owner, and any available comments.

The below SQL statements extracts the masking policies present in the database and schema of the current session.

show masking policies;
Listing all masking policies
Listing all masking policies

The below SQL statement extracts all the masking policies present across the account.

show masking policies in account;

DESCRIBE MASKING POLICY

Describes the details about a masking policy, including the creation date, name, data type, and SQL expression.

The below SQL statement extracts information of the masking policy STRING_MASK.

describe masking policy STRING_MASK;
Extracting details of a Masking Policy
Extracting details of a Masking Policy

8. Dropping Masking Policies in Snowflake

A Masking Policy in Snowflake cannot be dropped successfully if it is currently assigned to a column.

Follow below steps to drop a Masking Policy in Snowflake

1. Find the columns on which the policy is applied.

The below SQL statement lists all the columns on which EMAIL_MASK masking policy is applied.

select * from table(information_schema.policy_references(policy_name=>'EMAIL_MASK'));
Finding all the columns on which Masking Policy is applied
Finding all the columns on which Masking Policy is applied

2. Once the columns on which masking policies are applied is found out, UNSET the masking policy from the column

3. Drop the masking Policy.

The below SQL statement drops the masking policy named EMAIL_MASK;

drop masking policy EMAIL_MASK;

9. Limitations of Snowflake Dynamic Data Masking

Currently, Snowflake does not support different input and output data types in a masking policy, i.e you cannot mask a date column with a string value (e.g. ***MASKED***).

The input and output data types of a masking policy must match.

Masking policies cannot be applied to virtual columns. Apply the policy to the source table column or view column.

In conditional masking policies, a virtual column of an external table can be listed as a conditional column argument to determine whether the first column argument should be masked. However, a virtual column cannot be specified as the first column to mask.

Prior to dropping a policy, the policy needs to be unset from the table or view.

Subscribe to our Newsletter !!

Related Articles:

  • Snowflake Information Schema

    The INFORMATION_SCHEMA is a read-only schema available automatically under each database. It stores metadata of all the objects built under the database.

    READ MORE

  • Snowflake Resource Monitors

    Snowflake Resource Monitors assist in cost management and prevent unforeseen credit usage caused by operating warehouses.

    READ MORE

  • HOW TO: Remove Duplicates in Snowflake?

    Learn different ways to remove duplicate records from a Snowflake table using SWAP WITH and INSERT OVERWRITE commands.

    READ MORE

Leave a Comment

Related Posts