1. What is Snowflake Dynamic Data Masking?
Snowflake Dynamic Data Masking is a security feature that allows organizations to mask sensitive data in their database tables, views, and query results in real-time. This is useful for protecting sensitive information from unauthorized access or exposure.
Snowflake Dynamic Data Masking allows the data to be masked as it is accessed, rather than being permanently altered in the database.
With Dynamic Data Masking, users can choose which data to mask and how it should be masked, such as by replacing sensitive information with dummy values or by partially revealing data. This can be done at the column level, meaning that different columns can be masked differently depending on the sensitivity of the data they contain.
2. Steps to apply Snowflake Dynamic Data Masking on a column
Follow below steps to perform Dynamic Data Masking in Snowflake.
- Step-1: Create a Custom Role with Masking Privileges
- Step-2: Assign Masking Role to an existing Role/User
- Step-3: Create a Masking Policy
- Step-4: Apply the Masking Policy to a Table or View Column
- Step-5: Verify the masking rules by querying data
Step-1: Create a Custom Role with Masking Privileges
The below SQL statement creates a custom role MASKINGADMIN in Snowflake.
create role MASKINGADMIN;
The below SQL statement grants privileges to create masking policies to the role MASKINGADMIN.
grant create masking policy on schema MYDB.MYSCHEMA to role MASKINGADMIN;
The below SQL statement grants privileges to apply masking policies to the role MASKINGADMIN.
grant apply masking policy on account to role MASKINGADMIN;
Step-2: Assign Masking Role to an existing Role/User
The MASKINGADMIN role by default will not have access to any database nor warehouse. The role needs to be assigned to another Custom Role or a User who have privileges to access a database and warehouse.
The below SQL statement assigns MASKINGADMIN to another custom role named DATAENGINEER.
grant role MASKINGADMIN to role DATAENGINEER;
This allows all users with DATAENGINEER role to inherit masking privileges. Instead if you want to limit the masking privileges, assign the role to individual users.
The below SQL statement assigns MASKINGADMIN to a User named STEVE.
grant role MASKINGADMIN to user STEVE;
Step-3: Create a Masking Policy
The below SQL statement creates a masking policy STRING_MASK that can be applied to columns of type string.
create or replace masking policy STRING_MASK as (val string) returns string ->
when current_role() in ('DATAENGINEER') then val
This masking policy masks the data applied on a column when queried from a role other than DATAENGINEER.
Step-4: Apply (Set) the Masking Policy to a Table or View Column
The below SQL statement applies the masking policy STRING_MASK on a column named LAST_NAME in EMPLOYEE table.
alter table if exists EMPLOYEE modify column LAST_NAME set masking policy STRING_MASK;
Note that prior to dropping a policy, the policy needs to be unset from all the tables and views on which it is applied.
Step-5: Verify the masking rules by querying data
Verify the data present in EMPLOYEE table by querying from two different roles.
The below image shows data present in EMPLOYEE when queried from DATAENGINEER role.
The below image shows data present in EMPLOYEE when queried from ANALYST role where the data present in LAST_NAME column is masked.
3. Remove (Unset) Masking Policy on a column
The below SQL statement removes (unsets) a masking policy applied on a column present in a table.
alter table if exists EMPLOYEE modify LAST_NAME unset masking policy ;
4. Partial Data Masking in Snowflake
Snowflake also supports partially masking the column data.
The below SQL statement creates a masking policy EMAIL_MASK which partially mask the email data when queried from ANALYST role leaving the email domain unmasked.
create or replace masking policy EMAIL_MASK as (val string) returns string ->
when current_role() in ('DATAENGINEER') then val
when current_role() in ('ANALYST') then regexp_replace(val,'.+\@','*****@') -- leave email domain unmasked
The below SQL statement applies the masking policy EMAIL_MASK on a column named EMAIL in EMPLOYEE table.
alter table if exists EMPLOYEE modify column EMAIL set masking policy EMAIL_MASK;
The below image shows data present in EMPLOYEE when queried from ANALYST role where the data present in EMAIL column is partially masked.
5. Conditional Data Masking in Snowflake
Conditional Data Masking allows you to selectively apply the masking on a column by using a different column to determine whether data in a given column should be masked.
The CREATE MASKING POLICY syntax consists of two arguments. The first column always specifies the column to mask. The second column is a conditional column to evaluate whether the first column should be masked.
The below SQL statement masks the data when the value of conditional columnis less than 105.
create or replace masking policy EMAIL_MASK as (mask_col string, cond_col number ) returns string ->
when cond_col < 105 then mask_col
The below SQL statement applies the masking policy EMAIL_MASK on a column named EMAIL based on the value of the conditional column EMPLOYEE_ID present in the EMPLOYEE table.
alter table if exists EMPLOYEE modify column EMAIL set masking policy EMAIL_MASK using(email, employee_id);
The below image shows the output of a query from EMPLOYEE table where the EMAIL data is masked based on the value of EMPLOYEE_ID.
6. Altering Masking Policies in Snowflake
Snowflake supports modifying the existing masking policy rules with new rules and renaming of a masking policy. The changes done to the masking policy will go into effect when the next SQL query that uses the masking policy runs.
Below is the syntax to alter the existing masking policy in Snowflake.
ALTER MASKING POLICY [ IF EXISTS ] <name> SET BODY -> <expression_on_arg_name>
ALTER MASKING POLICY [ IF EXISTS ] <name> RENAME TO <new_name>
ALTER MASKING POLICY [ IF EXISTS ] <NAME> SET COMMENT = '<string_literal>'
7. Extracting information of existing Masking Policies in Snowflake
SHOW MASKING POLICIES
Lists masking policy information, including the creation date, database and schema names, owner, and any available comments.
The below SQL statements extracts the masking policies present in the database and schema of the current session.
show masking policies;
The below SQL statement extracts all the masking policies present across the account.
show masking policies in account;
DESCRIBE MASKING POLICY
Describes the details about a masking policy, including the creation date, name, data type, and SQL expression.
The below SQL statement extracts information of the masking policy STRING_MASK.
describe masking policy STRING_MASK;
8. Dropping Masking Policies in Snowflake
A Masking Policy in Snowflake cannot be dropped successfully if it is currently assigned to a column.
Follow below steps to drop a Masking Policy in Snowflake
1. Find the columns on which the policy is applied.
The below SQL statement lists all the columns on which EMAIL_MASK masking policy is applied.
select * from table(information_schema.policy_references(policy_name=>'EMAIL_MASK'));
2. Once the columns on which masking policies are applied is found out, UNSET the masking policy from the column
3. Drop the masking Policy.
The below SQL statement drops the masking policy named EMAIL_MASK;
drop masking policy EMAIL_MASK;
9. Limitations of Snowflake Dynamic Data Masking
Currently, Snowflake does not support different input and output data types in a masking policy, i.e you cannot mask a date column with a string value (e.g. ***MASKED***).
The input and output data types of a masking policy must match.
Masking policies cannot be applied to virtual columns. Apply the policy to the source table column or view column.
In conditional masking policies, a virtual column of an external table can be listed as a conditional column argument to determine whether the first column argument should be masked. However, a virtual column cannot be specified as the first column to mask.
Prior to dropping a policy, the policy needs to be unset from the table or view.
Subscribe to our Newsletter !!
The INFORMATION_SCHEMA is a read-only schema available automatically under each database. It stores metadata of all the objects built under the database.
Snowflake Resource Monitors assist in cost management and prevent unforeseen credit usage caused by operating warehouses.
Learn different ways to remove duplicate records from a Snowflake table using SWAP WITH and INSERT OVERWRITE commands.