Snowflake Cortex AISQL Functions

Spread the love

With the introduction of Snowflake Cortex AISQL functions, Snowflake has brought the power of large language models (LLMs) natively into SQL. These built-in AI functions enable you to perform tasks like summarization, classification, sentiment analysis, translation, and analyze unstructured text all within your Snowflake environment using SQL without needing external tools or Python code.

In this article, we’ll walk through practical use cases of various Snowflake AISQL functions using real-world examples.

The queries in this article use the AIRBNB_PROPERTIES_INFORMATION dataset, which you can access for Free from the Snowflake Marketplace.

Contents hide

1. AI_COMPLETE

Example 1: Generate Description for Database Columns

Example 2: Detect the Language of Each Review

2. TRANSLATE

Example: Translate Reviews from multiple languages into English

3. SUMMARIZE

Example: Summarize Translated Reviews

4. AI_SUMMARIZE_AGG

Example: Summarize Reviews by Date

5. AI_AGG

Example: Return a Structured Summary of Review Sentiment by Day

6. AI_FILTER

Example 1: Detect Reviews Indicating Negative Experience (Used in SELECT)

Example 2: Return Only the Negative Reviews (Used in WHERE)

7. EXTRACT_ANSWER

Example: Extract Issues Mentioned in a Review

8. SENTIMENT

Example: Determine Sentiment of Airbnb Reviews

9. ENTITY_SENTIMENT

Example: Analyze Sentiment Toward Key Aspects in Airbnb Reviews

10. AI_CLASSIFY

Example 1: Classify Sentiment of Summarized Airbnb Reviews

Example 2: Identify Property Type Based on Description

11. AI_SIMILARITY

Example: Match Raw Country Names to Standardized Names

12. COUNT_TOKENS

Example: Check Token Usage Across Models

1. AI_COMPLETE

The AI_COMPLETE function is one of the most flexible and powerful Cortex AI functions in Snowflake. It generates natural language responses based on a custom prompt and chosen LLM model. This function is ideal for free-form generation tasks such as creating summaries, generating documentation, classifying text, answering questions, and more just by adjusting the prompt.

Syntax:

SNOWFLAKE.CORTEX.AI_COMPLETE(
    model => '<model_name>',
    prompt => '<your_prompt>'
)

To find the list of all supported models in Snowflake, refer to the Snowflake documentation.

Example 1: Generate Description for Database Columns

The below query generates short, documentation-style descriptions for column names in tables from the Snowflake Sample database.

It uses a large language model (snowflake-llama-3.3-70b) and prompts it to describe each column without repeating the column name in the description.

SELECT
  TABLE_NAME,
  COLUMN_NAME,
  REPLACE(
    SNOWFLAKE.CORTEX.AI_COMPLETE(
      model  => 'snowflake-llama-3.3-70b',
      prompt => 'Generate a very short one-sentence documentation-style description for a database column named '|| COLUMN_NAME || 
                '. Do not include the column name again in the output.'
    ),
    '"', ''
  ) AS DESCRIPTION
FROM SNOWFLAKE_SAMPLE_DATA.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'TPCDS_SF10TCL' 
AND TABLE_NAME = 'STORE'
ORDER BY TABLE_NAME, COLUMN_NAME
;

Result:

The below image shows the output of the AI_COMPLETE function generating short documentation-style descriptions for each column in the STORE table.

Example 2: Detect the Language of Each Review

The below query identifies the language used in each review by sending the review text to the model. This is helpful for datasets with multilingual content.

The prompt explicitly asks the model to return only the name of the language (e.g., ‘English’, ‘Spanish’).
It returns “Unknown” if the language can’t be detected or is ambiguous.

SELECT 
    REVIEWS AS RAW_REVIEWS,
    REPLACE(
        SNOWFLAKE.CORTEX.AI_COMPLETE(
            model  => 'mistral-large2',
            prompt => 'Identify the language used in the following text: ' || REVIEWS || '. ' ||
                      'Return only the name of the language (e.g., English, Spanish). ' ||
                      'If no language can be detected or if the text is empty, return Unknown. ' ||
                      'If multiple languages are present, return their names concatenated with commas.' ||
                      'No additional text should be returned'
        ),
        '"', ''
    ) AS LANGUAGE
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY RAW_REVIEWS DESC
;

Result:

The below image shows how AI_COMPLETE identifies the language of each Airbnb review using a prompt-based instruction.

2. TRANSLATE

The TRANSLATE Snowflake Cortex AISQL function allows you to automatically translate text from one language to another using built-in AI models. It is especially useful when working with multilingual datasets where consistent language output is required for downstream processing, sentiment analysis, or reporting.

Syntax:

SNOWFLAKE.CORTEX.TRANSLATE(
    input_text,
    source_language_code,
    target_language_code
)

Example: Translate Reviews from multiple languages into English

The below query translates customer reviews from their original language to English. Use an empty string to allow Snowflake to automatically detect the source language

SELECT
    REVIEWS AS RAW_REVIEWS,
    SNOWFLAKE.CORTEX.TRANSLATE(REVIEWS, '', 'en') AS TRANSLATED_REVIEWS
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY RAW_REVIEWS DESC
;

Result:

The below image shows the output of the TRANSLATE function converting reviews from various languages into English.

3. SUMMARIZE

The SUMMARIZE AISQL function in Snowflake Cortex condenses a block of text into a short, meaningful summary using AI models. This is particularly helpful for lengthy content like reviews, descriptions, or feedback, allowing users to quickly understand the main points without reading the full text.

Syntax:

SNOWFLAKE.CORTEX.SUMMARIZE(input_text)

Example: Summarize Translated Reviews

The below query first translates reviews into English using the TRANSLATE function and then summarizes the translated reviews using SUMMARIZE.

This two-step process is useful when the original reviews are in multiple languages, and you want a quick English summary for each one to assist with sentiment analysis or reporting.

SELECT 
    REVIEWS AS RAW_REVIEWS,
    SNOWFLAKE.CORTEX.TRANSLATE(REVIEWS, '', 'en') AS TRANSLATED_REVIEWS,
    SNOWFLAKE.CORTEX.SUMMARIZE(TRANSLATED_REVIEWS) AS SUMMARIZED_REVIEWS
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY RAW_REVIEWS DESC
;

Result:

The below image shows how SUMMARIZE generates concise summaries of translated Airbnb reviews.

4. AI_SUMMARIZE_AGG

The AI_SUMMARIZE_AGG is an aggregate AISQL function in Snowflake Cortex that summarizes a collection of text values into a single, coherent summary.

Instead of summarizing individual rows, this function provides an overall summary across multiple rows of all grouped text entries.
It is ideal for compressing large volumes of similar feedback or reviews into a few key points.

Syntax:

AI_SUMMARIZE_AGG(text_column)

Example: Summarize Reviews by Date

The below query groups Airbnb reviews by date and uses AI_SUMMARIZE_AGG to generate a single summary for all reviews submitted on that day.

SELECT 
    TIMESTAMP,
    AI_SUMMARIZE_AGG(REVIEWS) AS AGG_SUMMARIZED_REVIEWS
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
GROUP BY TIMESTAMP
;

Result:

The below image shows the use of AI_SUMMARIZE_AGG to generate a daily summary of all reviews grouped by timestamp.

5. AI_AGG

The AI_AGG is an aggregate AISQL function in Snowflake Cortex that applies a custom natural language prompt to a set of grouped text values. Unlike AI_SUMMARIZE_AGG, which generates a general summary, AI_AGG allows for precise control over the output by letting you define how the grouped text should be interpreted, summarized, or transformed.

Syntax:

AI_AGG(text_column, prompt)

Example: Return a Structured Summary of Review Sentiment by Day

The below query groups Airbnb reviews by day and uses a prompt in AI_AGG to generate a short summary in a structured, bullet-style output.

It returns the total number of reviews submitted on a given day, along with a breakdown of positive, negative, and neutral reviews.

SELECT 
    TIMESTAMP,
    AI_AGG(
        REVIEWS,
        'Return a summary in the following format only (no additional text):
        1) Total number of properties: [value]
        2) Properties with positive reviews: [value]
        3) Properties with negative reviews: [value]
        4) Properties with neutral reviews: [value]'
    ) AS AI_AGG_REVIEW_SUMMARY
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
GROUP BY TIMESTAMP
ORDER BY TIMESTAMP
;

Result:

The below image shows AI_AGG returning structured review insights, such as counts of positive, negative, and neutral reviews per day.

6. AI_FILTER

The AI_FILTER is a Snowflake Cortex AISQL function used to determine whether a given piece of text matches a condition described in natural language. It returns a Boolean (TRUE or FALSE), indicating if the AI considers the input text to meet the defined context.

AI_FILTER can be used in SELECT, WHERE, and even JOIN conditions, making it versatile for both filtering and logic-based operations in your queries.

Syntax:

SNOWFLAKE.CORTEX.AI_FILTER(prompt_with_text)

Example 1: Detect Reviews Indicating Negative Experience (Used in SELECT)

The below query evaluates each Airbnb review to check if it reflects a negative or disappointing experience. The AI returns TRUE if the review matches the defined context.

SELECT 
    REVIEWS,
    SNOWFLAKE.CORTEX.AI_FILTER(
        'The review indicates a negative or disappointing experience, 
        such as the reviewer not enjoying the stay or mentioning specific issues: ' || REVIEWS
    ) AS REVIEW_SENTIMENT
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY REVIEW_SENTIMENT;

Result:

The below image shows how AI_FILTER is used in a SELECT clause to detect whether a review indicates a negative experience. The results show that there are 13 negative reviews.

Example 2: Return Only the Negative Reviews (Used in WHERE)

The below query filters the dataset to return only those reviews that the AI classifies as negative, based on the prompt provided.

SELECT 
    REVIEWS AS BAD_EXPERIENCE_REVIEWS
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
WHERE SNOWFLAKE.CORTEX.AI_FILTER(
    'The review indicates a negative or disappointing experience, 
    such as the reviewer not enjoying the stay or mentioning specific issues: ' || REVIEWS
);

Result:

The below image shows how AI_FILTER is used in a WHERE clause to filter only those reviews that reflect a bad or disappointing experience.

7. EXTRACT_ANSWER

The EXTRACT_ANSWER is a Snowflake Cortex AISQL function that extracts specific information from a block of text based on a focused question. Unlike general summarization, this function is tailored to return a precise answer to a question, even when the answer is buried inside a longer paragraph or sentence.

In addition to the answer, EXTRACT_ANSWER also returns a confidence score (ranging from 0 to 1) that indicates how certain the model is about the extracted answer.

A score close to 1 suggests high confidence.
A lower score may indicate uncertainty or that the answer is inferred from ambiguous context.

Syntax:

SNOWFLAKE.CORTEX.EXTRACT_ANSWER(input_text, question)

Example: Extract Issues Mentioned in a Review

The below query identifies Airbnb reviews that describe a negative experience using AI_FILTER. For those reviews, it then applies EXTRACT_ANSWER to pull out the specific issues mentioned in the review.

SELECT 
    REVIEWS,
    SNOWFLAKE.CORTEX.EXTRACT_ANSWER(
        REVIEWS, 
        'What are the issues mentioned in the review? Return only the issues, if any.'
    ) AS ISSUES
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
WHERE SNOWFLAKE.CORTEX.AI_FILTER(
    'The review indicates a negative or disappointing experience, 
    such as the reviewer not enjoying the stay or mentioning specific issues: ' || REVIEWS
);

Result:

The below image shows EXTRACT_ANSWER extracting specific issues from negative reviews, along with a confidence score.

8. SENTIMENT

The SENTIMENT AISQL function in Snowflake Cortex evaluates the emotional tone of a given text and returns a numeric sentiment score. This score helps quantify how positive, neutral, or negative the text is.

Scores closer to 1 indicate positive sentiment
Scores near 0 suggest negative sentiment
Values around 0.5 represent a neutral tone

It is especially useful in customer feedback analysis, surveys, and social media monitoring to quickly assess user satisfaction or frustration.

Syntax:

SNOWFLAKE.CORTEX.SENTIMENT(input_text)

For best results, translate non-English content into English using TRANSLATE before applying SENTIMENT.

Example: Determine Sentiment of Airbnb Reviews

The below query processes reviews from an Airbnb dataset.

It first translates reviews to English using TRANSLATE, then calculates the sentiment score using SENTIMENT.
The result is a numeric indicator of how positive or negative the review is.

SELECT 
    REVIEWS AS RAW_REVIEWS,
    SNOWFLAKE.CORTEX.TRANSLATE(REVIEWS, '', 'en') AS TRANSLATED_REVIEWS,
    ROUND(SNOWFLAKE.CORTEX.SENTIMENT(TRANSLATED_REVIEWS), 2) AS SENTIMENT_SCORE
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY RAW_REVIEWS DESC
;

Result:

The below image shows SENTIMENT generating sentiment scores for each Airbnb review after translating them to English.

9. ENTITY_SENTIMENT

The ENTITY_SENTIMENT AISQL function in Snowflake Cortex identifies specific entities (topics or aspects) in a given text and evaluates the sentiment associated with each one individually.

This is especially useful when analyzing long-form content like product reviews, support tickets, or customer feedback where different parts of the text express different emotions toward different features.

For example, a review might praise the “location” but criticize the “price”. ENTITY_SENTIMENT helps you break this down into granular insights.

Syntax:

SNOWFLAKE.CORTEX.ENTITY_SENTIMENT(input_text, entity_array)

text: The input text, typically in English.
entity_array: An array of entities (e.g., [‘location’, ‘value’, ‘amenities’]) you want the function to evaluate.

Example: Analyze Sentiment Toward Key Aspects in Airbnb Reviews

The below query translates each Airbnb review into English and applies ENTITY_SENTIMENT to extract how the reviewer feels about specific aspects: location, value for money, and amenities.

SELECT 
    REVIEWS AS RAW_REVIEWS,
    SNOWFLAKE.CORTEX.TRANSLATE(REVIEWS, '', 'en') AS TRANSLATED_REVIEWS,
    SNOWFLAKE.CORTEX.ENTITY_SENTIMENT(
        TRANSLATED_REVIEWS, 
        ['location', 'value for money', 'amenities']
    ) AS ENTITY_SENTIMENT
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY RAW_REVIEWS DESC
;

Result:

The below image shows ENTITY_SENTIMENT extracting sentiment specifically related to predefined entities like “location”, “value for money” and “amenities”.

10. AI_CLASSIFY

The AI_CLASSIFY AISQL function in Snowflake Cortex categorizes a given text into one of several predefined labels specified by the user.

Unlike general-purpose models, it lets you define your own categories for custom classification for tasks like:

Sentiment tagging (Positive, Negative, Neutral)
Categorizing content types.
Classifying support tickets, emails, or user reviews

The model uses natural language understanding to choose the most appropriate label from your list.

Syntax:

SNOWFLAKE.CORTEX.AI_CLASSIFY(input_text, labels_array)

Example 1: Classify Sentiment of Summarized Airbnb Reviews

The below query translates and summarizes Airbnb reviews, and then classifies the overall tone of each review as either Positive, Negative, or Neutral.

SELECT 
    REVIEWS AS RAW_REVIEWS,
    SNOWFLAKE.CORTEX.TRANSLATE(REVIEWS, '', 'en') AS TRANSLATED_REVIEWS,
    SNOWFLAKE.CORTEX.SUMMARIZE(TRANSLATED_REVIEWS) AS SUMMARIZED_REVIEWS,
    SNOWFLAKE.CORTEX.AI_CLASSIFY(SUMMARIZED_REVIEWS, ['Positive', 'Negative', 'Neutral']) AS REVIEW_SENTIMENT
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY RAW_REVIEWS DESC
;

Result:

The below image shows how AI_CLASSIFY categorizes summarized reviews into Positive, Negative, or Neutral sentiment.

Example 2: Identify Property Type Based on Description

The below query classifies the type of property based on its description using categories like Apartment, House, Cabin, etc. The description is first translated to English and then passed to AI_CLASSIFY.

SELECT 
    DESCRIPTION AS RAW_DESCRIPTION,
    SNOWFLAKE.CORTEX.TRANSLATE(DESCRIPTION, '', 'en') AS EN_DESCRIPTION,
    SNOWFLAKE.CORTEX.AI_CLASSIFY(
        EN_DESCRIPTION,
        ['Apartment', 'House', 'Cabin', 'Studio', 'Villa', 'Shared Room', 'Other']
    ) AS PROPERTY_TYPE
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
;

Result:

The below image shows AI_CLASSIFY determining the property type based on the English-translated listing description.

11. AI_SIMILARITY

The AI_SIMILARITY AISQL function in Snowflake Cortex measures how similar two text inputs are using semantic (meaning-based) comparison, not just string matching.

It returns a score between 0 and 1, where:

1 means identical in meaning
0 means completely unrelated

Syntax:

SNOWFLAKE.CORTEX.AI_SIMILARITY(input_text1, input_text2)

Example: Match Raw Country Names to Standardized Names

The below query compares each raw country name (like “USA”, “U.S.A.”, “America”) against a list of standard country names (like “United States”) using AI_SIMILARITY.

It returns the most similar standard name for each raw input using the ROW_NUMBER() analytical function.

CREATE OR REPLACE TABLE raw_country_names (
    id INT,
    raw_name STRING
);

INSERT INTO raw_country_names (id, raw_name) VALUES
(1, 'USA'),
(2, 'U.S.A.'),
(3, 'America'),
(4, 'United States'),
(5, 'UK'),
(6, 'U.K.'),
(7, 'Great Britain'),
(8, 'England'),
(9, 'India'),
(10, 'Bharat'),
(11, 'Deutschland'),
(12, 'Federal Republic of Germany'),
(13, 'Korea'),
(14, 'South Korea'),
(15, 'Republic of Korea');


CREATE OR REPLACE TABLE standard_country_names (
    standard_name STRING
);

INSERT INTO standard_country_names (standard_name) VALUES
('United States'),
('United Kingdom'),
('India'),
('Germany'),
('South Korea');

--AI_SIMILARITY
SELECT 
    r.ID,
    r.RAW_NAME,
    s.STANDARD_NAME,
    ROUND(
        SNOWFLAKE.CORTEX.AI_SIMILARITY(
            r.RAW_NAME,
            s.STANDARD_NAME
        ),
        2
    ) AS SIMILARITY_SCORE
FROM RAW_COUNTRY_NAMES r
CROSS JOIN STANDARD_COUNTRY_NAMES s
QUALIFY ROW_NUMBER() OVER (PARTITION BY r.ID ORDER BY SIMILARITY_SCORE DESC) = 1
ORDER BY r.ID
;

Result:

The below image shows the AI_SIMILARITY function matching raw country names to their closest standard names based on semantic similarity.

12. COUNT_TOKENS

The COUNT_TOKENS AISQL function in Snowflake cortex counts the number of tokens in the input text for a given Cortex model.

Tokens are chunks of words that AI models process (like words, subwords, or characters). This helps estimate:

Prompt size
Token limits
Cost and performance planning for AI functions

Different models may tokenize the same text differently, so COUNT_TOKENS helps you check cost estimation and compatibility before calling an AI function.

Syntax:

SNOWFLAKE.CORTEX.COUNT_TOKENS(model_name, input_text)

SNOWFLAKE.CORTEX.COUNT_TOKENS(function_name , input_text )

model_name: The name of the Cortex model (e.g., ‘mistral-large2’, ‘snowflake-arctic’)
function_name: Specify one of the following values:
- extract_answer
- sentiment
- summarize
- translate

Example: Check Token Usage Across Models

The below query counts how many tokens each description contains for different Cortex models and functions. This helps assess whether the text is within size limits for each model.

SELECT 
    DESCRIPTION AS RAW_DESCRIPTION,
    SNOWFLAKE.CORTEX.COUNT_TOKENS('mistral-large2', DESCRIPTION) AS TOKEN_COUNT,
    SNOWFLAKE.CORTEX.COUNT_TOKENS('extract_answer', DESCRIPTION) AS EXTRACT_ANS_TC,
    SNOWFLAKE.CORTEX.COUNT_TOKENS('sentiment', DESCRIPTION) AS SENTIMENT_TC,
    SNOWFLAKE.CORTEX.COUNT_TOKENS('summarize', DESCRIPTION) AS SUMMARIZE_TC,
    SNOWFLAKE.CORTEX.COUNT_TOKENS('translate', DESCRIPTION) AS TRANSLATE_TC
FROM AIRBNB_PROPERTIES_INFORMATION.PUBLIC.AIRBNB_PROPERTIES_INFORMATION
ORDER BY TOKEN_COUNT DESC
;

Result:

The below image shows the output of the COUNT_TOKENS function, which calculates the number of tokens for each description across different Cortex models to help estimate processing cost and input limits.