SQL Analytical Functions Interview Questions

Spread the love

1. SQL Analytic Functions VS Aggregate Functions

An Analytic function calculates an aggregate value over a group of rows and returns a single result for each row of the group. This is different from an aggregate function, which returns a single result for each group of rows.

Before getting into interview questions let us try to understand the differences between the Analytic Functions and Aggregate Functions.

Let us understand the differences with an example. Oracle HR Schema’s Employee table is used in all the demonstrations in the article.

Employee table sample data:

Average Salary of all employees

SELECT AVG(Salary) as AVG_SAL FROM EMPLOYEES

AVG_SAL
------------------
6461.83

Here the average is calculated for all the employees of all the departments present in the table.

Average Salary of all employees Department wise

SELECT Department_Id, AVG(Salary) as AVG_SAL FROM EMPLOYEES 
GROUP BY Department_Id ORDER BY Department_Id

Here average salary of employees is calculated department wise. Here we can observe that though there may be any number of employee records per department, only one record per department is returned as a result.

Displaying average salary against each employee record

SELECT
    B.Employee_Id,
    B.First_Name,
    B.Department_Id,
    B.Salary,
    A.AVG_SAL
FROM
(SELECT Department_Id, AVG(Salary) as AVG_SAL FROM Employees GROUP BY Department_Id) A,
EMPLOYEES B
WHERE A.Department_Id = B.Department_Id

Here we are displaying all the employee records with average salary of employees of the department in a separate column AVG_SAL.

To achieve this we have performed a self-join on Employee table. We have taken the result of “Average salary per department in Employee table” and “entire Employee table” and joined them based on the Department_Id.

The same result can be achieved by a query using the Analytic function as below.

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       AVG(Salary) OVER(PARTITION BY Department_Id) as AVG_SAL 
FROM EMPLOYEES ORDER BY Employee_Id

The results are same but the queries are different. Using the aggregate functions we can display only those columns which are in GROUP BY clause. But using an Analytic function you can display all the columns along with aggregated result.

I hope the difference between the two is clear now.

Related Article: SQL Scenario based Interview Questions

2. Analytic Functions Syntax

Before proceeding further let us understand the syntax of Analytic Functions. The syntax may vary for each function but below is the basic syntax.

Analytic_Function([ arguments ]) OVER ([partition_clause ] [ order_by_clause [ windowing_clause ] ])

An argument is the column on which the aggregation is done.

Partition Clause defines the group of rows upon which the aggregation needs to be done. For example in the previous example Partition is specified on Department_Id.

If Partition Clause is not specified the aggregation is done on all the records of the table.

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, AVG(Salary) OVER() as AVG_SAL 
FROM EMPLOYEES

Here the AVG_SAL is calculated for all the employees instead of department wise as partition clause is empty.

Order By Clause is used to order the rows with in the partition. If an analytic function behaves differently according to order of rows in a partition, the order by clause needs to be specified. We shall see more about this in below examples.

Windowing Clause just like partition clause gives a further degree of control over the window within which an analytic function can apply.

This is an extension of order by clause. So we can’t use windowing clause without order by clause.

3. Calculate total sum of salary department wise

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       SUM(Salary) OVER(PARTITION BY Department_Id) as SUM_SAL 
FROM EMPLOYEES ORDER BY Employee_Id

Here sum of salary per department is calculated and displayed against each employee record.

For Department_ID=90, the sum is calculated as SUM(24000,17000,17000)=58000

4. Calculate cumulative sum of salary department wise

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       SUM(Salary) OVER(PARTITION BY Department_Id ORDER BY Employee_Id) as CUML_SAL 
FROM EMPLOYEES ORDER BY Employee_Id

The order by clause addition changed the behavior of the sum function allowing it to calculate the cumulative sum.

For emp_id 100, the CUML_SAL is calculated as SUM(24000) =24000
For emp_id 101, the CUML_SAL is calculated as SUM (24000,17000) =41000
For emp_id 102, the CUML_SAL is calculated as SUM (24000,17000,17000) =58000

5. Calculate cumulative sum of the organization

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       SUM(Salary) OVER(ORDER BY Employee_Id) as CUML_SAL 
FROM EMPLOYEES ORDER BY Employee_Id

Since there is no partition clause is mentioned, the sum keeps increasing for each record and final record will give us the total sum of salary of all employees.

6. Calculate Cumulative average of the salary department wise

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       AVG(Salary) OVER(PARTITION BY Department_Id ORDER BY Employee_Id) as AVG_SAL 
FROM EMPLOYEES ORDER BY Employee_Id

For emp_id 100, the AVG_SAL is calculated as AVG(24000) =24000
For emp_id 101, the AVG_SAL is calculated as AVG(24000,17000) =20500
For emp_id 102, the AVG_SAL is calculated as AVG(24000,17000,17000) =19333.33

Hence the last record of the department gives the overall average of the salary for that department.

7. Calculate average of salary for current and previous record department wise

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       AVG(Salary) OVER(PARTITION BY Department_Id ORDER BY Employee_Id ROWS 1 PRECEDING) as AVG_SAL 
FROM EMPLOYEES ORDER BY Employee_Id

For emp_id 100, the AVG_SAL is calculated as AVG(24000) =24000
For emp_id 101, the AVG_SAL is calculated as AVG(24000,17000) =20500
For emp_id 102, the AVG_SAL is calculated as AVG(17000,17000) =17000

Here the AVG_SAL value is the average of the salary of current row and previous row in a department. This is achieved using the windowing function ROWS 1 PRECEDING after the Order by clause.

8. Find the oldest joinee department wise using LAG Analytic function

LAG Analytic function helps to fetch the row details of the previous record.

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Hire_date, 
       LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) as PREV_HIREDATE 
FROM EMPLOYEES

For employee 102, the PREV_HIREDATE is NULL as he is the oldest joinee in the department.
For employee 100, the PREV_HIREDATE is the hire date of the employee 102 and so on.

The below query fetches the oldest joinee details department wise

SELECT * FROM(
       SELECT Employee_Id, 
              First_Name, 
              Department_Id, 
              Hire_date, 
              LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) as PREV_HIREDATE 
       FROM EMPLOYEES)
WHERE PREV_HIREDATE IS NULL

9. Find the newest joinee department wise using LAG Analytic function

By adding DESC to the ORDER BY clause we can change the order in which LAG Analytic function calculates the previous value.

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Hire_date, 
       LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date desc) as NEXT_HIREDATE 
FROM EMPLOYEES 
ORDER BY Department_Id, Hire_date

Since we have mentioned DESC, the Lag starts from the highest Hire date and since there won’t be any value before that, the lag value is calculated as NULL

The below query fetches the newst joinee details department wise

SELECT * FROM(
       SELECT Employee_Id, 
              First_Name, 
              Department_Id, 
              Hire_date, 
              LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date desc) as NEXT_HIREDATE 
       FROM EMPLOYEES )
WHERE NEXT_HIREDATE IS NULL

10. Find the oldest joinee department wise using LEAD Analytic function

LEAD is exact opposite of the LAG Analytic function. LAG fetches the row details of previous record and LEAD fetches the row details of next record.

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Hire_date, 
       LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date DESC) as PREV_HIREDATE 
FROM EMPLOYEES 
ORDER BY Department_Id, Hire_date

Since LEAD fetches the next value we used the DESC in the ORDER BY clause so that the LEAD Analytic function starts working from latest Hire date and keep picking the hire date of the employee joined before him. So finally for the oldest joinee the value will be NULL.

 The below query fetches the oldest joinee details department wise

SELECT * FROM(
       SELECT Employee_Id, 
              First_Name, 
              Department_Id, 
              Hire_date, 
              LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date DESC) as PREV_HIREDATE 
       FROM EMPLOYEES)
WHERE PREV_HIREDATE IS NULL

11. Find the newest joinee department wise using LEAD Analytic function

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Hire_date, 
       LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) as NEXT_HIREDATE 
FROM EMPLOYEES

Since the LEAD Analytic function fetches the value of the next record, the newest joinee will have the value as NULL.

The below query fetches the newest joinee details department wise

SELECT * FROM(
       SELECT Employee_Id, 
              First_Name, 
              Department_Id, 
              Hire_date, 
              LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) as NEXT_HIREDATE 
       FROM EMPLOYEES)
WHERE NEXT_HIREDATE IS NULL

12. RANK and DENSE_RANK

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       RANK() OVER(PARTITION BY Department_Id ORDER BY Salary) as SAL_RANK 
FROM EMPLOYEES

As the employees 101 and 102 have the same salary amount, the RANK function skipped the rank 2 in above example.

As we have given ORDER BY clause by Salary, the rank starts from the lowest salary to highest salary. The same can be observed for both departments 90 and 100. The ranking started from lowest to the highest salary value. This behavior can be changed by adding the DESC at the end of ORDER BY clause. Then the RANK starts ranking salary from highest to lowest.

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       DENSE_RANK() OVER(PARTITION BY Department_Id ORDER BY Salary) as SAL_RANK 
FROM EMPLOYEES

DENSE_RANK will not skip the rank value like in RANK function.

In the above example employees 101 and 102 both have same salary value and are awarded rank 1. The next salary value is awarded rank 2 unlike RANK.

12.1 Find employee with MAX salary department wise

SELECT * FROM(
       SELECT Employee_Id, 
              First_Name, 
              Department_Id, 
              Salary, 
              RANK() OVER(PARTITION BY Department_Id ORDER BY Salary DESC) as SAL_RANK 
       FROM EMPLOYEES
) 
WHERE SAL_RANK =1

12.2 Find employee with MIN salary department wise

SELECT * FROM(
       SELECT Employee_Id, 
              First_Name, 
              Department_Id, 
              Salary, 
              RANK() OVER(PARTITION BY Department_Id ORDER BY Salary) as SAL_RANK 
       FROM EMPLOYEES
) 
WHERE SAL_RANK =1

If the question is to find the employee with 2nd max or min salary change the SAL_RANK accordingly in the query.

13. Find the difference between the salary of an employee and max salary of the employee in the department

SELECT Employee_Id, 
       First_Name, 
       Department_Id, 
       Salary, 
       MAX(Salary) OVER(PARTITION BY Department_Id ) as MAX_SAL, 
       (MAX(Salary) OVER(PARTITION BY Department_Id )-Salary ) as SAL_DIFF 
FROM EMPLOYEES 
ORDER BY Employee_Id

In the above query MAX_SAL represents the maximum salary of an employee in that department. SAL_DIFF gives the difference between the Max salary in the department and the salary of the employee.

14. Find employee with MAX salary department wise without using RANK or DENSE_RANK

This can be achieved using the MAX function by extending the above query. The employee with MAX salary will have a SAL_DIFF as 0.

SELECT * FROM(
       SELECT Employee_Id, 
              First_Name, 
              Department_Id, 
              Salary, 
              MAX(Salary) OVER(PARTITION BY Department_Id ) as MAX_SAL,
              (MAX(Salary) OVER(PARTITION BY Department_Id )-Salary ) as SAL_DIFF 
       FROM EMPLOYEES) 
WHERE SAL_DIFF = 0


Related Articles:

  • Top Advanced SQL Interview Questions
  • SQL Scenario based Interview Questions
  • SQL Analytical Functions Interview Questions

15. Test your Understanding – SQL Analytic Functions QUIZ

Test Your Understanding:

SQL Analytic Functions Quiz

Five Multiple Choice Questions

16 thoughts on “SQL Analytical Functions Interview Questions”

    • If we order by department_id, we get the total sum of salary of entire department against each employee of that department which would be same as “3.Calculate total sum of salary department wise”
      Since we need cumulative salary we used employee_id as reference.

      Reply
  1. Thank you for your time and effort for this wonderful work of sharing and helping others. Keep up the good work you never know how much you are helping and sharing your knowledge. God bless you

    Reply
  2. Hi. Thank you for your effort and a very good job. I just need to mention one thing. Take a look at this question:

    Which of the below statements about Analytic Functions is FALSE?
    Your answer: The PARTITION clause is not a mandatory clause in every Analytic Function

    Correct answer: The Windowing Clause can be used without order by clause

    In fact, I was able to use the rank and dense_rank functions without using the partition by clause. It was very intuitive. I just thought I could try to rank the salaries organization-wise, so I opted the partition by department_id clause and that gave the salary rankings for the whole organization.

    Reply
    • When you gave partition by department_id, it should not have returned the salaries ranked by whole organization unless there is only one department in the organization. Could you please share the SQL you tried and also confirm if you are querying against the employee data set that was provided by us at the beginning of the article.
      Thanks

      Reply

Leave a Comment

Related Posts