1. SQL Analytic Functions VS Aggregate Functions
An Analytic function calculates an aggregate value over a group of rows and returns a single result for each row of the group. This is different from an aggregate function, which returns a single result for each group of rows.
Before getting into interview questions let us try to understand the differences between the Analytic Functions and Aggregate Functions.
Let us understand the differences with an example. Oracle HR Schema’s Employee table is used in all the demonstrations in the article.
Employee table sample data:
Average Salary of all employees
SELECT AVG(Salary) AS AVG_SAL FROM EMPLOYEES;
AVG_SAL
------------------
6461.83
Here the average is calculated for all the employees of all the departments present in the table.
Average Salary of all employees Department wise
SELECT
Department_Id,
AVG(Salary) AS AVG_SAL
FROM EMPLOYEES
GROUP BY Department_Id ORDER BY Department_Id;
Here average salary of employees is calculated department wise. Here we can observe that though there may be any number of employee records per department, only one record per department is returned as a result.
Displaying average salary against each employee record
SELECT
B.Employee_Id,
B.First_Name,
B.Department_Id,
B.Salary,
A.AVG_SAL
FROM
(SELECT Department_Id, AVG(Salary) AS AVG_SAL FROM Employees GROUP BY Department_Id) A,
EMPLOYEES B
WHERE A.Department_Id = B.Department_Id;
Here we are displaying all the employee records with average salary of employees of the department in a separate column AVG_SAL.
To achieve this we have performed a self-join on Employee table. We have taken the result of “Average salary per department in Employee table” and “entire Employee table” and joined them based on the Department_Id.
The same result can be achieved by a query using the Analytic function as below.
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
AVG(Salary) OVER(PARTITION BY Department_Id) AS AVG_SAL
FROM EMPLOYEES ORDER BY Employee_Id;
The results are same but the queries are different. Using the aggregate functions we can display only those columns which are in GROUP BY clause. But using an Analytic function you can display all the columns along with aggregated result.
I hope the difference between the two is clear now.
Related Article: SQL Scenario based Interview Questions
2. Analytic Functions Syntax
Before proceeding further let us understand the syntax of Analytic Functions. The syntax may vary for each function but below is the basic syntax.
Analytic_Function([ arguments ]) OVER ([partition_clause ] [ order_by_clause [ windowing_clause ] ])
An argument is the column on which the aggregation is done.
Partition Clause defines the group of rows upon which the aggregation needs to be done. For example in the previous example Partition is specified on Department_Id.
If Partition Clause is not specified the aggregation is done on all the records of the table.
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary, AVG(Salary) OVER() AS AVG_SAL
FROM EMPLOYEES;
Here the AVG_SAL is calculated for all the employees instead of department wise as partition clause is empty.
Order By Clause is used to order the rows with in the partition. If an analytic function behaves differently according to order of rows in a partition, the order by clause needs to be specified. We shall see more about this in below examples.
Windowing Clause just like partition clause gives a further degree of control over the window within which an analytic function can apply.
This is an extension of order by clause. So we can’t use windowing clause without order by clause.
3. Calculate total sum of salary department wise
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
SUM(Salary) OVER(PARTITION BY Department_Id) AS SUM_SAL
FROM EMPLOYEES ORDER BY Employee_Id;
Here sum of salary per department is calculated and displayed against each employee record.
For Department_ID=90, the sum is calculated as SUM(24000,17000,17000)=58000
4. Calculate cumulative sum of salary department wise
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
SUM(Salary) OVER(PARTITION BY Department_Id ORDER BY Employee_Id) AS CUML_SAL
FROM EMPLOYEES ORDER BY Employee_Id;
The order by clause addition changed the behavior of the sum function allowing it to calculate the cumulative sum.
For emp_id 100, the CUML_SAL is calculated as SUM(24000) =24000
For emp_id 101, the CUML_SAL is calculated as SUM (24000,17000) =41000
For emp_id 102, the CUML_SAL is calculated as SUM (24000,17000,17000) =58000
5. Calculate cumulative sum of the organization
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
SUM(Salary) OVER(ORDER BY Employee_Id) AS CUML_SAL
FROM EMPLOYEES ORDER BY Employee_Id;
Since there is no partition clause is mentioned, the sum keeps increasing for each record and final record will give us the total sum of salary of all employees.
6. Calculate Cumulative average of the salary department wise
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
AVG(Salary) OVER(PARTITION BY Department_Id ORDER BY Employee_Id) AS AVG_SAL
FROM EMPLOYEES ORDER BY Employee_Id;
For emp_id 100, the AVG_SAL is calculated as AVG(24000) =24000
For emp_id 101, the AVG_SAL is calculated as AVG(24000,17000) =20500
For emp_id 102, the AVG_SAL is calculated as AVG(24000,17000,17000) =19333.33
Hence the last record of the department gives the overall average of the salary for that department.
7. Calculate average of salary for current and previous record department wise
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
AVG(Salary) OVER(PARTITION BY Department_Id ORDER BY Employee_Id ROWS 1 PRECEDING) AS AVG_SAL
FROM EMPLOYEES ORDER BY Employee_Id;
For emp_id 100, the AVG_SAL is calculated as AVG(24000) =24000
For emp_id 101, the AVG_SAL is calculated as AVG(24000,17000) =20500
For emp_id 102, the AVG_SAL is calculated as AVG(17000,17000) =17000
Here the AVG_SAL value is the average of the salary of current row and previous row in a department. This is achieved using the windowing function ROWS 1 PRECEDING after the Order by clause.
8. Find the oldest joinee department wise using LAG Analytic function
LAG Analytic function helps to fetch the row details of the previous record.
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) AS PREV_HIREDATE
FROM EMPLOYEES;
For employee 102, the PREV_HIREDATE is NULL as he is the oldest joinee in the department.
For employee 100, the PREV_HIREDATE is the hire date of the employee 102 and so on.
The below query fetches the oldest joinee details department wise
SELECT * FROM(
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) AS PREV_HIREDATE
FROM EMPLOYEES)
WHERE PREV_HIREDATE IS NULL;
9. Find the newest joinee department wise using LAG Analytic function
By adding DESC to the ORDER BY clause we can change the order in which LAG Analytic function calculates the previous value.
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date DESC) AS NEXT_HIREDATE
FROM EMPLOYEES
ORDER BY Department_Id, Hire_date;
Since we have mentioned DESC, the Lag starts from the highest Hire date and since there won’t be any value before that, the lag value is calculated as NULL
The below query fetches the newst joinee details department wise
SELECT * FROM(
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LAG(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date DESC) AS NEXT_HIREDATE
FROM EMPLOYEES )
WHERE NEXT_HIREDATE IS NULL;
10. Find the oldest joinee department wise using LEAD Analytic function
LEAD is exact opposite of the LAG Analytic function. LAG fetches the row details of previous record and LEAD fetches the row details of next record.
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date DESC) AS PREV_HIREDATE
FROM EMPLOYEES
ORDER BY Department_Id, Hire_date;
Since LEAD fetches the next value we used the DESC in the ORDER BY clause so that the LEAD Analytic function starts working from latest Hire date and keep picking the hire date of the employee joined before him. So finally for the oldest joinee the value will be NULL.
The below query fetches the oldest joinee details department wise
SELECT * FROM(
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date DESC) AS PREV_HIREDATE
FROM EMPLOYEES)
WHERE PREV_HIREDATE IS NULL;
11. Find the newest joinee department wise using LEAD Analytic function
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) AS NEXT_HIREDATE
FROM EMPLOYEES;
Since the LEAD Analytic function fetches the value of the next record, the newest joinee will have the value as NULL.
The below query fetches the newest joinee details department wise
SELECT * FROM(
SELECT
Employee_Id,
First_Name,
Department_Id,
Hire_date,
LEAD(Hire_date) OVER(PARTITION BY Department_Id ORDER BY Hire_date) AS NEXT_HIREDATE
FROM EMPLOYEES)
WHERE NEXT_HIREDATE IS NULL;
12. RANK and DENSE_RANK
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
RANK() OVER(PARTITION BY Department_Id ORDER BY Salary) AS SAL_RANK
FROM EMPLOYEES;
As the employees 101 and 102 have the same salary amount, the RANK function skipped the rank 2 in above example.
As we have given ORDER BY clause by Salary, the rank starts from the lowest salary to highest salary. The same can be observed for both departments 90 and 100. The ranking started from lowest to the highest salary value. This behavior can be changed by adding the DESC at the end of ORDER BY clause. Then the RANK starts ranking salary from highest to lowest.
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
DENSE_RANK() OVER(PARTITION BY Department_Id ORDER BY Salary) AS SAL_RANK
FROM EMPLOYEES;
DENSE_RANK will not skip the rank value like in RANK function.
In the above example employees 101 and 102 both have same salary value and are awarded rank 1. The next salary value is awarded rank 2 unlike RANK.
12.1 Find employee with MAX salary department wise
SELECT * FROM(
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
RANK() OVER(PARTITION BY Department_Id ORDER BY Salary DESC) AS SAL_RANK
FROM EMPLOYEES
)
WHERE SAL_RANK =1;
12.2 Find employee with MIN salary department wise
SELECT * FROM(
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
RANK() OVER(PARTITION BY Department_Id ORDER BY Salary) AS SAL_RANK
FROM EMPLOYEES
)
WHERE SAL_RANK =1;
If the question is to find the employee with 2nd max or min salary change the SAL_RANK accordingly in the query.
13. Find the difference between the salary of an employee and max salary of the employee in the department
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
MAX(Salary) OVER(PARTITION BY Department_Id ) AS MAX_SAL,
(MAX(Salary) OVER(PARTITION BY Department_Id )-Salary ) AS SAL_DIFF
FROM EMPLOYEES
ORDER BY Employee_Id;
In the above query MAX_SAL represents the maximum salary of an employee in that department. SAL_DIFF gives the difference between the Max salary in the department and the salary of the employee.
14. Find employee with MAX salary department wise without using RANK or DENSE_RANK
This can be achieved using the MAX function by extending the above query. The employee with MAX salary will have a SAL_DIFF as 0.
SELECT * FROM(
SELECT
Employee_Id,
First_Name,
Department_Id,
Salary,
MAX(Salary) OVER(PARTITION BY Department_Id ) AS MAX_SAL,
(MAX(Salary) OVER(PARTITION BY Department_Id )-Salary ) AS SAL_DIFF
FROM EMPLOYEES
)
WHERE SAL_DIFF = 0;
Subscribe to our Newsletter !!
Related Articles:
15. Test your Understanding – SQL Analytic Functions QUIZ
Test Your Understanding:
SQL Analytic Functions Quiz
Five Multiple Choice Questions
4.Calculate cumulative sum of salary department wise,(results are not same as query)
I guess the order by clause should be by deptno
If we order by department_id, we get the total sum of salary of entire department against each employee of that department which would be same as “3.Calculate total sum of salary department wise”
Since we need cumulative salary we used employee_id as reference.
Thank you so much for detailed explaining. I got good understanding for analytical function.
Thank you. Glad it helped!!
Walahi Thank you for such a thing, MORE THAN ENOUGH
👍👍
its very easy understandable for every one
Thank you!! Glad it helped.
Thank you for your time and effort for this wonderful work of sharing and helping others. Keep up the good work you never know how much you are helping and sharing your knowledge. God bless you
Thank you!! Glad it helped!!
Very helpful. I cracker my interview. thank you so much
Congratulations!! All the best👍!!
Very Nice explanation for such scenario based sql for windows function.
Thank you!!
Hi. Thank you for your effort and a very good job. I just need to mention one thing. Take a look at this question:
Which of the below statements about Analytic Functions is FALSE?
Your answer: The PARTITION clause is not a mandatory clause in every Analytic Function
Correct answer: The Windowing Clause can be used without order by clause
In fact, I was able to use the rank and dense_rank functions without using the partition by clause. It was very intuitive. I just thought I could try to rank the salaries organization-wise, so I opted the partition by department_id clause and that gave the salary rankings for the whole organization.
When you gave partition by department_id, it should not have returned the salaries ranked by whole organization unless there is only one department in the organization. Could you please share the SQL you tried and also confirm if you are querying against the employee data set that was provided by us at the beginning of the article.
Thanks
How come “Windowing functions can be used without order by clause” is false?
Window functions can be used without order clause right ?
Eg: SELECT Employee_Id,
First_Name,
Department_Id,
Salary,
SUM(Salary) OVER(PARTITION BY Department_Id) as SUM_SAL
FROM EMPLOYEES ORDER BY Employee_Id
It is Windowing Clause that is mentioned, not the Windowing function. Refer the syntax for more details.