diff --git a/README.md b/README.md index d8e2397..72abe28 100644 --- a/README.md +++ b/README.md @@ -1,59 +1,50 @@ -# Leetcode SQL Questions & Solutions
- -![alt text](https://github.com/cM2908/leetcode-sql/blob/main/LeetCode.png) - -#### Repository Contains :
- -(1) All Leetcode SQL Question Solutions
-(2) PostgreSQL Dump File (leetcodedb.sql)
- -#### Problem statements of all questions including leetcode premium questions :
- -(1) https://leetcode.ca
-(2) https://lifewithdata.com/sql
-(3) https://www.jiakaobo.com/leetcode
- -#### How to Import dump file using command line terminal?
- -(1) Open terminal & open psql utility -``` -user@my-machine:~$ psql -``` -(2) Create Database (To import the dump file, database should be created priorly)
-``` -postgres=# CREATE DATABASE sample_db; -``` -(3) Quit the psql promt -``` -postgres=# \q -``` -(4) From terminal, Load dump file into the newly created database using below command -``` -user@my-machine:~$ pg_restore --host "127.0.0.1" --port "5432" --username "postgres" --dbname "sample_db" --verbose "leetcodedb.sql" -``` -Replace your configurations(host,port,username) in pg_restore command
- -#### How to Import dump file using PgAdmin tool?

- -(1) Open PgAdmin & Create Database -``` -Servers -> Databases -> Create -> Database.. (Create Database dialog will get opened) -``` -(2) Restore Dump File
-``` -Right Click on newly created Database & select Restore option from menu (Restore dialog will get opened) -``` -Just Browse the dump file and keep other options as it is. - -#### Notes :
- -(1) Do not just copy-paste and run the content of dump file into either "psql promt in terminal" or "query tool of pgadmin".
- (Because dump file contains COPY commands not INSERTS,So doing such will cause errors.)
-(2) Table names are suffixed with question number.
-(3) New solutions will get added as I solve them.
- -#### Checkout my another repository which cantains Miscellaneous SQL Questions & Solutions :
+# LeetCode SQL Questions and Solutions + +![LeetCode SQL](https://github.com/cM2908/leetcode-sql/blob/main/LeetCode.png) + +This repository contains LeetCode SQL solutions organized by difficulty and documented as Markdown files. + +## Repository Structure + +- `easy/` +- `medium/` +- `hard/` + +Each solution file is named with the LeetCode question number and title, for example: + +- `easy/1050. Actors and Directors Who Cooperated At Least Three Times.md` +- `medium/176. Second Highest Salary.md` +- `hard/262. Trips and Users.md` + +## What Each Solution Includes + +Each Markdown solution document typically contains: + +- the LeetCode problem link +- the problem description +- table schema details +- sample input and expected output +- the SQL solution +- a short breakdown of the query logic + +## Problem Statement References + +Problem statements, including premium questions, can be referenced from: + +1. https://leetcode.ca +2. https://lifewithdata.com/sql +3. https://www.jiakaobo.com/leetcode + +## Notes + +1. Table names in the SQL queries are suffixed with the question number. +2. The repository is now documentation-oriented, so solutions are stored as `.md` files instead of standalone `.sql` files. +3. New solutions can be added in the same Markdown format as more problems are covered. + +## Other Resources + +Miscellaneous SQL repository: https://github.com/cM2908/misc-sql -#### Checkout my Blogs on interesting SQL topics :
+SQL blog: http://chintan-sql.blogspot.com diff --git a/dump_file/leetcodedb.sql b/dump_file/leetcodedb.sql deleted file mode 100644 index 0fef71e..0000000 Binary files a/dump_file/leetcodedb.sql and /dev/null differ diff --git a/easy/1050. Actors and Directors Who Cooperated At Least Three Times.md b/easy/1050. Actors and Directors Who Cooperated At Least Three Times.md new file mode 100644 index 0000000..8f101c7 --- /dev/null +++ b/easy/1050. Actors and Directors Who Cooperated At Least Three Times.md @@ -0,0 +1,73 @@ +# Question 1050: Actors and Directors Who Cooperated At Least Three Times + +**LeetCode URL:** https://leetcode.com/problems/actors-and-directors-who-cooperated-at-least-three-times/ + +## Description + +Write a solution to find all the pairs (actor_id, director_id) where the actor has cooperated with the director at least three times. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists ActorDirector (actor_id int, director_id int, timestamp int); +``` + +## Sample Input Data + +```sql +insert into ActorDirector (actor_id, director_id, timestamp) values ('1', '1', '0'); +insert into ActorDirector (actor_id, director_id, timestamp) values ('1', '1', '1'); +insert into ActorDirector (actor_id, director_id, timestamp) values ('1', '1', '2'); +insert into ActorDirector (actor_id, director_id, timestamp) values ('1', '2', '3'); +insert into ActorDirector (actor_id, director_id, timestamp) values ('1', '2', '4'); +insert into ActorDirector (actor_id, director_id, timestamp) values ('2', '1', '5'); +insert into ActorDirector (actor_id, director_id, timestamp) values ('2', '1', '6'); +``` + +## Expected Output Data + +```text ++-------------+-------------+ +| actor_id | director_id | ++-------------+-------------+ +| 1 | 1 | ++-------------+-------------+ +``` + +## SQL Solution + +```sql +SELECT actor_id,director_id +FROM actor_director_1050 +GROUP BY actor_id,director_id +HAVING COUNT(1)>=3; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `actor_id`, `director_id` from `actor_director`. + +### Result Grain + +One row per unique key in `GROUP BY actor_id,director_id`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by actor_id,director_id. +2. Project final output columns: `actor_id`, `director_id`. +3. Filter aggregated groups in `HAVING`: COUNT(1)>=3. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1050. Actors and Directors Who Cooperated At Least Three Times.sql b/easy/1050. Actors and Directors Who Cooperated At Least Three Times.sql deleted file mode 100644 index 79ac3c1..0000000 --- a/easy/1050. Actors and Directors Who Cooperated At Least Three Times.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT actor_id,director_id -FROM actor_director_1050 -GROUP BY actor_id,director_id -HAVING COUNT(1)>=3; diff --git a/easy/1068. Product Sales Analysis I.md b/easy/1068. Product Sales Analysis I.md new file mode 100644 index 0000000..96a13a1 --- /dev/null +++ b/easy/1068. Product Sales Analysis I.md @@ -0,0 +1,73 @@ +# Question 1068: Product Sales Analysis I + +**LeetCode URL:** https://leetcode.com/problems/product-sales-analysis-i/ + +## Description + +Write a solution to report the product_name, year, and price for each sale_id in the Sales table. Return the resulting table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Sales (sale_id int, product_id int, year int, quantity int, price int); +Create table If Not Exists Product (product_id int, product_name varchar(10)); +``` + +## Sample Input Data + +```sql +insert into Sales (sale_id, product_id, year, quantity, price) values ('1', '100', '2008', '10', '5000'); +insert into Sales (sale_id, product_id, year, quantity, price) values ('2', '100', '2009', '12', '5000'); +insert into Sales (sale_id, product_id, year, quantity, price) values ('7', '200', '2011', '15', '9000'); +insert into Product (product_id, product_name) values ('100', 'Nokia'); +insert into Product (product_id, product_name) values ('200', 'Apple'); +insert into Product (product_id, product_name) values ('300', 'Samsung'); +``` + +## Expected Output Data + +```text ++--------------+-------+-------+ +| product_name | year | price | ++--------------+-------+-------+ +| Nokia | 2008 | 5000 | +| Nokia | 2009 | 5000 | +| Apple | 2011 | 9000 | ++--------------+-------+-------+ +``` + +## SQL Solution + +```sql +SELECT p.product_name,s.year,s.price +FROM sales_1068 s +JOIN product_1068 p ON s.product_id = p.product_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_name`, `year`, `price` from `sales`, `product`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `product_name`, `year`, `price`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1068. Product Sales Analysis I.sql b/easy/1068. Product Sales Analysis I.sql deleted file mode 100644 index 53207fe..0000000 --- a/easy/1068. Product Sales Analysis I.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT p.product_name,s.year,s.price -FROM sales_1068 s -JOIN product_1068 p ON s.product_id = p.product_id; diff --git a/easy/1069. Product Sales Analysis II.md b/easy/1069. Product Sales Analysis II.md new file mode 100644 index 0000000..103ecd9 --- /dev/null +++ b/easy/1069. Product Sales Analysis II.md @@ -0,0 +1,72 @@ +# Question 1069: Product Sales Analysis II + +**LeetCode URL:** https://leetcode.com/problems/product-sales-analysis-ii/ + +## Description + +The query result format is in the following example: Sales table: +---------+------------+------+----------+-------+ | sale_id | product_id | year | quantity | price | +---------+------------+------+----------+-------+ | 1 | 100 | 2008 | 10 | 5000 | | 2 | 100 | 2009 | 12 | 5000 | | 7 | 200 | 2011 | 15 | 9000 | +---------+------------+------+----------+-------+ Product table: +------------+--------------+ | product_id | product_name | +------------+--------------+ | 100 | Nokia | | 200 | Apple | | 300 | Samsung | +------------+--------------+ Result table: +--------------+----------------+ | product_id | total_quantity | +--------------+----------------+ | 100 | 22 | | 200 | 15 | +--------------+----------------+ Difficulty: Easy Lock: Prime Company: Amazon Problem Solution 1069-Product-Sales-Analysis-II All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Sales (sale_id int, product_id int, year int, quantity int, price int); +Create table If Not Exists Product (product_id int, product_name varchar(10)); +``` + +## Sample Input Data + +```sql +insert into Sales (sale_id, product_id, year, quantity, price) values ('1', '100', '2008', '10', '5000'); +insert into Sales (sale_id, product_id, year, quantity, price) values ('2', '100', '2009', '12', '5000'); +insert into Sales (sale_id, product_id, year, quantity, price) values ('7', '200', '2011', '15', '9000'); +insert into Product (product_id, product_name) values ('100', 'Nokia'); +insert into Product (product_id, product_name) values ('200', 'Apple'); +insert into Product (product_id, product_name) values ('300', 'Samsung'); +``` + +## Expected Output Data + +```text ++--------------+----------------+ +| product_id | total_quantity | ++--------------+----------------+ +| 100 | 22 | +| 200 | 15 | ++--------------+----------------+ +``` + +## SQL Solution + +```sql +SELECT product_id,SUM(quantity) AS total_quantity +FROM sales_1068 +GROUP BY product_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id`, `total_quantity` from `sales`. + +### Result Grain + +One row per unique key in `GROUP BY product_id`. + +### Step-by-Step Logic + +1. Aggregate rows with SUM grouped by product_id. +2. Project final output columns: `product_id`, `total_quantity`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1069. Product Sales Analysis II.sql b/easy/1069. Product Sales Analysis II.sql deleted file mode 100644 index 4306b8c..0000000 --- a/easy/1069. Product Sales Analysis II.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT product_id,SUM(quantity) AS total_quantity -FROM sales_1068 -GROUP BY product_id; diff --git a/easy/1075. Project Employees I.md b/easy/1075. Project Employees I.md new file mode 100644 index 0000000..c4786e9 --- /dev/null +++ b/easy/1075. Project Employees I.md @@ -0,0 +1,80 @@ +# Question 1075: Project Employees I + +**LeetCode URL:** https://leetcode.com/problems/project-employees-i/ + +## Description + +Return the result table in any order. + +## Table Schema Structure + +```sql +Create table If Not Exists Project (project_id int, employee_id int); +Create table If Not Exists Employee (employee_id int, name varchar(10), experience_years int); +``` + +## Sample Input Data + +```sql +insert into Project (project_id, employee_id) values ('1', '1'); +insert into Project (project_id, employee_id) values ('1', '2'); +insert into Project (project_id, employee_id) values ('1', '3'); +insert into Project (project_id, employee_id) values ('2', '1'); +insert into Project (project_id, employee_id) values ('2', '4'); +insert into Employee (employee_id, name, experience_years) values ('1', 'Khaled', '3'); +insert into Employee (employee_id, name, experience_years) values ('2', 'Ali', '2'); +insert into Employee (employee_id, name, experience_years) values ('3', 'John', '1'); +insert into Employee (employee_id, name, experience_years) values ('4', 'Doe', '2'); +``` + +## Expected Output Data + +```text ++-------------+---------------+ +| project_id | average_years | ++-------------+---------------+ +| 1 | 2.00 | +| 2 | 2.50 | ++-------------+---------------+ +``` + +## SQL Solution + +```sql +SELECT p.project_id,ROUND(AVG(e.experience_years),2) +FROM project_1075 p +JOIN employee_1075 e ON p.employee_id=e.employee_id +GROUP BY p.project_id +ORDER BY 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `project_id` from `project`, `employee`. + +### Result Grain + +One row per unique key in `GROUP BY p.project_id`. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with AVG, ROUND grouped by p.project_id. +3. Project final output columns: `project_id`. +4. Order output deterministically with `ORDER BY 1`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1075. Project Employees I.sql b/easy/1075. Project Employees I.sql deleted file mode 100644 index e5536bc..0000000 --- a/easy/1075. Project Employees I.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT p.project_id,ROUND(AVG(e.experience_years),2) -FROM project_1075 p -JOIN employee_1075 e ON p.employee_id=e.employee_id -GROUP BY p.project_id -ORDER BY 1; diff --git a/easy/1076. Project Employees II.md b/easy/1076. Project Employees II.md new file mode 100644 index 0000000..b3a918c --- /dev/null +++ b/easy/1076. Project Employees II.md @@ -0,0 +1,77 @@ +# Question 1076: Project Employees II + +**LeetCode URL:** https://leetcode.com/problems/project-employees-ii/ + +## Description + +The query result format is in the following example: Project table: +-------------+-------------+ | project_id | employee_id | +-------------+-------------+ | 1 | 1 | | 1 | 2 | | 1 | 3 | | 2 | 1 | | 2 | 4 | +-------------+-------------+ Employee table: +-------------+--------+------------------+ | employee_id | name | experience_years | +-------------+--------+------------------+ | 1 | Khaled | 3 | | 2 | Ali | 2 | | 3 | John | 1 | | 4 | Doe | 2 | +-------------+--------+------------------+ Result table: +-------------+ | project_id | +-------------+ | 1 | +-------------+ The first project has 3 employees while the second one has 2. + +## Table Schema Structure + +```sql +Create table If Not Exists Project (project_id int, employee_id int); +Create table If Not Exists Employee (employee_id int, name varchar(10), experience_years int); +``` + +## Sample Input Data + +```sql +insert into Project (project_id, employee_id) values ('1', '1'); +insert into Project (project_id, employee_id) values ('1', '2'); +insert into Project (project_id, employee_id) values ('1', '3'); +insert into Project (project_id, employee_id) values ('2', '1'); +insert into Project (project_id, employee_id) values ('2', '4'); +insert into Employee (employee_id, name, experience_years) values ('1', 'Khaled', '3'); +insert into Employee (employee_id, name, experience_years) values ('2', 'Ali', '2'); +insert into Employee (employee_id, name, experience_years) values ('3', 'John', '1'); +insert into Employee (employee_id, name, experience_years) values ('4', 'Doe', '2'); +``` + +## Expected Output Data + +```text ++-------------+ +| project_id | ++-------------+ +| 1 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT project_id +FROM project_1075 +GROUP BY project_id +ORDER BY COUNT(employee_id) DESC +LIMIT 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `project_id` from `project`. + +### Result Grain + +One row per unique key in `GROUP BY project_id`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by project_id. +2. Project final output columns: `project_id`. +3. Order output deterministically with `ORDER BY COUNT(employee_id) DESC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1076. Project Employees II.sql b/easy/1076. Project Employees II.sql deleted file mode 100644 index 6fb0174..0000000 --- a/easy/1076. Project Employees II.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT project_id -FROM project_1075 -GROUP BY project_id -ORDER BY COUNT(employee_id) DESC -LIMIT 1; diff --git a/easy/1082. Sales Analysis I.md b/easy/1082. Sales Analysis I.md new file mode 100644 index 0000000..07de797 --- /dev/null +++ b/easy/1082. Sales Analysis I.md @@ -0,0 +1,82 @@ +# Question 1082: Sales Analysis I + +**LeetCode URL:** https://leetcode.com/problems/sales-analysis-i/ + +## Description + +The query result format is in the following example: Product table: +------------+--------------+------------+ | product_id | product_name | unit_price | +------------+--------------+------------+ | 1 | S8 | 1000 | | 2 | G4 | 800 | | 3 | iPhone | 1400 | +------------+--------------+------------+ Sales table: +-----------+------------+----------+------------+----------+-------+ | seller_id | product_id | buyer_id | sale_date | quantity | price | +-----------+------------+----------+------------+----------+-------+ | 1 | 1 | 1 | 2019-01-21 | 2 | 2000 | | 1 | 2 | 2 | 2019-02-17 | 1 | 800 | | 2 | 2 | 3 | 2019-06-02 | 1 | 800 | | 3 | 3 | 4 | 2019-05-13 | 2 | 2800 | +-----------+------------+----------+------------+----------+-------+ Result table: +-------------+ | seller_id | +-------------+ | 1 | | 3 | +-------------+ Both sellers with id 1 and 3 sold products with the most total price of 2800. + +## Table Schema Structure + +```sql +Create table If Not Exists Product (product_id int, product_name varchar(10), unit_price int); +Create table If Not Exists Sales (seller_id int, product_id int, buyer_id int, sale_date date, quantity int, price int); +``` + +## Sample Input Data + +```sql +insert into Product (product_id, product_name, unit_price) values ('1', 'S8', '1000'); +insert into Product (product_id, product_name, unit_price) values ('2', 'G4', '800'); +insert into Product (product_id, product_name, unit_price) values ('3', 'iPhone', '1400'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('1', '1', '1', '2019-01-21', '2', '2000'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('1', '2', '2', '2019-02-17', '1', '800'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('2', '2', '3', '2019-06-02', '1', '800'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('3', '3', '4', '2019-05-13', '2', '2800'); +``` + +## Expected Output Data + +```text ++-------------+ +| seller_id | ++-------------+ +| 1 | +| 3 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT seller_id +FROM sales_1082 +GROUP BY seller_id +HAVING SUM(price) IN ( + SELECT SUM(price) AS m_sum + FROM sales_1082 + GROUP BY seller_id + ORDER BY m_sum DESC + LIMIT 1 + ); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `seller_id` from `sales`. + +### Result Grain + +One row per unique key in `GROUP BY seller_id`. + +### Step-by-Step Logic + +1. Aggregate rows with SUM grouped by seller_id. +2. Project final output columns: `seller_id`. +3. Filter aggregated groups in `HAVING`: SUM(price) IN ( SELECT SUM(price) AS m_sum FROM sales_1082 GROUP BY seller_id. +4. Order output deterministically with `ORDER BY m_sum DESC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1082. Sales Analysis I.sql b/easy/1082. Sales Analysis I.sql deleted file mode 100644 index 930534f..0000000 --- a/easy/1082. Sales Analysis I.sql +++ /dev/null @@ -1,10 +0,0 @@ -SELECT seller_id -FROM sales_1082 -GROUP BY seller_id -HAVING SUM(price) IN ( - SELECT SUM(price) AS m_sum - FROM sales_1082 - GROUP BY seller_id - ORDER BY m_sum DESC - LIMIT 1 - ); diff --git a/easy/1083. Sales Analysis II.md b/easy/1083. Sales Analysis II.md new file mode 100644 index 0000000..3a2e6b5 --- /dev/null +++ b/easy/1083. Sales Analysis II.md @@ -0,0 +1,80 @@ +# Question 1083: Sales Analysis II + +**LeetCode URL:** https://leetcode.com/problems/sales-analysis-ii/ + +## Description + +The query result format is in the following example: Product table: +------------+--------------+------------+ | product_id | product_name | unit_price | +------------+--------------+------------+ | 1 | S8 | 1000 | | 2 | G4 | 800 | | 3 | iPhone | 1400 | +------------+--------------+------------+ Sales table: +-----------+------------+----------+------------+----------+-------+ | seller_id | product_id | buyer_id | sale_date | quantity | price | +-----------+------------+----------+------------+----------+-------+ | 1 | 1 | 1 | 2019-01-21 | 2 | 2000 | | 1 | 2 | 2 | 2019-02-17 | 1 | 800 | | 2 | 1 | 3 | 2019-06-02 | 1 | 800 | | 3 | 3 | 3 | 2019-05-13 | 2 | 2800 | +-----------+------------+----------+------------+----------+-------+ Result table: +-------------+ | buyer_id | +-------------+ | 1 | +-------------+ The buyer with id 1 bought an S8 but didn't buy an iPhone. + +## Table Schema Structure + +```sql +Create table If Not Exists Product (product_id int, product_name varchar(10), unit_price int); +Create table If Not Exists Sales (seller_id int, product_id int, buyer_id int, sale_date date, quantity int, price int); +``` + +## Sample Input Data + +```sql +insert into Product (product_id, product_name, unit_price) values ('1', 'S8', '1000'); +insert into Product (product_id, product_name, unit_price) values ('2', 'G4', '800'); +insert into Product (product_id, product_name, unit_price) values ('3', 'iPhone', '1400'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('1', '1', '1', '2019-01-21', '2', '2000'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('1', '2', '2', '2019-02-17', '1', '800'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('2', '1', '3', '2019-06-02', '1', '800'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('3', '3', '3', '2019-05-13', '2', '2800'); +``` + +## Expected Output Data + +```text ++-------------+ +| buyer_id | ++-------------+ +| 1 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT s.buyer_id +FROM sales_1082 s +LEFT JOIN product_1082 p ON s.product_id = p.product_id +WHERE p.product_name = 'S8' AND + s.buyer_id NOT IN (SELECT s.buyer_id + FROM sales_1082 s LEFT JOIN product_1082 p ON + s.product_id = p.product_id + WHERE p.product_name = 'iPhone'); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `buyer_id` from `sales`, `product`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: p.product_name = 'S8' AND s.buyer_id NOT IN (SELECT s.buyer_id FROM sales_1082 s LEFT JOIN product_1082 p ON s.product_id = p.product_id WHERE p.product_name = 'iPhone'). +3. Project final output columns: `buyer_id`. +4. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1083. Sales Analysis II.sql b/easy/1083. Sales Analysis II.sql deleted file mode 100644 index 93623b2..0000000 --- a/easy/1083. Sales Analysis II.sql +++ /dev/null @@ -1,8 +0,0 @@ -SELECT DISTINCT s.buyer_id -FROM sales_1082 s -LEFT JOIN product_1082 p ON s.product_id = p.product_id -WHERE p.product_name = 'S8' AND - s.buyer_id NOT IN (SELECT s.buyer_id - FROM sales_1082 s LEFT JOIN product_1082 p ON - s.product_id = p.product_id - WHERE p.product_name = 'iPhone'); diff --git a/easy/1084. Sales Analysis III.md b/easy/1084. Sales Analysis III.md new file mode 100644 index 0000000..ea8c2c6 --- /dev/null +++ b/easy/1084. Sales Analysis III.md @@ -0,0 +1,77 @@ +# Question 1084: Sales Analysis III + +**LeetCode URL:** https://leetcode.com/problems/sales-analysis-iii/ + +## Description + +Write a solution to report the products that were only sold in the first quarter of 2019. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Product (product_id int, product_name varchar(10), unit_price int); +Create table If Not Exists Sales (seller_id int, product_id int, buyer_id int, sale_date date, quantity int, price int); +``` + +## Sample Input Data + +```sql +insert into Product (product_id, product_name, unit_price) values ('1', 'S8', '1000'); +insert into Product (product_id, product_name, unit_price) values ('2', 'G4', '800'); +insert into Product (product_id, product_name, unit_price) values ('3', 'iPhone', '1400'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('1', '1', '1', '2019-01-21', '2', '2000'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('1', '2', '2', '2019-02-17', '1', '800'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('2', '2', '3', '2019-06-02', '1', '800'); +insert into Sales (seller_id, product_id, buyer_id, sale_date, quantity, price) values ('3', '3', '4', '2019-05-13', '2', '2800'); +``` + +## Expected Output Data + +```text ++-------------+--------------+ +| product_id | product_name | ++-------------+--------------+ +| 1 | S8 | ++-------------+--------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT p.product_id,p.product_name +FROM product_1082 p +INNER JOIN sales_1082 s ON p.product_id = s.product_id AND (s.sale_date BETWEEN '2019-01-01' AND '2019-03-31') +EXCEPT +SELECT DISTINCT p.product_id,p.product_name +FROM product_1082 p +INNER JOIN sales_1082 s ON p.product_id = s.product_id AND (s.sale_date < '2019-01-01' OR s.sale_date > '2019-03-31'); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id`, `product_name` from `product`, `sales`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `product_id`, `product_name`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1084. Sales Analysis III.sql b/easy/1084. Sales Analysis III.sql deleted file mode 100644 index eac5b79..0000000 --- a/easy/1084. Sales Analysis III.sql +++ /dev/null @@ -1,7 +0,0 @@ -SELECT DISTINCT p.product_id,p.product_name -FROM product_1082 p -INNER JOIN sales_1082 s ON p.product_id = s.product_id AND (s.sale_date BETWEEN '2019-01-01' AND '2019-03-31') -EXCEPT -SELECT DISTINCT p.product_id,p.product_name -FROM product_1082 p -INNER JOIN sales_1082 s ON p.product_id = s.product_id AND (s.sale_date < '2019-01-01' OR s.sale_date > '2019-03-31'); diff --git a/easy/1113.Reported Posts.md b/easy/1113.Reported Posts.md new file mode 100644 index 0000000..6193316 --- /dev/null +++ b/easy/1113.Reported Posts.md @@ -0,0 +1,80 @@ +# Question 1113: Reported Posts + +**LeetCode URL:** https://leetcode.com/problems/reported-posts/ + +## Description + +The query result format is in the following example: Actions table: +---------+---------+-------------+--------+--------+ | user_id | post_id | action_date | action | extra | +---------+---------+-------------+--------+--------+ | 1 | 1 | 2019-07-01 | view | null | | 1 | 1 | 2019-07-01 | like | null | | 1 | 1 | 2019-07-01 | share | null | | 2 | 4 | 2019-07-04 | view | null | | 2 | 4 | 2019-07-04 | report | spam | | 3 | 4 | 2019-07-04 | view | null | | 3 | 4 | 2019-07-04 | report | spam | | 4 | 3 | 2019-07-02 | view | null | | 4 | 3 | 2019-07-02 | report | spam | | 5 | 2 | 2019-07-04 | view | null | | 5 | 2 | 2019-07-04 | report | racism | | 5 | 5 | 2019-07-04 | view | null | | 5 | 5 | 2019-07-04 | report | racism | +---------+---------+-------------+--------+--------+ Result table: +---------------+--------------+ | report_reason | report_count | +---------------+--------------+ | spam | 1 | | racism | 2 | +---------------+--------------+ Note that we only care about report reasons with non zero number of reports. + +## Table Schema Structure + +```sql +Create table If Not Exists Actions (user_id int, post_id int, action_date date, action ENUM('view', 'like', 'reaction', 'comment', 'report', 'share'), extra varchar(10)); +``` + +## Sample Input Data + +```sql +insert into Actions (user_id, post_id, action_date, action, extra) values ('1', '1', '2019-07-01', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('1', '1', '2019-07-01', 'like', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('1', '1', '2019-07-01', 'share', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('2', '4', '2019-07-04', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('2', '4', '2019-07-04', 'report', 'spam'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('3', '4', '2019-07-04', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('3', '4', '2019-07-04', 'report', 'spam'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('4', '3', '2019-07-02', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('4', '3', '2019-07-02', 'report', 'spam'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '2', '2019-07-04', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '2', '2019-07-04', 'report', 'racism'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '5', '2019-07-04', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '5', '2019-07-04', 'report', 'racism'); +``` + +## Expected Output Data + +```text ++---------------+--------------+ +| report_reason | report_count | ++---------------+--------------+ +| spam | 1 | +| racism | 2 | ++---------------+--------------+ +``` + +## SQL Solution + +```sql +SELECT extra,COUNT(DISTINCT post_id) +FROM actions_1113 +WHERE extra IS NOT NULL AND action_date = DATE '2019-07-05'-1 +GROUP BY extra; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `extra` from `actions`. + +### Result Grain + +One row per unique key in `GROUP BY extra`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: extra IS NOT NULL AND action_date = DATE '2019-07-05'-1. +2. Aggregate rows with COUNT grouped by extra. +3. Project final output columns: `extra`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1113.Reported Posts.sql b/easy/1113.Reported Posts.sql deleted file mode 100644 index 6d69af9..0000000 --- a/easy/1113.Reported Posts.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT extra,COUNT(DISTINCT post_id) -FROM actions_1113 -WHERE extra IS NOT NULL AND action_date = DATE '2019-07-05'-1 -GROUP BY extra; diff --git a/easy/1141. User Activity for the Past 30 Days I.md b/easy/1141. User Activity for the Past 30 Days I.md new file mode 100644 index 0000000..6340018 --- /dev/null +++ b/easy/1141. User Activity for the Past 30 Days I.md @@ -0,0 +1,79 @@ +# Question 1141: User Activity for the Past 30 Days I + +**LeetCode URL:** https://leetcode.com/problems/user-activity-for-the-past-30-days-i/ + +## Description + +Write a solution to find the daily active user count for a period of 30 days ending 2019-07-27 inclusively. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (user_id int, session_id int, activity_date date, activity_type ENUM('open_session', 'end_session', 'scroll_down', 'send_message')); +``` + +## Sample Input Data + +```sql +insert into Activity (user_id, session_id, activity_date, activity_type) values ('1', '1', '2019-07-20', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('1', '1', '2019-07-20', 'scroll_down'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('1', '1', '2019-07-20', 'end_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('2', '4', '2019-07-20', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('2', '4', '2019-07-21', 'send_message'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('2', '4', '2019-07-21', 'end_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '2', '2019-07-21', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '2', '2019-07-21', 'send_message'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '2', '2019-07-21', 'end_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('4', '3', '2019-06-25', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('4', '3', '2019-06-25', 'end_session'); +``` + +## Expected Output Data + +```text ++------------+--------------+ +| day | active_users | ++------------+--------------+ +| 2019-07-20 | 2 | +| 2019-07-21 | 2 | ++------------+--------------+ +``` + +## SQL Solution + +```sql +SELECT activity_date,COUNT(DISTINCT user_id) +FROM activity_1141 +WHERE activity_date <= '2019-07-27' AND activity_date >= DATE '2019-07-27'-30 +GROUP BY activity_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `activity_date` from `activity`. + +### Result Grain + +One row per unique key in `GROUP BY activity_date`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: activity_date <= '2019-07-27' AND activity_date >= DATE '2019-07-27'-30. +2. Aggregate rows with COUNT grouped by activity_date. +3. Project final output columns: `activity_date`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1141. User Activity for the Past 30 Days I.sql b/easy/1141. User Activity for the Past 30 Days I.sql deleted file mode 100644 index cfd441d..0000000 --- a/easy/1141. User Activity for the Past 30 Days I.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT activity_date,COUNT(DISTINCT user_id) -FROM activity_1141 -WHERE activity_date <= '2019-07-27' AND activity_date >= DATE '2019-07-27'-30 -GROUP BY activity_date; diff --git a/easy/1142. User Activity for the Past 30 Days II.md b/easy/1142. User Activity for the Past 30 Days II.md new file mode 100644 index 0000000..6fe4b31 --- /dev/null +++ b/easy/1142. User Activity for the Past 30 Days II.md @@ -0,0 +1,85 @@ +# Question 1142: User Activity for the Past 30 Days II + +**LeetCode URL:** https://leetcode.com/problems/user-activity-for-the-past-30-days-ii/ + +## Description + +Write an SQL query to find the average number of sessions per user for a period of 30 days ending 2019-07-27 inclusively, rounded to 2 decimal places. The query result format is in the following example: Activity table: +---------+------------+---------------+---------------+ | user_id | session_id | activity_date | activity_type | +---------+------------+---------------+---------------+ | 1 | 1 | 2019-07-20 | open_session | | 1 | 1 | 2019-07-20 | scroll_down | | 1 | 1 | 2019-07-20 | end_session | | 2 | 4 | 2019-07-20 | open_session | | 2 | 4 | 2019-07-21 | send_message | | 2 | 4 | 2019-07-21 | end_session | | 3 | 2 | 2019-07-21 | open_session | | 3 | 2 | 2019-07-21 | send_message | | 3 | 2 | 2019-07-21 | end_session | | 3 | 5 | 2019-07-21 | open_session | | 3 | 5 | 2019-07-21 | scroll_down | | 3 | 5 | 2019-07-21 | end_session | | 4 | 3 | 2019-06-25 | open_session | | 4 | 3 | 2019-06-25 | end_session | +---------+------------+---------------+---------------+ Result table: +---------------------------+ | average_sessions_per_user | +---------------------------+ | 1. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (user_id int, session_id int, activity_date date, activity_type ENUM('open_session', 'end_session', 'scroll_down', 'send_message')); +``` + +## Sample Input Data + +```sql +insert into Activity (user_id, session_id, activity_date, activity_type) values ('1', '1', '2019-07-20', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('1', '1', '2019-07-20', 'scroll_down'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('1', '1', '2019-07-20', 'end_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('2', '4', '2019-07-20', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('2', '4', '2019-07-21', 'send_message'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('2', '4', '2019-07-21', 'end_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '2', '2019-07-21', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '2', '2019-07-21', 'send_message'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '2', '2019-07-21', 'end_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '5', '2019-07-21', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '5', '2019-07-21', 'scroll_down'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('3', '5', '2019-07-21', 'end_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('4', '3', '2019-06-25', 'open_session'); +insert into Activity (user_id, session_id, activity_date, activity_type) values ('4', '3', '2019-06-25', 'end_session'); +``` + +## Expected Output Data + +```text ++---------------------------+ +| average_sessions_per_user | ++---------------------------+ +| 1.33 | ++---------------------------+ +``` + +## SQL Solution + +```sql +WITH cnt AS( + SELECT COUNT(DISTINCT session_id) AS c + FROM activity_1142 + WHERE activity_date <= '2019-07-27' AND activity_date > '2019-07-27'::DATE - 30 + GROUP BY user_id +) + +SELECT ROUND(AVG(c),2) AS average_sessions_per_user +FROM cnt; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `average_sessions_per_user` from `activity`, `cnt`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cnt`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cnt`: reads `activity`. +3. Project final output columns: `average_sessions_per_user`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/1142. User Activity for the Past 30 Days II.sql b/easy/1142. User Activity for the Past 30 Days II.sql deleted file mode 100644 index dfeb47d..0000000 --- a/easy/1142. User Activity for the Past 30 Days II.sql +++ /dev/null @@ -1,9 +0,0 @@ -WITH cnt AS( - SELECT COUNT(DISTINCT session_id) AS c - FROM activity_1142 - WHERE activity_date <= '2019-07-27' AND activity_date > '2019-07-27'::DATE - 30 - GROUP BY user_id -) - -SELECT ROUND(AVG(c),2) AS average_sessions_per_user -FROM cnt; diff --git a/easy/1148. Article Views I.md b/easy/1148. Article Views I.md new file mode 100644 index 0000000..fb98511 --- /dev/null +++ b/easy/1148. Article Views I.md @@ -0,0 +1,73 @@ +# Question 1148: Article Views I + +**LeetCode URL:** https://leetcode.com/problems/article-views-i/ + +## Description + +Write an SQL query to find all the authors that viewed at least one of their own articles, sorted in ascending order by their id. The query result format is in the following example: Views table: +------------+-----------+-----------+------------+ | article_id | author_id | viewer_id | view_date | +------------+-----------+-----------+------------+ | 1 | 3 | 5 | 2019-08-01 | | 1 | 3 | 6 | 2019-08-02 | | 2 | 7 | 7 | 2019-08-01 | | 2 | 7 | 6 | 2019-08-02 | | 4 | 7 | 1 | 2019-07-22 | | 3 | 4 | 4 | 2019-07-21 | | 3 | 4 | 4 | 2019-07-21 | +------------+-----------+-----------+------------+ Result table: +------+ | id | +------+ | 4 | | 7 | +------+ Difficulty: Easy Lock: Prime Company: LinkedIn Problem Solution 1148-Article-Views-I All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Views (article_id int, author_id int, viewer_id int, view_date date); +``` + +## Sample Input Data + +```sql +insert into Views (article_id, author_id, viewer_id, view_date) values ('1', '3', '5', '2019-08-01'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('1', '3', '6', '2019-08-02'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('2', '7', '7', '2019-08-01'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('2', '7', '6', '2019-08-02'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('4', '7', '1', '2019-07-22'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('3', '4', '4', '2019-07-21'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('3', '4', '4', '2019-07-21'); +``` + +## Expected Output Data + +```text ++------+ +| id | ++------+ +| 4 | +| 7 | ++------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT author_id +FROM views_1148 +WHERE viewer_id = author_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `author_id` from `views`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: viewer_id = author_id. +2. Project final output columns: `author_id`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1148. Article Views I.sql b/easy/1148. Article Views I.sql deleted file mode 100644 index f7e60a9..0000000 --- a/easy/1148. Article Views I.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT DISTINCT author_id -FROM views_1148 -WHERE viewer_id = author_id; diff --git a/easy/1173. Immediate Food Delivery I.md b/easy/1173. Immediate Food Delivery I.md new file mode 100644 index 0000000..06e90ee --- /dev/null +++ b/easy/1173. Immediate Food Delivery I.md @@ -0,0 +1,69 @@ +# Question 1173: Immediate Food Delivery I + +**LeetCode URL:** https://leetcode.com/problems/immediate-food-delivery-i/ + +## Description + +Write an SQL query to find the percentage of immediate orders in the table, rounded to 2 decimal places. The query result format is in the following example: Delivery table: +-------------+-------------+------------+-----------------------------+ | delivery_id | customer_id | order_date | customer_pref_delivery_date | +-------------+-------------+------------+-----------------------------+ | 1 | 1 | 2019-08-01 | 2019-08-02 | | 2 | 5 | 2019-08-02 | 2019-08-02 | | 3 | 1 | 2019-08-11 | 2019-08-11 | | 4 | 3 | 2019-08-24 | 2019-08-26 | | 5 | 4 | 2019-08-21 | 2019-08-22 | | 6 | 2 | 2019-08-11 | 2019-08-13 | +-------------+-------------+------------+-----------------------------+ Result table: +----------------------+ | immediate_percentage | +----------------------+ | 33. + +## Table Schema Structure + +```sql +Create table If Not Exists Delivery (delivery_id int, customer_id int, order_date date, customer_pref_delivery_date date); +``` + +## Sample Input Data + +```sql +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('1', '1', '2019-08-01', '2019-08-02'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('2', '5', '2019-08-02', '2019-08-02'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('3', '1', '2019-08-11', '2019-08-11'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('4', '3', '2019-08-24', '2019-08-26'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('5', '4', '2019-08-21', '2019-08-22'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('6', '2', '2019-08-11', '2019-08-13'); +``` + +## Expected Output Data + +```text ++----------------------+ +| immediate_percentage | ++----------------------+ +| 33.33 | ++----------------------+ +``` + +## SQL Solution + +```sql +SELECT ROUND((COUNT(CASE WHEN order_date = customer_pref_delivery_date THEN 1 ELSE NULL END)::NUMERIC/COUNT(*))*100,2) + AS immediate_percentage +FROM delivery_1173; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `immediate_percentage` from `delivery`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Project final output columns: `immediate_percentage`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/1173. Immediate Food Delivery I.sql b/easy/1173. Immediate Food Delivery I.sql deleted file mode 100644 index 93ca152..0000000 --- a/easy/1173. Immediate Food Delivery I.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT ROUND((COUNT(CASE WHEN order_date = customer_pref_delivery_date THEN 1 ELSE NULL END)::NUMERIC/COUNT(*))*100,2) - AS immediate_percentage -FROM delivery_1173; diff --git a/easy/1179. Reformat Department Table.md b/easy/1179. Reformat Department Table.md new file mode 100644 index 0000000..abc149d --- /dev/null +++ b/easy/1179. Reformat Department Table.md @@ -0,0 +1,104 @@ +# Question 1179: Reformat Department Table + +**LeetCode URL:** https://leetcode.com/problems/reformat-department-table/ + +## Description + +Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Department (id int, revenue int, month varchar(5)); +``` + +## Sample Input Data + +```sql +insert into Department (id, revenue, month) values ('1', '8000', 'Jan'); +insert into Department (id, revenue, month) values ('2', '9000', 'Jan'); +insert into Department (id, revenue, month) values ('3', '10000', 'Feb'); +insert into Department (id, revenue, month) values ('1', '7000', 'Feb'); +insert into Department (id, revenue, month) values ('1', '6000', 'Mar'); +``` + +## Expected Output Data + +```text ++------+-------------+-------------+-------------+-----+-------------+ +| id | Jan_Revenue | Feb_Revenue | Mar_Revenue | ... | Dec_Revenue | ++------+-------------+-------------+-------------+-----+-------------+ +| 1 | 8000 | 7000 | 6000 | ... | null | +| 2 | 9000 | null | null | ... | null | +| 3 | null | 10000 | null | ... | null | ++------+-------------+-------------+-------------+-----+-------------+ +``` + +## SQL Solution + +```sql +SELECT id, + SUM(CASE WHEN month = 'Jan' THEN revenue ELSE NULL END) AS Jan_Revenue, + SUM(CASE WHEN month = 'Feb' THEN revenue ELSE NULL END) AS Feb_Revenue, + SUM(CASE WHEN month = 'Mar' THEN revenue ELSE NULL END) AS Mar_Revenue, + SUM(CASE WHEN month = 'Apr' THEN revenue ELSE NULL END) AS Apr_Revenue, + SUM(CASE WHEN month = 'May' THEN revenue ELSE NULL END) AS May_Revenue, + SUM(CASE WHEN month = 'Jun' THEN revenue ELSE NULL END) AS Jun_Revenue, + SUM(CASE WHEN month = 'Jul' THEN revenue ELSE NULL END) AS Jul_Revenue, + SUM(CASE WHEN month = 'Aug' THEN revenue ELSE NULL END) AS Aug_Revenue, + SUM(CASE WHEN month = 'Sep' THEN revenue ELSE NULL END) AS Sep_Revenue, + SUM(CASE WHEN month = 'Oct' THEN revenue ELSE NULL END) AS Oct_Revenue, + SUM(CASE WHEN month = 'Nov' THEN revenue ELSE NULL END) AS Nov_Revenue, + SUM(CASE WHEN month = 'Dec' THEN revenue ELSE NULL END) AS Dec_Revenue +FROM department_1179 +GROUP BY id +ORDER BY id; + + +-- Extra +(SELECT id::TEXT, + SUM(CASE WHEN month = 'Jan' THEN revenue ELSE 0 END) AS Jan_Revenue, + SUM(CASE WHEN month = 'Feb' THEN revenue ELSE 0 END) AS Feb_Revenue, + SUM(CASE WHEN month = 'Mar' THEN revenue ELSE 0 END) AS Mar_Revenue, + SUM(revenue) AS Total +FROM department_1179 +GROUP BY id) +UNION +(SELECT NULL, + SUM(CASE WHEN month = 'Jan' THEN revenue ELSE 0 END) AS JTR, + SUM(CASE WHEN month = 'Feb' THEN revenue ELSE 0 END) AS FTR, + SUM(CASE WHEN month = 'Mar' THEN revenue ELSE 0 END) AS MTR, + SUM(revenue) AS TR +FROM department_1179 +GROUP BY 1) +ORDER BY 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `Jan_Revenue`, `Feb_Revenue`, `Mar_Revenue`, `Apr_Revenue`, `May_Revenue` from `department`. + +### Result Grain + +One row per unique key in `GROUP BY id`. + +### Step-by-Step Logic + +1. Aggregate rows with SUM grouped by id. +2. Project final output columns: `id`, `Jan_Revenue`, `Feb_Revenue`, `Mar_Revenue`, `Apr_Revenue`, `May_Revenue`, `Jun_Revenue`, `Jul_Revenue`. +3. Order output deterministically with `ORDER BY id`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1179. Reformat Department Table.sql b/easy/1179. Reformat Department Table.sql deleted file mode 100644 index 66a1de7..0000000 --- a/easy/1179. Reformat Department Table.sql +++ /dev/null @@ -1,36 +0,0 @@ -SELECT id, - SUM(CASE WHEN month = 'Jan' THEN revenue ELSE NULL END) AS Jan_Revenue, - SUM(CASE WHEN month = 'Feb' THEN revenue ELSE NULL END) AS Feb_Revenue, - SUM(CASE WHEN month = 'Mar' THEN revenue ELSE NULL END) AS Mar_Revenue, - SUM(CASE WHEN month = 'Apr' THEN revenue ELSE NULL END) AS Apr_Revenue, - SUM(CASE WHEN month = 'May' THEN revenue ELSE NULL END) AS May_Revenue, - SUM(CASE WHEN month = 'Jun' THEN revenue ELSE NULL END) AS Jun_Revenue, - SUM(CASE WHEN month = 'Jul' THEN revenue ELSE NULL END) AS Jul_Revenue, - SUM(CASE WHEN month = 'Aug' THEN revenue ELSE NULL END) AS Aug_Revenue, - SUM(CASE WHEN month = 'Sep' THEN revenue ELSE NULL END) AS Sep_Revenue, - SUM(CASE WHEN month = 'Oct' THEN revenue ELSE NULL END) AS Oct_Revenue, - SUM(CASE WHEN month = 'Nov' THEN revenue ELSE NULL END) AS Nov_Revenue, - SUM(CASE WHEN month = 'Dec' THEN revenue ELSE NULL END) AS Dec_Revenue -FROM department_1179 -GROUP BY id -ORDER BY id; - - --- Extra -(SELECT id::TEXT, - SUM(CASE WHEN month = 'Jan' THEN revenue ELSE 0 END) AS Jan_Revenue, - SUM(CASE WHEN month = 'Feb' THEN revenue ELSE 0 END) AS Feb_Revenue, - SUM(CASE WHEN month = 'Mar' THEN revenue ELSE 0 END) AS Mar_Revenue, - SUM(revenue) AS Total -FROM department_1179 -GROUP BY id) -UNION -(SELECT NULL, - SUM(CASE WHEN month = 'Jan' THEN revenue ELSE 0 END) AS JTR, - SUM(CASE WHEN month = 'Feb' THEN revenue ELSE 0 END) AS FTR, - SUM(CASE WHEN month = 'Mar' THEN revenue ELSE 0 END) AS MTR, - SUM(revenue) AS TR -FROM department_1179 -GROUP BY 1) -ORDER BY 1; - diff --git a/easy/1211. Queries Quality and Percentage.md b/easy/1211. Queries Quality and Percentage.md new file mode 100644 index 0000000..2751e82 --- /dev/null +++ b/easy/1211. Queries Quality and Percentage.md @@ -0,0 +1,73 @@ +# Question 1211: Queries Quality and Percentage + +**LeetCode URL:** https://leetcode.com/problems/queries-quality-and-percentage/ + +## Description + +Write a solution to find each query_name, the quality and poor_query_percentage. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Queries (query_name varchar(30), result varchar(50), position int, rating int); +``` + +## Sample Input Data + +```sql +insert into Queries (query_name, result, position, rating) values ('Dog', 'Golden Retriever', '1', '5'); +insert into Queries (query_name, result, position, rating) values ('Dog', 'German Shepherd', '2', '5'); +insert into Queries (query_name, result, position, rating) values ('Dog', 'Mule', '200', '1'); +insert into Queries (query_name, result, position, rating) values ('Cat', 'Shirazi', '5', '2'); +insert into Queries (query_name, result, position, rating) values ('Cat', 'Siamese', '3', '3'); +insert into Queries (query_name, result, position, rating) values ('Cat', 'Sphynx', '7', '4'); +``` + +## Expected Output Data + +```text ++------------+---------+-----------------------+ +| query_name | quality | poor_query_percentage | ++------------+---------+-----------------------+ +| Dog | 2.50 | 33.33 | +| Cat | 0.66 | 33.33 | ++------------+---------+-----------------------+ +``` + +## SQL Solution + +```sql +SELECT query_name, + ROUND(AVG(rating::NUMERIC/position),2) AS quality, + ROUND((COUNT(CASE WHEN rating < 3 THEN 1 ELSE NULL END)::NUMERIC/COUNT(*))*100,2) AS poor_query_percentage +FROM queries_1211 +GROUP BY query_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `query_name`, `quality`, `poor_query_percentage` from `queries`. + +### Result Grain + +One row per unique key in `GROUP BY query_name`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT, AVG, ROUND grouped by query_name. +2. Project final output columns: `query_name`, `quality`, `poor_query_percentage`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1211. Queries Quality and Percentage.sql b/easy/1211. Queries Quality and Percentage.sql deleted file mode 100644 index baa1164..0000000 --- a/easy/1211. Queries Quality and Percentage.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT query_name, - ROUND(AVG(rating::NUMERIC/position),2) AS quality, - ROUND((COUNT(CASE WHEN rating < 3 THEN 1 ELSE NULL END)::NUMERIC/COUNT(*))*100,2) AS poor_query_percentage -FROM queries_1211 -GROUP BY query_name; diff --git a/easy/1241. Number of Comments per Post (Easy).md b/easy/1241. Number of Comments per Post (Easy).md new file mode 100644 index 0000000..c06fd88 --- /dev/null +++ b/easy/1241. Number of Comments per Post (Easy).md @@ -0,0 +1,101 @@ +# Question 1241: Number of Comments per Post + +**LeetCode URL:** https://leetcode.com/problems/number-of-comments-per-post/ + +## Description + +Write an SQL query to find number of comments per each post. The query result format is in the following example: Submissions table: +---------+------------+ | sub_id | parent_id | +---------+------------+ | 1 | Null | | 2 | Null | | 1 | Null | | 12 | Null | | 3 | 1 | | 5 | 2 | | 3 | 1 | | 4 | 1 | | 9 | 1 | | 10 | 2 | | 6 | 7 | +---------+------------+ Result table: +---------+--------------------+ | post_id | number_of_comments | +---------+--------------------+ | 1 | 3 | | 2 | 2 | | 12 | 0 | +---------+--------------------+ The post with id 1 has three comments in the table with id 3, 4 and 9. + +## Table Schema Structure + +```sql +Create table If Not Exists Submissions (sub_id int, parent_id int); +``` + +## Sample Input Data + +```sql +insert into Submissions (sub_id, parent_id) values ('1', NULL); +insert into Submissions (sub_id, parent_id) values ('2', NULL); +insert into Submissions (sub_id, parent_id) values ('1', NULL); +insert into Submissions (sub_id, parent_id) values ('12', NULL); +insert into Submissions (sub_id, parent_id) values ('3', '1'); +insert into Submissions (sub_id, parent_id) values ('5', '2'); +insert into Submissions (sub_id, parent_id) values ('3', '1'); +insert into Submissions (sub_id, parent_id) values ('4', '1'); +insert into Submissions (sub_id, parent_id) values ('9', '1'); +insert into Submissions (sub_id, parent_id) values ('10', '2'); +insert into Submissions (sub_id, parent_id) values ('6', '7'); +``` + +## Expected Output Data + +```text ++---------+--------------------+ +| post_id | number_of_comments | ++---------+--------------------+ +| 1 | 3 | +| 2 | 2 | +| 12 | 0 | ++---------+--------------------+ +``` + +## SQL Solution + +```sql +WITH posts AS ( + SELECT * + FROM submissions_1241 + WHERE parent_id IS NULL +), +cmnts AS ( + SELECT * + FROM submissions_1241 + WHERE parent_id IS NOT NULL +), +cte AS ( + SELECT DISTINCT p.sub_id AS post_id,c.sub_id AS cmnt_id + FROM posts p + LEFT JOIN cmnts c + ON p.sub_id = c.parent_id +) +SELECT post_id,COUNT(cmnt_id) +FROM cte +GROUP BY post_id +ORDER BY post_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `post_id` from `submissions`, `posts`, `cmnts`, `cte`. + +### Result Grain + +One row per unique key in `GROUP BY post_id`. + +### Step-by-Step Logic + +1. Create CTE layers (`posts`, `cmnts`, `cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `posts`: reads `submissions`. +3. CTE `cmnts`: reads `submissions`. +4. CTE `cte`: reads `posts`, `cmnts`, joins related entities. +5. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +6. Aggregate rows with COUNT grouped by post_id. +7. Project final output columns: `post_id`. +8. Order output deterministically with `ORDER BY post_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1241. Number of Comments per Post (Easy).sql b/easy/1241. Number of Comments per Post (Easy).sql deleted file mode 100644 index 3d5b994..0000000 --- a/easy/1241. Number of Comments per Post (Easy).sql +++ /dev/null @@ -1,20 +0,0 @@ -WITH posts AS ( - SELECT * - FROM submissions_1241 - WHERE parent_id IS NULL -), -cmnts AS ( - SELECT * - FROM submissions_1241 - WHERE parent_id IS NOT NULL -), -cte AS ( - SELECT DISTINCT p.sub_id AS post_id,c.sub_id AS cmnt_id - FROM posts p - LEFT JOIN cmnts c - ON p.sub_id = c.parent_id -) -SELECT post_id,COUNT(cmnt_id) -FROM cte -GROUP BY post_id -ORDER BY post_id; diff --git a/easy/1251. Average Selling Price (Easy).md b/easy/1251. Average Selling Price (Easy).md new file mode 100644 index 0000000..6df9d82 --- /dev/null +++ b/easy/1251. Average Selling Price (Easy).md @@ -0,0 +1,79 @@ +# Question 1251: Average Selling Price + +**LeetCode URL:** https://leetcode.com/problems/average-selling-price/ + +## Description + +Write a solution to find the average selling price for each product. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Prices (product_id int, start_date date, end_date date, price int); +Create table If Not Exists UnitsSold (product_id int, purchase_date date, units int); +``` + +## Sample Input Data + +```sql +insert into Prices (product_id, start_date, end_date, price) values ('1', '2019-02-17', '2019-02-28', '5'); +insert into Prices (product_id, start_date, end_date, price) values ('1', '2019-03-01', '2019-03-22', '20'); +insert into Prices (product_id, start_date, end_date, price) values ('2', '2019-02-01', '2019-02-20', '15'); +insert into Prices (product_id, start_date, end_date, price) values ('2', '2019-02-21', '2019-03-31', '30'); +insert into UnitsSold (product_id, purchase_date, units) values ('1', '2019-02-25', '100'); +insert into UnitsSold (product_id, purchase_date, units) values ('1', '2019-03-01', '15'); +insert into UnitsSold (product_id, purchase_date, units) values ('2', '2019-02-10', '200'); +insert into UnitsSold (product_id, purchase_date, units) values ('2', '2019-03-22', '30'); +``` + +## Expected Output Data + +```text ++------------+---------------+ +| product_id | average_price | ++------------+---------------+ +| 1 | 6.96 | +| 2 | 16.96 | ++------------+---------------+ +``` + +## SQL Solution + +```sql +SELECT u.product_id,ROUND(SUM(u.units*p.price)::NUMERIC/SUM(u.units),2) +FROM unit_sold_1251 u +INNER JOIN prices_1251 p +ON u.purchase_date BETWEEN p.start_date AND p.end_date AND + u.product_id = p.product_id +GROUP BY u.product_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id` from `unit_sold`, `prices`. + +### Result Grain + +One row per unique key in `GROUP BY u.product_id`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM, ROUND grouped by u.product_id. +3. Project final output columns: `product_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1251. Average Selling Price (Easy).sql b/easy/1251. Average Selling Price (Easy).sql deleted file mode 100644 index 5d276af..0000000 --- a/easy/1251. Average Selling Price (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT u.product_id,ROUND(SUM(u.units*p.price)::NUMERIC/SUM(u.units),2) -FROM unit_sold_1251 u -INNER JOIN prices_1251 p -ON u.purchase_date BETWEEN p.start_date AND p.end_date AND - u.product_id = p.product_id -GROUP BY u.product_id; diff --git a/easy/1280. Students and Examinations (Easy).md b/easy/1280. Students and Examinations (Easy).md new file mode 100644 index 0000000..c5ff8f7 --- /dev/null +++ b/easy/1280. Students and Examinations (Easy).md @@ -0,0 +1,110 @@ +# Question 1280: Students and Examinations + +**LeetCode URL:** https://leetcode.com/problems/students-and-examinations/ + +## Description + +Write a solution to find the number of times each student attended each exam. Return the result table ordered by student_id and subject_name. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Students (student_id int, student_name varchar(20)); +Create table If Not Exists Subjects (subject_name varchar(20)); +Create table If Not Exists Examinations (student_id int, subject_name varchar(20)); +``` + +## Sample Input Data + +```sql +insert into Students (student_id, student_name) values ('1', 'Alice'); +insert into Students (student_id, student_name) values ('2', 'Bob'); +insert into Students (student_id, student_name) values ('13', 'John'); +insert into Students (student_id, student_name) values ('6', 'Alex'); +insert into Subjects (subject_name) values ('Math'); +insert into Subjects (subject_name) values ('Physics'); +insert into Subjects (subject_name) values ('Programming'); +insert into Examinations (student_id, subject_name) values ('1', 'Math'); +insert into Examinations (student_id, subject_name) values ('1', 'Physics'); +insert into Examinations (student_id, subject_name) values ('1', 'Programming'); +insert into Examinations (student_id, subject_name) values ('2', 'Programming'); +insert into Examinations (student_id, subject_name) values ('1', 'Physics'); +insert into Examinations (student_id, subject_name) values ('1', 'Math'); +insert into Examinations (student_id, subject_name) values ('13', 'Math'); +insert into Examinations (student_id, subject_name) values ('13', 'Programming'); +insert into Examinations (student_id, subject_name) values ('13', 'Physics'); +insert into Examinations (student_id, subject_name) values ('2', 'Math'); +insert into Examinations (student_id, subject_name) values ('1', 'Math'); +``` + +## Expected Output Data + +```text ++------------+--------------+--------------+----------------+ +| student_id | student_name | subject_name | attended_exams | ++------------+--------------+--------------+----------------+ +| 1 | Alice | Math | 3 | +| 1 | Alice | Physics | 2 | +| 1 | Alice | Programming | 1 | +| 2 | Bob | Math | 1 | +| 2 | Bob | Physics | 0 | +| 2 | Bob | Programming | 1 | +| 6 | Alex | Math | 0 | +| 6 | Alex | Physics | 0 | +| 6 | Alex | Programming | 0 | +| 13 | John | Math | 1 | +| 13 | John | Physics | 1 | +| 13 | John | Programming | 1 | ++------------+--------------+--------------+----------------+ +``` + +## SQL Solution + +```sql +WITH exams AS ( + SELECT student_id,subject_name,COUNT(1) AS attended_exams + FROM examinations_1280 + GROUP BY student_id,subject_name +), +combinations AS ( + SELECT st.*,sb.subject_name + FROM students_1280 st + CROSS JOIN subjects_1280 sb +) +SELECT c.student_id,c.student_name,c.subject_name,COALESCE(e.attended_exams,0) +FROM combinations c +LEFT JOIN exams e ON e.student_id = c.student_id AND e.subject_name = c.subject_name +ORDER BY c.student_id,c.subject_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `student_id`, `student_name`, `subject_name` from `examinations`, `students`, `subjects`, `combinations`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`exams`, `combinations`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `exams`: reads `examinations`. +3. CTE `combinations`: reads `students`, `subjects`, joins related entities. +4. Combine datasets using LEFT JOIN, CROSS JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Project final output columns: `student_id`, `student_name`, `subject_name`. +6. Order output deterministically with `ORDER BY c.student_id,c.subject_name`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1280. Students and Examinations (Easy).sql b/easy/1280. Students and Examinations (Easy).sql deleted file mode 100644 index 36a729c..0000000 --- a/easy/1280. Students and Examinations (Easy).sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH exams AS ( - SELECT student_id,subject_name,COUNT(1) AS attended_exams - FROM examinations_1280 - GROUP BY student_id,subject_name -), -combinations AS ( - SELECT st.*,sb.subject_name - FROM students_1280 st - CROSS JOIN subjects_1280 sb -) -SELECT c.student_id,c.student_name,c.subject_name,COALESCE(e.attended_exams,0) -FROM combinations c -LEFT JOIN exams e ON e.student_id = c.student_id AND e.subject_name = c.subject_name -ORDER BY c.student_id,c.subject_name; diff --git a/easy/1294. Weather Type in Each Country (Easy).md b/easy/1294. Weather Type in Each Country (Easy).md new file mode 100644 index 0000000..d9cb238 --- /dev/null +++ b/easy/1294. Weather Type in Each Country (Easy).md @@ -0,0 +1,102 @@ +# Question 1294: Weather Type in Each Country + +**LeetCode URL:** https://leetcode.com/problems/weather-type-in-each-country/ + +## Description + +Write an SQL query to find the type of weather in each country for November 2019. Return result table in any order. The query result format is in the following example: Countries table: +------------+--------------+ | country_id | country_name | +------------+--------------+ | 2 | USA | | 3 | Australia | | 7 | Peru | | 5 | China | | 8 | Morocco | | 9 | Spain | +------------+--------------+ Weather table: +------------+---------------+------------+ | country_id | weather_state | day | +------------+---------------+------------+ | 2 | 15 | 2019-11-01 | | 2 | 12 | 2019-10-28 | | 2 | 12 | 2019-10-27 | | 3 | -2 | 2019-11-10 | | 3 | 0 | 2019-11-11 | | 3 | 3 | 2019-11-12 | | 5 | 16 | 2019-11-07 | | 5 | 18 | 2019-11-09 | | 5 | 21 | 2019-11-23 | | 7 | 25 | 2019-11-28 | | 7 | 22 | 2019-12-01 | | 7 | 20 | 2019-12-02 | | 8 | 25 | 2019-11-05 | | 8 | 27 | 2019-11-15 | | 8 | 31 | 2019-11-25 | | 9 | 7 | 2019-10-23 | | 9 | 3 | 2019-12-23 | +------------+---------------+------------+ Result table: +--------------+--------------+ | country_name | weather_type | +--------------+--------------+ | USA | Cold | | Austraila | Cold | | Peru | Hot | | China | Warm | | Morocco | Hot | +--------------+--------------+ Average weather_state in USA in November is (15) / 1 = 15 so weather type is Cold. + +## Table Schema Structure + +```sql +Create table If Not Exists Countries (country_id int, country_name varchar(20)); +Create table If Not Exists Weather (country_id int, weather_state int, day date); +``` + +## Sample Input Data + +```sql +insert into Countries (country_id, country_name) values ('2', 'USA'); +insert into Countries (country_id, country_name) values ('3', 'Australia'); +insert into Countries (country_id, country_name) values ('7', 'Peru'); +insert into Countries (country_id, country_name) values ('5', 'China'); +insert into Countries (country_id, country_name) values ('8', 'Morocco'); +insert into Countries (country_id, country_name) values ('9', 'Spain'); +insert into Weather (country_id, weather_state, day) values ('2', '15', '2019-11-01'); +insert into Weather (country_id, weather_state, day) values ('2', '12', '2019-10-28'); +insert into Weather (country_id, weather_state, day) values ('2', '12', '2019-10-27'); +insert into Weather (country_id, weather_state, day) values ('3', '-2', '2019-11-10'); +insert into Weather (country_id, weather_state, day) values ('3', '0', '2019-11-11'); +insert into Weather (country_id, weather_state, day) values ('3', '3', '2019-11-12'); +insert into Weather (country_id, weather_state, day) values ('5', '16', '2019-11-07'); +insert into Weather (country_id, weather_state, day) values ('5', '18', '2019-11-09'); +insert into Weather (country_id, weather_state, day) values ('5', '21', '2019-11-23'); +insert into Weather (country_id, weather_state, day) values ('7', '25', '2019-11-28'); +insert into Weather (country_id, weather_state, day) values ('7', '22', '2019-12-01'); +insert into Weather (country_id, weather_state, day) values ('7', '20', '2019-12-02'); +insert into Weather (country_id, weather_state, day) values ('8', '25', '2019-11-05'); +insert into Weather (country_id, weather_state, day) values ('8', '27', '2019-11-15'); +insert into Weather (country_id, weather_state, day) values ('8', '31', '2019-11-25'); +insert into Weather (country_id, weather_state, day) values ('9', '7', '2019-10-23'); +insert into Weather (country_id, weather_state, day) values ('9', '3', '2019-12-23'); +``` + +## Expected Output Data + +```text ++--------------+--------------+ +| country_name | weather_type | ++--------------+--------------+ +| USA | Cold | +| Austraila | Cold | +| Peru | Hot | +| China | Warm | +| Morocco | Hot | ++--------------+--------------+ +``` + +## SQL Solution + +```sql +SELECT country_name, + CASE WHEN AVG(weather_state) <= 15 THEN 'Cold' + WHEN AVG(weather_state) >= 25 THEN 'Hot' + ELSE 'Warm' + END AS weather_type +FROM weather_1294 w +INNER JOIN countries_1294 c ON w.country_id = c.country_id +WHERE EXTRACT(month FROM day) = 11 +GROUP BY country_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `country_name`, `weather_type` from `weather`, `countries`, `day`. + +### Result Grain + +One row per unique key in `GROUP BY country_name`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: EXTRACT(month FROM day) = 11. +3. Aggregate rows with AVG grouped by country_name. +4. Project final output columns: `country_name`, `weather_type`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1294. Weather Type in Each Country (Easy).sql b/easy/1294. Weather Type in Each Country (Easy).sql deleted file mode 100644 index c7a5ba5..0000000 --- a/easy/1294. Weather Type in Each Country (Easy).sql +++ /dev/null @@ -1,10 +0,0 @@ -SELECT country_name, - CASE WHEN AVG(weather_state) <= 15 THEN 'Cold' - WHEN AVG(weather_state) >= 25 THEN 'Hot' - ELSE 'Warm' - END AS weather_type -FROM weather_1294 w -INNER JOIN countries_1294 c ON w.country_id = c.country_id -WHERE EXTRACT(month FROM day) = 11 -GROUP BY country_name; - diff --git a/easy/1303. Find the Team Size (Easy).md b/easy/1303. Find the Team Size (Easy).md new file mode 100644 index 0000000..cf1ad7d --- /dev/null +++ b/easy/1303. Find the Team Size (Easy).md @@ -0,0 +1,77 @@ +# Question 1303: Find the Team Size + +**LeetCode URL:** https://leetcode.com/problems/find-the-team-size/ + +## Description + +Write an SQL query to find the team size of each of the employees. Return result table in any order. The query result format is in the following example: Employee Table: +-------------+------------+ | employee_id | team_id | +-------------+------------+ | 1 | 8 | | 2 | 8 | | 3 | 8 | | 4 | 7 | | 5 | 9 | | 6 | 9 | +-------------+------------+ Result table: +-------------+------------+ | employee_id | team_size | +-------------+------------+ | 1 | 3 | | 2 | 3 | | 3 | 3 | | 4 | 1 | | 5 | 2 | | 6 | 2 | +-------------+------------+ Employees with Id 1,2,3 are part of a team with team_id = 8. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (employee_id int, team_id int); +``` + +## Sample Input Data + +```sql +insert into Employee (employee_id, team_id) values ('1', '8'); +insert into Employee (employee_id, team_id) values ('2', '8'); +insert into Employee (employee_id, team_id) values ('3', '8'); +insert into Employee (employee_id, team_id) values ('4', '7'); +insert into Employee (employee_id, team_id) values ('5', '9'); +insert into Employee (employee_id, team_id) values ('6', '9'); +``` + +## Expected Output Data + +```text ++-------------+------------+ +| employee_id | team_size | ++-------------+------------+ +| 1 | 3 | +| 2 | 3 | +| 3 | 3 | +| 4 | 1 | +| 5 | 2 | +| 6 | 2 | ++-------------+------------+ +``` + +## SQL Solution + +```sql +SELECT employee_id, + COUNT(employee_id) OVER (PARTITION BY team_id) AS team_size +FROM employee_1303 +ORDER BY employee_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id`, `team_size` from `employee`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +2. Project final output columns: `employee_id`, `team_size`. +3. Order output deterministically with `ORDER BY employee_id`. + +### Why This Works + +Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/easy/1303. Find the Team Size (Easy).sql b/easy/1303. Find the Team Size (Easy).sql deleted file mode 100644 index 271ccd4..0000000 --- a/easy/1303. Find the Team Size (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT employee_id, - COUNT(employee_id) OVER (PARTITION BY team_id) AS team_size -FROM employee_1303 -ORDER BY employee_id; diff --git a/easy/1322. Ads Performance (Easy).md b/easy/1322. Ads Performance (Easy).md new file mode 100644 index 0000000..3bc251e --- /dev/null +++ b/easy/1322. Ads Performance (Easy).md @@ -0,0 +1,84 @@ +# Question 1322: Ads Performance + +**LeetCode URL:** https://leetcode.com/problems/ads-performance/ + +## Description + +Write an SQL query to find the ctr of each Ad. The query result format is in the following example: Ads table: +-------+---------+---------+ | ad_id | user_id | action | +-------+---------+---------+ | 1 | 1 | Clicked | | 2 | 2 | Clicked | | 3 | 3 | Viewed | | 5 | 5 | Ignored | | 1 | 7 | Ignored | | 2 | 7 | Viewed | | 3 | 5 | Clicked | | 1 | 4 | Viewed | | 2 | 11 | Viewed | | 1 | 2 | Clicked | +-------+---------+---------+ Result table: +-------+-------+ | ad_id | ctr | +-------+-------+ | 1 | 66. + +## Table Schema Structure + +```sql +Create table If Not Exists Ads (ad_id int, user_id int, action ENUM('Clicked', 'Viewed', 'Ignored')); +``` + +## Sample Input Data + +```sql +insert into Ads (ad_id, user_id, action) values ('1', '1', 'Clicked'); +insert into Ads (ad_id, user_id, action) values ('2', '2', 'Clicked'); +insert into Ads (ad_id, user_id, action) values ('3', '3', 'Viewed'); +insert into Ads (ad_id, user_id, action) values ('5', '5', 'Ignored'); +insert into Ads (ad_id, user_id, action) values ('1', '7', 'Ignored'); +insert into Ads (ad_id, user_id, action) values ('2', '7', 'Viewed'); +insert into Ads (ad_id, user_id, action) values ('3', '5', 'Clicked'); +insert into Ads (ad_id, user_id, action) values ('1', '4', 'Viewed'); +insert into Ads (ad_id, user_id, action) values ('2', '11', 'Viewed'); +insert into Ads (ad_id, user_id, action) values ('1', '2', 'Clicked'); +``` + +## Expected Output Data + +```text ++-------+-------+ +| ad_id | ctr | ++-------+-------+ +| 1 | 66.67 | +| 3 | 50.00 | +| 2 | 33.33 | +| 5 | 0.00 | ++-------+-------+ +``` + +## SQL Solution + +```sql +SELECT ad_id, + COALESCE(ROUND(AVG( + CASE WHEN action = 'Clicked' THEN 1 + WHEN action = 'Viewed' THEN 0 + ELSE NULL + END)*100,2),0.00) AS ctr +FROM ads_1322 +GROUP BY ad_id +ORDER BY ctr DESC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `ad_id`, `ctr` from `ads`. + +### Result Grain + +One row per unique key in `GROUP BY ad_id`. + +### Step-by-Step Logic + +1. Aggregate rows with AVG, ROUND grouped by ad_id. +2. Project final output columns: `ad_id`, `ctr`. +3. Order output deterministically with `ORDER BY ctr DESC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1322. Ads Performance (Easy).sql b/easy/1322. Ads Performance (Easy).sql deleted file mode 100644 index 1d314a2..0000000 --- a/easy/1322. Ads Performance (Easy).sql +++ /dev/null @@ -1,9 +0,0 @@ -SELECT ad_id, - COALESCE(ROUND(AVG( - CASE WHEN action = 'Clicked' THEN 1 - WHEN action = 'Viewed' THEN 0 - ELSE NULL - END)*100,2),0.00) AS ctr -FROM ads_1322 -GROUP BY ad_id -ORDER BY ctr DESC; diff --git a/easy/1327. List the Products Ordered in a Period (Easy).md b/easy/1327. List the Products Ordered in a Period (Easy).md new file mode 100644 index 0000000..f34acfd --- /dev/null +++ b/easy/1327. List the Products Ordered in a Period (Easy).md @@ -0,0 +1,91 @@ +# Question 1327: List the Products Ordered in a Period + +**LeetCode URL:** https://leetcode.com/problems/list-the-products-ordered-in-a-period/ + +## Description + +Write a solution to get the names of products that have at least 100 units ordered in February 2020 and their amount. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Products (product_id int, product_name varchar(40), product_category varchar(40)); +Create table If Not Exists Orders (product_id int, order_date date, unit int); +``` + +## Sample Input Data + +```sql +insert into Products (product_id, product_name, product_category) values ('1', 'Leetcode Solutions', 'Book'); +insert into Products (product_id, product_name, product_category) values ('2', 'Jewels of Stringology', 'Book'); +insert into Products (product_id, product_name, product_category) values ('3', 'HP', 'Laptop'); +insert into Products (product_id, product_name, product_category) values ('4', 'Lenovo', 'Laptop'); +insert into Products (product_id, product_name, product_category) values ('5', 'Leetcode Kit', 'T-shirt'); +insert into Orders (product_id, order_date, unit) values ('1', '2020-02-05', '60'); +insert into Orders (product_id, order_date, unit) values ('1', '2020-02-10', '70'); +insert into Orders (product_id, order_date, unit) values ('2', '2020-01-18', '30'); +insert into Orders (product_id, order_date, unit) values ('2', '2020-02-11', '80'); +insert into Orders (product_id, order_date, unit) values ('3', '2020-02-17', '2'); +insert into Orders (product_id, order_date, unit) values ('3', '2020-02-24', '3'); +insert into Orders (product_id, order_date, unit) values ('4', '2020-03-01', '20'); +insert into Orders (product_id, order_date, unit) values ('4', '2020-03-04', '30'); +insert into Orders (product_id, order_date, unit) values ('4', '2020-03-04', '60'); +insert into Orders (product_id, order_date, unit) values ('5', '2020-02-25', '50'); +insert into Orders (product_id, order_date, unit) values ('5', '2020-02-27', '50'); +insert into Orders (product_id, order_date, unit) values ('5', '2020-03-01', '50'); +``` + +## Expected Output Data + +```text ++--------------------+---------+ +| product_name | unit | ++--------------------+---------+ +| Leetcode Solutions | 130 | +| Leetcode Kit | 100 | ++--------------------+---------+ +``` + +## SQL Solution + +```sql +SELECT p.product_name,SUM(o.unit) +FROM orders_1327 o +INNER JOIN products_1327 p ON o.product_id = p.product_id +WHERE DATE_TRUNC('MONTH',o.order_date)::DATE = '2020-02-01' +GROUP BY p.product_name +HAVING SUM(o.unit) >= 100; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_name` from `orders`, `products`. + +### Result Grain + +One row per unique key in `GROUP BY p.product_name`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: DATE_TRUNC('MONTH',o.order_date)::DATE = '2020-02-01'. +3. Aggregate rows with SUM grouped by p.product_name. +4. Project final output columns: `product_name`. +5. Filter aggregated groups in `HAVING`: SUM(o.unit) >= 100. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1327. List the Products Ordered in a Period (Easy).sql b/easy/1327. List the Products Ordered in a Period (Easy).sql deleted file mode 100644 index 8eeb1cb..0000000 --- a/easy/1327. List the Products Ordered in a Period (Easy).sql +++ /dev/null @@ -1,7 +0,0 @@ -SELECT p.product_name,SUM(o.unit) -FROM orders_1327 o -INNER JOIN products_1327 p ON o.product_id = p.product_id -WHERE DATE_TRUNC('MONTH',o.order_date)::DATE = '2020-02-01' -GROUP BY p.product_name -HAVING SUM(o.unit) >= 100; - diff --git a/easy/1350. Students With Invalid Departments (Easy).md b/easy/1350. Students With Invalid Departments (Easy).md new file mode 100644 index 0000000..a4bf8ff --- /dev/null +++ b/easy/1350. Students With Invalid Departments (Easy).md @@ -0,0 +1,83 @@ +# Question 1350: Students With Invalid Departments + +**LeetCode URL:** https://leetcode.com/problems/students-with-invalid-departments/ + +## Description + +Write an SQL query to find the id and the name of all students who are enrolled in departments that no longer exists. Return the result table in any order. The query result format is in the following example: Departments table: +------+--------------------------+ | id | name | +------+--------------------------+ | 1 | Electrical Engineering | | 7 | Computer Engineering | | 13 | Bussiness Administration | +------+--------------------------+ Students table: +------+----------+---------------+ | id | name | department_id | +------+----------+---------------+ | 23 | Alice | 1 | | 1 | Bob | 7 | | 5 | Jennifer | 13 | | 2 | John | 14 | | 4 | Jasmine | 77 | | 3 | Steve | 74 | | 6 | Luis | 1 | | 8 | Jonathan | 7 | | 7 | Daiana | 33 | | 11 | Madelynn | 1 | +------+----------+---------------+ Result table: +------+----------+ | id | name | +------+----------+ | 2 | John | | 7 | Daiana | | 4 | Jasmine | | 3 | Steve | +------+----------+ John, Daiana, Steve and Jasmine are enrolled in departments 14, 33, 74 and 77 respectively. + +## Table Schema Structure + +```sql +Create table If Not Exists Departments (id int, name varchar(30)); +Create table If Not Exists Students (id int, name varchar(30), department_id int); +``` + +## Sample Input Data + +```sql +insert into Departments (id, name) values ('1', 'Electrical Engineering'); +insert into Departments (id, name) values ('7', 'Computer Engineering'); +insert into Departments (id, name) values ('13', 'Bussiness Administration'); +insert into Students (id, name, department_id) values ('23', 'Alice', '1'); +insert into Students (id, name, department_id) values ('1', 'Bob', '7'); +insert into Students (id, name, department_id) values ('5', 'Jennifer', '13'); +insert into Students (id, name, department_id) values ('2', 'John', '14'); +insert into Students (id, name, department_id) values ('4', 'Jasmine', '77'); +insert into Students (id, name, department_id) values ('3', 'Steve', '74'); +insert into Students (id, name, department_id) values ('6', 'Luis', '1'); +insert into Students (id, name, department_id) values ('8', 'Jonathan', '7'); +insert into Students (id, name, department_id) values ('7', 'Daiana', '33'); +insert into Students (id, name, department_id) values ('11', 'Madelynn', '1'); +``` + +## Expected Output Data + +```text ++------+----------+ +| id | name | ++------+----------+ +| 2 | John | +| 7 | Daiana | +| 4 | Jasmine | +| 3 | Steve | ++------+----------+ +``` + +## SQL Solution + +```sql +SELECT s.id,s.name +FROM students_1350 s +LEFT JOIN departments_1350 d ON s.department_id = d.id +WHERE d.id IS NULL; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `name` from `students`, `departments`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: d.id IS NULL. +3. Project final output columns: `id`, `name`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1350. Students With Invalid Departments (Easy).sql b/easy/1350. Students With Invalid Departments (Easy).sql deleted file mode 100644 index c7ec62f..0000000 --- a/easy/1350. Students With Invalid Departments (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT s.id,s.name -FROM students_1350 s -LEFT JOIN departments_1350 d ON s.department_id = d.id -WHERE d.id IS NULL; diff --git a/easy/1378. Replace Employee ID With The Unique Identifier (Easy).md b/easy/1378. Replace Employee ID With The Unique Identifier (Easy).md new file mode 100644 index 0000000..02c2ec2 --- /dev/null +++ b/easy/1378. Replace Employee ID With The Unique Identifier (Easy).md @@ -0,0 +1,79 @@ +# Question 1378: Replace Employee ID With The Unique Identifier + +**LeetCode URL:** https://leetcode.com/problems/replace-employee-id-with-the-unique-identifier/ + +## Description + +Write a solution to show the unique ID of each user, If a user does not have a unique ID replace just show null. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees (id int, name varchar(20)); +Create table If Not Exists EmployeeUNI (id int, unique_id int); +``` + +## Sample Input Data + +```sql +insert into Employees (id, name) values ('1', 'Alice'); +insert into Employees (id, name) values ('7', 'Bob'); +insert into Employees (id, name) values ('11', 'Meir'); +insert into Employees (id, name) values ('90', 'Winston'); +insert into Employees (id, name) values ('3', 'Jonathan'); +insert into EmployeeUNI (id, unique_id) values ('3', '1'); +insert into EmployeeUNI (id, unique_id) values ('11', '2'); +insert into EmployeeUNI (id, unique_id) values ('90', '3'); +``` + +## Expected Output Data + +```text ++-----------+----------+ +| unique_id | name | ++-----------+----------+ +| null | Alice | +| null | Bob | +| 2 | Meir | +| 3 | Winston | +| 1 | Jonathan | ++-----------+----------+ +``` + +## SQL Solution + +```sql +SELECT eu.unique_id,e.name +FROM employee_1378 e +LEFT JOIN employee_uni_1378 eu ON e.id = eu.id +ORDER BY e.name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `unique_id`, `name` from `employee`, `employee_uni`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `unique_id`, `name`. +3. Order output deterministically with `ORDER BY e.name`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1378. Replace Employee ID With The Unique Identifier (Easy).sql b/easy/1378. Replace Employee ID With The Unique Identifier (Easy).sql deleted file mode 100644 index 37f6053..0000000 --- a/easy/1378. Replace Employee ID With The Unique Identifier (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT eu.unique_id,e.name -FROM employee_1378 e -LEFT JOIN employee_uni_1378 eu ON e.id = eu.id -ORDER BY e.name; diff --git a/easy/1407. Top Travellers (Easy).md b/easy/1407. Top Travellers (Easy).md new file mode 100644 index 0000000..5512fcd --- /dev/null +++ b/easy/1407. Top Travellers (Easy).md @@ -0,0 +1,92 @@ +# Question 1407: Top Travellers + +**LeetCode URL:** https://leetcode.com/problems/top-travellers/ + +## Description + +Write a solution to report the distance traveled by each user. Return the result table ordered by travelled_distance in descending order, if two or more users traveled the same distance, order them by their name in ascending order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create Table If Not Exists Users (id int, name varchar(30)); +Create Table If Not Exists Rides (id int, user_id int, distance int); +``` + +## Sample Input Data + +```sql +insert into Users (id, name) values ('1', 'Alice'); +insert into Users (id, name) values ('2', 'Bob'); +insert into Users (id, name) values ('3', 'Alex'); +insert into Users (id, name) values ('4', 'Donald'); +insert into Users (id, name) values ('7', 'Lee'); +insert into Users (id, name) values ('13', 'Jonathan'); +insert into Users (id, name) values ('19', 'Elvis'); +insert into Rides (id, user_id, distance) values ('1', '1', '120'); +insert into Rides (id, user_id, distance) values ('2', '2', '317'); +insert into Rides (id, user_id, distance) values ('3', '3', '222'); +insert into Rides (id, user_id, distance) values ('4', '7', '100'); +insert into Rides (id, user_id, distance) values ('5', '13', '312'); +insert into Rides (id, user_id, distance) values ('6', '19', '50'); +insert into Rides (id, user_id, distance) values ('7', '7', '120'); +insert into Rides (id, user_id, distance) values ('8', '19', '400'); +insert into Rides (id, user_id, distance) values ('9', '7', '230'); +``` + +## Expected Output Data + +```text ++----------+--------------------+ +| name | travelled_distance | ++----------+--------------------+ +| Elvis | 450 | +| Lee | 450 | +| Bob | 317 | +| Jonathan | 312 | +| Alex | 222 | +| Alice | 120 | +| Donald | 0 | ++----------+--------------------+ +``` + +## SQL Solution + +```sql +SELECT u.name,COALESCE(SUM(r.distance),0) AS travelled_distance +FROM users_1407 u +LEFT JOIN rides_1407 r ON u.id = r.user_id +GROUP BY u.name +ORDER BY travelled_distance DESC,u.name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name`, `travelled_distance` from `users`, `rides`. + +### Result Grain + +One row per unique key in `GROUP BY u.name`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by u.name. +3. Project final output columns: `name`, `travelled_distance`. +4. Order output deterministically with `ORDER BY travelled_distance DESC,u.name`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1407. Top Travellers (Easy).sql b/easy/1407. Top Travellers (Easy).sql deleted file mode 100644 index 9a38028..0000000 --- a/easy/1407. Top Travellers (Easy).sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT u.name,COALESCE(SUM(r.distance),0) AS travelled_distance -FROM users_1407 u -LEFT JOIN rides_1407 r ON u.id = r.user_id -GROUP BY u.name -ORDER BY travelled_distance DESC,u.name; diff --git a/easy/1421. NPV Queries.md b/easy/1421. NPV Queries.md new file mode 100644 index 0000000..f8c5599 --- /dev/null +++ b/easy/1421. NPV Queries.md @@ -0,0 +1,86 @@ +# Question 1421: NPV Queries + +**LeetCode URL:** https://leetcode.com/problems/npv-queries/ + +## Description + +Write an SQL query to find the npv of all each query of queries table. Return the result table in any order. The query result format is in the following example: NPV table: +------+--------+--------+ | id | year | npv | +------+--------+--------+ | 1 | 2018 | 100 | | 7 | 2020 | 30 | | 13 | 2019 | 40 | | 1 | 2019 | 113 | | 2 | 2008 | 121 | | 3 | 2009 | 12 | | 11 | 2020 | 99 | | 7 | 2019 | 0 | +------+--------+--------+ Queries table: +------+--------+ | id | year | +------+--------+ | 1 | 2019 | | 2 | 2008 | | 3 | 2009 | | 7 | 2018 | | 7 | 2019 | | 7 | 2020 | | 13 | 2019 | +------+--------+ Result table: +------+--------+--------+ | id | year | npv | +------+--------+--------+ | 1 | 2019 | 113 | | 2 | 2008 | 121 | | 3 | 2009 | 12 | | 7 | 2018 | 0 | | 7 | 2019 | 0 | | 7 | 2020 | 30 | | 13 | 2019 | 40 | +------+--------+--------+ The npv value of (7, 2018) is not present in the NPV table, we consider it 0. + +## Table Schema Structure + +```sql +Create Table If Not Exists NPV (id int, year int, npv int); +Create Table If Not Exists Queries (id int, year int); +``` + +## Sample Input Data + +```sql +insert into NPV (id, year, npv) values ('1', '2018', '100'); +insert into NPV (id, year, npv) values ('7', '2020', '30'); +insert into NPV (id, year, npv) values ('13', '2019', '40'); +insert into NPV (id, year, npv) values ('1', '2019', '113'); +insert into NPV (id, year, npv) values ('2', '2008', '121'); +insert into NPV (id, year, npv) values ('3', '2009', '21'); +insert into NPV (id, year, npv) values ('11', '2020', '99'); +insert into NPV (id, year, npv) values ('7', '2019', '0'); +insert into Queries (id, year) values ('1', '2019'); +insert into Queries (id, year) values ('2', '2008'); +insert into Queries (id, year) values ('3', '2009'); +insert into Queries (id, year) values ('7', '2018'); +insert into Queries (id, year) values ('7', '2019'); +insert into Queries (id, year) values ('7', '2020'); +insert into Queries (id, year) values ('13', '2019'); +``` + +## Expected Output Data + +```text ++------+--------+--------+ +| id | year | npv | ++------+--------+--------+ +| 1 | 2019 | 113 | +| 2 | 2008 | 121 | +| 3 | 2009 | 12 | +| 7 | 2018 | 0 | +| 7 | 2019 | 0 | +| 7 | 2020 | 30 | +| 13 | 2019 | 40 | ++------+--------+--------+ +``` + +## SQL Solution + +```sql +SELECT q.id,q.year,COALESCE(n.npv,0) +FROM queries_1421 q +LEFT JOIN npv_1421 n ON q.id = n.id AND q.year = n.year; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `year` from `queries`, `npv`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `id`, `year`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1421. NPV Queries.sql b/easy/1421. NPV Queries.sql deleted file mode 100644 index ce3343a..0000000 --- a/easy/1421. NPV Queries.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT q.id,q.year,COALESCE(n.npv,0) -FROM queries_1421 q -LEFT JOIN npv_1421 n ON q.id = n.id AND q.year = n.year; diff --git a/easy/1435. Create a Session Bar Chart (Easy).md b/easy/1435. Create a Session Bar Chart (Easy).md new file mode 100644 index 0000000..67a836c --- /dev/null +++ b/easy/1435. Create a Session Bar Chart (Easy).md @@ -0,0 +1,89 @@ +# Question 1435: Create a Session Bar Chart + +**LeetCode URL:** https://leetcode.com/problems/create-a-session-bar-chart/ + +## Description + +Write an SQL query to report the (bin, total) in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Sessions (session_id int, duration int); +``` + +## Sample Input Data + +```sql +insert into Sessions (session_id, duration) values ('1', '30'); +insert into Sessions (session_id, duration) values ('2', '199'); +insert into Sessions (session_id, duration) values ('3', '299'); +insert into Sessions (session_id, duration) values ('4', '580'); +insert into Sessions (session_id, duration) values ('5', '1000'); +``` + +## Expected Output Data + +```text ++--------------+--------------+ +| bin | total | ++--------------+--------------+ +| [0-5> | 3 | +| [5-10> | 1 | +| [10-15> | 0 | +| 15 or more | 1 | ++--------------+--------------+ +``` + +## SQL Solution + +```sql +WITH bins AS ( + SELECT '[0-5>' AS bin, 0 AS min_duration, 5*60 AS max_duration + UNION ALL + SELECT '[5-10>' AS bin, 5*60 AS min_duration, 10*60 AS max_duration + UNION ALL + SELECT '[10-15>' AS bin, 10*60 AS min_duration, 15*60 AS max_duration + UNION ALL + SELECT '15 or more' AS bin, 15*60 as min_duration, 2147483647 AS max_duration +) +SELECT b.bin, COUNT(s.session_id) AS total +FROM bins b +LEFT JOIN sessions_1435 s + ON s.duration >= min_duration + AND s.duration < max_duration +GROUP BY b.bin; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `bin`, `total` from `bins`, `sessions`. + +### Result Grain + +One row per unique key in `GROUP BY b.bin`. + +### Step-by-Step Logic + +1. Create CTE layers (`bins`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `bins`: unions compatible rows. +3. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Aggregate rows with COUNT grouped by b.bin. +5. Project final output columns: `bin`, `total`. +6. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1435. Create a Session Bar Chart (Easy).sql b/easy/1435. Create a Session Bar Chart (Easy).sql deleted file mode 100644 index 4dc475c..0000000 --- a/easy/1435. Create a Session Bar Chart (Easy).sql +++ /dev/null @@ -1,15 +0,0 @@ -WITH bins AS ( - SELECT '[0-5>' AS bin, 0 AS min_duration, 5*60 AS max_duration - UNION ALL - SELECT '[5-10>' AS bin, 5*60 AS min_duration, 10*60 AS max_duration - UNION ALL - SELECT '[10-15>' AS bin, 10*60 AS min_duration, 15*60 AS max_duration - UNION ALL - SELECT '15 or more' AS bin, 15*60 as min_duration, 2147483647 AS max_duration -) -SELECT b.bin, COUNT(s.session_id) AS total -FROM bins b -LEFT JOIN sessions_1435 s - ON s.duration >= min_duration - AND s.duration < max_duration -GROUP BY b.bin; diff --git a/easy/1484. Group Sold Products By The Date (Easy).md b/easy/1484. Group Sold Products By The Date (Easy).md new file mode 100644 index 0000000..a362e1f --- /dev/null +++ b/easy/1484. Group Sold Products By The Date (Easy).md @@ -0,0 +1,75 @@ +# Question 1484: Group Sold Products By The Date + +**LeetCode URL:** https://leetcode.com/problems/group-sold-products-by-the-date/ + +## Description + +Write a solution to find for each date the number of different products sold and their names. Return the result table ordered by sell_date. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Activities (sell_date date, product varchar(20)); +``` + +## Sample Input Data + +```sql +insert into Activities (sell_date, product) values ('2020-05-30', 'Headphone'); +insert into Activities (sell_date, product) values ('2020-06-01', 'Pencil'); +insert into Activities (sell_date, product) values ('2020-06-02', 'Mask'); +insert into Activities (sell_date, product) values ('2020-05-30', 'Basketball'); +insert into Activities (sell_date, product) values ('2020-06-01', 'Bible'); +insert into Activities (sell_date, product) values ('2020-06-02', 'Mask'); +insert into Activities (sell_date, product) values ('2020-05-30', 'T-Shirt'); +``` + +## Expected Output Data + +```text ++------------+----------+------------------------------+ +| sell_date | num_sold | products | ++------------+----------+------------------------------+ +| 2020-05-30 | 3 | Basketball,Headphone,T-shirt | +| 2020-06-01 | 2 | Bible,Pencil | +| 2020-06-02 | 1 | Mask | ++------------+----------+------------------------------+ +``` + +## SQL Solution + +```sql +SELECT sell_date,COUNT(DISTINCT product) AS num_sold,STRING_AGG(DISTINCT product,',' ORDER BY product) AS products +FROM activities_1484 +GROUP BY sell_date +ORDER BY sell_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `sell_date`, `num_sold`, `products` from `activities`. + +### Result Grain + +One row per unique key in `GROUP BY sell_date`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by sell_date. +2. Project final output columns: `sell_date`, `num_sold`, `products`. +3. Order output deterministically with `ORDER BY product) AS products FROM activities_1484 GROUP BY sell_date ORDER BY sell_date`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1484. Group Sold Products By The Date (Easy).sql b/easy/1484. Group Sold Products By The Date (Easy).sql deleted file mode 100644 index ee7e6c9..0000000 --- a/easy/1484. Group Sold Products By The Date (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT sell_date,COUNT(DISTINCT product) AS num_sold,STRING_AGG(DISTINCT product,',' ORDER BY product) AS products -FROM activities_1484 -GROUP BY sell_date -ORDER BY sell_date; diff --git a/easy/1495. Friendly Movies Streamed Last Month (Easy).md b/easy/1495. Friendly Movies Streamed Last Month (Easy).md new file mode 100644 index 0000000..a616dad --- /dev/null +++ b/easy/1495. Friendly Movies Streamed Last Month (Easy).md @@ -0,0 +1,80 @@ +# Question 1495: Friendly Movies Streamed Last Month + +**LeetCode URL:** https://leetcode.com/problems/friendly-movies-streamed-last-month/ + +## Description + +Write an SQL query to report the distinct titles of the kid-friendly movies streamed in June 2020. Return the result table in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists TVProgram (program_date date, content_id int, channel varchar(30)); +Create table If Not Exists Content (content_id varchar(30), title varchar(30), Kids_content ENUM('Y', 'N'), content_type varchar(30)); +``` + +## Sample Input Data + +```sql +insert into TVProgram (program_date, content_id, channel) values ('2020-06-10 08:00', '1', 'LC-Channel'); +insert into TVProgram (program_date, content_id, channel) values ('2020-05-11 12:00', '2', 'LC-Channel'); +insert into TVProgram (program_date, content_id, channel) values ('2020-05-12 12:00', '3', 'LC-Channel'); +insert into TVProgram (program_date, content_id, channel) values ('2020-05-13 14:00', '4', 'Disney Ch'); +insert into TVProgram (program_date, content_id, channel) values ('2020-06-18 14:00', '4', 'Disney Ch'); +insert into TVProgram (program_date, content_id, channel) values ('2020-07-15 16:00', '5', 'Disney Ch'); +insert into Content (content_id, title, Kids_content, content_type) values ('1', 'Leetcode Movie', 'N', 'Movies'); +insert into Content (content_id, title, Kids_content, content_type) values ('2', 'Alg. for Kids', 'Y', 'Series'); +insert into Content (content_id, title, Kids_content, content_type) values ('3', 'Database Sols', 'N', 'Series'); +insert into Content (content_id, title, Kids_content, content_type) values ('4', 'Aladdin', 'Y', 'Movies'); +insert into Content (content_id, title, Kids_content, content_type) values ('5', 'Cinderella', 'Y', 'Movies'); +``` + +## Expected Output Data + +```text ++--------------+ +| title | ++--------------+ +| Aladdin | ++--------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT c.title +FROM tv_program_1495 t +INNER JOIN content_1495 c +ON t.content_id = c.content_id AND + c.kids_content = 'Y' AND + DATE_TRUNC('MONTH',t.program_date)::DATE = '2020-06-01'; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `title` from `tv_program`, `content`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `title`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1495. Friendly Movies Streamed Last Month (Easy).sql b/easy/1495. Friendly Movies Streamed Last Month (Easy).sql deleted file mode 100644 index 294954b..0000000 --- a/easy/1495. Friendly Movies Streamed Last Month (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT DISTINCT c.title -FROM tv_program_1495 t -INNER JOIN content_1495 c -ON t.content_id = c.content_id AND - c.kids_content = 'Y' AND - DATE_TRUNC('MONTH',t.program_date)::DATE = '2020-06-01'; diff --git a/easy/1511. Customer Order Frequency (Easy).md b/easy/1511. Customer Order Frequency (Easy).md new file mode 100644 index 0000000..e602a0f --- /dev/null +++ b/easy/1511. Customer Order Frequency (Easy).md @@ -0,0 +1,98 @@ +# Question 1511: Customer Order Frequency + +**LeetCode URL:** https://leetcode.com/problems/customer-order-frequency/ + +## Description + +Write an SQL query to report the customer_id and customer_name of customers who have spent at least $100 in each month of June and July 2020. Return the result table in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, name varchar(30), country varchar(30)); +Create table If Not Exists Product (product_id int, description varchar(30), price int); +Create table If Not Exists Orders (order_id int, customer_id int, product_id int, order_date date, quantity int); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, name, country) values ('1', 'Winston', 'USA'); +insert into Customers (customer_id, name, country) values ('2', 'Jonathan', 'Peru'); +insert into Customers (customer_id, name, country) values ('3', 'Moustafa', 'Egypt'); +insert into Product (product_id, description, price) values ('10', 'LC Phone', '300'); +insert into Product (product_id, description, price) values ('20', 'LC T-Shirt', '10'); +insert into Product (product_id, description, price) values ('30', 'LC Book', '45'); +insert into Product (product_id, description, price) values ('40', 'LC Keychain', '2'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('1', '1', '10', '2020-06-10', '1'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('2', '1', '20', '2020-07-01', '1'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('3', '1', '30', '2020-07-08', '2'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('4', '2', '10', '2020-06-15', '2'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('5', '2', '40', '2020-07-01', '10'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('6', '3', '20', '2020-06-24', '2'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('7', '3', '30', '2020-06-25', '2'); +insert into Orders (order_id, customer_id, product_id, order_date, quantity) values ('9', '3', '30', '2020-05-08', '3'); +``` + +## Expected Output Data + +```text ++--------------+------------+ +| customer_id | name | ++--------------+------------+ +| 1 | Winston | ++--------------+------------+ +``` + +## SQL Solution + +```sql +WITH par_spent AS ( + SELECT c.name,DATE_TRUNC('MONTH',o.order_date)::DATE,SUM(quantity*price) AS spent + FROM orders_1511 o + INNER JOIN product_1511 p + ON o.product_id = p.product_id AND + (DATE_TRUNC('MONTH',o.order_date)::DATE = '2020-06-01' OR DATE_TRUNC('MONTH',o.order_date)::DATE = '2020-07-01') + INNER JOIN customers_1511 c + ON c.customer_id = o.customer_id + GROUP BY c.name,DATE_TRUNC('MONTH',o.order_date)::DATE + HAVING SUM(quantity*price)>=100 +) +SELECT name +FROM par_spent +GROUP BY name +HAVING COUNT(name) = 2; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name` from `orders`, `product`, `customers`, `par_spent`. + +### Result Grain + +One row per unique key in `GROUP BY name`. + +### Step-by-Step Logic + +1. Create CTE layers (`par_spent`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `par_spent`: reads `orders`, `product`, `customers`, joins related entities. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Aggregate rows with COUNT, SUM grouped by name. +5. Project final output columns: `name`. +6. Filter aggregated groups in `HAVING`: COUNT(name) = 2. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1511. Customer Order Frequency (Easy).sql b/easy/1511. Customer Order Frequency (Easy).sql deleted file mode 100644 index 86e88d2..0000000 --- a/easy/1511. Customer Order Frequency (Easy).sql +++ /dev/null @@ -1,15 +0,0 @@ -WITH par_spent AS ( - SELECT c.name,DATE_TRUNC('MONTH',o.order_date)::DATE,SUM(quantity*price) AS spent - FROM orders_1511 o - INNER JOIN product_1511 p - ON o.product_id = p.product_id AND - (DATE_TRUNC('MONTH',o.order_date)::DATE = '2020-06-01' OR DATE_TRUNC('MONTH',o.order_date)::DATE = '2020-07-01') - INNER JOIN customers_1511 c - ON c.customer_id = o.customer_id - GROUP BY c.name,DATE_TRUNC('MONTH',o.order_date)::DATE - HAVING SUM(quantity*price)>=100 -) -SELECT name -FROM par_spent -GROUP BY name -HAVING COUNT(name) = 2; diff --git a/easy/1517. Find Users With Valid E-Mails (Easy).md b/easy/1517. Find Users With Valid E-Mails (Easy).md new file mode 100644 index 0000000..65c43b5 --- /dev/null +++ b/easy/1517. Find Users With Valid E-Mails (Easy).md @@ -0,0 +1,72 @@ +# Question 1517: Find Users With Valid E-Mails + +**LeetCode URL:** https://leetcode.com/problems/find-users-with-valid-e-mails/ + +## Description + +Write a solution to find the users who have valid emails. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Users (user_id int, name varchar(30), mail varchar(50)); +``` + +## Sample Input Data + +```sql +insert into Users (user_id, name, mail) values ('1', 'Winston', 'winston@leetcode.com'); +insert into Users (user_id, name, mail) values ('2', 'Jonathan', 'jonathanisgreat'); +insert into Users (user_id, name, mail) values ('3', 'Annabelle', 'bella-@leetcode.com'); +insert into Users (user_id, name, mail) values ('4', 'Sally', 'sally.come@leetcode.com'); +insert into Users (user_id, name, mail) values ('5', 'Marwan', 'quarz#2020@leetcode.com'); +insert into Users (user_id, name, mail) values ('6', 'David', 'david69@gmail.com'); +insert into Users (user_id, name, mail) values ('7', 'Shapiro', '.shapo@leetcode.com'); +``` + +## Expected Output Data + +```text ++---------+-----------+-------------------------+ +| user_id | name | mail | ++---------+-----------+-------------------------+ +| 1 | Winston | winston@leetcode.com | +| 3 | Annabelle | bella-@leetcode.com | +| 4 | Sally | sally.come@leetcode.com | ++---------+-----------+-------------------------+ +``` + +## SQL Solution + +```sql +SELECT * +FROM users_1517 +WHERE mail SIMILAR TO '[a-zA-Z][a-zA-Z0-9_.-]*@leetcode[.]com'; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `users`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: mail SIMILAR TO '[a-zA-Z][a-zA-Z0-9_.-]*@leetcode[.]com'. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1517. Find Users With Valid E-Mails (Easy).sql b/easy/1517. Find Users With Valid E-Mails (Easy).sql deleted file mode 100644 index 9fdc809..0000000 --- a/easy/1517. Find Users With Valid E-Mails (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT * -FROM users_1517 -WHERE mail SIMILAR TO '[a-zA-Z][a-zA-Z0-9_.-]*@leetcode[.]com'; diff --git a/easy/1527. Patients With a Condition (Easy).md b/easy/1527. Patients With a Condition (Easy).md new file mode 100644 index 0000000..34b6d67 --- /dev/null +++ b/easy/1527. Patients With a Condition (Easy).md @@ -0,0 +1,69 @@ +# Question 1527: Patients With a Condition + +**LeetCode URL:** https://leetcode.com/problems/patients-with-a-condition/ + +## Description + +Write a solution to find the patient_id, patient_name, and conditions of the patients who have Type I Diabetes. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Patients (patient_id int, patient_name varchar(30), conditions varchar(100)); +``` + +## Sample Input Data + +```sql +insert into Patients (patient_id, patient_name, conditions) values ('1', 'Daniel', 'YFEV COUGH'); +insert into Patients (patient_id, patient_name, conditions) values ('2', 'Alice', ''); +insert into Patients (patient_id, patient_name, conditions) values ('3', 'Bob', 'DIAB100 MYOP'); +insert into Patients (patient_id, patient_name, conditions) values ('4', 'George', 'ACNE DIAB100'); +insert into Patients (patient_id, patient_name, conditions) values ('5', 'Alain', 'DIAB201'); +``` + +## Expected Output Data + +```text ++------------+--------------+--------------+ +| patient_id | patient_name | conditions | ++------------+--------------+--------------+ +| 3 | Bob | DIAB100 MYOP | +| 4 | George | ACNE DIAB100 | ++------------+--------------+--------------+ +``` + +## SQL Solution + +```sql +SELECT * +FROM patients_1527 +WHERE conditions LIKE '% DIAB1%' OR conditions LIKE 'DIAB1%'; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `patients`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: conditions LIKE '% DIAB1%' OR conditions LIKE 'DIAB1%'. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1527. Patients With a Condition (Easy).sql b/easy/1527. Patients With a Condition (Easy).sql deleted file mode 100644 index 51a8007..0000000 --- a/easy/1527. Patients With a Condition (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT * -FROM patients_1527 -WHERE conditions LIKE '% DIAB1%' OR conditions LIKE 'DIAB1%'; diff --git a/easy/1543. Fix Product Name Format (Easy).md b/easy/1543. Fix Product Name Format (Easy).md new file mode 100644 index 0000000..34e303d --- /dev/null +++ b/easy/1543. Fix Product Name Format (Easy).md @@ -0,0 +1,75 @@ +# Question 1543: Fix Product Name Format + +**LeetCode URL:** https://leetcode.com/problems/fix-product-name-format/ + +## Description + +Write an SQL query to report - product_name in lowercase without leading or trailing white spaces. Return the result table ordered by product_name in ascending order, in case of a tie order it by sale_date in ascending order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Sales (sale_id int, product_name varchar(30), sale_date date); +``` + +## Sample Input Data + +```sql +insert into Sales (sale_id, product_name, sale_date) values ('1', 'LCPHONE', '2000-01-16'); +insert into Sales (sale_id, product_name, sale_date) values ('2', 'LCPhone', '2000-01-17'); +insert into Sales (sale_id, product_name, sale_date) values ('3', 'LcPhOnE', '2000-02-18'); +insert into Sales (sale_id, product_name, sale_date) values ('4', 'LCKeyCHAiN', '2000-02-19'); +insert into Sales (sale_id, product_name, sale_date) values ('5', 'LCKeyChain', '2000-02-28'); +insert into Sales (sale_id, product_name, sale_date) values ('6', 'Matryoshka', '2000-03-31'); +``` + +## Expected Output Data + +```text ++--------------+--------------+----------+ +| product_name | sale_date | total | ++--------------+--------------+----------+ +| lcphone | 2000-01 | 2 | +| lckeychain | 2000-02 | 2 | +| lcphone | 2000-02 | 1 | +| matryoshka | 2000-03 | 1 | ++--------------+--------------+----------+ +``` + +## SQL Solution + +```sql +SELECT LOWER(TRIM(product_name)) AS product_name,TO_CHAR(sale_date,'YYYY-MM') AS sale_date,COUNT(sale_id) AS total +FROM sales_1543 +GROUP BY LOWER(TRIM(product_name)),TO_CHAR(sale_date,'YYYY-MM') +ORDER BY product_name,sale_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_name`, `sale_date`, `total` from `sales`. + +### Result Grain + +One row per unique key in `GROUP BY LOWER(TRIM(product_name)),TO_CHAR(sale_date,'YYYY-MM')`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by LOWER(TRIM(product_name)),TO_CHAR(sale_date,'YYYY-MM'). +2. Project final output columns: `product_name`, `sale_date`, `total`. +3. Order output deterministically with `ORDER BY product_name,sale_date`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1543. Fix Product Name Format (Easy).sql b/easy/1543. Fix Product Name Format (Easy).sql deleted file mode 100644 index 257447b..0000000 --- a/easy/1543. Fix Product Name Format (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT LOWER(TRIM(product_name)) AS product_name,TO_CHAR(sale_date,'YYYY-MM') AS sale_date,COUNT(sale_id) AS total -FROM sales_1543 -GROUP BY LOWER(TRIM(product_name)),TO_CHAR(sale_date,'YYYY-MM') -ORDER BY product_name,sale_date; diff --git a/easy/1565. Unique Orders and Customers Per Month (Easy).md b/easy/1565. Unique Orders and Customers Per Month (Easy).md new file mode 100644 index 0000000..b43db34 --- /dev/null +++ b/easy/1565. Unique Orders and Customers Per Month (Easy).md @@ -0,0 +1,82 @@ +# Question 1565: Unique Orders and Customers Per Month + +**LeetCode URL:** https://leetcode.com/problems/unique-orders-and-customers-per-month/ + +## Description + +Write an SQL query to find the number of unique orders and the number of unique users with invoices > $20 for each different month. Return the result table sorted in any order. The query result format is in the following example: Orders +----------+------------+-------------+------------+ | order_id | order_date | customer_id | invoice | +----------+------------+-------------+------------+ | 1 | 2020-09-15 | 1 | 30 | | 2 | 2020-09-17 | 2 | 90 | | 3 | 2020-10-06 | 3 | 20 | | 4 | 2020-10-20 | 3 | 21 | | 5 | 2020-11-10 | 1 | 10 | | 6 | 2020-11-21 | 2 | 15 | | 7 | 2020-12-01 | 4 | 55 | | 8 | 2020-12-03 | 4 | 77 | | 9 | 2021-01-07 | 3 | 31 | | 10 | 2021-01-15 | 2 | 20 | +----------+------------+-------------+------------+ Result table: +---------+-------------+----------------+ | month | order_count | customer_count | +---------+-------------+----------------+ | 2020-09 | 2 | 2 | | 2020-10 | 1 | 1 | | 2020-12 | 2 | 1 | | 2021-01 | 1 | 1 | +---------+-------------+----------------+ In September 2020 we have two orders from 2 different customers with invoices > $20. + +## Table Schema Structure + +```sql +Create table If Not Exists Orders (order_id int, order_date date, customer_id int, invoice int); +``` + +## Sample Input Data + +```sql +insert into Orders (order_id, order_date, customer_id, invoice) values ('1', '2020-09-15', '1', '30'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('2', '2020-09-17', '2', '90'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('3', '2020-10-06', '3', '20'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('4', '2020-10-20', '3', '21'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('5', '2020-11-10', '1', '10'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('6', '2020-11-21', '2', '15'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('7', '2020-12-01', '4', '55'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('8', '2020-12-03', '4', '77'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('9', '2021-01-07', '3', '31'); +insert into Orders (order_id, order_date, customer_id, invoice) values ('10', '2021-01-15', '2', '20'); +``` + +## Expected Output Data + +```text ++---------+-------------+----------------+ +| month | order_count | customer_count | ++---------+-------------+----------------+ +| 2020-09 | 2 | 2 | +| 2020-10 | 1 | 1 | +| 2020-12 | 2 | 1 | +| 2021-01 | 1 | 1 | ++---------+-------------+----------------+ +``` + +## SQL Solution + +```sql +SELECT TO_CHAR(order_date,'YYYY-MM') AS month, + COUNT(order_id) AS order_count, + COUNT(DISTINCT customer_id) AS customer_count +FROM orders_1565 +WHERE invoice > 20 +GROUP BY TO_CHAR(order_date,'YYYY-MM'); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `month`, `order_count`, `customer_count` from `orders`. + +### Result Grain + +One row per unique key in `GROUP BY TO_CHAR(order_date,'YYYY-MM')`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: invoice > 20. +2. Aggregate rows with COUNT grouped by TO_CHAR(order_date,'YYYY-MM'). +3. Project final output columns: `month`, `order_count`, `customer_count`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1565. Unique Orders and Customers Per Month (Easy).sql b/easy/1565. Unique Orders and Customers Per Month (Easy).sql deleted file mode 100644 index 93ba790..0000000 --- a/easy/1565. Unique Orders and Customers Per Month (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT TO_CHAR(order_date,'YYYY-MM') AS month, - COUNT(order_id) AS order_count, - COUNT(DISTINCT customer_id) AS customer_count -FROM orders_1565 -WHERE invoice > 20 -GROUP BY TO_CHAR(order_date,'YYYY-MM'); diff --git a/easy/1571. Warehouse Manager (Easy).md b/easy/1571. Warehouse Manager (Easy).md new file mode 100644 index 0000000..ae8eaea --- /dev/null +++ b/easy/1571. Warehouse Manager (Easy).md @@ -0,0 +1,80 @@ +# Question 1571: Warehouse Manager + +**LeetCode URL:** https://leetcode.com/problems/warehouse-manager/ + +## Description + +Write an SQL query to report, How much cubic feet of volume does the inventory occupy in each warehouse. Return the result table in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Warehouse (name varchar(50), product_id int, units int); +Create table If Not Exists Products (product_id int, product_name varchar(50), Width int,Length int,Height int); +``` + +## Sample Input Data + +```sql +insert into Warehouse (name, product_id, units) values ('LCHouse1', '1', '1'); +insert into Warehouse (name, product_id, units) values ('LCHouse1', '2', '10'); +insert into Warehouse (name, product_id, units) values ('LCHouse1', '3', '5'); +insert into Warehouse (name, product_id, units) values ('LCHouse2', '1', '2'); +insert into Warehouse (name, product_id, units) values ('LCHouse2', '2', '2'); +insert into Warehouse (name, product_id, units) values ('LCHouse3', '4', '1'); +insert into Products (product_id, product_name, Width, Length, Height) values ('1', 'LC-TV', '5', '50', '40'); +insert into Products (product_id, product_name, Width, Length, Height) values ('2', 'LC-KeyChain', '5', '5', '5'); +insert into Products (product_id, product_name, Width, Length, Height) values ('3', 'LC-Phone', '2', '10', '10'); +insert into Products (product_id, product_name, Width, Length, Height) values ('4', 'LC-T-Shirt', '4', '10', '20'); +``` + +## Expected Output Data + +```text ++----------------+------------+ +| warehouse_name | volume | ++----------------+------------+ +| LCHouse1 | 12250 | +| LCHouse2 | 20250 | +| LCHouse3 | 800 | ++----------------+------------+ +``` + +## SQL Solution + +```sql +SELECT w.name AS warehouse_name,SUM(p.width*p.height*p.length*w.units) AS volume +FROM warehouse_1571 w +INNER JOIN products_1571 p ON w.product_id = p.product_id +GROUP BY w.name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `warehouse_name`, `volume` from `warehouse`, `products`. + +### Result Grain + +One row per unique key in `GROUP BY w.name`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by w.name. +3. Project final output columns: `warehouse_name`, `volume`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1571. Warehouse Manager (Easy).sql b/easy/1571. Warehouse Manager (Easy).sql deleted file mode 100644 index d80b4d5..0000000 --- a/easy/1571. Warehouse Manager (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT w.name AS warehouse_name,SUM(p.width*p.height*p.length*w.units) AS volume -FROM warehouse_1571 w -INNER JOIN products_1571 p ON w.product_id = p.product_id -GROUP BY w.name; diff --git a/easy/1581. Customer Who Visited but Did Not Make Any Transactions (Easy).md b/easy/1581. Customer Who Visited but Did Not Make Any Transactions (Easy).md new file mode 100644 index 0000000..2af0415 --- /dev/null +++ b/easy/1581. Customer Who Visited but Did Not Make Any Transactions (Easy).md @@ -0,0 +1,86 @@ +# Question 1581: Customer Who Visited but Did Not Make Any Transactions + +**LeetCode URL:** https://leetcode.com/problems/customer-who-visited-but-did-not-make-any-transactions/ + +## Description + +Write a solution to find the IDs of the users who visited without making any transactions and the number of times they made these types of visits. Return the result table sorted in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Visits(visit_id int, customer_id int); +Create table If Not Exists Transactions(transaction_id int, visit_id int, amount int); +``` + +## Sample Input Data + +```sql +insert into Visits (visit_id, customer_id) values ('1', '23'); +insert into Visits (visit_id, customer_id) values ('2', '9'); +insert into Visits (visit_id, customer_id) values ('4', '30'); +insert into Visits (visit_id, customer_id) values ('5', '54'); +insert into Visits (visit_id, customer_id) values ('6', '96'); +insert into Visits (visit_id, customer_id) values ('7', '54'); +insert into Visits (visit_id, customer_id) values ('8', '54'); +insert into Transactions (transaction_id, visit_id, amount) values ('2', '5', '310'); +insert into Transactions (transaction_id, visit_id, amount) values ('3', '5', '300'); +insert into Transactions (transaction_id, visit_id, amount) values ('9', '5', '200'); +insert into Transactions (transaction_id, visit_id, amount) values ('12', '1', '910'); +insert into Transactions (transaction_id, visit_id, amount) values ('13', '2', '970'); +``` + +## Expected Output Data + +```text ++-------------+----------------+ +| customer_id | count_no_trans | ++-------------+----------------+ +| 54 | 2 | +| 30 | 1 | +| 96 | 1 | ++-------------+----------------+ +``` + +## SQL Solution + +```sql +SELECT v.customer_id,COUNT(v.visit_id) +FROM visits_1581 v +LEFT JOIN transactions_1581 t ON v.visit_id = t.visit_id +WHERE t.transaction_id IS NULL +GROUP BY v.customer_id +ORDER BY v.customer_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_id` from `visits`, `transactions`. + +### Result Grain + +One row per unique key in `GROUP BY v.customer_id`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: t.transaction_id IS NULL. +3. Aggregate rows with COUNT grouped by v.customer_id. +4. Project final output columns: `customer_id`. +5. Order output deterministically with `ORDER BY v.customer_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1581. Customer Who Visited but Did Not Make Any Transactions (Easy).sql b/easy/1581. Customer Who Visited but Did Not Make Any Transactions (Easy).sql deleted file mode 100644 index 06c6bca..0000000 --- a/easy/1581. Customer Who Visited but Did Not Make Any Transactions (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT v.customer_id,COUNT(v.visit_id) -FROM visits_1581 v -LEFT JOIN transactions_1581 t ON v.visit_id = t.visit_id -WHERE t.transaction_id IS NULL -GROUP BY v.customer_id -ORDER BY v.customer_id; diff --git a/easy/1587. Bank Account Summary II (Easy).md b/easy/1587. Bank Account Summary II (Easy).md new file mode 100644 index 0000000..d6dc9b0 --- /dev/null +++ b/easy/1587. Bank Account Summary II (Easy).md @@ -0,0 +1,80 @@ +# Question 1587: Bank Account Summary II + +**LeetCode URL:** https://leetcode.com/problems/bank-account-summary-ii/ + +## Description + +Write a solution to report the name and balance of users with a balance higher than 10000. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Users (account int, name varchar(20)); +Create table If Not Exists Transactions (trans_id int, account int, amount int, transacted_on date); +``` + +## Sample Input Data + +```sql +insert into Users (account, name) values ('900001', 'Alice'); +insert into Users (account, name) values ('900002', 'Bob'); +insert into Users (account, name) values ('900003', 'Charlie'); +insert into Transactions (trans_id, account, amount, transacted_on) values ('1', '900001', '7000', '2020-08-01'); +insert into Transactions (trans_id, account, amount, transacted_on) values ('2', '900001', '7000', '2020-09-01'); +insert into Transactions (trans_id, account, amount, transacted_on) values ('3', '900001', '-3000', '2020-09-02'); +insert into Transactions (trans_id, account, amount, transacted_on) values ('4', '900002', '1000', '2020-09-12'); +insert into Transactions (trans_id, account, amount, transacted_on) values ('5', '900003', '6000', '2020-08-07'); +insert into Transactions (trans_id, account, amount, transacted_on) values ('6', '900003', '6000', '2020-09-07'); +insert into Transactions (trans_id, account, amount, transacted_on) values ('7', '900003', '-4000', '2020-09-11'); +``` + +## Expected Output Data + +```text ++------------+------------+ +| name | balance | ++------------+------------+ +| Alice | 11000 | ++------------+------------+ +``` + +## SQL Solution + +```sql +SELECT u.name,SUM(t.amount) +FROM transactions_1587 t +INNER JOIN users_1587 u ON t.account=u.account +GROUP BY u.name +HAVING SUM(t.amount) > 10000 +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name` from `transactions`, `users`. + +### Result Grain + +One row per unique key in `GROUP BY u.name`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by u.name. +3. Project final output columns: `name`. +4. Filter aggregated groups in `HAVING`: SUM(t.amount) > 10000. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1587. Bank Account Summary II (Easy).sql b/easy/1587. Bank Account Summary II (Easy).sql deleted file mode 100644 index 882a6c2..0000000 --- a/easy/1587. Bank Account Summary II (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT u.name,SUM(t.amount) -FROM transactions_1587 t -INNER JOIN users_1587 u ON t.account=u.account -GROUP BY u.name -HAVING SUM(t.amount) > 10000 - diff --git a/easy/1607. Sellers With No Sales (Easy).md b/easy/1607. Sellers With No Sales (Easy).md new file mode 100644 index 0000000..43d6077 --- /dev/null +++ b/easy/1607. Sellers With No Sales (Easy).md @@ -0,0 +1,81 @@ +# Question 1607: Sellers With No Sales + +**LeetCode URL:** https://leetcode.com/problems/sellers-with-no-sales/ + +## Description + +Write an SQL query to report the names of all sellers who did not make any sales in 2020. Return the result table ordered by seller_name in ascending order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Customer (customer_id int, customer_name varchar(20)); +Create table If Not Exists Orders (order_id int, sale_date date, order_cost int, customer_id int, seller_id int); +Create table If Not Exists Seller (seller_id int, seller_name varchar(20)); +``` + +## Sample Input Data + +```sql +insert into Customer (customer_id, customer_name) values ('101', 'Alice'); +insert into Customer (customer_id, customer_name) values ('102', 'Bob'); +insert into Customer (customer_id, customer_name) values ('103', 'Charlie'); +insert into Orders (order_id, sale_date, order_cost, customer_id, seller_id) values ('1', '2020-03-01', '1500', '101', '1'); +insert into Orders (order_id, sale_date, order_cost, customer_id, seller_id) values ('2', '2020-05-25', '2400', '102', '2'); +insert into Orders (order_id, sale_date, order_cost, customer_id, seller_id) values ('3', '2019-05-25', '800', '101', '3'); +insert into Orders (order_id, sale_date, order_cost, customer_id, seller_id) values ('4', '2020-09-13', '1000', '103', '2'); +insert into Orders (order_id, sale_date, order_cost, customer_id, seller_id) values ('5', '2019-02-11', '700', '101', '2'); +insert into Seller (seller_id, seller_name) values ('1', 'Daniel'); +insert into Seller (seller_id, seller_name) values ('2', 'Elizabeth'); +insert into Seller (seller_id, seller_name) values ('3', 'Frank'); +``` + +## Expected Output Data + +```text ++-------------+ +| seller_name | ++-------------+ +| Frank | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT seller_name +FROM seller_1607 +WHERE seller_id NOT IN +(SELECT DISTINCT seller_id +FROM orders_1607 +WHERE EXTRACT(YEAR FROM sale_date) = 2020); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `seller_name` from `seller`, `orders`, `sale_date`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: seller_id NOT IN (SELECT DISTINCT seller_id FROM orders_1607 WHERE EXTRACT(YEAR FROM sale_date) = 2020). +2. Project final output columns: `seller_name`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1607. Sellers With No Sales (Easy).sql b/easy/1607. Sellers With No Sales (Easy).sql deleted file mode 100644 index b1888c7..0000000 --- a/easy/1607. Sellers With No Sales (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT seller_name -FROM seller_1607 -WHERE seller_id NOT IN -(SELECT DISTINCT seller_id -FROM orders_1607 -WHERE EXTRACT(YEAR FROM sale_date) = 2020); diff --git a/easy/1623. All Valid Triplets That Can Represent a Country (Easy).md b/easy/1623. All Valid Triplets That Can Represent a Country (Easy).md new file mode 100644 index 0000000..423e86f --- /dev/null +++ b/easy/1623. All Valid Triplets That Can Represent a Country (Easy).md @@ -0,0 +1,78 @@ +# Question 1623: All Valid Triplets That Can Represent a Country + +**LeetCode URL:** https://leetcode.com/problems/all-valid-triplets-that-can-represent-a-country/ + +## Description + +Write an SQL query to find all the possible triplets representing the country under the given constraints. Return the result table in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists SchoolA (student_id int, student_name varchar(20)); +Create table If Not Exists SchoolB (student_id int, student_name varchar(20)); +Create table If Not Exists SchoolC (student_id int, student_name varchar(20)); +``` + +## Sample Input Data + +```sql +insert into SchoolA (student_id, student_name) values ('1', 'Alice'); +insert into SchoolA (student_id, student_name) values ('2', 'Bob'); +insert into SchoolB (student_id, student_name) values ('3', 'Tom'); +insert into SchoolC (student_id, student_name) values ('3', 'Tom'); +insert into SchoolC (student_id, student_name) values ('2', 'Jerry'); +insert into SchoolC (student_id, student_name) values ('10', 'Alice'); +``` + +## Expected Output Data + +```text ++----------+----------+----------+ +| member_A | member_B | member_C | ++----------+----------+----------+ +| Alice | Tom | Jerry | +| Bob | Tom | Alice | ++----------+----------+----------+ +``` + +## SQL Solution + +```sql +SELECT a.student_name,b.student_name,c.student_name +FROM school_a_1623 a +CROSS JOIN school_b_1623 b +CROSS JOIN school_c_1623 c +WHERE a.student_id <> b.student_id AND a.student_id <> c.student_id AND b.student_id <> c.student_id AND + a.student_name <> b.student_name AND a.student_name <> c.student_name AND b.student_name <> c.student_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `student_name` from `school_a`, `school_b`, `school_c`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using CROSS JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: a.student_id <> b.student_id AND a.student_id <> c.student_id AND b.student_id <> c.student_id AND a.student_name <> b.student_name AND a.student_name <> c.student_name AND b.student_name <> c.student_name. +3. Project final output columns: `student_name`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1623. All Valid Triplets That Can Represent a Country (Easy).sql b/easy/1623. All Valid Triplets That Can Represent a Country (Easy).sql deleted file mode 100644 index 3ab8b63..0000000 --- a/easy/1623. All Valid Triplets That Can Represent a Country (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT a.student_name,b.student_name,c.student_name -FROM school_a_1623 a -CROSS JOIN school_b_1623 b -CROSS JOIN school_c_1623 c -WHERE a.student_id <> b.student_id AND a.student_id <> c.student_id AND b.student_id <> c.student_id AND - a.student_name <> b.student_name AND a.student_name <> c.student_name AND b.student_name <> c.student_name; diff --git a/easy/1633. Percentage of Users Attended a Contest (Easy).md b/easy/1633. Percentage of Users Attended a Contest (Easy).md new file mode 100644 index 0000000..01d9e08 --- /dev/null +++ b/easy/1633. Percentage of Users Attended a Contest (Easy).md @@ -0,0 +1,96 @@ +# Question 1633: Percentage of Users Attended a Contest + +**LeetCode URL:** https://leetcode.com/problems/percentage-of-users-attended-a-contest/ + +## Description + +Write a solution to find the percentage of the users registered in each contest rounded to two decimals. Return the result table ordered by percentage in descending order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Users (user_id int, user_name varchar(20)); +Create table If Not Exists Register (contest_id int, user_id int); +``` + +## Sample Input Data + +```sql +insert into Users (user_id, user_name) values ('6', 'Alice'); +insert into Users (user_id, user_name) values ('2', 'Bob'); +insert into Users (user_id, user_name) values ('7', 'Alex'); +insert into Register (contest_id, user_id) values ('215', '6'); +insert into Register (contest_id, user_id) values ('209', '2'); +insert into Register (contest_id, user_id) values ('208', '2'); +insert into Register (contest_id, user_id) values ('210', '6'); +insert into Register (contest_id, user_id) values ('208', '6'); +insert into Register (contest_id, user_id) values ('209', '7'); +insert into Register (contest_id, user_id) values ('209', '6'); +insert into Register (contest_id, user_id) values ('215', '7'); +insert into Register (contest_id, user_id) values ('208', '7'); +insert into Register (contest_id, user_id) values ('210', '2'); +insert into Register (contest_id, user_id) values ('207', '2'); +insert into Register (contest_id, user_id) values ('210', '7'); +``` + +## Expected Output Data + +```text ++------------+------------+ +| contest_id | percentage | ++------------+------------+ +| 208 | 100.0 | +| 209 | 100.0 | +| 210 | 100.0 | +| 215 | 66.67 | +| 207 | 33.33 | ++------------+------------+ +``` + +## SQL Solution + +```sql +SELECT contest_id,ROUND((COUNT(DISTINCT user_id)*100.0)/user_count.cnt,2) AS percentage +FROM register_1633 +CROSS JOIN (SELECT COUNT(*) AS cnt FROM users_1633) user_count +GROUP BY contest_id,user_count.cnt +ORDER BY percentage DESC,contest_id; + +--OR-- + +SELECT contest_id,ROUND((COUNT(DISTINCT user_id)*100.0)/(SELECT COUNT(*) AS cnt FROM users_1633),2) AS percentage +FROM register_1633 +GROUP BY contest_id +ORDER BY percentage DESC,contest_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `contest_id`, `cnt` from `register`, `users`. + +### Result Grain + +One row per unique key in `GROUP BY contest_id`. + +### Step-by-Step Logic + +1. Combine datasets using CROSS JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with COUNT, ROUND grouped by contest_id. +3. Project final output columns: `contest_id`, `cnt`. +4. Order output deterministically with `ORDER BY percentage DESC,contest_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1633. Percentage of Users Attended a Contest (Easy).sql b/easy/1633. Percentage of Users Attended a Contest (Easy).sql deleted file mode 100644 index 90ebd5c..0000000 --- a/easy/1633. Percentage of Users Attended a Contest (Easy).sql +++ /dev/null @@ -1,12 +0,0 @@ -SELECT contest_id,ROUND((COUNT(DISTINCT user_id)*100.0)/user_count.cnt,2) AS percentage -FROM register_1633 -CROSS JOIN (SELECT COUNT(*) AS cnt FROM users_1633) user_count -GROUP BY contest_id,user_count.cnt -ORDER BY percentage DESC,contest_id; - ---OR-- - -SELECT contest_id,ROUND((COUNT(DISTINCT user_id)*100.0)/(SELECT COUNT(*) AS cnt FROM users_1633),2) AS percentage -FROM register_1633 -GROUP BY contest_id -ORDER BY percentage DESC,contest_id; diff --git a/easy/1661. Average Time of Process per Machine (Easy).md b/easy/1661. Average Time of Process per Machine (Easy).md new file mode 100644 index 0000000..b84aa34 --- /dev/null +++ b/easy/1661. Average Time of Process per Machine (Easy).md @@ -0,0 +1,83 @@ +# Question 1661: Average Time of Process per Machine + +**LeetCode URL:** https://leetcode.com/problems/average-time-of-process-per-machine/ + +## Description + +Write a solution to find the average time each machine takes to complete a process. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (machine_id int, process_id int, activity_type ENUM('start', 'end'), timestamp float); +``` + +## Sample Input Data + +```sql +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('0', '0', 'start', '0.712'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('0', '0', 'end', '1.52'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('0', '1', 'start', '3.14'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('0', '1', 'end', '4.12'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('1', '0', 'start', '0.55'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('1', '0', 'end', '1.55'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('1', '1', 'start', '0.43'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('1', '1', 'end', '1.42'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('2', '0', 'start', '4.1'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('2', '0', 'end', '4.512'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('2', '1', 'start', '2.5'); +insert into Activity (machine_id, process_id, activity_type, timestamp) values ('2', '1', 'end', '5'); +``` + +## Expected Output Data + +```text ++------------+-----------------+ +| machine_id | processing_time | ++------------+-----------------+ +| 0 | 0.894 | +| 1 | 0.995 | +| 2 | 1.456 | ++------------+-----------------+ +``` + +## SQL Solution + +```sql +SELECT s.machine_id,ROUND(AVG(e.timestamp-s.timestamp)::NUMERIC,3) AS processing_time +FROM activity_1661 s +INNER JOIN activity_1661 e +ON s.activity_type = 'start' AND e.activity_type = 'end' AND + s.machine_id = e.machine_id AND s.process_id = e.process_id +GROUP BY s.machine_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `machine_id`, `processing_time` from `activity`. + +### Result Grain + +One row per unique key in `GROUP BY s.machine_id`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with AVG, ROUND grouped by s.machine_id. +3. Project final output columns: `machine_id`, `processing_time`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1661. Average Time of Process per Machine (Easy).sql b/easy/1661. Average Time of Process per Machine (Easy).sql deleted file mode 100644 index b763192..0000000 --- a/easy/1661. Average Time of Process per Machine (Easy).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT s.machine_id,ROUND(AVG(e.timestamp-s.timestamp)::NUMERIC,3) AS processing_time -FROM activity_1661 s -INNER JOIN activity_1661 e -ON s.activity_type = 'start' AND e.activity_type = 'end' AND - s.machine_id = e.machine_id AND s.process_id = e.process_id -GROUP BY s.machine_id; diff --git a/easy/1667. Fix Names in a Table (Easy).md b/easy/1667. Fix Names in a Table (Easy).md new file mode 100644 index 0000000..0ec9218 --- /dev/null +++ b/easy/1667. Fix Names in a Table (Easy).md @@ -0,0 +1,67 @@ +# Question 1667: Fix Names in a Table + +**LeetCode URL:** https://leetcode.com/problems/fix-names-in-a-table/ + +## Description + +Write an SQL query to fix the names so that only the first character is uppercase and the rest are lowercase. Return the result table ordered by user_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Users (user_id int, name varchar(40)); +``` + +## Sample Input Data + +```sql +insert into Users (user_id, name) values ('1', 'aLice'); +insert into Users (user_id, name) values ('2', 'bOB'); +``` + +## Expected Output Data + +```text ++---------+-------+ +| user_id | name | ++---------+-------+ +| 1 | Alice | +| 2 | Bob | ++---------+-------+ +``` + +## SQL Solution + +```sql +SELECT user_id,INITCAP(name) +FROM users_1667 +ORDER BY user_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id` from `users`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Project final output columns: `user_id`. +2. Order output deterministically with `ORDER BY user_id`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/1667. Fix Names in a Table (Easy).sql b/easy/1667. Fix Names in a Table (Easy).sql deleted file mode 100644 index 64fdc4b..0000000 --- a/easy/1667. Fix Names in a Table (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT user_id,INITCAP(name) -FROM users_1667 -ORDER BY user_id; diff --git "a/easy/1677. Product\342\200\231s Worth Over Invoices (Easy).md" "b/easy/1677. Product\342\200\231s Worth Over Invoices (Easy).md" new file mode 100644 index 0000000..c387df4 --- /dev/null +++ "b/easy/1677. Product\342\200\231s Worth Over Invoices (Easy).md" @@ -0,0 +1,79 @@ +# Question 1677: Product's Worth Over Invoices + +**LeetCode URL:** https://leetcode.com/problems/products-worth-over-invoices/ + +## Description + +return each product name with total amount due, paid, canceled, and refunded across all invoices. + +## Table Schema Structure + +```sql +Create table If Not Exists Product(product_id int, name varchar(15)); +Create table If Not Exists Invoice(invoice_id int,product_id int,rest int, paid int, canceled int, refunded int); +``` + +## Sample Input Data + +```sql +insert into Product (product_id, name) values ('0', 'ham'); +insert into Product (product_id, name) values ('1', 'bacon'); +insert into Invoice (invoice_id, product_id, rest, paid, canceled, refunded) values ('23', '0', '2', '0', '5', '0'); +insert into Invoice (invoice_id, product_id, rest, paid, canceled, refunded) values ('12', '0', '0', '4', '0', '3'); +insert into Invoice (invoice_id, product_id, rest, paid, canceled, refunded) values ('1', '1', '1', '1', '0', '1'); +insert into Invoice (invoice_id, product_id, rest, paid, canceled, refunded) values ('2', '1', '1', '0', '1', '1'); +insert into Invoice (invoice_id, product_id, rest, paid, canceled, refunded) values ('3', '1', '0', '1', '1', '1'); +insert into Invoice (invoice_id, product_id, rest, paid, canceled, refunded) values ('4', '1', '1', '1', '1', '0'); +``` + +## Expected Output Data + +```text ++-------+------+------+----------+----------+ +| name | rest | paid | canceled | refunded | ++-------+------+------+----------+----------+ +| bacon | 3 | 3 | 3 | 3 | +| ham | 2 | 4 | 5 | 3 | ++-------+------+------+----------+----------+ +``` + +## SQL Solution + +```sql +SELECT p.name,SUM(i.rest) AS rest,SUM(i.paid) AS paid,SUM(i.canceled) AS canceled,SUM(i.refunded) AS refunded +FROM invoice_1677 i +INNER JOIN product_1677 p ON i.product_id = p.product_id +GROUP BY p.name +ORDER BY p.name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name`, `rest`, `paid`, `canceled`, `refunded` from `invoice`, `product`. + +### Result Grain + +One row per unique key in `GROUP BY p.name`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by p.name. +3. Project final output columns: `name`, `rest`, `paid`, `canceled`, `refunded`. +4. Order output deterministically with `ORDER BY p.name`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git "a/easy/1677. Product\342\200\231s Worth Over Invoices (Easy).sql" "b/easy/1677. Product\342\200\231s Worth Over Invoices (Easy).sql" deleted file mode 100644 index 0ed01cc..0000000 --- "a/easy/1677. Product\342\200\231s Worth Over Invoices (Easy).sql" +++ /dev/null @@ -1,5 +0,0 @@ -SELECT p.name,SUM(i.rest) AS rest,SUM(i.paid) AS paid,SUM(i.canceled) AS canceled,SUM(i.refunded) AS refunded -FROM invoice_1677 i -INNER JOIN product_1677 p ON i.product_id = p.product_id -GROUP BY p.name -ORDER BY p.name; diff --git a/easy/1683. Invalid Tweets (Easy).md b/easy/1683. Invalid Tweets (Easy).md new file mode 100644 index 0000000..ea32fd5 --- /dev/null +++ b/easy/1683. Invalid Tweets (Easy).md @@ -0,0 +1,66 @@ +# Question 1683: Invalid Tweets + +**LeetCode URL:** https://leetcode.com/problems/invalid-tweets/ + +## Description + +Write a solution to find the IDs of the invalid tweets. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Tweets(tweet_id int, content varchar(50)); +``` + +## Sample Input Data + +```sql +insert into Tweets (tweet_id, content) values ('1', 'Let us Code'); +insert into Tweets (tweet_id, content) values ('2', 'More than fifteen chars are here!'); +``` + +## Expected Output Data + +```text ++----------+ +| tweet_id | ++----------+ +| 2 | ++----------+ +``` + +## SQL Solution + +```sql +SELECT tweet_id +FROM tweets_1683 +WHERE LENGTH(content)>15; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `tweet_id` from `tweets`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: LENGTH(content)>15. +2. Project final output columns: `tweet_id`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1683. Invalid Tweets (Easy).sql b/easy/1683. Invalid Tweets (Easy).sql deleted file mode 100644 index 28a00e6..0000000 --- a/easy/1683. Invalid Tweets (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT tweet_id -FROM tweets_1683 -WHERE LENGTH(content)>15; diff --git a/easy/1693. Daily Leads and Partners (Easy).md b/easy/1693. Daily Leads and Partners (Easy).md new file mode 100644 index 0000000..cf08b24 --- /dev/null +++ b/easy/1693. Daily Leads and Partners (Easy).md @@ -0,0 +1,77 @@ +# Question 1693: Daily Leads and Partners + +**LeetCode URL:** https://leetcode.com/problems/daily-leads-and-partners/ + +## Description + +Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists DailySales(date_id date, make_name varchar(20), lead_id int, partner_id int); +``` + +## Sample Input Data + +```sql +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-8', 'toyota', '0', '1'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-8', 'toyota', '1', '0'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-8', 'toyota', '1', '2'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-7', 'toyota', '0', '2'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-7', 'toyota', '0', '1'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-8', 'honda', '1', '2'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-8', 'honda', '2', '1'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-7', 'honda', '0', '1'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-7', 'honda', '1', '2'); +insert into DailySales (date_id, make_name, lead_id, partner_id) values ('2020-12-7', 'honda', '2', '1'); +``` + +## Expected Output Data + +```text ++-----------+-----------+--------------+-----------------+ +| date_id | make_name | unique_leads | unique_partners | ++-----------+-----------+--------------+-----------------+ +| 2020-12-8 | toyota | 2 | 3 | +| 2020-12-7 | toyota | 1 | 2 | +| 2020-12-8 | honda | 2 | 2 | +| 2020-12-7 | honda | 3 | 2 | ++-----------+-----------+--------------+-----------------+ +``` + +## SQL Solution + +```sql +SELECT date_id,make_name,COUNT(DISTINCT lead_id) AS unique_leads,COUNT(DISTINCT partner_id) AS unique_partners +FROM daily_sales_1693 +GROUP BY date_id,make_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `date_id`, `make_name`, `unique_leads`, `unique_partners` from `daily_sales`. + +### Result Grain + +One row per unique key in `GROUP BY date_id,make_name`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by date_id,make_name. +2. Project final output columns: `date_id`, `make_name`, `unique_leads`, `unique_partners`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1693. Daily Leads and Partners (Easy).sql b/easy/1693. Daily Leads and Partners (Easy).sql deleted file mode 100644 index 399029c..0000000 --- a/easy/1693. Daily Leads and Partners (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT date_id,make_name,COUNT(DISTINCT lead_id) AS unique_leads,COUNT(DISTINCT partner_id) AS unique_partners -FROM daily_sales_1693 -GROUP BY date_id,make_name; diff --git a/easy/1729. Find Followers Count (Easy).md b/easy/1729. Find Followers Count (Easy).md new file mode 100644 index 0000000..2e7eb61 --- /dev/null +++ b/easy/1729. Find Followers Count (Easy).md @@ -0,0 +1,72 @@ +# Question 1729: Find Followers Count + +**LeetCode URL:** https://leetcode.com/problems/find-followers-count/ + +## Description + +return the number of followers. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Followers(user_id int, follower_id int); +``` + +## Sample Input Data + +```sql +insert into Followers (user_id, follower_id) values ('0', '1'); +insert into Followers (user_id, follower_id) values ('1', '0'); +insert into Followers (user_id, follower_id) values ('2', '0'); +insert into Followers (user_id, follower_id) values ('2', '1'); +``` + +## Expected Output Data + +```text ++---------+----------------+ +| user_id | followers_count| ++---------+----------------+ +| 0 | 1 | +| 1 | 1 | +| 2 | 2 | ++---------+----------------+ +``` + +## SQL Solution + +```sql +SELECT user_id,COUNT(follower_id) AS follower_count +FROM followers_1729 +GROUP BY user_id +ORDER BY user_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id`, `follower_count` from `followers`. + +### Result Grain + +One row per unique key in `GROUP BY user_id`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by user_id. +2. Project final output columns: `user_id`, `follower_count`. +3. Order output deterministically with `ORDER BY user_id`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1729. Find Followers Count (Easy).sql b/easy/1729. Find Followers Count (Easy).sql deleted file mode 100644 index d0061c6..0000000 --- a/easy/1729. Find Followers Count (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT user_id,COUNT(follower_id) AS follower_count -FROM followers_1729 -GROUP BY user_id -ORDER BY user_id; diff --git a/easy/1731. The Number of Employees Which Report to Each Employee (Easy).md b/easy/1731. The Number of Employees Which Report to Each Employee (Easy).md new file mode 100644 index 0000000..5b64d35 --- /dev/null +++ b/easy/1731. The Number of Employees Which Report to Each Employee (Easy).md @@ -0,0 +1,73 @@ +# Question 1731: The Number of Employees Which Report to Each Employee + +**LeetCode URL:** https://leetcode.com/problems/the-number-of-employees-which-report-to-each-employee/ + +## Description + +Write a solution to report the ids and the names of all managers, the number of employees who report directly to them, and the average age of the reports rounded to the nearest integer. Return the result table ordered by employee_id. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees(employee_id int, name varchar(20), reports_to int, age int); +``` + +## Sample Input Data + +```sql +insert into Employees (employee_id, name, reports_to, age) values ('9', 'Hercy', NULL, '43'); +insert into Employees (employee_id, name, reports_to, age) values ('6', 'Alice', '9', '41'); +insert into Employees (employee_id, name, reports_to, age) values ('4', 'Bob', '9', '36'); +insert into Employees (employee_id, name, reports_to, age) values ('2', 'Winston', NULL, '37'); +``` + +## Expected Output Data + +```text ++-------------+-------+---------------+-------------+ +| employee_id | name | reports_count | average_age | ++-------------+-------+---------------+-------------+ +| 9 | Hercy | 2 | 39 | ++-------------+-------+---------------+-------------+ +``` + +## SQL Solution + +```sql +SELECT mgr.employee_id,mgr.name,COUNT(emp.name) AS reports_count,ROUND(AVG(emp.age)) AS average_age +FROM employees_1731 mgr +INNER JOIN employees_1731 emp ON mgr.employee_id = emp.reports_to +GROUP BY mgr.employee_id,mgr.name +ORDER BY mgr.employee_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id`, `name`, `reports_count`, `average_age` from `employees`. + +### Result Grain + +One row per unique key in `GROUP BY mgr.employee_id,mgr.name`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with COUNT, AVG, ROUND grouped by mgr.employee_id,mgr.name. +3. Project final output columns: `employee_id`, `name`, `reports_count`, `average_age`. +4. Order output deterministically with `ORDER BY mgr.employee_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1731. The Number of Employees Which Report to Each Employee (Easy).sql b/easy/1731. The Number of Employees Which Report to Each Employee (Easy).sql deleted file mode 100644 index 9317600..0000000 --- a/easy/1731. The Number of Employees Which Report to Each Employee (Easy).sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT mgr.employee_id,mgr.name,COUNT(emp.name) AS reports_count,ROUND(AVG(emp.age)) AS average_age -FROM employees_1731 mgr -INNER JOIN employees_1731 emp ON mgr.employee_id = emp.reports_to -GROUP BY mgr.employee_id,mgr.name -ORDER BY mgr.employee_id; diff --git a/easy/1741. Find Total Time Spent by Each Employee (Easy).md b/easy/1741. Find Total Time Spent by Each Employee (Easy).md new file mode 100644 index 0000000..756f8e9 --- /dev/null +++ b/easy/1741. Find Total Time Spent by Each Employee (Easy).md @@ -0,0 +1,74 @@ +# Question 1741: Find Total Time Spent by Each Employee + +**LeetCode URL:** https://leetcode.com/problems/find-total-time-spent-by-each-employee/ + +## Description + +Write a solution to calculate the total time in minutes spent by each employee on each day at the office. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees(emp_id int, event_day date, in_time int, out_time int); +``` + +## Sample Input Data + +```sql +insert into Employees (emp_id, event_day, in_time, out_time) values ('1', '2020-11-28', '4', '32'); +insert into Employees (emp_id, event_day, in_time, out_time) values ('1', '2020-11-28', '55', '200'); +insert into Employees (emp_id, event_day, in_time, out_time) values ('1', '2020-12-3', '1', '42'); +insert into Employees (emp_id, event_day, in_time, out_time) values ('2', '2020-11-28', '3', '33'); +insert into Employees (emp_id, event_day, in_time, out_time) values ('2', '2020-12-9', '47', '74'); +``` + +## Expected Output Data + +```text ++------------+--------+------------+ +| day | emp_id | total_time | ++------------+--------+------------+ +| 2020-11-28 | 1 | 173 | +| 2020-11-28 | 2 | 30 | +| 2020-12-03 | 1 | 41 | +| 2020-12-09 | 2 | 27 | ++------------+--------+------------+ +``` + +## SQL Solution + +```sql +SELECT event_day AS day,emp_id,SUM(out_time-in_time) AS total_time +FROM employees_1741 +GROUP BY event_day,emp_id +ORDER BY day,emp_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `day`, `emp_id`, `total_time` from `employees`. + +### Result Grain + +One row per unique key in `GROUP BY event_day,emp_id`. + +### Step-by-Step Logic + +1. Aggregate rows with SUM grouped by event_day,emp_id. +2. Project final output columns: `day`, `emp_id`, `total_time`. +3. Order output deterministically with `ORDER BY day,emp_id`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/1741. Find Total Time Spent by Each Employee (Easy).sql b/easy/1741. Find Total Time Spent by Each Employee (Easy).sql deleted file mode 100644 index 8f31b9f..0000000 --- a/easy/1741. Find Total Time Spent by Each Employee (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT event_day AS day,emp_id,SUM(out_time-in_time) AS total_time -FROM employees_1741 -GROUP BY event_day,emp_id -ORDER BY day,emp_id; diff --git a/easy/175. Combine Two Tables.md b/easy/175. Combine Two Tables.md new file mode 100644 index 0000000..f34e457 --- /dev/null +++ b/easy/175. Combine Two Tables.md @@ -0,0 +1,70 @@ +# Question 175: Combine Two Tables + +**LeetCode URL:** https://leetcode.com/problems/combine-two-tables/ + +## Description + +Write a solution to report the first name, last name, city, and state of each person in the Person table. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Person (personId int, firstName varchar(255), lastName varchar(255)); +Create table If Not Exists Address (addressId int, personId int, city varchar(255), state varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Person (personId, lastName, firstName) values ('1', 'Wang', 'Allen'); +insert into Person (personId, lastName, firstName) values ('2', 'Alice', 'Bob'); +insert into Address (addressId, personId, city, state) values ('1', '2', 'New York City', 'New York'); +insert into Address (addressId, personId, city, state) values ('2', '3', 'Leetcode', 'California'); +``` + +## Expected Output Data + +```text ++-----------+----------+---------------+----------+ +| firstName | lastName | city | state | ++-----------+----------+---------------+----------+ +| Allen | Wang | Null | Null | +| Bob | Alice | New York City | New York | ++-----------+----------+---------------+----------+ +``` + +## SQL Solution + +```sql +SELECT firstname,lastname,city,state +FROM person_175 p +LEFT JOIN address_175 a ON p.personid = a.personid; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `firstname`, `lastname`, `city`, `state` from `person`, `address`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `firstname`, `lastname`, `city`, `state`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/175. Combine Two Tables.sql b/easy/175. Combine Two Tables.sql deleted file mode 100644 index ac99e32..0000000 --- a/easy/175. Combine Two Tables.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT firstname,lastname,city,state -FROM person_175 p -LEFT JOIN address_175 a ON p.personid = a.personid; diff --git a/easy/1757. Recyclable and Low Fat Products (Easy).md b/easy/1757. Recyclable and Low Fat Products (Easy).md new file mode 100644 index 0000000..75334ee --- /dev/null +++ b/easy/1757. Recyclable and Low Fat Products (Easy).md @@ -0,0 +1,70 @@ +# Question 1757: Recyclable and Low Fat Products + +**LeetCode URL:** https://leetcode.com/problems/recyclable-and-low-fat-products/ + +## Description + +Write a solution to find the ids of products that are both low fat and recyclable. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Products (product_id int, low_fats ENUM('Y', 'N'), recyclable ENUM('Y','N')); +``` + +## Sample Input Data + +```sql +insert into Products (product_id, low_fats, recyclable) values ('0', 'Y', 'N'); +insert into Products (product_id, low_fats, recyclable) values ('1', 'Y', 'Y'); +insert into Products (product_id, low_fats, recyclable) values ('2', 'N', 'Y'); +insert into Products (product_id, low_fats, recyclable) values ('3', 'Y', 'Y'); +insert into Products (product_id, low_fats, recyclable) values ('4', 'N', 'N'); +``` + +## Expected Output Data + +```text ++-------------+ +| product_id | ++-------------+ +| 1 | +| 3 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT product_id +FROM products_1757 +WHERE low_fats = 'Y' AND recyclable = 'Y'; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id` from `products`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: low_fats = 'Y' AND recyclable = 'Y'. +2. Project final output columns: `product_id`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1757. Recyclable and Low Fat Products (Easy).sql b/easy/1757. Recyclable and Low Fat Products (Easy).sql deleted file mode 100644 index c552560..0000000 --- a/easy/1757. Recyclable and Low Fat Products (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT product_id -FROM products_1757 -WHERE low_fats = 'Y' AND recyclable = 'Y'; diff --git "a/easy/1777. Product\342\200\231s Price for Each Store (Easy).md" "b/easy/1777. Product\342\200\231s Price for Each Store (Easy).md" new file mode 100644 index 0000000..b3f6cc8 --- /dev/null +++ "b/easy/1777. Product\342\200\231s Price for Each Store (Easy).md" @@ -0,0 +1,104 @@ +# Question 1777: Product's Price for Each Store + +**LeetCode URL:** https://leetcode.com/problems/products-price-for-each-store/ + +## Description + +Write an SQL query to find the price of each product in each store. Return the result table in any order. The query result format is in the following example: Products table: +-------------+--------+-------+ | product_id | store | price | +-------------+--------+-------+ | 0 | store1 | 95 | | 0 | store3 | 105 | | 0 | store2 | 100 | | 1 | store1 | 70 | | 1 | store3 | 80 | +-------------+--------+-------+ Result table: +-------------+--------+--------+--------+ | product_id | store1 | store2 | store3 | +-------------+--------+--------+--------+ | 0 | 95 | 100 | 105 | | 1 | 70 | null | 80 | +-------------+--------+--------+--------+ Product 0 price's are 95 for store1, 100 for store2 and, 105 for store3. + +## Table Schema Structure + +```sql +Create table If Not Exists Products (product_id int, store ENUM('store1', 'store2', 'store3'), price int); +``` + +## Sample Input Data + +```sql +insert into Products (product_id, store, price) values ('0', 'store1', '95'); +insert into Products (product_id, store, price) values ('0', 'store3', '105'); +insert into Products (product_id, store, price) values ('0', 'store2', '100'); +insert into Products (product_id, store, price) values ('1', 'store1', '70'); +insert into Products (product_id, store, price) values ('1', 'store3', '80'); +``` + +## Expected Output Data + +```text ++-------------+--------+--------+--------+ +| product_id | store1 | store2 | store3 | ++-------------+--------+--------+--------+ +| 0 | 95 | 100 | 105 | +| 1 | 70 | null | 80 | ++-------------+--------+--------+--------+ +``` + +## SQL Solution + +```sql +CREATE OR REPLACE FUNCTION pivot_products_1777() +RETURNS TEXT +LANGUAGE PLPGSQL +AS +$$ +DECLARE + stores_array TEXT[]; + store_name TEXT; + query_text TEXT := ''; +BEGIN + SELECT ARRAY_AGG(DISTINCT store ORDER BY store ASC) + INTO stores_array + FROM products_1777; + + query_text := query_text || 'SELECT product_id, '; + FOREACH store_name IN ARRAY stores_array + LOOP + query_text := query_text || 'SUM(CASE WHEN store = ''' || store_name || ''' THEN price ELSE NULL END) AS "' || store_name || '",'; + END LOOP; + query_text := LEFT(query_text,LENGTH(query_text)-1); + + query_text := query_text || ' FROM products_1777 GROUP BY product_id ORDER BY product_id;'; + + RETURN query_text; +END +$$; + +SELECT pivot_products_1777(); + +/* +SELECT product_id, + SUM(CASE WHEN store = 'store1' THEN price ELSE NULL END) AS "store1", + SUM(CASE WHEN store = 'store2' THEN price ELSE NULL END) AS "store2", + SUM(CASE WHEN store = 'store3' THEN price ELSE NULL END) AS "store3" +FROM products_1777 +GROUP BY product_id +ORDER BY product_id; +*/ +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `products`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Define a SQL function/procedural block first, then execute it to generate or run dynamic SQL for the final shape. + +### Why This Works + +The query maps input columns directly to the requested output shape with minimal transformation. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git "a/easy/1777. Product\342\200\231s Price for Each Store (Easy).sql" "b/easy/1777. Product\342\200\231s Price for Each Store (Easy).sql" deleted file mode 100644 index b36c0ac..0000000 --- "a/easy/1777. Product\342\200\231s Price for Each Store (Easy).sql" +++ /dev/null @@ -1,38 +0,0 @@ -CREATE OR REPLACE FUNCTION pivot_products_1777() -RETURNS TEXT -LANGUAGE PLPGSQL -AS -$$ -DECLARE - stores_array TEXT[]; - store_name TEXT; - query_text TEXT := ''; -BEGIN - SELECT ARRAY_AGG(DISTINCT store ORDER BY store ASC) - INTO stores_array - FROM products_1777; - - query_text := query_text || 'SELECT product_id, '; - FOREACH store_name IN ARRAY stores_array - LOOP - query_text := query_text || 'SUM(CASE WHEN store = ''' || store_name || ''' THEN price ELSE NULL END) AS "' || store_name || '",'; - END LOOP; - query_text := LEFT(query_text,LENGTH(query_text)-1); - - query_text := query_text || ' FROM products_1777 GROUP BY product_id ORDER BY product_id;'; - - RETURN query_text; -END -$$; - -SELECT pivot_products_1777(); - -/* -SELECT product_id, - SUM(CASE WHEN store = 'store1' THEN price ELSE NULL END) AS "store1", - SUM(CASE WHEN store = 'store2' THEN price ELSE NULL END) AS "store2", - SUM(CASE WHEN store = 'store3' THEN price ELSE NULL END) AS "store3" -FROM products_1777 -GROUP BY product_id -ORDER BY product_id; -*/ diff --git a/easy/1789. Primary Department for Each Employee (Easy).md b/easy/1789. Primary Department for Each Employee (Easy).md new file mode 100644 index 0000000..4028eec --- /dev/null +++ b/easy/1789. Primary Department for Each Employee (Easy).md @@ -0,0 +1,83 @@ +# Question 1789: Primary Department for Each Employee + +**LeetCode URL:** https://leetcode.com/problems/primary-department-for-each-employee/ + +## Description + +Write a solution to report all the employees with their primary department. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (employee_id int, department_id int, primary_flag ENUM('Y','N')); +``` + +## Sample Input Data + +```sql +insert into Employee (employee_id, department_id, primary_flag) values ('1', '1', 'N'); +insert into Employee (employee_id, department_id, primary_flag) values ('2', '1', 'Y'); +insert into Employee (employee_id, department_id, primary_flag) values ('2', '2', 'N'); +insert into Employee (employee_id, department_id, primary_flag) values ('3', '3', 'N'); +insert into Employee (employee_id, department_id, primary_flag) values ('4', '2', 'N'); +insert into Employee (employee_id, department_id, primary_flag) values ('4', '3', 'Y'); +insert into Employee (employee_id, department_id, primary_flag) values ('4', '4', 'N'); +``` + +## Expected Output Data + +```text ++-------------+---------------+ +| employee_id | department_id | ++-------------+---------------+ +| 1 | 1 | +| 2 | 1 | +| 3 | 3 | +| 4 | 3 | ++-------------+---------------+ +``` + +## SQL Solution + +```sql +WITH employee_departments AS ( + SELECT *, + COUNT(employee_id) OVER (PARTITION BY employee_id) AS cnt + FROM employee_1789 +) +SELECT employee_id,department_id +FROM employee_departments +WHERE primary_flag = 'Y' OR (primary_flag = 'N' AND cnt = 1); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id`, `department_id` from `employee`, `employee_departments`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`employee_departments`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `employee_departments`: reads `employee`, computes window metrics. +3. Apply row-level filtering in `WHERE`: primary_flag = 'Y' OR (primary_flag = 'N' AND cnt = 1). +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `employee_id`, `department_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1789. Primary Department for Each Employee (Easy).sql b/easy/1789. Primary Department for Each Employee (Easy).sql deleted file mode 100644 index 6342c7e..0000000 --- a/easy/1789. Primary Department for Each Employee (Easy).sql +++ /dev/null @@ -1,8 +0,0 @@ -WITH employee_departments AS ( - SELECT *, - COUNT(employee_id) OVER (PARTITION BY employee_id) AS cnt - FROM employee_1789 -) -SELECT employee_id,department_id -FROM employee_departments -WHERE primary_flag = 'Y' OR (primary_flag = 'N' AND cnt = 1); diff --git a/easy/1809. Ad-Free Sessions (Easy).md b/easy/1809. Ad-Free Sessions (Easy).md new file mode 100644 index 0000000..81a0696 --- /dev/null +++ b/easy/1809. Ad-Free Sessions (Easy).md @@ -0,0 +1,80 @@ +# Question 1809: Ad-Free Sessions + +**LeetCode URL:** https://leetcode.com/problems/ad-free-sessions/ + +## Description + +Write an SQL query to report all the sessions that did not get shown any ads. Return the result table in any order. The query result format is in the following example: Playback table: +------------+-------------+------------+----------+ | session_id | customer_id | start_time | end_time | +------------+-------------+------------+----------+ | 1 | 1 | 1 | 5 | | 2 | 1 | 15 | 23 | | 3 | 2 | 10 | 12 | | 4 | 2 | 17 | 28 | | 5 | 2 | 2 | 8 | +------------+-------------+------------+----------+ Ads table: +-------+-------------+-----------+ | ad_id | customer_id | timestamp | +-------+-------------+-----------+ | 1 | 1 | 5 | | 2 | 2 | 15 | | 3 | 2 | 20 | +-------+-------------+-----------+ Result table: +------------+ | session_id | +------------+ | 2 | | 3 | | 5 | +------------+ The ad with ID 1 was shown to user 1 at time 5 while they were in session 1. + +## Table Schema Structure + +```sql +Create table If Not Exists Playback(session_id int,customer_id int,start_time int,end_time int); +Create table If Not Exists Ads (ad_id int, customer_id int, timestamp int); +``` + +## Sample Input Data + +```sql +insert into Playback (session_id, customer_id, start_time, end_time) values ('1', '1', '1', '5'); +insert into Playback (session_id, customer_id, start_time, end_time) values ('2', '1', '15', '23'); +insert into Playback (session_id, customer_id, start_time, end_time) values ('3', '2', '10', '12'); +insert into Playback (session_id, customer_id, start_time, end_time) values ('4', '2', '17', '28'); +insert into Playback (session_id, customer_id, start_time, end_time) values ('5', '2', '2', '8'); +insert into Ads (ad_id, customer_id, timestamp) values ('1', '1', '5'); +insert into Ads (ad_id, customer_id, timestamp) values ('2', '2', '17'); +insert into Ads (ad_id, customer_id, timestamp) values ('3', '2', '20'); +``` + +## Expected Output Data + +```text ++------------+ +| session_id | ++------------+ +| 2 | +| 3 | +| 5 | ++------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT session_id +FROM playback_1809 pb +LEFT JOIN ads_1809 ad +ON pb.customer_id = ad.customer_id AND + ad.timestamp BETWEEN start_time AND end_time +WHERE ad_id IS NULL; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `session_id` from `playback`, `ads`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: ad_id IS NULL. +3. Project final output columns: `session_id`. +4. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1809. Ad-Free Sessions (Easy).sql b/easy/1809. Ad-Free Sessions (Easy).sql deleted file mode 100644 index 5d04d00..0000000 --- a/easy/1809. Ad-Free Sessions (Easy).sql +++ /dev/null @@ -1,7 +0,0 @@ -SELECT DISTINCT session_id -FROM playback_1809 pb -LEFT JOIN ads_1809 ad -ON pb.customer_id = ad.customer_id AND - ad.timestamp BETWEEN start_time AND end_time -WHERE ad_id IS NULL; - diff --git a/easy/181. Employees Earning More Than Their Managers.md b/easy/181. Employees Earning More Than Their Managers.md new file mode 100644 index 0000000..e2daf32 --- /dev/null +++ b/easy/181. Employees Earning More Than Their Managers.md @@ -0,0 +1,68 @@ +# Question 181: Employees Earning More Than Their Managers + +**LeetCode URL:** https://leetcode.com/problems/employees-earning-more-than-their-managers/ + +## Description + +Write a solution to find the employees who earn more than their managers. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (id int, name varchar(255), salary int, managerId int); +``` + +## Sample Input Data + +```sql +insert into Employee (id, name, salary, managerId) values ('1', 'Joe', '70000', '3'); +insert into Employee (id, name, salary, managerId) values ('2', 'Henry', '80000', '4'); +insert into Employee (id, name, salary, managerId) values ('3', 'Sam', '60000', NULL); +insert into Employee (id, name, salary, managerId) values ('4', 'Max', '90000', NULL); +``` + +## Expected Output Data + +```text ++----------+ +| Employee | ++----------+ +| Joe | ++----------+ +``` + +## SQL Solution + +```sql +SELECT e.name as Employee +FROM employee_181 e +JOIN employee_181 m ON e.manager_id = m.id AND e.salary>m.salary; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `Employee` from `employee`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `Employee`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/181. Employees Earning More Than Their Managers.sql b/easy/181. Employees Earning More Than Their Managers.sql deleted file mode 100644 index dd1c60e..0000000 --- a/easy/181. Employees Earning More Than Their Managers.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT e.name as Employee -FROM employee_181 e -JOIN employee_181 m ON e.manager_id = m.id AND e.salary>m.salary; diff --git a/easy/182. Duplicate Emails.md b/easy/182. Duplicate Emails.md new file mode 100644 index 0000000..de64960 --- /dev/null +++ b/easy/182. Duplicate Emails.md @@ -0,0 +1,69 @@ +# Question 182: Duplicate Emails + +**LeetCode URL:** https://leetcode.com/problems/duplicate-emails/ + +## Description + +Write a solution to report all the duplicate emails. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Person (id int, email varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Person (id, email) values ('1', 'a@b.com'); +insert into Person (id, email) values ('2', 'c@d.com'); +insert into Person (id, email) values ('3', 'a@b.com'); +``` + +## Expected Output Data + +```text ++---------+ +| Email | ++---------+ +| a@b.com | ++---------+ +``` + +## SQL Solution + +```sql +SELECT email +FROM person_182 +GROUP BY email +HAVING cnt>2; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `email` from `person`. + +### Result Grain + +One row per unique key in `GROUP BY email`. + +### Step-by-Step Logic + +1. Group rows by email to enforce one result row per key. +2. Project final output columns: `email`. +3. Filter aggregated groups in `HAVING`: cnt>2. + +### Why This Works + +`HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/182. Duplicate Emails.sql b/easy/182. Duplicate Emails.sql deleted file mode 100644 index c297a3c..0000000 --- a/easy/182. Duplicate Emails.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT email -FROM person_182 -GROUP BY email -HAVING cnt>2; diff --git a/easy/1821. Find Customers With Positive Revenue this Year (Easy).md b/easy/1821. Find Customers With Positive Revenue this Year (Easy).md new file mode 100644 index 0000000..85d8b9d --- /dev/null +++ b/easy/1821. Find Customers With Positive Revenue this Year (Easy).md @@ -0,0 +1,72 @@ +# Question 1821: Find Customers With Positive Revenue this Year + +**LeetCode URL:** https://leetcode.com/problems/find-customers-with-positive-revenue-this-year/ + +## Description + +Write an SQL query to report the customers with postive revenue in the year 2021. Return the result table in any order. The query result format is in the following example: Customers +-------------+------+---------+ | customer_id | year | revenue | +-------------+------+---------+ | 1 | 2018 | 50 | | 1 | 2021 | 30 | | 1 | 2020 | 70 | | 2 | 2021 | -50 | | 3 | 2018 | 10 | | 3 | 2016 | 50 | | 4 | 2021 | 20 | +-------------+------+---------+ Result table: +-------------+ | customer_id | +-------------+ | 1 | | 4 | +-------------+ Customer 1 has revenue equal to 50 in year 2021. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, year int, revenue int); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, year, revenue) values ('1', '2018', '50'); +insert into Customers (customer_id, year, revenue) values ('1', '2021', '30'); +insert into Customers (customer_id, year, revenue) values ('1', '2020', '70'); +insert into Customers (customer_id, year, revenue) values ('2', '2021', '-50'); +insert into Customers (customer_id, year, revenue) values ('3', '2018', '10'); +insert into Customers (customer_id, year, revenue) values ('3', '2016', '50'); +insert into Customers (customer_id, year, revenue) values ('4', '2021', '20'); +``` + +## Expected Output Data + +```text ++-------------+ +| customer_id | ++-------------+ +| 1 | +| 4 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT customer_id +FROM customers_1821 +WHERE year = 2021 AND revenue > 0; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_id` from `customers`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: year = 2021 AND revenue > 0. +2. Project final output columns: `customer_id`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1821. Find Customers With Positive Revenue this Year (Easy).sql b/easy/1821. Find Customers With Positive Revenue this Year (Easy).sql deleted file mode 100644 index 1356da7..0000000 --- a/easy/1821. Find Customers With Positive Revenue this Year (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT customer_id -FROM customers_1821 -WHERE year = 2021 AND revenue > 0; diff --git a/easy/183. Customers Who Never Order.md b/easy/183. Customers Who Never Order.md new file mode 100644 index 0000000..ba4e762 --- /dev/null +++ b/easy/183. Customers Who Never Order.md @@ -0,0 +1,73 @@ +# Question 183: Customers Who Never Order + +**LeetCode URL:** https://leetcode.com/problems/customers-who-never-order/ + +## Description + +return the following: +-----------+ | Customers | +-----------+ | Henry | | Max | +-----------+ Difficulty: Easy Lock: Normal Company: Amazon Problem Solution 183-Customers-Who-Never-Order All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (id int, name varchar(255)); +Create table If Not Exists Orders (id int, customerId int); +``` + +## Sample Input Data + +```sql +insert into Customers (id, name) values ('1', 'Joe'); +insert into Customers (id, name) values ('2', 'Henry'); +insert into Customers (id, name) values ('3', 'Sam'); +insert into Customers (id, name) values ('4', 'Max'); +insert into Orders (id, customerId) values ('1', '3'); +insert into Orders (id, customerId) values ('2', '1'); +``` + +## Expected Output Data + +```text ++-------------+ +| customer_id | ++-------------+ +| sample | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT name +FROM customers_183 +WHERE id NOT IN (SELECT DISTINCT customer_id + FROM orders_183); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name` from `customers`, `orders`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: id NOT IN (SELECT DISTINCT customer_id FROM orders_183). +2. Project final output columns: `name`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/183. Customers Who Never Order.sql b/easy/183. Customers Who Never Order.sql deleted file mode 100644 index 2ff87d4..0000000 --- a/easy/183. Customers Who Never Order.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT name -FROM customers_183 -WHERE id NOT IN (SELECT DISTINCT customer_id - FROM orders_183); diff --git a/easy/1853. Convert Date Format (Easy).md b/easy/1853. Convert Date Format (Easy).md new file mode 100644 index 0000000..41b4f12 --- /dev/null +++ b/easy/1853. Convert Date Format (Easy).md @@ -0,0 +1,67 @@ +# Question 1853: Convert Date Format + +**LeetCode URL:** https://leetcode.com/problems/convert-date-format/ + +## Description + +Write an SQL query to convert each date in Days into a string formatted as "day_name, month_name day, year". Return the result table in any order. The query result format is in the following example: Days table: +------------+ | day | +------------+ | 2022-04-12 | | 2021-08-09 | | 2020-06-26 | +------------+ Result table: +-------------------------+ | day | +-------------------------+ | Tuesday, April 12, 2022 | | Monday, August 9, 2021 | | Friday, June 26, 2020 | +-------------------------+ Please note that the output is case-sensitive. + +## Table Schema Structure + +```sql +Create table If Not Exists Days (day date); +``` + +## Sample Input Data + +```sql +insert into Days (day) values ('2022-04-12'); +insert into Days (day) values ('2021-08-09'); +insert into Days (day) values ('2020-06-26'); +``` + +## Expected Output Data + +```text ++-------------------------+ +| day | ++-------------------------+ +| Tuesday, April 12, 2022 | +| Monday, August 9, 2021 | +| Friday, June 26, 2020 | ++-------------------------+ +``` + +## SQL Solution + +```sql +SELECT TO_CHAR(day,'Day, Month DD, YYYY') AS day +FROM days_1853; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `day` from `days`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Project final output columns: `day`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/1853. Convert Date Format (Easy).sql b/easy/1853. Convert Date Format (Easy).sql deleted file mode 100644 index 01bcb44..0000000 --- a/easy/1853. Convert Date Format (Easy).sql +++ /dev/null @@ -1,2 +0,0 @@ -SELECT TO_CHAR(day,'Day, Month DD, YYYY') AS day -FROM days_1853; diff --git a/easy/1873. Calculate Special Bonus (Easy).md b/easy/1873. Calculate Special Bonus (Easy).md new file mode 100644 index 0000000..2973cc4 --- /dev/null +++ b/easy/1873. Calculate Special Bonus (Easy).md @@ -0,0 +1,74 @@ +# Question 1873: Calculate Special Bonus + +**LeetCode URL:** https://leetcode.com/problems/calculate-special-bonus/ + +## Description + +Write a solution to calculate the bonus of each employee. Return the result table ordered by employee_id. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees (employee_id int, name varchar(30), salary int); +``` + +## Sample Input Data + +```sql +insert into Employees (employee_id, name, salary) values ('2', 'Meir', '3000'); +insert into Employees (employee_id, name, salary) values ('3', 'Michael', '3800'); +insert into Employees (employee_id, name, salary) values ('7', 'Addilyn', '7400'); +insert into Employees (employee_id, name, salary) values ('8', 'Juan', '6100'); +insert into Employees (employee_id, name, salary) values ('9', 'Kannon', '7700'); +``` + +## Expected Output Data + +```text ++-------------+-------+ +| employee_id | bonus | ++-------------+-------+ +| 2 | 0 | +| 3 | 0 | +| 7 | 7400 | +| 8 | 0 | +| 9 | 7700 | ++-------------+-------+ +``` + +## SQL Solution + +```sql +SELECT employee_id, + CASE WHEN employee_id%2 <> 0 AND LEFT(name,1)<>'M' THEN salary + ELSE 0 + END AS bonus +FROM employees_1873; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id`, `bonus` from `employees`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Project final output columns: `employee_id`, `bonus`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/1873. Calculate Special Bonus (Easy).sql b/easy/1873. Calculate Special Bonus (Easy).sql deleted file mode 100644 index 9c0c21b..0000000 --- a/easy/1873. Calculate Special Bonus (Easy).sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT employee_id, - CASE WHEN employee_id%2 <> 0 AND LEFT(name,1)<>'M' THEN salary - ELSE 0 - END AS bonus -FROM employees_1873; diff --git a/easy/1890. The Latest Login in 2020 (Easy).md b/easy/1890. The Latest Login in 2020 (Easy).md new file mode 100644 index 0000000..028f73e --- /dev/null +++ b/easy/1890. The Latest Login in 2020 (Easy).md @@ -0,0 +1,78 @@ +# Question 1890: The Latest Login in 2020 + +**LeetCode URL:** https://leetcode.com/problems/the-latest-login-in-2020/ + +## Description + +Write a solution to report the latest login for all users in the year 2020. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Logins (user_id int, time_stamp datetime); +``` + +## Sample Input Data + +```sql +insert into Logins (user_id, time_stamp) values ('6', '2020-06-30 15:06:07'); +insert into Logins (user_id, time_stamp) values ('6', '2021-04-21 14:06:06'); +insert into Logins (user_id, time_stamp) values ('6', '2019-03-07 00:18:15'); +insert into Logins (user_id, time_stamp) values ('8', '2020-02-01 05:10:53'); +insert into Logins (user_id, time_stamp) values ('8', '2020-12-30 00:46:50'); +insert into Logins (user_id, time_stamp) values ('2', '2020-01-16 02:49:50'); +insert into Logins (user_id, time_stamp) values ('2', '2019-08-25 07:59:08'); +insert into Logins (user_id, time_stamp) values ('14', '2019-07-14 09:00:00'); +insert into Logins (user_id, time_stamp) values ('14', '2021-01-06 11:59:59'); +``` + +## Expected Output Data + +```text ++---------+---------------------+ +| user_id | last_stamp | ++---------+---------------------+ +| 6 | 2020-06-30 15:06:07 | +| 8 | 2020-12-30 00:46:50 | +| 2 | 2020-01-16 02:49:50 | ++---------+---------------------+ +``` + +## SQL Solution + +```sql +SELECT user_id,MAX(time_stamp) AS last_stamp +FROM logins_1890 +WHERE EXTRACT(YEAR FROM time_stamp) = 2020 +GROUP BY user_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id`, `last_stamp` from `logins`, `time_stamp`. + +### Result Grain + +One row per unique key in `GROUP BY user_id`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: EXTRACT(YEAR FROM time_stamp) = 2020. +2. Aggregate rows with MAX grouped by user_id. +3. Project final output columns: `user_id`, `last_stamp`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1890. The Latest Login in 2020 (Easy).sql b/easy/1890. The Latest Login in 2020 (Easy).sql deleted file mode 100644 index 83fe315..0000000 --- a/easy/1890. The Latest Login in 2020 (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT user_id,MAX(time_stamp) AS last_stamp -FROM logins_1890 -WHERE EXTRACT(YEAR FROM time_stamp) = 2020 -GROUP BY user_id; diff --git a/easy/1939. Users That Actively Request Confirmation Messages (Easy).md b/easy/1939. Users That Actively Request Confirmation Messages (Easy).md new file mode 100644 index 0000000..e70962d --- /dev/null +++ b/easy/1939. Users That Actively Request Confirmation Messages (Easy).md @@ -0,0 +1,80 @@ +# Question 1939: Users That Actively Request Confirmation Messages + +**LeetCode URL:** https://leetcode.com/problems/users-that-actively-request-confirmation-messages/ + +## Description + +Drafted from this solution SQL: write a query on `confirmations` to return `user_id`. Apply filter conditions: EXTRACT(EPOCH FROM (c2.time_stamp-c1.time_stamp)) <= 24*60*60. + +## Table Schema Structure + +```sql +Create table If Not Exists Signups (user_id int, time_stamp datetime); +Create table If Not Exists Confirmations (user_id int, time_stamp datetime, action ENUM('confirmed','timeout')); +``` + +## Sample Input Data + +```sql +insert into Signups (user_id, time_stamp) values ('3', '2020-03-21 10:16:13'); +insert into Signups (user_id, time_stamp) values ('7', '2020-01-04 13:57:59'); +insert into Signups (user_id, time_stamp) values ('2', '2020-07-29 23:09:44'); +insert into Signups (user_id, time_stamp) values ('6', '2020-12-09 10:39:37'); +insert into Confirmations (user_id, time_stamp, action) values ('3', '2021-01-06 03:30:46', 'timeout'); +insert into Confirmations (user_id, time_stamp, action) values ('3', '2021-01-06 03:37:45', 'timeout'); +insert into Confirmations (user_id, time_stamp, action) values ('7', '2021-06-12 11:57:29', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('7', '2021-06-13 11:57:30', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('2', '2021-01-22 00:00:00', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('2', '2021-01-23 00:00:00', 'timeout'); +insert into Confirmations (user_id, time_stamp, action) values ('6', '2021-10-23 14:14:14', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('6', '2021-10-24 14:14:13', 'timeout'); +``` + +## Expected Output Data + +```text ++---------+ +| user_id | ++---------+ +| sample | ++---------+ +``` + +## SQL Solution + +```sql +SELECT c1.user_id +FROM confirmations_1939 c1 +INNER JOIN confirmations_1939 c2 ON c1.user_id = c2.user_id AND c1.time_stamp < c2.time_stamp +WHERE EXTRACT(EPOCH FROM (c2.time_stamp-c1.time_stamp)) <= 24*60*60; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id` from `confirmations`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: EXTRACT(EPOCH FROM (c2.time_stamp-c1.time_stamp)) <= 24*60*60. +3. Project final output columns: `user_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/1939. Users That Actively Request Confirmation Messages (Easy).sql b/easy/1939. Users That Actively Request Confirmation Messages (Easy).sql deleted file mode 100644 index d37c51c..0000000 --- a/easy/1939. Users That Actively Request Confirmation Messages (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT c1.user_id -FROM confirmations_1939 c1 -INNER JOIN confirmations_1939 c2 ON c1.user_id = c2.user_id AND c1.time_stamp < c2.time_stamp -WHERE EXTRACT(EPOCH FROM (c2.time_stamp-c1.time_stamp)) <= 24*60*60; diff --git a/easy/196. Delete Duplicate Emails.md b/easy/196. Delete Duplicate Emails.md new file mode 100644 index 0000000..eb1ce75 --- /dev/null +++ b/easy/196. Delete Duplicate Emails.md @@ -0,0 +1,74 @@ +# Question 196: Delete Duplicate Emails + +**LeetCode URL:** https://leetcode.com/problems/delete-duplicate-emails/ + +## Description + +Write a solution to delete all duplicate emails, keeping only one unique email with the smallest id. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Person (Id int, Email varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Person (id, email) values ('1', 'john@example.com'); +insert into Person (id, email) values ('2', 'bob@example.com'); +insert into Person (id, email) values ('3', 'john@example.com'); +``` + +## Expected Output Data + +```text ++----+------------------+ +| id | email | ++----+------------------+ +| 1 | john@example.com | +| 2 | bob@example.com | ++----+------------------+ +``` + +## SQL Solution + +```sql +WITH duplicate_id_higher AS ( + SELECT p1.id AS higher_id FROM person_196_ans p1 JOIN person_196_ans p2 ON p1.email = p2.email AND p1.id > p2.id +) + +DELETE FROM person_196_ans WHERE id IN (SELECT higher_id FROM duplicate_id_higher); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `higher_id` from `person_196_ans`, `duplicate_id_higher`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`duplicate_id_higher`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `duplicate_id_higher`: reads `person_196_ans`, joins related entities. +3. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Apply row-level filtering in `WHERE`: id IN (SELECT higher_id FROM duplicate_id_higher);. +5. Project final output columns: `higher_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/196. Delete Duplicate Emails.sql b/easy/196. Delete Duplicate Emails.sql deleted file mode 100644 index 2312b71..0000000 --- a/easy/196. Delete Duplicate Emails.sql +++ /dev/null @@ -1,5 +0,0 @@ -WITH duplicate_id_higher AS ( - SELECT p1.id AS higher_id FROM person_196_ans p1 JOIN person_196_ans p2 ON p1.email = p2.email AND p1.id > p2.id -) - -DELETE FROM person_196_ans WHERE id IN (SELECT higher_id FROM duplicate_id_higher); diff --git a/easy/1965. Employees With Missing Information (Easy).md b/easy/1965. Employees With Missing Information (Easy).md new file mode 100644 index 0000000..02f9276 --- /dev/null +++ b/easy/1965. Employees With Missing Information (Easy).md @@ -0,0 +1,73 @@ +# Question 1965: Employees With Missing Information + +**LeetCode URL:** https://leetcode.com/problems/employees-with-missing-information/ + +## Description + +Write a solution to report the IDs of all the employees with missing information. Return the result table ordered by employee_id in ascending order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees (employee_id int, name varchar(30)); +Create table If Not Exists Salaries (employee_id int, salary int); +``` + +## Sample Input Data + +```sql +insert into Employees (employee_id, name) values ('2', 'Crew'); +insert into Employees (employee_id, name) values ('4', 'Haven'); +insert into Employees (employee_id, name) values ('5', 'Kristian'); +insert into Salaries (employee_id, salary) values ('5', '76071'); +insert into Salaries (employee_id, salary) values ('1', '22517'); +insert into Salaries (employee_id, salary) values ('4', '63539'); +``` + +## Expected Output Data + +```text ++-------------+ +| employee_id | ++-------------+ +| 1 | +| 2 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT COALESCE(e.employee_id,s.employee_id) +FROM employees_1965 e +FULL OUTER JOIN salaries_1965 s ON e.employee_id = s.employee_id +WHERE e.name IS NULL OR s.salary IS NULL; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `employees`, `salaries`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using FULL OUTER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: e.name IS NULL OR s.salary IS NULL. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1965. Employees With Missing Information (Easy).sql b/easy/1965. Employees With Missing Information (Easy).sql deleted file mode 100644 index 68a2e04..0000000 --- a/easy/1965. Employees With Missing Information (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT COALESCE(e.employee_id,s.employee_id) -FROM employees_1965 e -FULL OUTER JOIN salaries_1965 s ON e.employee_id = s.employee_id -WHERE e.name IS NULL OR s.salary IS NULL; diff --git a/easy/197. Rising Temperature.md b/easy/197. Rising Temperature.md new file mode 100644 index 0000000..4d0cdeb --- /dev/null +++ b/easy/197. Rising Temperature.md @@ -0,0 +1,69 @@ +# Question 197: Rising Temperature + +**LeetCode URL:** https://leetcode.com/problems/rising-temperature/ + +## Description + +Write a solution to find all dates' id with higher temperatures compared to its previous dates (yesterday). Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Weather (id int, recordDate date, temperature int); +``` + +## Sample Input Data + +```sql +insert into Weather (id, recordDate, temperature) values ('1', '2015-01-01', '10'); +insert into Weather (id, recordDate, temperature) values ('2', '2015-01-02', '25'); +insert into Weather (id, recordDate, temperature) values ('3', '2015-01-03', '20'); +insert into Weather (id, recordDate, temperature) values ('4', '2015-01-04', '30'); +``` + +## Expected Output Data + +```text ++----+ +| id | ++----+ +| 2 | +| 4 | ++----+ +``` + +## SQL Solution + +```sql +SELECT w1.id +FROM weather_197 w1 +JOIN weather_197 w2 ON w1.record_date-1=w2.record_date AND w1.temperature>w2.temperature; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id` from `weather`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/197. Rising Temperature.sql b/easy/197. Rising Temperature.sql deleted file mode 100644 index 5e81c8a..0000000 --- a/easy/197. Rising Temperature.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT w1.id -FROM weather_197 w1 -JOIN weather_197 w2 ON w1.record_date-1=w2.record_date AND w1.temperature>w2.temperature; diff --git a/easy/1978. Employees Whose Manager Left the Company (Easy).md b/easy/1978. Employees Whose Manager Left the Company (Easy).md new file mode 100644 index 0000000..758e5de --- /dev/null +++ b/easy/1978. Employees Whose Manager Left the Company (Easy).md @@ -0,0 +1,74 @@ +# Question 1978: Employees Whose Manager Left the Company + +**LeetCode URL:** https://leetcode.com/problems/employees-whose-manager-left-the-company/ + +## Description + +Return the result table ordered by employee_id. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees (employee_id int, name varchar(20), manager_id int, salary int); +``` + +## Sample Input Data + +```sql +insert into Employees (employee_id, name, manager_id, salary) values ('3', 'Mila', '9', '60301'); +insert into Employees (employee_id, name, manager_id, salary) values ('12', 'Antonella', NULL, '31000'); +insert into Employees (employee_id, name, manager_id, salary) values ('13', 'Emery', NULL, '67084'); +insert into Employees (employee_id, name, manager_id, salary) values ('1', 'Kalel', '11', '21241'); +insert into Employees (employee_id, name, manager_id, salary) values ('9', 'Mikaela', NULL, '50937'); +insert into Employees (employee_id, name, manager_id, salary) values ('11', 'Joziah', '6', '28485'); +``` + +## Expected Output Data + +```text ++-------------+ +| employee_id | ++-------------+ +| 11 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT e1.employee_id +FROM employees_1978 e1 +LEFT JOIN employees_1978 e2 +ON e1.manager_id = e2.employee_id +WHERE e1.salary < 30000 AND e2.employee_id IS NULL; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id` from `employees`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: e1.salary < 30000 AND e2.employee_id IS NULL. +3. Project final output columns: `employee_id`. +4. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/1978. Employees Whose Manager Left the Company (Easy).sql b/easy/1978. Employees Whose Manager Left the Company (Easy).sql deleted file mode 100644 index 281f217..0000000 --- a/easy/1978. Employees Whose Manager Left the Company (Easy).sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT DISTINCT e1.employee_id -FROM employees_1978 e1 -LEFT JOIN employees_1978 e2 -ON e1.manager_id = e2.employee_id -WHERE e1.salary < 30000 AND e2.employee_id IS NULL; diff --git a/easy/2026. Low-Quality Problems (Easy).md b/easy/2026. Low-Quality Problems (Easy).md new file mode 100644 index 0000000..d6fbda3 --- /dev/null +++ b/easy/2026. Low-Quality Problems (Easy).md @@ -0,0 +1,72 @@ +# Question 2026: Low-Quality Problems + +**LeetCode URL:** https://leetcode.com/problems/low-quality-problems/ + +## Description + +Drafted from this solution SQL: write a query on `problems` to return `problem_id`. Apply filter conditions: (likes*100.0/(likes+dislikes)) < 60 ORDER BY problem_id. Order the final output by: problem_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Problems (problem_id int, likes int, dislikes int); +``` + +## Sample Input Data + +```sql +insert into Problems (problem_id, likes, dislikes) values ('6', '1290', '425'); +insert into Problems (problem_id, likes, dislikes) values ('11', '2677', '8659'); +insert into Problems (problem_id, likes, dislikes) values ('1', '4446', '2760'); +insert into Problems (problem_id, likes, dislikes) values ('7', '8569', '6086'); +insert into Problems (problem_id, likes, dislikes) values ('13', '2050', '4164'); +insert into Problems (problem_id, likes, dislikes) values ('10', '9002', '7446'); +``` + +## Expected Output Data + +```text ++------------+ +| problem_id | ++------------+ +| sample | ++------------+ +``` + +## SQL Solution + +```sql +SELECT problem_id +FROM problems_2026 +WHERE (likes*100.0/(likes+dislikes)) < 60 +ORDER BY problem_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `problem_id` from `problems`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: (likes*100.0/(likes+dislikes)) < 60. +2. Project final output columns: `problem_id`. +3. Order output deterministically with `ORDER BY problem_id`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/2026. Low-Quality Problems (Easy).sql b/easy/2026. Low-Quality Problems (Easy).sql deleted file mode 100644 index 802ed81..0000000 --- a/easy/2026. Low-Quality Problems (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT problem_id -FROM problems_2026 -WHERE (likes*100.0/(likes+dislikes)) < 60 -ORDER BY problem_id; diff --git a/easy/2072. The Winner University (Easy).md b/easy/2072. The Winner University (Easy).md new file mode 100644 index 0000000..483012b --- /dev/null +++ b/easy/2072. The Winner University (Easy).md @@ -0,0 +1,86 @@ +# Question 2072: The Winner University + +**LeetCode URL:** https://leetcode.com/problems/the-winner-university/ + +## Description + +Drafted from this solution SQL: write a query on `newyork`, `california`, `ny`, `cf` to return `winner`. Apply filter conditions: score >= 90 ) SELECT CASE WHEN ny_cnt > cf_cnt THEN 'New York University' WHEN ny_cnt < cf_cnt THEN 'California University' ELSE 'No Winner' END AS winner FROM ny INNER JOIN cf ON true. + +## Table Schema Structure + +```sql +Create table If Not Exists NewYork (student_id int, score int); +Create table If Not Exists California (student_id int, score int); +``` + +## Sample Input Data + +```sql +insert into NewYork (student_id, score) values ('1', '90'); +insert into NewYork (student_id, score) values ('2', '87'); +insert into California (student_id, score) values ('2', '89'); +insert into California (student_id, score) values ('3', '88'); +``` + +## Expected Output Data + +```text ++--------+ +| winner | ++--------+ +| sample | ++--------+ +``` + +## SQL Solution + +```sql +WITH ny AS ( + SELECT COUNT(student_id) AS ny_cnt + FROM newyork_2072 + WHERE score >= 90 +), +cf AS ( + SELECT COUNT(student_id) AS cf_cnt + FROM california_2072 + WHERE score >= 90 +) +SELECT + CASE WHEN ny_cnt > cf_cnt THEN 'New York University' + WHEN ny_cnt < cf_cnt THEN 'California University' + ELSE 'No Winner' + END AS winner +FROM ny +INNER JOIN cf ON true; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `winner` from `newyork`, `california`, `ny`, `cf`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ny`, `cf`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ny`: reads `newyork`. +3. CTE `cf`: reads `california`. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Project final output columns: `winner`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/2072. The Winner University (Easy).sql b/easy/2072. The Winner University (Easy).sql deleted file mode 100644 index 48083e0..0000000 --- a/easy/2072. The Winner University (Easy).sql +++ /dev/null @@ -1,17 +0,0 @@ -WITH ny AS ( - SELECT COUNT(student_id) AS ny_cnt - FROM newyork_2072 - WHERE score >= 90 -), -cf AS ( - SELECT COUNT(student_id) AS cf_cnt - FROM california_2072 - WHERE score >= 90 -) -SELECT - CASE WHEN ny_cnt > cf_cnt THEN 'New York University' - WHEN ny_cnt < cf_cnt THEN 'California University' - ELSE 'No Winner' - END AS winner -FROM ny -INNER JOIN cf ON true; diff --git a/easy/2082. The Number of Rich Customers (Easy).md b/easy/2082. The Number of Rich Customers (Easy).md new file mode 100644 index 0000000..aacb288 --- /dev/null +++ b/easy/2082. The Number of Rich Customers (Easy).md @@ -0,0 +1,69 @@ +# Question 2082: The Number of Rich Customers + +**LeetCode URL:** https://leetcode.com/problems/the-number-of-rich-customers/ + +## Description + +Drafted from this solution SQL: write a query on `store` to return `rich_count`. Apply filter conditions: amount > 500. + +## Table Schema Structure + +```sql +Create table If Not Exists Store (bill_id int, customer_id int, amount int); +``` + +## Sample Input Data + +```sql +insert into Store (bill_id, customer_id, amount) values ('6', '1', '549'); +insert into Store (bill_id, customer_id, amount) values ('8', '1', '834'); +insert into Store (bill_id, customer_id, amount) values ('4', '2', '394'); +insert into Store (bill_id, customer_id, amount) values ('11', '3', '657'); +insert into Store (bill_id, customer_id, amount) values ('13', '3', '257'); +``` + +## Expected Output Data + +```text ++------------+ +| rich_count | ++------------+ +| sample | ++------------+ +``` + +## SQL Solution + +```sql +SELECT COUNT(DISTINCT customer_id) AS rich_count +FROM store_2082 +WHERE amount > 500; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `rich_count` from `store`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: amount > 500. +2. Project final output columns: `rich_count`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/2082. The Number of Rich Customers (Easy).sql b/easy/2082. The Number of Rich Customers (Easy).sql deleted file mode 100644 index 91973bb..0000000 --- a/easy/2082. The Number of Rich Customers (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT COUNT(DISTINCT customer_id) AS rich_count -FROM store_2082 -WHERE amount > 500; diff --git a/easy/2205. The Number of Users That Are Eligible for Discount (Easy).md b/easy/2205. The Number of Users That Are Eligible for Discount (Easy).md new file mode 100644 index 0000000..2b77bfd --- /dev/null +++ b/easy/2205. The Number of Users That Are Eligible for Discount (Easy).md @@ -0,0 +1,68 @@ +# Question 2205: The Number of Users That Are Eligible for Discount + +**LeetCode URL:** https://leetcode.com/problems/the-number-of-users-that-are-eligible-for-discount/ + +## Description + +Drafted from this solution SQL: write a query on `purchases` to return `user_cnt`. Apply filter conditions: (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000. + +## Table Schema Structure + +```sql +Create table If Not Exists Purchases (user_id int, time_stamp datetime, amount int); +``` + +## Sample Input Data + +```sql +insert into Purchases (user_id, time_stamp, amount) values ('1', '2022-04-20 09:03:00', '4416'); +insert into Purchases (user_id, time_stamp, amount) values ('2', '2022-03-19 19:24:02', '678'); +insert into Purchases (user_id, time_stamp, amount) values ('3', '2022-03-18 12:03:09', '4523'); +insert into Purchases (user_id, time_stamp, amount) values ('3', '2022-03-30 09:43:42', '626'); +``` + +## Expected Output Data + +```text ++----------+ +| user_cnt | ++----------+ +| sample | ++----------+ +``` + +## SQL Solution + +```sql +SELECT COUNT(DISTINCT user_id) AS user_cnt +FROM purchases_2205 +WHERE (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_cnt` from `purchases`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000. +2. Project final output columns: `user_cnt`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/2205. The Number of Users That Are Eligible for Discount (Easy).sql b/easy/2205. The Number of Users That Are Eligible for Discount (Easy).sql deleted file mode 100644 index 00e5d6d..0000000 --- a/easy/2205. The Number of Users That Are Eligible for Discount (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT COUNT(DISTINCT user_id) AS user_cnt -FROM purchases_2205 -WHERE (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000; diff --git a/easy/2230. The Users That Are Eligible for Discount (Easy).md b/easy/2230. The Users That Are Eligible for Discount (Easy).md new file mode 100644 index 0000000..2d7c42e --- /dev/null +++ b/easy/2230. The Users That Are Eligible for Discount (Easy).md @@ -0,0 +1,69 @@ +# Question 2230: The Users That Are Eligible for Discount + +**LeetCode URL:** https://leetcode.com/problems/the-users-that-are-eligible-for-discount/ + +## Description + +Drafted from this solution SQL: write a query on `purchases` to return `user_id`. Apply filter conditions: (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000. + +## Table Schema Structure + +```sql +Create table If Not Exists Purchases (user_id int, time_stamp datetime, amount int); +``` + +## Sample Input Data + +```sql +insert into Purchases (user_id, time_stamp, amount) values ('1', '2022-04-20 09:03:00', '4416'); +insert into Purchases (user_id, time_stamp, amount) values ('2', '2022-03-19 19:24:02', '678'); +insert into Purchases (user_id, time_stamp, amount) values ('3', '2022-03-18 12:03:09', '4523'); +insert into Purchases (user_id, time_stamp, amount) values ('3', '2022-03-30 09:43:42', '626'); +``` + +## Expected Output Data + +```text ++---------+ +| user_id | ++---------+ +| sample | ++---------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT user_id +FROM purchases_2230 +WHERE (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id` from `purchases`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000. +2. Project final output columns: `user_id`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/2230. The Users That Are Eligible for Discount (Easy).sql b/easy/2230. The Users That Are Eligible for Discount (Easy).sql deleted file mode 100644 index 85a10ce..0000000 --- a/easy/2230. The Users That Are Eligible for Discount (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT DISTINCT user_id -FROM purchases_2230 -WHERE (DATE_TRUNC('DAY',time_stamp) BETWEEN '2022-03-08' AND '2022-03-20') AND amount >= 1000; diff --git a/easy/2329. Product Sales Analysis V (Easy).md b/easy/2329. Product Sales Analysis V (Easy).md new file mode 100644 index 0000000..5c0fa4e --- /dev/null +++ b/easy/2329. Product Sales Analysis V (Easy).md @@ -0,0 +1,78 @@ +# Question 2329: Product Sales Analysis V + +**LeetCode URL:** https://leetcode.com/problems/product-sales-analysis-v/ + +## Description + +Drafted from this solution SQL: write a query on `sales`, `product` to return `user_id`, `spending`. Group results by: user_id ORDER BY spending DESC,user_id. Order the final output by: spending DESC,user_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Sales (sale_id int, product_id int, user_id int, quantity int); +Create table If Not Exists Product (product_id int, price int); +``` + +## Sample Input Data + +```sql +insert into Sales (sale_id, product_id, user_id, quantity) values ('1', '1', '101', '10'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('2', '2', '101', '1'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('3', '3', '102', '3'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('4', '3', '102', '2'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('5', '2', '103', '3'); +insert into Product (product_id, price) values ('1', '10'); +insert into Product (product_id, price) values ('2', '25'); +insert into Product (product_id, price) values ('3', '15'); +``` + +## Expected Output Data + +```text ++---------+----------+ +| user_id | spending | ++---------+----------+ +| sample | sample | ++---------+----------+ +``` + +## SQL Solution + +```sql +SELECT user_id,SUM(s.quantity*p.price) AS spending +FROM sales_2329 s +INNER JOIN product_2329 p ON s.product_id = p.product_id +GROUP BY user_id +ORDER BY spending DESC,user_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id`, `spending` from `sales`, `product`. + +### Result Grain + +One row per unique key in `GROUP BY user_id`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by user_id. +3. Project final output columns: `user_id`, `spending`. +4. Order output deterministically with `ORDER BY spending DESC,user_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/2329. Product Sales Analysis V (Easy).sql b/easy/2329. Product Sales Analysis V (Easy).sql deleted file mode 100644 index 17d9339..0000000 --- a/easy/2329. Product Sales Analysis V (Easy).sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT user_id,SUM(s.quantity*p.price) AS spending -FROM sales_2329 s -INNER JOIN product_2329 p ON s.product_id = p.product_id -GROUP BY user_id -ORDER BY spending DESC,user_id; diff --git a/easy/2339. All the Matches of the League (Easy).md b/easy/2339. All the Matches of the League (Easy).md new file mode 100644 index 0000000..e8248ac --- /dev/null +++ b/easy/2339. All the Matches of the League (Easy).md @@ -0,0 +1,67 @@ +# Question 2339: All the Matches of the League + +**LeetCode URL:** https://leetcode.com/problems/all-the-matches-of-the-league/ + +## Description + +Drafted from this solution SQL: write a query on `teams` to return `home_team`, `away_team`. + +## Table Schema Structure + +```sql +Create table If Not Exists Teams (team_name varchar(50)); +``` + +## Sample Input Data + +```sql +insert into Teams (team_name) values ('Leetcode FC'); +insert into Teams (team_name) values ('Ahly SC'); +insert into Teams (team_name) values ('Real Madrid'); +``` + +## Expected Output Data + +```text ++-----------+-----------+ +| home_team | away_team | ++-----------+-----------+ +| sample | sample | ++-----------+-----------+ +``` + +## SQL Solution + +```sql +SELECT t1.team_name AS home_team,t2.team_name AS away_team +FROM teams_2339 t1 +INNER JOIN teams_2339 t2 ON t1.team_name <> t2.team_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `home_team`, `away_team` from `teams`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `home_team`, `away_team`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/2339. All the Matches of the League (Easy).sql b/easy/2339. All the Matches of the League (Easy).sql deleted file mode 100644 index 1f78620..0000000 --- a/easy/2339. All the Matches of the League (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT t1.team_name AS home_team,t2.team_name AS away_team -FROM teams_2339 t1 -INNER JOIN teams_2339 t2 ON t1.team_name <> t2.team_name; diff --git a/easy/2356. Number of Unique Subjects Taught by Each Teacher (Easy).md b/easy/2356. Number of Unique Subjects Taught by Each Teacher (Easy).md new file mode 100644 index 0000000..54ef09b --- /dev/null +++ b/easy/2356. Number of Unique Subjects Taught by Each Teacher (Easy).md @@ -0,0 +1,72 @@ +# Question 2356: Number of Unique Subjects Taught by Each Teacher + +**LeetCode URL:** https://leetcode.com/problems/number-of-unique-subjects-taught-by-each-teacher/ + +## Description + +Write a solution to calculate the number of unique subjects each teacher teaches in the university. Return the result table in any order. The result format is shown in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Teacher (teacher_id int, subject_id int, dept_id int); +``` + +## Sample Input Data + +```sql +insert into Teacher (teacher_id, subject_id, dept_id) values ('1', '2', '3'); +insert into Teacher (teacher_id, subject_id, dept_id) values ('1', '2', '4'); +insert into Teacher (teacher_id, subject_id, dept_id) values ('1', '3', '3'); +insert into Teacher (teacher_id, subject_id, dept_id) values ('2', '1', '1'); +insert into Teacher (teacher_id, subject_id, dept_id) values ('2', '2', '1'); +insert into Teacher (teacher_id, subject_id, dept_id) values ('2', '3', '1'); +insert into Teacher (teacher_id, subject_id, dept_id) values ('2', '4', '1'); +``` + +## Expected Output Data + +```text ++------------+-----+ +| teacher_id | cnt | ++------------+-----+ +| 1 | 2 | +| 2 | 4 | ++------------+-----+ +``` + +## SQL Solution + +```sql +SELECT teacher_id,COUNT(DISTINCT subject_id) +FROM teacher_2356 +GROUP BY teacher_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `teacher_id` from `teacher`. + +### Result Grain + +One row per unique key in `GROUP BY teacher_id`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by teacher_id. +2. Project final output columns: `teacher_id`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/2356. Number of Unique Subjects Taught by Each Teacher (Easy).sql b/easy/2356. Number of Unique Subjects Taught by Each Teacher (Easy).sql deleted file mode 100644 index 6bb5150..0000000 --- a/easy/2356. Number of Unique Subjects Taught by Each Teacher (Easy).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT teacher_id,COUNT(DISTINCT subject_id) -FROM teacher_2356 -GROUP BY teacher_id; - diff --git a/easy/2377. Sort the Olympic Table (Easy).md b/easy/2377. Sort the Olympic Table (Easy).md new file mode 100644 index 0000000..77e422e --- /dev/null +++ b/easy/2377. Sort the Olympic Table (Easy).md @@ -0,0 +1,68 @@ +# Question 2377: Sort the Olympic Table + +**LeetCode URL:** https://leetcode.com/problems/sort-the-olympic-table/ + +## Description + +Drafted from this solution SQL: write a query on `olympic` to return the required result columns. Order the final output by: gold_medals DESC,silver_medals DESC,bronze_medals DESC,country ASC. + +## Table Schema Structure + +```sql +Create table If Not Exists Olympic (country varchar(50), gold_medals int, silver_medals int, bronze_medals int); +``` + +## Sample Input Data + +```sql +insert into Olympic (country, gold_medals, silver_medals, bronze_medals) values ('China', '10', '10', '20'); +insert into Olympic (country, gold_medals, silver_medals, bronze_medals) values ('South Sudan', '0', '0', '1'); +insert into Olympic (country, gold_medals, silver_medals, bronze_medals) values ('USA', '10', '10', '20'); +insert into Olympic (country, gold_medals, silver_medals, bronze_medals) values ('Israel', '2', '2', '3'); +insert into Olympic (country, gold_medals, silver_medals, bronze_medals) values ('Egypt', '2', '2', '2'); +``` + +## Expected Output Data + +```text ++------------------+ +| result | ++------------------+ +| derived values | ++------------------+ +``` + +## SQL Solution + +```sql +SELECT * +FROM olympic_2377 +ORDER BY gold_medals DESC,silver_medals DESC,bronze_medals DESC,country ASC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `olympic`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Order output deterministically with `ORDER BY gold_medals DESC,silver_medals DESC,bronze_medals DESC,country ASC`. + +### Why This Works + +The query maps input columns directly to the requested output shape with minimal transformation. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/2377. Sort the Olympic Table (Easy).sql b/easy/2377. Sort the Olympic Table (Easy).sql deleted file mode 100644 index cbb5c90..0000000 --- a/easy/2377. Sort the Olympic Table (Easy).sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT * -FROM olympic_2377 -ORDER BY gold_medals DESC,silver_medals DESC,bronze_medals DESC,country ASC; diff --git a/easy/511. Game Play Analysis I.md b/easy/511. Game Play Analysis I.md new file mode 100644 index 0000000..12b14b5 --- /dev/null +++ b/easy/511. Game Play Analysis I.md @@ -0,0 +1,88 @@ +# Question 511: Game Play Analysis I + +**LeetCode URL:** https://leetcode.com/problems/game-play-analysis-i/ + +## Description + +The query result format is in the following example: Activity table: +-----------+-----------+------------+--------------+ | player_id | device_id | event_date | games_played | +-----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-05-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-02 | 0 | | 3 | 4 | 2018-07-03 | 5 | +-----------+-----------+------------+--------------+ Result table: +-----------+-------------+ | player_id | first_login | +-----------+-------------+ | 1 | 2016-03-01 | | 2 | 2017-06-25 | | 3 | 2016-03-02 | +-----------+-------------+ Difficulty: Easy Lock: Prime Company: GSN Games Problem Solution 511-Game-Play-Analysis-I All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int); +``` + +## Sample Input Data + +```sql +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5'); +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-05-02', '6'); +insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5'); +``` + +## Expected Output Data + +```text ++-----------+-------------+ +| player_id | first_login | ++-----------+-------------+ +| 1 | 2016-03-01 | +| 2 | 2017-06-25 | +| 3 | 2016-03-02 | ++-----------+-------------+ +``` + +## SQL Solution + +```sql +SELECT player_id,MIN(event_date) AS first_login +FROM activity_511 +GROUP BY player_id +ORDER BY player_id; + +(OR) + +WITH ranked AS( + SELECT *,DENSE_RANK() OVER w AS rnk + FROM activity_511 + WINDOW w AS (PARTITION BY player_id ORDER BY event_date) +) + +SELECT player_id,event_date +FROM ranked +WHERE rnk = 1 +ORDER BY player_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `player_id`, `event_date` from `activity`, `ranked`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `activity`. +3. Apply row-level filtering in `WHERE`: rnk = 1. +4. Project final output columns: `player_id`, `event_date`. +5. Order output deterministically with `ORDER BY player_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/511. Game Play Analysis I.sql b/easy/511. Game Play Analysis I.sql deleted file mode 100644 index 247a0b7..0000000 --- a/easy/511. Game Play Analysis I.sql +++ /dev/null @@ -1,17 +0,0 @@ -SELECT player_id,MIN(event_date) AS first_login -FROM activity_511 -GROUP BY player_id -ORDER BY player_id; - -(OR) - -WITH ranked AS( - SELECT *,DENSE_RANK() OVER w AS rnk - FROM activity_511 - WINDOW w AS (PARTITION BY player_id ORDER BY event_date) -) - -SELECT player_id,event_date -FROM ranked -WHERE rnk = 1 -ORDER BY player_id; diff --git a/easy/512. Game Play Analysis II.md b/easy/512. Game Play Analysis II.md new file mode 100644 index 0000000..59997b8 --- /dev/null +++ b/easy/512. Game Play Analysis II.md @@ -0,0 +1,94 @@ +# Question 512: Game Play Analysis II + +**LeetCode URL:** https://leetcode.com/problems/game-play-analysis-ii/ + +## Description + +The query result format is in the following example: Activity table: +-----------+-----------+------------+--------------+ | player_id | device_id | event_date | games_played | +-----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-05-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-02 | 0 | | 3 | 4 | 2018-07-03 | 5 | +-----------+-----------+------------+--------------+ Result table: +-----------+-----------+ | player_id | device_id | +-----------+-----------+ | 1 | 2 | | 2 | 3 | | 3 | 1 | +-----------+-----------+ Difficulty: Easy Lock: Prime Company: GSN Games Problem Solution 512-Game-Play-Analysis-II All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int); +``` + +## Sample Input Data + +```sql +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5'); +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-05-02', '6'); +insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5'); +``` + +## Expected Output Data + +```text ++-----------+-----------+ +| player_id | device_id | ++-----------+-----------+ +| 1 | 2 | +| 2 | 3 | +| 3 | 1 | ++-----------+-----------+ +``` + +## SQL Solution + +```sql +WITH ranked AS( + SELECT *,DENSE_RANK() OVER w AS rnk + FROM activity_511 + WINDOW w AS (PARTITION BY player_id ORDER BY event_date) +) + +SELECT player_id,device_id +FROM ranked +WHERE rnk = 1 +ORDER BY player_id; + + +(OR) + + +WITH cte AS( + SELECT player_id,MIN(event_date) AS first_login + FROM activity_511 + GROUP BY player_id +) + +SELECT player_id,device_id +FROM activity_511 +WHERE (player_id,event_date) IN (SELECT * FROM cte); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `player_id`, `device_id` from `activity`, `ranked`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`, `cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `activity`. +3. Apply row-level filtering in `WHERE`: (player_id,event_date) IN (SELECT * FROM cte). +4. Project final output columns: `player_id`, `device_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/512. Game Play Analysis II.sql b/easy/512. Game Play Analysis II.sql deleted file mode 100644 index bd51641..0000000 --- a/easy/512. Game Play Analysis II.sql +++ /dev/null @@ -1,24 +0,0 @@ -WITH ranked AS( - SELECT *,DENSE_RANK() OVER w AS rnk - FROM activity_511 - WINDOW w AS (PARTITION BY player_id ORDER BY event_date) -) - -SELECT player_id,device_id -FROM ranked -WHERE rnk = 1 -ORDER BY player_id; - - -(OR) - - -WITH cte AS( - SELECT player_id,MIN(event_date) AS first_login - FROM activity_511 - GROUP BY player_id -) - -SELECT player_id,device_id -FROM activity_511 -WHERE (player_id,event_date) IN (SELECT * FROM cte); diff --git a/easy/577. Employee Bonus.md b/easy/577. Employee Bonus.md new file mode 100644 index 0000000..df51f6b --- /dev/null +++ b/easy/577. Employee Bonus.md @@ -0,0 +1,71 @@ +# Question 577: Employee Bonus + +**LeetCode URL:** https://leetcode.com/problems/employee-bonus/ + +## Description + +Select all employee's name and bonus whose bonus is < 1000. Table:Employee empId is the primary key column for this table. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (empId int, name varchar(255), supervisor int, salary int); +Create table If Not Exists Bonus (empId int, bonus int); +``` + +## Sample Input Data + +```sql +insert into Employee (empId, name, supervisor, salary) values ('3', 'Brad', NULL, '4000'); +insert into Employee (empId, name, supervisor, salary) values ('1', 'John', '3', '1000'); +insert into Employee (empId, name, supervisor, salary) values ('2', 'Dan', '3', '2000'); +insert into Employee (empId, name, supervisor, salary) values ('4', 'Thomas', '3', '4000'); +insert into Bonus (empId, bonus) values ('2', '500'); +insert into Bonus (empId, bonus) values ('4', '2000'); +``` + +## Expected Output Data + +```text ++--------+--------+ +| name | bonus | ++--------+--------+ +| sample | sample | ++--------+--------+ +``` + +## SQL Solution + +```sql +SELECT name,bonus +FROM employee_577 e +JOIN bonus_577 b ON e.empId = b.empId AND b.bonus < 1000; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name`, `bonus` from `employee`, `bonus`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `name`, `bonus`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/577. Employee Bonus.sql b/easy/577. Employee Bonus.sql deleted file mode 100644 index 0ada1bf..0000000 --- a/easy/577. Employee Bonus.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT name,bonus -FROM employee_577 e -JOIN bonus_577 b ON e.empId = b.empId AND b.bonus < 1000; diff --git a/easy/584. Find Customer Referee.md b/easy/584. Find Customer Referee.md new file mode 100644 index 0000000..04b747e --- /dev/null +++ b/easy/584. Find Customer Referee.md @@ -0,0 +1,71 @@ +# Question 584: Find Customer Referee + +**LeetCode URL:** https://leetcode.com/problems/find-customer-referee/ + +## Description + +return the list of customers NOT referred by the person with id '2'. + +## Table Schema Structure + +```sql +Create table If Not Exists Customer (id int, name varchar(25), referee_id int); +``` + +## Sample Input Data + +```sql +insert into Customer (id, name, referee_id) values ('1', 'Will', NULL); +insert into Customer (id, name, referee_id) values ('2', 'Jane', NULL); +insert into Customer (id, name, referee_id) values ('3', 'Alex', '2'); +insert into Customer (id, name, referee_id) values ('4', 'Bill', NULL); +insert into Customer (id, name, referee_id) values ('5', 'Zack', '1'); +insert into Customer (id, name, referee_id) values ('6', 'Mark', '2'); +``` + +## Expected Output Data + +```text ++------------------+ +| result | ++------------------+ +| derived values | ++------------------+ +``` + +## SQL Solution + +```sql +SELECT * FROM customer_584 WHERE reference_id <> 21 OR reference_id IS NULL; + +SELECT * FROM customer_584 +EXCEPT +SELECT * FROM customer_584 WHERE reference_id = 2; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `customer`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: reference_id = 2. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/584. Find Customer Referee.sql b/easy/584. Find Customer Referee.sql deleted file mode 100644 index 8dc48b9..0000000 --- a/easy/584. Find Customer Referee.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT * FROM customer_584 WHERE reference_id <> 21 OR reference_id IS NULL; - -SELECT * FROM customer_584 -EXCEPT -SELECT * FROM customer_584 WHERE reference_id = 2; diff --git a/easy/586. Customer Placing the Largest Number of Orders.md b/easy/586. Customer Placing the Largest Number of Orders.md new file mode 100644 index 0000000..1be4b54 --- /dev/null +++ b/easy/586. Customer Placing the Largest Number of Orders.md @@ -0,0 +1,88 @@ +# Question 586: Customer Placing the Largest Number of Orders + +**LeetCode URL:** https://leetcode.com/problems/customer-placing-the-largest-number-of-orders/ + +## Description + +Write a solution to find the customer_number for the customer who has placed the largest number of orders. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists orders (order_number int, customer_number int); +``` + +## Sample Input Data + +```sql +insert into orders (order_number, customer_number) values ('1', '1'); +insert into orders (order_number, customer_number) values ('2', '2'); +insert into orders (order_number, customer_number) values ('3', '3'); +insert into orders (order_number, customer_number) values ('4', '3'); +``` + +## Expected Output Data + +```text ++-----------------+ +| customer_number | ++-----------------+ +| 3 | ++-----------------+ +``` + +## SQL Solution + +```sql +SELECT customer_number +FROM orders_586 +GROUP BY customer_number +ORDER BY COUNT(order_number) DESC +LIMIT 1; + + +--ANSWER OF EXTRA QUESTION: + +WITH cte AS ( +SELECT COUNT(order_number) AS count +FROM orders_586 +GROUP BY customer_number +ORDER BY count DESC +LIMIT 1) + +SELECT customer_number +FROM orders_586 +GROUP BY customer_number +HAVING COUNT(order_number) IN (SELECT count FROM cte); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_number` from `orders`, `cte`. + +### Result Grain + +One row per unique key in `GROUP BY customer_number`. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `orders`. +3. Aggregate rows with COUNT grouped by customer_number. +4. Project final output columns: `customer_number`. +5. Filter aggregated groups in `HAVING`: COUNT(order_number) IN (SELECT count FROM cte). + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/586. Customer Placing the Largest Number of Orders.sql b/easy/586. Customer Placing the Largest Number of Orders.sql deleted file mode 100644 index d1183dc..0000000 --- a/easy/586. Customer Placing the Largest Number of Orders.sql +++ /dev/null @@ -1,20 +0,0 @@ -SELECT customer_number -FROM orders_586 -GROUP BY customer_number -ORDER BY COUNT(order_number) DESC -LIMIT 1; - - ---ANSWER OF EXTRA QUESTION: - -WITH cte AS ( -SELECT COUNT(order_number) AS count -FROM orders_586 -GROUP BY customer_number -ORDER BY count DESC -LIMIT 1) - -SELECT customer_number -FROM orders_586 -GROUP BY customer_number -HAVING COUNT(order_number) IN (SELECT count FROM cte); diff --git a/easy/595. Big Countries.md b/easy/595. Big Countries.md new file mode 100644 index 0000000..d8910bb --- /dev/null +++ b/easy/595. Big Countries.md @@ -0,0 +1,70 @@ +# Question 595: Big Countries + +**LeetCode URL:** https://leetcode.com/problems/big-countries/ + +## Description + +Write a solution to find the name, population, and area of the big countries. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists World (name varchar(255), continent varchar(255), area int, population int, gdp bigint); +``` + +## Sample Input Data + +```sql +insert into World (name, continent, area, population, gdp) values ('Afghanistan', 'Asia', '652230', '25500100', '20343000000'); +insert into World (name, continent, area, population, gdp) values ('Albania', 'Europe', '28748', '2831741', '12960000000'); +insert into World (name, continent, area, population, gdp) values ('Algeria', 'Africa', '2381741', '37100000', '188681000000'); +insert into World (name, continent, area, population, gdp) values ('Andorra', 'Europe', '468', '78115', '3712000000'); +insert into World (name, continent, area, population, gdp) values ('Angola', 'Africa', '1246700', '20609294', '100990000000'); +``` + +## Expected Output Data + +```text ++--------------+-------------+--------------+ +| name | population | area | ++--------------+-------------+--------------+ +| Afghanistan | 25500100 | 652230 | +| Algeria | 37100000 | 2381741 | ++--------------+-------------+--------------+ +``` + +## SQL Solution + +```sql +SELECT name,population,area +FROM world_595 +WHERE area > 3000000 OR population > 25000000; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name`, `population`, `area` from `world`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: area > 3000000 OR population > 25000000. +2. Project final output columns: `name`, `population`, `area`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/595. Big Countries.sql b/easy/595. Big Countries.sql deleted file mode 100644 index 2018b80..0000000 --- a/easy/595. Big Countries.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT name,population,area -FROM world_595 -WHERE area > 3000000 OR population > 25000000; diff --git a/easy/596. Classes More Than 5 Students.md b/easy/596. Classes More Than 5 Students.md new file mode 100644 index 0000000..111b1af --- /dev/null +++ b/easy/596. Classes More Than 5 Students.md @@ -0,0 +1,75 @@ +# Question 596: Classes With at Least 5 Students + +**LeetCode URL:** https://leetcode.com/problems/classes-with-at-least-5-students/ + +## Description + +Write a solution to find all the classes that have at least five students. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Courses (student varchar(255), class varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Courses (student, class) values ('A', 'Math'); +insert into Courses (student, class) values ('B', 'English'); +insert into Courses (student, class) values ('C', 'Math'); +insert into Courses (student, class) values ('D', 'Biology'); +insert into Courses (student, class) values ('E', 'Math'); +insert into Courses (student, class) values ('F', 'Computer'); +insert into Courses (student, class) values ('G', 'Math'); +insert into Courses (student, class) values ('H', 'Math'); +insert into Courses (student, class) values ('I', 'Math'); +``` + +## Expected Output Data + +```text ++---------+ +| class | ++---------+ +| Math | ++---------+ +``` + +## SQL Solution + +```sql +SELECT class +FROM courses_596 +GROUP BY class +HAVING COUNT(DISTINCT student) >=5; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `class` from `courses`. + +### Result Grain + +One row per unique key in `GROUP BY class`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by class. +2. Project final output columns: `class`. +3. Filter aggregated groups in `HAVING`: COUNT(DISTINCT student) >=5. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/596. Classes More Than 5 Students.sql b/easy/596. Classes More Than 5 Students.sql deleted file mode 100644 index 95a5b81..0000000 --- a/easy/596. Classes More Than 5 Students.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT class -FROM courses_596 -GROUP BY class -HAVING COUNT(DISTINCT student) >=5; diff --git a/easy/597. Friend Requests I: Overall Acceptance Rate.md b/easy/597. Friend Requests I: Overall Acceptance Rate.md new file mode 100644 index 0000000..2489f75 --- /dev/null +++ b/easy/597. Friend Requests I: Overall Acceptance Rate.md @@ -0,0 +1,75 @@ +# Question 597: Friend Requests I: Overall Acceptance Rate + +**LeetCode URL:** https://leetcode.com/problems/friend-requests-i-overall-acceptance-rate/ + +## Description + +return the following result. + +## Table Schema Structure + +```sql +Create table If Not Exists FriendRequest (sender_id int, send_to_id int, request_date date); +Create table If Not Exists RequestAccepted (requester_id int, accepter_id int, accept_date date); +``` + +## Sample Input Data + +```sql +insert into FriendRequest (sender_id, send_to_id, request_date) values ('1', '2', '2016/06/01'); +insert into FriendRequest (sender_id, send_to_id, request_date) values ('1', '3', '2016/06/01'); +insert into FriendRequest (sender_id, send_to_id, request_date) values ('1', '4', '2016/06/01'); +insert into FriendRequest (sender_id, send_to_id, request_date) values ('2', '3', '2016/06/02'); +insert into FriendRequest (sender_id, send_to_id, request_date) values ('3', '4', '2016/06/09'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('1', '2', '2016/06/03'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('1', '3', '2016/06/08'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('2', '3', '2016/06/08'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('3', '4', '2016/06/09'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('3', '4', '2016/06/10'); +``` + +## Expected Output Data + +```text ++-------------+ +| accept_rate | ++-------------+ +| sample | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT ROUND(COUNT(DISTINCT CASE WHEN a.requester_id IS NOT NULL AND a.accepter_id IS NOT NULL THEN CONCAT(a.requester_id,' ',a.accepter_id) END)::NUMERIC/COUNT(DISTINCT CASE WHEN r.sender_id IS NOT NULL AND r.send_to_id IS NOT NULL THEN CONCAT(r.sender_id,' ',r.send_to_id) END ),2) AS accept_rate +FROM friend_request_597 r +LEFT JOIN request_accepted_597 a ON r.sender_id = a.requester_id AND r.send_to_id = a.accepter_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `accept_rate` from `friend_request`, `request_accepted`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `accept_rate`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/597. Friend Requests I: Overall Acceptance Rate.sql b/easy/597. Friend Requests I: Overall Acceptance Rate.sql deleted file mode 100644 index 622f8fd..0000000 --- a/easy/597. Friend Requests I: Overall Acceptance Rate.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT ROUND(COUNT(DISTINCT CASE WHEN a.requester_id IS NOT NULL AND a.accepter_id IS NOT NULL THEN CONCAT(a.requester_id,' ',a.accepter_id) END)::NUMERIC/COUNT(DISTINCT CASE WHEN r.sender_id IS NOT NULL AND r.send_to_id IS NOT NULL THEN CONCAT(r.sender_id,' ',r.send_to_id) END ),2) AS accept_rate -FROM friend_request_597 r -LEFT JOIN request_accepted_597 a ON r.sender_id = a.requester_id AND r.send_to_id = a.accepter_id; diff --git a/easy/603. Consecutive Available Seats.md b/easy/603. Consecutive Available Seats.md new file mode 100644 index 0000000..4fd0dc0 --- /dev/null +++ b/easy/603. Consecutive Available Seats.md @@ -0,0 +1,106 @@ +# Question 603: Consecutive Available Seats + +**LeetCode URL:** https://leetcode.com/problems/consecutive-available-seats/ + +## Description + +return the following result for the sample case above. + +## Table Schema Structure + +```sql +Create table If Not Exists Cinema (seat_id int primary key auto_increment, free bool); +``` + +## Sample Input Data + +```sql +insert into Cinema (seat_id, free) values ('1', '1'); +insert into Cinema (seat_id, free) values ('2', '0'); +insert into Cinema (seat_id, free) values ('3', '1'); +insert into Cinema (seat_id, free) values ('4', '1'); +insert into Cinema (seat_id, free) values ('5', '1'); +``` + +## Expected Output Data + +```text ++---------+ +| seat_id | ++---------+ +| sample | ++---------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT c1.seat_id +FROM cinema_603 c1 +INNER JOIN cinema_603 c2 ON ABS(c1.seat_id-c2.seat_id)=1 +WHERE c1.free AND c2.free; + +------------------------------------------------------------------------------------------------------------------------------- + +WITH cte AS( + SELECT seat_id,free, + LEAD(free) OVER() as next_seat, + LAG(free) OVER() as prev_seat + FROM cinema_603 +) + +SELECT DISTINCT seat_id +FROM cte +WHERE (free=1 AND next_seat=1) OR (free=1 AND prev_seat=1); + +------------------------------------------------------------------------------------------------------------------------------- + +WITH ranked AS ( + SELECT *, + seat_id-ROW_NUMBER () OVER (ORDER BY seat_id) AS diff + FROM cinema_603 + WHERE free = 1 +), +consecutive_free_seats AS ( + SELECT *, + COUNT(seat_id) OVER (PARTITION BY diff) AS cnt + FROM ranked +) +SELECT seat_id +FROM consecutive_free_seats +WHERE cnt >= 2; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `seat_id` from `cinema`, `cte`, `ranked`, `consecutive_free_seats`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `ranked`, `consecutive_free_seats`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `cinema`. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Apply row-level filtering in `WHERE`: cnt >= 2. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `seat_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/603. Consecutive Available Seats.sql b/easy/603. Consecutive Available Seats.sql deleted file mode 100644 index df4588b..0000000 --- a/easy/603. Consecutive Available Seats.sql +++ /dev/null @@ -1,34 +0,0 @@ -SELECT DISTINCT c1.seat_id -FROM cinema_603 c1 -INNER JOIN cinema_603 c2 ON ABS(c1.seat_id-c2.seat_id)=1 -WHERE c1.free AND c2.free; - -------------------------------------------------------------------------------------------------------------------------------- - -WITH cte AS( - SELECT seat_id,free, - LEAD(free) OVER() as next_seat, - LAG(free) OVER() as prev_seat - FROM cinema_603 -) - -SELECT DISTINCT seat_id -FROM cte -WHERE (free=1 AND next_seat=1) OR (free=1 AND prev_seat=1); - -------------------------------------------------------------------------------------------------------------------------------- - -WITH ranked AS ( - SELECT *, - seat_id-ROW_NUMBER () OVER (ORDER BY seat_id) AS diff - FROM cinema_603 - WHERE free = 1 -), -consecutive_free_seats AS ( - SELECT *, - COUNT(seat_id) OVER (PARTITION BY diff) AS cnt - FROM ranked -) -SELECT seat_id -FROM consecutive_free_seats -WHERE cnt >= 2; diff --git a/easy/607.Sales Person.md b/easy/607.Sales Person.md new file mode 100644 index 0000000..f2f96fd --- /dev/null +++ b/easy/607.Sales Person.md @@ -0,0 +1,95 @@ +# Question 607: Sales Person + +**LeetCode URL:** https://leetcode.com/problems/sales-person/ + +## Description + +Write a solution to find the names of all the salespersons who did not have any orders related to the company with the name "RED". Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists SalesPerson (sales_id int, name varchar(255), salary int, commission_rate int, hire_date date); +Create table If Not Exists Company (com_id int, name varchar(255), city varchar(255)); +Create table If Not Exists Orders (order_id int, order_date date, com_id int, sales_id int, amount int); +``` + +## Sample Input Data + +```sql +insert into SalesPerson (sales_id, name, salary, commission_rate, hire_date) values ('1', 'John', '100000', '6', '4/1/2006'); +insert into SalesPerson (sales_id, name, salary, commission_rate, hire_date) values ('2', 'Amy', '12000', '5', '5/1/2010'); +insert into SalesPerson (sales_id, name, salary, commission_rate, hire_date) values ('3', 'Mark', '65000', '12', '12/25/2008'); +insert into SalesPerson (sales_id, name, salary, commission_rate, hire_date) values ('4', 'Pam', '25000', '25', '1/1/2005'); +insert into SalesPerson (sales_id, name, salary, commission_rate, hire_date) values ('5', 'Alex', '5000', '10', '2/3/2007'); +insert into Company (com_id, name, city) values ('1', 'RED', 'Boston'); +insert into Company (com_id, name, city) values ('2', 'ORANGE', 'New York'); +insert into Company (com_id, name, city) values ('3', 'YELLOW', 'Boston'); +insert into Company (com_id, name, city) values ('4', 'GREEN', 'Austin'); +insert into Orders (order_id, order_date, com_id, sales_id, amount) values ('1', '1/1/2014', '3', '4', '10000'); +insert into Orders (order_id, order_date, com_id, sales_id, amount) values ('2', '2/1/2014', '4', '5', '5000'); +insert into Orders (order_id, order_date, com_id, sales_id, amount) values ('3', '3/1/2014', '1', '1', '50000'); +insert into Orders (order_id, order_date, com_id, sales_id, amount) values ('4', '4/1/2014', '1', '4', '25000'); +``` + +## Expected Output Data + +```text ++------+ +| name | ++------+ +| Amy | +| Mark | +| Alex | ++------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT s.sales_id + FROM orders_607 o + JOIN company_607 c ON o.com_id = c.com_id + JOIN salesperson_607 s ON o.sales_id = s.sales_id + WHERE c.name like 'RED' +) + +SELECT DISTINCT name +FROM salesperson_607 +WHERE sales_id NOT IN (SELECT * + FROM cte); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name` from `orders`, `company`, `salesperson`, `cte`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `orders`, `company`, `salesperson`. +3. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Apply row-level filtering in `WHERE`: sales_id NOT IN (SELECT * FROM cte). +5. Project final output columns: `name`. +6. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/607.Sales Person.sql b/easy/607.Sales Person.sql deleted file mode 100644 index 4353bf9..0000000 --- a/easy/607.Sales Person.sql +++ /dev/null @@ -1,12 +0,0 @@ -WITH cte AS( - SELECT s.sales_id - FROM orders_607 o - JOIN company_607 c ON o.com_id = c.com_id - JOIN salesperson_607 s ON o.sales_id = s.sales_id - WHERE c.name like 'RED' -) - -SELECT DISTINCT name -FROM salesperson_607 -WHERE sales_id NOT IN (SELECT * - FROM cte); diff --git a/easy/610. Triangle Judgement.md b/easy/610. Triangle Judgement.md new file mode 100644 index 0000000..2f0122b --- /dev/null +++ b/easy/610. Triangle Judgement.md @@ -0,0 +1,67 @@ +# Question 610: Triangle Judgement + +**LeetCode URL:** https://leetcode.com/problems/triangle-judgement/ + +## Description + +return the follow result: | x | y | z | triangle | |----|----|----|----------| | 13 | 15 | 30 | No | | 10 | 20 | 15 | Yes | Difficulty: Easy Lock: Prime Company: Unknown Problem Solution 610-Triangle-Judgement All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Triangle (x int, y int, z int); +``` + +## Sample Input Data + +```sql +insert into Triangle (x, y, z) values ('13', '15', '30'); +insert into Triangle (x, y, z) values ('10', '20', '15'); +``` + +## Expected Output Data + +```text ++----------+ +| triangle | ++----------+ +| sample | ++----------+ +``` + +## SQL Solution + +```sql +SELECT *, + CASE WHEN x+y>z AND x+z>y AND y+z>x THEN 'Yes' + ELSE 'No' + END AS triangle +FROM triangle_610; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `triangle` from `triangle`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Project final output columns: `triangle`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/610. Triangle Judgement.sql b/easy/610. Triangle Judgement.sql deleted file mode 100644 index 65a5972..0000000 --- a/easy/610. Triangle Judgement.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT *, - CASE WHEN x+y>z AND x+z>y AND y+z>x THEN 'Yes' - ELSE 'No' - END AS triangle -FROM triangle_610; diff --git a/easy/613. Shortest Distance in a Line.md b/easy/613. Shortest Distance in a Line.md new file mode 100644 index 0000000..7201f51 --- /dev/null +++ b/easy/613. Shortest Distance in a Line.md @@ -0,0 +1,67 @@ +# Question 613: Shortest Distance in a Line + +**LeetCode URL:** https://leetcode.com/problems/shortest-distance-in-a-line/ + +## Description + +Drafted from this solution SQL: write a query on `point` to return `shortest`. + +## Table Schema Structure + +```sql +Create Table If Not Exists Point (x int not null); +``` + +## Sample Input Data + +```sql +insert into Point (x) values ('-1'); +insert into Point (x) values ('0'); +insert into Point (x) values ('2'); +``` + +## Expected Output Data + +```text ++----------+ +| shortest | ++----------+ +| sample | ++----------+ +``` + +## SQL Solution + +```sql +SELECT MIN(ABS(ABS(a.x)-ABS(b.x))) AS shortest +FROM point_613 a +JOIN point_613 b ON a.x <> b.x; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `shortest` from `point`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `shortest`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/easy/613. Shortest Distance in a Line.sql b/easy/613. Shortest Distance in a Line.sql deleted file mode 100644 index 5b1173c..0000000 --- a/easy/613. Shortest Distance in a Line.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT MIN(ABS(ABS(a.x)-ABS(b.x))) AS shortest -FROM point_613 a -JOIN point_613 b ON a.x <> b.x; diff --git a/easy/619. Biggest Single Number.md b/easy/619. Biggest Single Number.md new file mode 100644 index 0000000..b559b94 --- /dev/null +++ b/easy/619. Biggest Single Number.md @@ -0,0 +1,77 @@ +# Question 619: Biggest Single Number + +**LeetCode URL:** https://leetcode.com/problems/biggest-single-number/ + +## Description + +return it. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists MyNumbers (num int); +``` + +## Sample Input Data + +```sql +insert into MyNumbers (num) values ('8'); +insert into MyNumbers (num) values ('8'); +insert into MyNumbers (num) values ('3'); +insert into MyNumbers (num) values ('3'); +insert into MyNumbers (num) values ('1'); +insert into MyNumbers (num) values ('4'); +insert into MyNumbers (num) values ('5'); +insert into MyNumbers (num) values ('6'); +``` + +## Expected Output Data + +```text ++-----+ +| num | ++-----+ +| 6 | ++-----+ +``` + +## SQL Solution + +```sql +SELECT num +FROM number_619 +GROUP BY num +HAVING COUNT(num) = 1 +ORDER BY num DESC +LIMIT 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `num` from `number`. + +### Result Grain + +One row per unique key in `GROUP BY num`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by num. +2. Project final output columns: `num`. +3. Filter aggregated groups in `HAVING`: COUNT(num) = 1. +4. Order output deterministically with `ORDER BY num DESC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/easy/619. Biggest Single Number.sql b/easy/619. Biggest Single Number.sql deleted file mode 100644 index 7d08967..0000000 --- a/easy/619. Biggest Single Number.sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT num -FROM number_619 -GROUP BY num -HAVING COUNT(num) = 1 -ORDER BY num DESC -LIMIT 1; diff --git a/easy/620. Not Boring Movies.md b/easy/620. Not Boring Movies.md new file mode 100644 index 0000000..de31db2 --- /dev/null +++ b/easy/620. Not Boring Movies.md @@ -0,0 +1,71 @@ +# Question 620: Not Boring Movies + +**LeetCode URL:** https://leetcode.com/problems/not-boring-movies/ + +## Description + +Write a solution to report the movies with an odd-numbered ID and a description that is not "boring". Return the result table ordered by rating in descending order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists cinema (id int, movie varchar(255), description varchar(255), rating float(2, 1)); +``` + +## Sample Input Data + +```sql +insert into cinema (id, movie, description, rating) values ('1', 'War', 'great 3D', '8.9'); +insert into cinema (id, movie, description, rating) values ('2', 'Science', 'fiction', '8.5'); +insert into cinema (id, movie, description, rating) values ('3', 'irish', 'boring', '6.2'); +insert into cinema (id, movie, description, rating) values ('4', 'Ice song', 'Fantacy', '8.6'); +insert into cinema (id, movie, description, rating) values ('5', 'House card', 'Interesting', '9.1'); +``` + +## Expected Output Data + +```text ++----+------------+-------------+--------+ +| id | movie | description | rating | ++----+------------+-------------+--------+ +| 5 | House card | Interesting | 9.1 | +| 1 | War | great 3D | 8.9 | ++----+------------+-------------+--------+ +``` + +## SQL Solution + +```sql +SELECT * +FROM cinema_620 +WHERE id%2 = 1 AND description <> 'boring' +ORDER BY rating DESC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `cinema`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: id%2 = 1 AND description <> 'boring'. +2. Order output deterministically with `ORDER BY rating DESC`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/easy/620. Not Boring Movies.sql b/easy/620. Not Boring Movies.sql deleted file mode 100644 index 8ff1da1..0000000 --- a/easy/620. Not Boring Movies.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT * -FROM cinema_620 -WHERE id%2 = 1 AND description <> 'boring' -ORDER BY rating DESC; diff --git a/easy/627. Swap Salary.md b/easy/627. Swap Salary.md new file mode 100644 index 0000000..413af4d --- /dev/null +++ b/easy/627. Swap Salary.md @@ -0,0 +1,73 @@ +# Question 627: Swap Sex of Employees + +**LeetCode URL:** https://leetcode.com/problems/swap-sex-of-employees/ + +## Description + +Write a solution to swap all 'f' and 'm' values (i. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Salary (id int, name varchar(100), sex char(1), salary int); +``` + +## Sample Input Data + +```sql +insert into Salary (id, name, sex, salary) values ('1', 'A', 'm', '2500'); +insert into Salary (id, name, sex, salary) values ('2', 'B', 'f', '1500'); +insert into Salary (id, name, sex, salary) values ('3', 'C', 'm', '5500'); +insert into Salary (id, name, sex, salary) values ('4', 'D', 'f', '500'); +``` + +## Expected Output Data + +```text ++----+------+-----+--------+ +| id | name | sex | salary | ++----+------+-----+--------+ +| 1 | A | f | 2500 | +| 2 | B | m | 1500 | +| 3 | C | f | 5500 | +| 4 | D | m | 500 | ++----+------+-----+--------+ +``` + +## SQL Solution + +```sql +UPDATE salary_627 +SET sex = ( + CASE WHEN sex = 'm' THEN 'f' + ELSE 'm' + END +); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from the source tables. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Select rows directly from the source tables and project the required columns. + +### Why This Works + +The query maps input columns directly to the requested output shape with minimal transformation. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/easy/627. Swap Salary.sql b/easy/627. Swap Salary.sql deleted file mode 100644 index 6f35242..0000000 --- a/easy/627. Swap Salary.sql +++ /dev/null @@ -1,6 +0,0 @@ -UPDATE salary_627 -SET sex = ( - CASE WHEN sex = 'm' THEN 'f' - ELSE 'm' - END -); diff --git a/hard/1097. Game Play Analysis V.md b/hard/1097. Game Play Analysis V.md new file mode 100644 index 0000000..57faadd --- /dev/null +++ b/hard/1097. Game Play Analysis V.md @@ -0,0 +1,86 @@ +# Question 1097: Game Play Analysis V + +**LeetCode URL:** https://leetcode.com/problems/game-play-analysis-v/ + +## Description + +The query result format is in the following example: Activity table: +-----------+-----------+------------+--------------+ | player_id | device_id | event_date | games_played | +-----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-03-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-01 | 0 | | 3 | 4 | 2016-07-03 | 5 | +-----------+-----------+------------+--------------+ Result table: +------------+----------+----------------+ | install_dt | installs | Day1_retention | +------------+----------+----------------+ | 2016-03-01 | 2 | 0. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int); +``` + +## Sample Input Data + +```sql +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5'); +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-02', '6'); +insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-01', '0'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5'); +``` + +## Expected Output Data + +```text ++------------+----------+----------------+ +| install_dt | installs | Day1_retention | ++------------+----------+----------------+ +| 2016-03-01 | 2 | 0.50 | +| 2017-06-25 | 1 | 0.00 | ++------------+----------+----------------+ +``` + +## SQL Solution + +```sql +WITH install_dates AS( + SELECT player_id,MIN(event_date) AS install_date + FROM activity_1097 + GROUP BY player_id +), +new AS( + SELECT i.player_id,i.install_date,a.event_date + FROM install_dates i + LEFT JOIN activity_1097 a ON i.player_id = a.player_id AND i.install_date + 1 = a.event_date +) + +SELECT install_date,COUNT(player_id),ROUND(COUNT(event_date)::NUMERIC/COUNT(player_id),2) +FROM new +GROUP BY install_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `install_date` from `activity`, `install_dates`, `new`. + +### Result Grain + +One row per unique key in `GROUP BY install_date`. + +### Step-by-Step Logic + +1. Create CTE layers (`install_dates`, `new`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `install_dates`: reads `activity`. +3. CTE `new`: reads `install_dates`, `activity`, joins related entities. +4. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Aggregate rows with COUNT, MIN, ROUND grouped by install_date. +6. Project final output columns: `install_date`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/hard/1097. Game Play Analysis V.sql b/hard/1097. Game Play Analysis V.sql deleted file mode 100644 index 0528f9b..0000000 --- a/hard/1097. Game Play Analysis V.sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH install_dates AS( - SELECT player_id,MIN(event_date) AS install_date - FROM activity_1097 - GROUP BY player_id -), -new AS( - SELECT i.player_id,i.install_date,a.event_date - FROM install_dates i - LEFT JOIN activity_1097 a ON i.player_id = a.player_id AND i.install_date + 1 = a.event_date -) - -SELECT install_date,COUNT(player_id),ROUND(COUNT(event_date)::NUMERIC/COUNT(player_id),2) -FROM new -GROUP BY install_date; diff --git a/hard/1127. User Purchase Platform.md b/hard/1127. User Purchase Platform.md new file mode 100644 index 0000000..6f7afcd --- /dev/null +++ b/hard/1127. User Purchase Platform.md @@ -0,0 +1,96 @@ +# Question 1127: User Purchase Platform + +**LeetCode URL:** https://leetcode.com/problems/user-purchase-platform/ + +## Description + +Write an SQL query to find the total number of users and the total amount spent using mobile only, desktop only and both mobile and desktop together for each date. The query result format is in the following example: Spending table: +---------+------------+----------+--------+ | user_id | spend_date | platform | amount | +---------+------------+----------+--------+ | 1 | 2019-07-01 | mobile | 100 | | 1 | 2019-07-01 | desktop | 100 | | 2 | 2019-07-01 | mobile | 100 | | 2 | 2019-07-02 | mobile | 100 | | 3 | 2019-07-01 | desktop | 100 | | 3 | 2019-07-02 | desktop | 100 | +---------+------------+----------+--------+ Result table: +------------+----------+--------------+-------------+ | spend_date | platform | total_amount | total_users | +------------+----------+--------------+-------------+ | 2019-07-01 | desktop | 100 | 1 | | 2019-07-01 | mobile | 100 | 1 | | 2019-07-01 | both | 200 | 1 | | 2019-07-02 | desktop | 100 | 1 | | 2019-07-02 | mobile | 100 | 1 | | 2019-07-02 | both | 0 | 0 | +------------+----------+--------------+-------------+ On 2019-07-01, user 1 purchased using both desktop and mobile, user 2 purchased using mobile only and user 3 purchased using desktop only. + +## Table Schema Structure + +```sql +Create table If Not Exists Spending (user_id int, spend_date date, platform ENUM('desktop', 'mobile'), amount int); +``` + +## Sample Input Data + +```sql +insert into Spending (user_id, spend_date, platform, amount) values ('1', '2019-07-01', 'mobile', '100'); +insert into Spending (user_id, spend_date, platform, amount) values ('1', '2019-07-01', 'desktop', '100'); +insert into Spending (user_id, spend_date, platform, amount) values ('2', '2019-07-01', 'mobile', '100'); +insert into Spending (user_id, spend_date, platform, amount) values ('2', '2019-07-02', 'mobile', '100'); +insert into Spending (user_id, spend_date, platform, amount) values ('3', '2019-07-01', 'desktop', '100'); +insert into Spending (user_id, spend_date, platform, amount) values ('3', '2019-07-02', 'desktop', '100'); +``` + +## Expected Output Data + +```text ++------------+----------+--------------+-------------+ +| spend_date | platform | total_amount | total_users | ++------------+----------+--------------+-------------+ +| 2019-07-01 | desktop | 100 | 1 | +| 2019-07-01 | mobile | 100 | 1 | +| 2019-07-01 | both | 200 | 1 | +| 2019-07-02 | desktop | 100 | 1 | +| 2019-07-02 | mobile | 100 | 1 | +| 2019-07-02 | both | 0 | 0 | ++------------+----------+--------------+-------------+ +``` + +## SQL Solution + +```sql +WITH each_day_platform AS( + SELECT spend_date,UNNEST(ARRAY['both','mobile','desktop']) AS platform_type + FROM spending_1127 + GROUP BY spend_date +), +cte AS( + SELECT a.spend_date, + CASE WHEN b.user_id IS NOT NULL THEN 'both' + WHEN a.platform = 'mobile' THEN 'mobile' + ELSE 'desktop' + END AS platform_type, + COUNT(DISTINCT a.user_id) AS total_users, + SUM(a.amount) AS amount + FROM spending_1127 a + LEFT JOIN spending_1127 b ON a.user_id = b.user_id AND a.spend_date = b.spend_date AND a.platform <> b.platform + GROUP BY a.spend_date,platform_type +) + +SELECT a.spend_date,a.platform_type,COALESCE(total_users,0) AS total_users,COALESCE(amount,0) AS amount +FROM each_day_platform a +LEFT JOIN cte b ON a.spend_date=b.spend_date AND a.platform_type = b.platform_type; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `spend_date`, `platform_type`, `total_users`, `amount` from `spending`, `each_day_platform`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`each_day_platform`, `cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `each_day_platform`: reads `spending`. +3. CTE `cte`: reads `spending`, joins related entities. +4. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Project final output columns: `spend_date`, `platform_type`, `total_users`, `amount`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/hard/1127. User Purchase Platform.sql b/hard/1127. User Purchase Platform.sql deleted file mode 100644 index 529261e..0000000 --- a/hard/1127. User Purchase Platform.sql +++ /dev/null @@ -1,21 +0,0 @@ -WITH each_day_platform AS( - SELECT spend_date,UNNEST(ARRAY['both','mobile','desktop']) AS platform_type - FROM spending_1127 - GROUP BY spend_date -), -cte AS( - SELECT a.spend_date, - CASE WHEN b.user_id IS NOT NULL THEN 'both' - WHEN a.platform = 'mobile' THEN 'mobile' - ELSE 'desktop' - END AS platform_type, - COUNT(DISTINCT a.user_id) AS total_users, - SUM(a.amount) AS amount - FROM spending_1127 a - LEFT JOIN spending_1127 b ON a.user_id = b.user_id AND a.spend_date = b.spend_date AND a.platform <> b.platform - GROUP BY a.spend_date,platform_type -) - -SELECT a.spend_date,a.platform_type,COALESCE(total_users,0) AS total_users,COALESCE(amount,0) AS amount -FROM each_day_platform a -LEFT JOIN cte b ON a.spend_date=b.spend_date AND a.platform_type = b.platform_type; diff --git a/hard/1159. Market Analysis II.md b/hard/1159. Market Analysis II.md new file mode 100644 index 0000000..2ef31bc --- /dev/null +++ b/hard/1159. Market Analysis II.md @@ -0,0 +1,105 @@ +# Question 1159: Market Analysis II + +**LeetCode URL:** https://leetcode.com/problems/market-analysis-ii/ + +## Description + +Write an SQL query to find for each user, whether the brand of the second item (by date) they sold is their favorite brand. The query result format is in the following example: Users table: +---------+------------+----------------+ | user_id | join_date | favorite_brand | +---------+------------+----------------+ | 1 | 2019-01-01 | Lenovo | | 2 | 2019-02-09 | Samsung | | 3 | 2019-01-19 | LG | | 4 | 2019-05-21 | HP | +---------+------------+----------------+ Orders table: +----------+------------+---------+----------+-----------+ | order_id | order_date | item_id | buyer_id | seller_id | +----------+------------+---------+----------+-----------+ | 1 | 2019-08-01 | 4 | 1 | 2 | | 2 | 2019-08-02 | 2 | 1 | 3 | | 3 | 2019-08-03 | 3 | 2 | 3 | | 4 | 2019-08-04 | 1 | 4 | 2 | | 5 | 2019-08-04 | 1 | 3 | 4 | | 6 | 2019-08-05 | 2 | 2 | 4 | +----------+------------+---------+----------+-----------+ Items table: +---------+------------+ | item_id | item_brand | +---------+------------+ | 1 | Samsung | | 2 | Lenovo | | 3 | LG | | 4 | HP | +---------+------------+ Result table: +-----------+--------------------+ | seller_id | 2nd_item_fav_brand | +-----------+--------------------+ | 1 | no | | 2 | yes | | 3 | yes | | 4 | no | +-----------+--------------------+ The answer for the user with id 1 is no because they sold nothing. + +## Table Schema Structure + +```sql +Create table If Not Exists Users (user_id int, join_date date, favorite_brand varchar(10)); +Create table If Not Exists Orders (order_id int, order_date date, item_id int, buyer_id int, seller_id int); +Create table If Not Exists Items (item_id int, item_brand varchar(10)); +``` + +## Sample Input Data + +```sql +insert into Users (user_id, join_date, favorite_brand) values ('1', '2019-01-01', 'Lenovo'); +insert into Users (user_id, join_date, favorite_brand) values ('2', '2019-02-09', 'Samsung'); +insert into Users (user_id, join_date, favorite_brand) values ('3', '2019-01-19', 'LG'); +insert into Users (user_id, join_date, favorite_brand) values ('4', '2019-05-21', 'HP'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('1', '2019-08-01', '4', '1', '2'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('2', '2019-08-02', '2', '1', '3'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('3', '2019-08-03', '3', '2', '3'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('4', '2019-08-04', '1', '4', '2'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('5', '2019-08-04', '1', '3', '4'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('6', '2019-08-05', '2', '2', '4'); +insert into Items (item_id, item_brand) values ('1', 'Samsung'); +insert into Items (item_id, item_brand) values ('2', 'Lenovo'); +insert into Items (item_id, item_brand) values ('3', 'LG'); +insert into Items (item_id, item_brand) values ('4', 'HP'); +``` + +## Expected Output Data + +```text ++-----------+--------------------+ +| seller_id | 2nd_item_fav_brand | ++-----------+--------------------+ +| 1 | no | +| 2 | yes | +| 3 | yes | +| 4 | no | ++-----------+--------------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT *, + ROW_NUMBER() OVER (PARTITION BY seller_id ORDER BY order_date) AS rn + FROM orders_1159 +), +cte2 AS( + SELECT seller_id,item_id + FROM cte + WHERE rn = 2 +) + +SELECT u.user_id, + CASE WHEN c.item_id=i.item_id THEN 'yes' + ELSE 'no' + END AS "2nd_item_fav_brand" +FROM users_1159 u +INNER JOIN items_1159 i ON u.favorite_brand = i.item_brand +LEFT JOIN cte2 c ON u.user_id = c.seller_id +ORDER BY 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id` from `orders`, `cte`, `users`, `items`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `cte2`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `orders`, computes window metrics. +3. CTE `cte2`: reads `cte`. +4. Combine datasets using LEFT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `user_id`. +7. Order output deterministically with `ORDER BY 1`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/hard/1159. Market Analysis II.sql b/hard/1159. Market Analysis II.sql deleted file mode 100644 index 58dfd09..0000000 --- a/hard/1159. Market Analysis II.sql +++ /dev/null @@ -1,19 +0,0 @@ -WITH cte AS( - SELECT *, - ROW_NUMBER() OVER (PARTITION BY seller_id ORDER BY order_date) AS rn - FROM orders_1159 -), -cte2 AS( - SELECT seller_id,item_id - FROM cte - WHERE rn = 2 -) - -SELECT u.user_id, - CASE WHEN c.item_id=i.item_id THEN 'yes' - ELSE 'no' - END AS "2nd_item_fav_brand" -FROM users_1159 u -INNER JOIN items_1159 i ON u.favorite_brand = i.item_brand -LEFT JOIN cte2 c ON u.user_id = c.seller_id -ORDER BY 1; diff --git a/hard/1194. Tournament Winners.md b/hard/1194. Tournament Winners.md new file mode 100644 index 0000000..c13095c --- /dev/null +++ b/hard/1194. Tournament Winners.md @@ -0,0 +1,114 @@ +# Question 1194: Tournament Winners + +**LeetCode URL:** https://leetcode.com/problems/tournament-winners/ + +## Description + +Write an SQL query to find the winner in each group. The query result format is in the following example: Players table: +-----------+------------+ | player_id | group_id | +-----------+------------+ | 15 | 1 | | 25 | 1 | | 30 | 1 | | 45 | 1 | | 10 | 2 | | 35 | 2 | | 50 | 2 | | 20 | 3 | | 40 | 3 | +-----------+------------+ Matches table: +------------+--------------+---------------+-------------+--------------+ | match_id | first_player | second_player | first_score | second_score | +------------+--------------+---------------+-------------+--------------+ | 1 | 15 | 45 | 3 | 0 | | 2 | 30 | 25 | 1 | 2 | | 3 | 30 | 15 | 2 | 0 | | 4 | 40 | 20 | 5 | 2 | | 5 | 35 | 50 | 1 | 1 | +------------+--------------+---------------+-------------+--------------+ Result table: +-----------+------------+ | group_id | player_id | +-----------+------------+ | 1 | 15 | | 2 | 35 | | 3 | 40 | +-----------+------------+ Difficulty: Hard Lock: Prime Company: Wayfair Problem Solution 1194-Tournament-Winners All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Players (player_id int, group_id int); +Create table If Not Exists Matches (match_id int, first_player int, second_player int, first_score int, second_score int); +``` + +## Sample Input Data + +```sql +insert into Players (player_id, group_id) values ('10', '2'); +insert into Players (player_id, group_id) values ('15', '1'); +insert into Players (player_id, group_id) values ('20', '3'); +insert into Players (player_id, group_id) values ('25', '1'); +insert into Players (player_id, group_id) values ('30', '1'); +insert into Players (player_id, group_id) values ('35', '2'); +insert into Players (player_id, group_id) values ('40', '3'); +insert into Players (player_id, group_id) values ('45', '1'); +insert into Players (player_id, group_id) values ('50', '2'); +insert into Matches (match_id, first_player, second_player, first_score, second_score) values ('1', '15', '45', '3', '0'); +insert into Matches (match_id, first_player, second_player, first_score, second_score) values ('2', '30', '25', '1', '2'); +insert into Matches (match_id, first_player, second_player, first_score, second_score) values ('3', '30', '15', '2', '0'); +insert into Matches (match_id, first_player, second_player, first_score, second_score) values ('4', '40', '20', '5', '2'); +insert into Matches (match_id, first_player, second_player, first_score, second_score) values ('5', '35', '50', '1', '1'); +``` + +## Expected Output Data + +```text ++-----------+------------+ +| group_id | player_id | ++-----------+------------+ +| 1 | 15 | +| 2 | 35 | +| 3 | 40 | ++-----------+------------+ +``` + +## SQL Solution + +```sql +WITH player_scores AS( + (SELECT first_player AS player,first_score AS score + FROM matches_1194) + UNION ALL + (SELECT second_player AS player,second_score AS score + FROM matches_1194) +), + +all_player_scores AS( + SELECT player,SUM(score) AS score + FROM player_scores + GROUP BY player + ORDER BY player +), + +ranked AS ( + SELECT p.*,ps.score AS score, + DENSE_RANK() OVER(PARTITION BY group_id ORDER BY score DESC,player_id ASC) AS rnk + FROM players_1194 p + INNER JOIN all_player_scores ps ON p.player_id = ps.player +) + +SELECT group_id,player_id +FROM ranked +WHERE rnk=1 +ORDER BY group_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `group_id`, `player_id` from `matches`, `player_scores`, `players`, `all_player_scores`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`player_scores`, `all_player_scores`, `ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `player_scores`: reads `matches`. +3. CTE `all_player_scores`: reads `player_scores`. +4. CTE `ranked`: reads `players`, `all_player_scores`, joins related entities. +5. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +6. Apply row-level filtering in `WHERE`: rnk=1. +7. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +8. Project final output columns: `group_id`, `player_id`. +9. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. +10. Order output deterministically with `ORDER BY group_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/hard/1194. Tournament Winners.sql b/hard/1194. Tournament Winners.sql deleted file mode 100644 index 5237b40..0000000 --- a/hard/1194. Tournament Winners.sql +++ /dev/null @@ -1,26 +0,0 @@ -WITH player_scores AS( - (SELECT first_player AS player,first_score AS score - FROM matches_1194) - UNION ALL - (SELECT second_player AS player,second_score AS score - FROM matches_1194) -), - -all_player_scores AS( - SELECT player,SUM(score) AS score - FROM player_scores - GROUP BY player - ORDER BY player -), - -ranked AS ( - SELECT p.*,ps.score AS score, - DENSE_RANK() OVER(PARTITION BY group_id ORDER BY score DESC,player_id ASC) AS rnk - FROM players_1194 p - INNER JOIN all_player_scores ps ON p.player_id = ps.player -) - -SELECT group_id,player_id -FROM ranked -WHERE rnk=1 -ORDER BY group_id; diff --git a/hard/1225. Report Contiguous Dates.md b/hard/1225. Report Contiguous Dates.md new file mode 100644 index 0000000..3560b2a --- /dev/null +++ b/hard/1225. Report Contiguous Dates.md @@ -0,0 +1,133 @@ +# Question 1225: Report Contiguous Dates + +**LeetCode URL:** https://leetcode.com/problems/report-contiguous-dates/ + +## Description + +Write an SQL query to generate a report of period_state for each continuous interval of days in the period from 2019-01-01 to 2019-12-31. The query result format is in the following example: Failed table: +-------------------+ | fail_date | +-------------------+ | 2018-12-28 | | 2018-12-29 | | 2019-01-04 | | 2019-01-05 | +-------------------+ Succeeded table: +-------------------+ | success_date | +-------------------+ | 2018-12-30 | | 2018-12-31 | | 2019-01-01 | | 2019-01-02 | | 2019-01-03 | | 2019-01-06 | +-------------------+ Result table: +--------------+--------------+--------------+ | period_state | start_date | end_date | +--------------+--------------+--------------+ | succeeded | 2019-01-01 | 2019-01-03 | | failed | 2019-01-04 | 2019-01-05 | | succeeded | 2019-01-06 | 2019-01-06 | +--------------+--------------+--------------+ The report ignored the system state in 2018 as we care about the system in the period 2019-01-01 to 2019-12-31. + +## Table Schema Structure + +```sql +Create table If Not Exists Failed (fail_date date); +Create table If Not Exists Succeeded (success_date date); +``` + +## Sample Input Data + +```sql +insert into Failed (fail_date) values ('2018-12-28'); +insert into Failed (fail_date) values ('2018-12-29'); +insert into Failed (fail_date) values ('2019-01-04'); +insert into Failed (fail_date) values ('2019-01-05'); +insert into Succeeded (success_date) values ('2018-12-30'); +insert into Succeeded (success_date) values ('2018-12-31'); +insert into Succeeded (success_date) values ('2019-01-01'); +insert into Succeeded (success_date) values ('2019-01-02'); +insert into Succeeded (success_date) values ('2019-01-03'); +insert into Succeeded (success_date) values ('2019-01-06'); +``` + +## Expected Output Data + +```text ++--------------+--------------+--------------+ +| period_state | start_date | end_date | ++--------------+--------------+--------------+ +| succeeded | 2019-01-01 | 2019-01-03 | +| failed | 2019-01-04 | 2019-01-05 | +| succeeded | 2019-01-06 | 2019-01-06 | ++--------------+--------------+--------------+ +``` + +## SQL Solution + +```sql +WITH cte1 AS( + SELECT fail_date AS dt,'failed' AS status + FROM failed_1225 + WHERE EXTRACT(YEAR from fail_date) = 2019 + UNION + SELECT success_date AS dt,'succeeded' AS status + FROM succeeded_1225 + WHERE EXTRACT(YEAR from success_date) = 2019 +), +cte2 AS ( + SELECT *, + LAG(status) OVER (ORDER BY dt) AS lagged_status + FROM cte1 +), +cte3 AS ( + SELECT *, + (CASE WHEN status = lagged_status THEN 0 ELSE 1 END) AS marker + FROM cte2 +), +cte4 AS ( + SELECT *, + SUM(marker) OVER (ORDER BY dt) AS rolling_sum + FROM cte3 +) +SELECT MAX(status) AS period_state, MIN(dt) AS start_date, MAX(dt) AS end_date +FROM cte4 +GROUP BY rolling_sum; + +--------------------------------------------------------------------------------------------------------------------------------------------- +--Simplified Query +--------------------------------------------------------------------------------------------------------------------------------------------- + +WITH tasks AS ( + SELECT fail_date AS dt,'failed' AS status + FROM failed_1225 + UNION + SELECT success_date AS dt,'succeeded' AS status + FROM succeeded_1225 +), +ranked AS ( + SELECT *, + ROW_NUMBER() OVER (ORDER BY dt)-ROW_NUMBER() OVER (PARTITION BY status ORDER BY dt) AS diff + FROM tasks + WHERE dt BETWEEN '01-01-2019' AND '31-12-2019' + ORDER BY dt +) +SELECT status, + MIN(dt) AS start_date,MAX(dt) AS end_date +FROM ranked +GROUP BY status,diff +ORDER BY start_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `status`, `start_date`, `end_date` from `failed`, `fail_date`, `succeeded`, `success_date`. + +### Result Grain + +One row per unique key in `GROUP BY status,diff`. + +### Step-by-Step Logic + +1. Create CTE layers (`cte1`, `cte2`, `cte3`, `cte4`, `tasks`, `ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte1`: reads `failed`, `fail_date`, `succeeded`. +3. CTE `cte2`: reads `cte1`, computes window metrics. +4. CTE `cte3`: reads `cte2`. +5. CTE `cte4`: reads `cte3`, computes window metrics. +6. Aggregate rows with SUM, MIN, MAX grouped by status,diff. +7. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +8. Project final output columns: `status`, `start_date`, `end_date`. +9. Order output deterministically with `ORDER BY start_date`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/hard/1225. Report Contiguous Dates.sql b/hard/1225. Report Contiguous Dates.sql deleted file mode 100644 index 22a358c..0000000 --- a/hard/1225. Report Contiguous Dates.sql +++ /dev/null @@ -1,51 +0,0 @@ -WITH cte1 AS( - SELECT fail_date AS dt,'failed' AS status - FROM failed_1225 - WHERE EXTRACT(YEAR from fail_date) = 2019 - UNION - SELECT success_date AS dt,'succeeded' AS status - FROM succeeded_1225 - WHERE EXTRACT(YEAR from success_date) = 2019 -), -cte2 AS ( - SELECT *, - LAG(status) OVER (ORDER BY dt) AS lagged_status - FROM cte1 -), -cte3 AS ( - SELECT *, - (CASE WHEN status = lagged_status THEN 0 ELSE 1 END) AS marker - FROM cte2 -), -cte4 AS ( - SELECT *, - SUM(marker) OVER (ORDER BY dt) AS rolling_sum - FROM cte3 -) -SELECT MAX(status) AS period_state, MIN(dt) AS start_date, MAX(dt) AS end_date -FROM cte4 -GROUP BY rolling_sum; - ---------------------------------------------------------------------------------------------------------------------------------------------- ---Simplified Query ---------------------------------------------------------------------------------------------------------------------------------------------- - -WITH tasks AS ( - SELECT fail_date AS dt,'failed' AS status - FROM failed_1225 - UNION - SELECT success_date AS dt,'succeeded' AS status - FROM succeeded_1225 -), -ranked AS ( - SELECT *, - ROW_NUMBER() OVER (ORDER BY dt)-ROW_NUMBER() OVER (PARTITION BY status ORDER BY dt) AS diff - FROM tasks - WHERE dt BETWEEN '01-01-2019' AND '31-12-2019' - ORDER BY dt -) -SELECT status, - MIN(dt) AS start_date,MAX(dt) AS end_date -FROM ranked -GROUP BY status,diff -ORDER BY start_date; diff --git a/hard/1336. Number of Transactions per Visit.md b/hard/1336. Number of Transactions per Visit.md new file mode 100644 index 0000000..012aae6 --- /dev/null +++ b/hard/1336. Number of Transactions per Visit.md @@ -0,0 +1,128 @@ +# Question 1336: Number of Transactions per Visit + +**LeetCode URL:** https://leetcode.com/problems/number-of-transactions-per-visit/ + +## Description + +Write an SQL query to find how many users visited the bank and didn't do any transactions, how many visited the bank and did one transaction and so on. The query result format is in the following example: Visits table: +---------+------------+ | user_id | visit_date | +---------+------------+ | 1 | 2020-01-01 | | 2 | 2020-01-02 | | 12 | 2020-01-01 | | 19 | 2020-01-03 | | 1 | 2020-01-02 | | 2 | 2020-01-03 | | 1 | 2020-01-04 | | 7 | 2020-01-11 | | 9 | 2020-01-25 | | 8 | 2020-01-28 | +---------+------------+ Transactions table: +---------+------------------+--------+ | user_id | transaction_date | amount | +---------+------------------+--------+ | 1 | 2020-01-02 | 120 | | 2 | 2020-01-03 | 22 | | 7 | 2020-01-11 | 232 | | 1 | 2020-01-04 | 7 | | 9 | 2020-01-25 | 33 | | 9 | 2020-01-25 | 66 | | 8 | 2020-01-28 | 1 | | 9 | 2020-01-25 | 99 | +---------+------------------+--------+ Result table: +--------------------+--------------+ | transactions_count | visits_count | +--------------------+--------------+ | 0 | 4 | | 1 | 5 | | 2 | 0 | | 3 | 1 | +--------------------+--------------+ * For transactions_count = 0, The visits (1, "2020-01-01"), (2, "2020-01-02"), (12, "2020-01-01") and (19, "2020-01-03") did no transactions so visits_count = 4. + +## Table Schema Structure + +```sql +Create table If Not Exists Visits (user_id int, visit_date date); +Create table If Not Exists Transactions (user_id int, transaction_date date, amount int); +``` + +## Sample Input Data + +```sql +insert into Visits (user_id, visit_date) values ('1', '2020-01-01'); +insert into Visits (user_id, visit_date) values ('2', '2020-01-02'); +insert into Visits (user_id, visit_date) values ('12', '2020-01-01'); +insert into Visits (user_id, visit_date) values ('19', '2020-01-03'); +insert into Visits (user_id, visit_date) values ('1', '2020-01-02'); +insert into Visits (user_id, visit_date) values ('2', '2020-01-03'); +insert into Visits (user_id, visit_date) values ('1', '2020-01-04'); +insert into Visits (user_id, visit_date) values ('7', '2020-01-11'); +insert into Visits (user_id, visit_date) values ('9', '2020-01-25'); +insert into Visits (user_id, visit_date) values ('8', '2020-01-28'); +insert into Transactions (user_id, transaction_date, amount) values ('1', '2020-01-02', '120'); +insert into Transactions (user_id, transaction_date, amount) values ('2', '2020-01-03', '22'); +insert into Transactions (user_id, transaction_date, amount) values ('7', '2020-01-11', '232'); +insert into Transactions (user_id, transaction_date, amount) values ('1', '2020-01-04', '7'); +insert into Transactions (user_id, transaction_date, amount) values ('9', '2020-01-25', '33'); +insert into Transactions (user_id, transaction_date, amount) values ('9', '2020-01-25', '66'); +insert into Transactions (user_id, transaction_date, amount) values ('8', '2020-01-28', '1'); +insert into Transactions (user_id, transaction_date, amount) values ('9', '2020-01-25', '99'); +``` + +## Expected Output Data + +```text ++--------------------+--------------+ +| transactions_count | visits_count | ++--------------------+--------------+ +| 0 | 4 | +| 1 | 5 | +| 2 | 0 | +| 3 | 1 | ++--------------------+--------------+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE cte AS( + SELECT v.user_id,v.visit_date,t.amount + FROM visits_1336 v + LEFT JOIN transactions_1336 t ON v.user_id=t.user_id AND v.visit_date=t.transaction_date +), +cte1 AS ( + SELECT user_id,visit_date,COUNT(1) AS transactions_count + FROM cte + WHERE amount IS NOT NULL + GROUP BY user_id,visit_date +), +cte2 AS ( + SELECT 0 AS transactions_count,COUNT(1) AS visits_count + FROM cte + WHERE amount IS NULL + GROUP BY transactions_count +), +cte3 AS ( + SELECT transactions_count,COUNT(transactions_count) AS visit_count + FROM cte1 + GROUP BY transactions_count + UNION + SELECT * + FROM cte2 +), +nums AS ( + SELECT 0 AS n + UNION + SELECT n+1 AS n + FROM nums + WHERE n < (SELECT MAX(transactions_count) FROM cte3) +) + +SELECT n.n AS transactions_count,COALESCE(visit_count,0) AS visit_count +FROM nums n +LEFT JOIN cte3 c ON n.n=c.transactions_count +ORDER BY 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `transactions_count`, `visit_count` from `visits`, `transactions`, `cte`, `cte1`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte1`, `cte2`, `cte3`, `nums`) to transform data incrementally; recursion expands rows level-by-level until the stop condition is met. +2. CTE `cte`: reads `visits`, `transactions`, joins related entities. +3. CTE `cte1`: reads `cte`. +4. CTE `cte2`: reads `cte`. +5. CTE `cte3`: reads `cte1`, `cte2`. +6. CTE `nums`: reads `nums`, `cte3`. +7. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +8. Project final output columns: `transactions_count`, `visit_count`. +9. Order output deterministically with `ORDER BY 1`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, recursive expansion, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/hard/1336. Number of Transactions per Visit.sql b/hard/1336. Number of Transactions per Visit.sql deleted file mode 100644 index 759b146..0000000 --- a/hard/1336. Number of Transactions per Visit.sql +++ /dev/null @@ -1,37 +0,0 @@ -WITH RECURSIVE cte AS( - SELECT v.user_id,v.visit_date,t.amount - FROM visits_1336 v - LEFT JOIN transactions_1336 t ON v.user_id=t.user_id AND v.visit_date=t.transaction_date -), -cte1 AS ( - SELECT user_id,visit_date,COUNT(1) AS transactions_count - FROM cte - WHERE amount IS NOT NULL - GROUP BY user_id,visit_date -), -cte2 AS ( - SELECT 0 AS transactions_count,COUNT(1) AS visits_count - FROM cte - WHERE amount IS NULL - GROUP BY transactions_count -), -cte3 AS ( - SELECT transactions_count,COUNT(transactions_count) AS visit_count - FROM cte1 - GROUP BY transactions_count - UNION - SELECT * - FROM cte2 -), -nums AS ( - SELECT 0 AS n - UNION - SELECT n+1 AS n - FROM nums - WHERE n < (SELECT MAX(transactions_count) FROM cte3) -) - -SELECT n.n AS transactions_count,COALESCE(visit_count,0) AS visit_count -FROM nums n -LEFT JOIN cte3 c ON n.n=c.transactions_count -ORDER BY 1; diff --git a/hard/1384. Total Sales Amount by Year.md b/hard/1384. Total Sales Amount by Year.md new file mode 100644 index 0000000..86e2771 --- /dev/null +++ b/hard/1384. Total Sales Amount by Year.md @@ -0,0 +1,98 @@ +# Question 1384: Total Sales Amount by Year + +**LeetCode URL:** https://leetcode.com/problems/total-sales-amount-by-year/ + +## Description + +Drafted from this solution SQL: split each sales period into daily rows, map each day to its year, and compute total sales amount per `product_id` and year. Return one row per product-year with the yearly total. + +## Table Schema Structure + +```sql +-- Inferred from the solution query +CREATE TABLE sales_1384 ( + product_id INT, + period_start DATE, + period_end DATE, + average_daily_sales INT +); + +CREATE TABLE product_1384 ( + product_id INT, + product_name VARCHAR(100) +); +``` + +## Sample Input Data + +```sql +INSERT INTO product_1384 (product_id, product_name) VALUES +(1, 'LC Phone'), +(2, 'LC T-Shirt'); + +INSERT INTO sales_1384 (product_id, period_start, period_end, average_daily_sales) VALUES +(1, '2019-12-30', '2020-01-02', 100), +(2, '2020-01-01', '2020-01-03', 10); +``` + +## Expected Output Data + +```text ++------------+------------+-------------+---------------------------+ +| product_id | product_name | report_year | sum | ++------------+------------+-------------+---------------------------+ +| 1 | LC Phone | 2019 | 200 | +| 1 | LC Phone | 2020 | 200 | +| 2 | LC T-Shirt | 2020 | 30 | ++------------+------------+-------------+---------------------------+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE exploded_sales AS ( + SELECT product_id,period_start,period_end,average_daily_sales + FROM sales_1384 + UNION + SELECT product_id,period_start+1 AS period_start,period_end,average_daily_sales + FROM exploded_sales + WHERE period_start < period_end +) +SELECT es.product_id,p.product_name,EXTRACT(YEAR FROM period_start) AS report_year,SUM(average_daily_sales) +FROM exploded_sales es +INNER JOIN product_1384 p ON es.product_id=p.product_id +GROUP BY es.product_id,p.product_name,EXTRACT(YEAR FROM period_start) +ORDER BY es.product_id,report_year; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id`, `product_name`, `YEAR` from `sales`, `exploded_sales`, `period_start`, `product`. + +### Result Grain + +One row per unique key in `GROUP BY es.product_id,p.product_name,EXTRACT(YEAR FROM period_start)`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by es.product_id,p.product_name,EXTRACT(YEAR FROM period_start). +3. Project final output columns: `product_id`, `product_name`, `YEAR`. +4. Order output deterministically with `ORDER BY es.product_id,report_year`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, recursive expansion, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/hard/1384. Total Sales Amount by Year.txt b/hard/1384. Total Sales Amount by Year.txt deleted file mode 100644 index 439be89..0000000 --- a/hard/1384. Total Sales Amount by Year.txt +++ /dev/null @@ -1,13 +0,0 @@ -WITH RECURSIVE exploded_sales AS ( - SELECT product_id,period_start,period_end,average_daily_sales - FROM sales_1384 - UNION - SELECT product_id,period_start+1 AS period_start,period_end,average_daily_sales - FROM exploded_sales - WHERE period_start < period_end -) -SELECT es.product_id,p.product_name,EXTRACT(YEAR FROM period_start) AS report_year,SUM(average_daily_sales) -FROM exploded_sales es -INNER JOIN product_1384 p ON es.product_id=p.product_id -GROUP BY es.product_id,p.product_name,EXTRACT(YEAR FROM period_start) -ORDER BY es.product_id,report_year; diff --git a/hard/1412. Find the Quiet Students in All Exams.md b/hard/1412. Find the Quiet Students in All Exams.md new file mode 100644 index 0000000..1240a15 --- /dev/null +++ b/hard/1412. Find the Quiet Students in All Exams.md @@ -0,0 +1,98 @@ +# Question 1412: Find the Quiet Students in All Exams + +**LeetCode URL:** https://leetcode.com/problems/find-the-quiet-students-in-all-exams/ + +## Description + +Write an SQL query to report the students (student_id, student_name) being "quiet" in ALL exams. return the student who has never taken any exam. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Student (student_id int, student_name varchar(30)); +Create table If Not Exists Exam (exam_id int, student_id int, score int); +``` + +## Sample Input Data + +```sql +insert into Student (student_id, student_name) values ('1', 'Daniel'); +insert into Student (student_id, student_name) values ('2', 'Jade'); +insert into Student (student_id, student_name) values ('3', 'Stella'); +insert into Student (student_id, student_name) values ('4', 'Jonathan'); +insert into Student (student_id, student_name) values ('5', 'Will'); +insert into Exam (exam_id, student_id, score) values ('10', '1', '70'); +insert into Exam (exam_id, student_id, score) values ('10', '2', '80'); +insert into Exam (exam_id, student_id, score) values ('10', '3', '90'); +insert into Exam (exam_id, student_id, score) values ('20', '1', '80'); +insert into Exam (exam_id, student_id, score) values ('30', '1', '70'); +insert into Exam (exam_id, student_id, score) values ('30', '3', '80'); +insert into Exam (exam_id, student_id, score) values ('30', '4', '90'); +insert into Exam (exam_id, student_id, score) values ('40', '1', '60'); +insert into Exam (exam_id, student_id, score) values ('40', '2', '70'); +insert into Exam (exam_id, student_id, score) values ('40', '4', '80'); +``` + +## Expected Output Data + +```text ++-------------+---------------+ +| student_id | student_name | ++-------------+---------------+ +| 2 | Jade | ++-------------+---------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT *, + MIN(score) OVER (PARTITION BY exam_id) AS lowest_score, + MAX(score) OVER (PARTITION BY exam_id) AS highest_score + FROM exams_1412 +), +cte1 AS( + SELECT DISTINCT student_id + FROM exams_1412 + EXCEPT + SELECT DISTINCT student_id + FROM cte c + WHERE score = lowest_score OR score = highest_score +) +SELECT s.* +FROM cte1 c +INNER JOIN students_1412 s ON c.student_id = s.student_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `exams`, `cte`, `cte1`, `students`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `cte1`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `exams`, computes window metrics. +3. CTE `cte1`: reads `exams`, `cte`. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/hard/1412. Find the Quiet Students in All Exams.sql b/hard/1412. Find the Quiet Students in All Exams.sql deleted file mode 100644 index dc1cc0d..0000000 --- a/hard/1412. Find the Quiet Students in All Exams.sql +++ /dev/null @@ -1,17 +0,0 @@ -WITH cte AS( - SELECT *, - MIN(score) OVER (PARTITION BY exam_id) AS lowest_score, - MAX(score) OVER (PARTITION BY exam_id) AS highest_score - FROM exams_1412 -), -cte1 AS( - SELECT DISTINCT student_id - FROM exams_1412 - EXCEPT - SELECT DISTINCT student_id - FROM cte c - WHERE score = lowest_score OR score = highest_score -) -SELECT s.* -FROM cte1 c -INNER JOIN students_1412 s ON c.student_id = s.student_id; diff --git a/hard/1479. Sales by Day of the Week.md b/hard/1479. Sales by Day of the Week.md new file mode 100644 index 0000000..aaaa922 --- /dev/null +++ b/hard/1479. Sales by Day of the Week.md @@ -0,0 +1,95 @@ +# Question 1479: Sales by Day of the Week + +**LeetCode URL:** https://leetcode.com/problems/sales-by-day-of-the-week/ + +## Description + +Write an SQL query to report how many units in each category have been ordered on each day of the week. Return the result table ordered by category. The query result format is in the following example: Orders table: +------------+--------------+-------------+--------------+-------------+ | order_id | customer_id | order_date | item_id | quantity | +------------+--------------+-------------+--------------+-------------+ | 1 | 1 | 2020-06-01 | 1 | 10 | | 2 | 1 | 2020-06-08 | 2 | 10 | | 3 | 2 | 2020-06-02 | 1 | 5 | | 4 | 3 | 2020-06-03 | 3 | 5 | | 5 | 4 | 2020-06-04 | 4 | 1 | | 6 | 4 | 2020-06-05 | 5 | 5 | | 7 | 5 | 2020-06-05 | 1 | 10 | | 8 | 5 | 2020-06-14 | 4 | 5 | | 9 | 5 | 2020-06-21 | 3 | 5 | +------------+--------------+-------------+--------------+-------------+ Items table: +------------+----------------+---------------+ | item_id | item_name | item_category | +------------+----------------+---------------+ | 1 | LC Alg. + +## Table Schema Structure + +```sql +Create table If Not Exists Orders (order_id int, customer_id int, order_date date, item_id varchar(30), quantity int); +Create table If Not Exists Items (item_id varchar(30), item_name varchar(30), item_category varchar(30)); +``` + +## Sample Input Data + +```sql +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('1', '1', '2020-06-01', '1', '10'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('2', '1', '2020-06-08', '2', '10'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('3', '2', '2020-06-02', '1', '5'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('4', '3', '2020-06-03', '3', '5'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('5', '4', '2020-06-04', '4', '1'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('6', '4', '2020-06-05', '5', '5'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('7', '5', '2020-06-05', '1', '10'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('8', '5', '2020-06-14', '4', '5'); +insert into Orders (order_id, customer_id, order_date, item_id, quantity) values ('9', '5', '2020-06-21', '3', '5'); +insert into Items (item_id, item_name, item_category) values ('1', 'LC Alg. Book', 'Book'); +insert into Items (item_id, item_name, item_category) values ('2', 'LC DB. Book', 'Book'); +insert into Items (item_id, item_name, item_category) values ('3', 'LC SmarthPhone', 'Phone'); +insert into Items (item_id, item_name, item_category) values ('4', 'LC Phone 2020', 'Phone'); +insert into Items (item_id, item_name, item_category) values ('5', 'LC SmartGlass', 'Glasses'); +insert into Items (item_id, item_name, item_category) values ('6', 'LC T-Shirt XL', 'T-shirt'); +``` + +## Expected Output Data + +```text ++------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ +| Category | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | ++------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ +| Book | 20 | 5 | 0 | 0 | 10 | 0 | 0 | +| Glasses | 0 | 0 | 0 | 0 | 5 | 0 | 0 | +| Phone | 0 | 0 | 5 | 1 | 0 | 0 | 10 | +| T-Shirt | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ++------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+ +``` + +## SQL Solution + +```sql +SELECT i.item_category AS "Category", + SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=1 THEN o.quantity ELSE 0 END) AS "Monday", + SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=2 THEN o.quantity ELSE 0 END) AS "Tuesday", + SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=3 THEN o.quantity ELSE 0 END) AS "Wednesday", + SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=4 THEN o.quantity ELSE 0 END) AS "Thursday", + SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=5 THEN o.quantity ELSE 0 END) AS "Friday", + SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=6 THEN o.quantity ELSE 0 END) AS "Saturday", + SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=7 THEN o.quantity ELSE 0 END) AS "Sunday" +FROM items_1479 i +LEFT JOIN orders_1479 o ON o.item_id=i.item_id::INT +GROUP BY i.item_category +ORDER BY i.item_category; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `ISODOW` from `o`, `items`, `orders`. + +### Result Grain + +One row per unique key in `GROUP BY i.item_category`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by i.item_category. +3. Project final output columns: `ISODOW`. +4. Order output deterministically with `ORDER BY i.item_category`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/hard/1479. Sales by Day of the Week.sql b/hard/1479. Sales by Day of the Week.sql deleted file mode 100644 index d93cdb6..0000000 --- a/hard/1479. Sales by Day of the Week.sql +++ /dev/null @@ -1,12 +0,0 @@ -SELECT i.item_category AS "Category", - SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=1 THEN o.quantity ELSE 0 END) AS "Monday", - SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=2 THEN o.quantity ELSE 0 END) AS "Tuesday", - SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=3 THEN o.quantity ELSE 0 END) AS "Wednesday", - SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=4 THEN o.quantity ELSE 0 END) AS "Thursday", - SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=5 THEN o.quantity ELSE 0 END) AS "Friday", - SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=6 THEN o.quantity ELSE 0 END) AS "Saturday", - SUM(CASE WHEN EXTRACT(ISODOW FROM o.order_date)=7 THEN o.quantity ELSE 0 END) AS "Sunday" -FROM items_1479 i -LEFT JOIN orders_1479 o ON o.item_id=i.item_id::INT -GROUP BY i.item_category -ORDER BY i.item_category; diff --git a/hard/1635. Hopper Company Queries I.md b/hard/1635. Hopper Company Queries I.md new file mode 100644 index 0000000..e097dda --- /dev/null +++ b/hard/1635. Hopper Company Queries I.md @@ -0,0 +1,156 @@ +# Question 1635: Hopper Company Queries I + +**LeetCode URL:** https://leetcode.com/problems/hopper-company-queries-i/ + +## Description + +Write an SQL query to report the following statistics for each month of 2020: - The number of drivers currently with the Hopper company by the end of the month (active_drivers). Return the result table ordered by month in ascending order, where month is the month's number (January is 1, February is 2, etc. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Drivers (driver_id int, join_date date); +Create table If Not Exists Rides (ride_id int, user_id int, requested_at date); +Create table If Not Exists AcceptedRides (ride_id int, driver_id int, ride_distance int, ride_duration int); +``` + +## Sample Input Data + +```sql +insert into Drivers (driver_id, join_date) values ('10', '2019-12-10'); +insert into Drivers (driver_id, join_date) values ('8', '2020-1-13'); +insert into Drivers (driver_id, join_date) values ('5', '2020-2-16'); +insert into Drivers (driver_id, join_date) values ('7', '2020-3-8'); +insert into Drivers (driver_id, join_date) values ('4', '2020-5-17'); +insert into Drivers (driver_id, join_date) values ('1', '2020-10-24'); +insert into Drivers (driver_id, join_date) values ('6', '2021-1-5'); +insert into Rides (ride_id, user_id, requested_at) values ('6', '75', '2019-12-9'); +insert into Rides (ride_id, user_id, requested_at) values ('1', '54', '2020-2-9'); +insert into Rides (ride_id, user_id, requested_at) values ('10', '63', '2020-3-4'); +insert into Rides (ride_id, user_id, requested_at) values ('19', '39', '2020-4-6'); +insert into Rides (ride_id, user_id, requested_at) values ('3', '41', '2020-6-3'); +insert into Rides (ride_id, user_id, requested_at) values ('13', '52', '2020-6-22'); +insert into Rides (ride_id, user_id, requested_at) values ('7', '69', '2020-7-16'); +insert into Rides (ride_id, user_id, requested_at) values ('17', '70', '2020-8-25'); +insert into Rides (ride_id, user_id, requested_at) values ('20', '81', '2020-11-2'); +insert into Rides (ride_id, user_id, requested_at) values ('5', '57', '2020-11-9'); +insert into Rides (ride_id, user_id, requested_at) values ('2', '42', '2020-12-9'); +insert into Rides (ride_id, user_id, requested_at) values ('11', '68', '2021-1-11'); +insert into Rides (ride_id, user_id, requested_at) values ('15', '32', '2021-1-17'); +insert into Rides (ride_id, user_id, requested_at) values ('12', '11', '2021-1-19'); +insert into Rides (ride_id, user_id, requested_at) values ('14', '18', '2021-1-27'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('10', '10', '63', '38'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('13', '10', '73', '96'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('7', '8', '100', '28'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('17', '7', '119', '68'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('20', '1', '121', '92'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('5', '7', '42', '101'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('2', '4', '6', '38'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('11', '8', '37', '43'); +... (truncated) +``` + +## Expected Output Data + +```text ++-------+----------------+----------------+ +| month | active_drivers | accepted_rides | ++-------+----------------+----------------+ +| 1 | 2 | 0 | +| 2 | 3 | 0 | +| 3 | 4 | 1 | +| 4 | 4 | 0 | +| 5 | 5 | 0 | +| 6 | 5 | 1 | +| 7 | 5 | 1 | +| 8 | 5 | 1 | +| 9 | 5 | 0 | +| 10 | 6 | 0 | +| 11 | 6 | 2 | +| 12 | 6 | 1 | ++-------+----------------+----------------+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE ac_rides AS ( + SELECT ar.ride_id,r.requested_at + FROM accepted_rides_1635 ar + INNER JOIN rides_1635 r ON ar.ride_id = r.ride_id AND EXTRACT(YEAR FROM r.requested_at)<=2020 +), +months AS ( + SELECT 1 AS num + UNION + SELECT num+1 AS num + FROM months + WHERE num<=11 +), +ride_details AS ( + SELECT * + FROM months m + LEFT JOIN ac_rides ar ON EXTRACT(MONTH FROM ar.requested_at)=m.num +), +aggr_details AS ( + SELECT num,COUNT(DISTINCT ride_id) AS rides + FROM ride_details + GROUP BY num +), +avail_drivers AS ( + SELECT *, + ROW_NUMBER() OVER (ORDER BY join_date) AS drivers + FROM drivers_1635 +), +drivers_2020 AS ( + SELECT * + FROM months m + LEFT JOIN avail_drivers a ON EXTRACT(YEAR FROM a.join_date)=2020 AND m.num>=EXTRACT(MONTH FROM a.join_date) +), +driver_count AS ( + SELECT num,MAX(drivers) AS drivers + FROM drivers_2020 + GROUP BY num +) +SELECT a.num AS month,a.rides,d.drivers +FROM aggr_details a +INNER JOIN driver_count d ON a.num=d.num; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `month`, `rides`, `drivers` from `accepted_rides`, `rides`, `r`, `months`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`months`, `ride_details`, `aggr_details`, `avail_drivers`, `drivers_2020`, `driver_count`) to transform data incrementally; recursion expands rows level-by-level until the stop condition is met. +2. CTE `ac_rides`: reads `accepted_rides`, `rides`, `r`, joins related entities. +3. CTE `months`: reads `months`. +4. CTE `ride_details`: reads `months`, `ac_rides`, `ar`, joins related entities. +5. CTE `aggr_details`: reads `ride_details`. +6. CTE `avail_drivers`: reads `drivers`, computes window metrics. +7. CTE `drivers_2020`: reads `months`, `avail_drivers`, `a`, joins related entities. +8. CTE `driver_count`: reads `drivers`. +9. Combine datasets using LEFT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +10. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +11. Project final output columns: `month`, `rides`, `drivers`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, recursive expansion, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/hard/1635. Hopper Company Queries I.sql b/hard/1635. Hopper Company Queries I.sql deleted file mode 100644 index cdedc1d..0000000 --- a/hard/1635. Hopper Company Queries I.sql +++ /dev/null @@ -1,40 +0,0 @@ -WITH RECURSIVE ac_rides AS ( - SELECT ar.ride_id,r.requested_at - FROM accepted_rides_1635 ar - INNER JOIN rides_1635 r ON ar.ride_id = r.ride_id AND EXTRACT(YEAR FROM r.requested_at)<=2020 -), -months AS ( - SELECT 1 AS num - UNION - SELECT num+1 AS num - FROM months - WHERE num<=11 -), -ride_details AS ( - SELECT * - FROM months m - LEFT JOIN ac_rides ar ON EXTRACT(MONTH FROM ar.requested_at)=m.num -), -aggr_details AS ( - SELECT num,COUNT(DISTINCT ride_id) AS rides - FROM ride_details - GROUP BY num -), -avail_drivers AS ( - SELECT *, - ROW_NUMBER() OVER (ORDER BY join_date) AS drivers - FROM drivers_1635 -), -drivers_2020 AS ( - SELECT * - FROM months m - LEFT JOIN avail_drivers a ON EXTRACT(YEAR FROM a.join_date)=2020 AND m.num>=EXTRACT(MONTH FROM a.join_date) -), -driver_count AS ( - SELECT num,MAX(drivers) AS drivers - FROM drivers_2020 - GROUP BY num -) -SELECT a.num AS month,a.rides,d.drivers -FROM aggr_details a -INNER JOIN driver_count d ON a.num=d.num; diff --git a/hard/1645. Hopper Company Queries II.md b/hard/1645. Hopper Company Queries II.md new file mode 100644 index 0000000..425b023 --- /dev/null +++ b/hard/1645. Hopper Company Queries II.md @@ -0,0 +1,143 @@ +# Question 1645: Hopper Company Queries II + +**LeetCode URL:** https://leetcode.com/problems/hopper-company-queries-ii/ + +## Description + +Write an SQL query to report the percentage of working drivers (working_percentage) for each month of 2020 where: Note that if the number of available drivers during a month is zero, we consider the working_percentage to be 0. Return the result table ordered by month in ascending order, where month is the month's number (January is 1, February is 2, etc. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Drivers (driver_id int, join_date date); +Create table If Not Exists Rides (ride_id int, user_id int, requested_at date); +Create table If Not Exists AcceptedRides (ride_id int, driver_id int, ride_distance int, ride_duration int); +``` + +## Sample Input Data + +```sql +insert into Drivers (driver_id, join_date) values ('10', '2019-12-10'); +insert into Drivers (driver_id, join_date) values ('8', '2020-1-13'); +insert into Drivers (driver_id, join_date) values ('5', '2020-2-16'); +insert into Drivers (driver_id, join_date) values ('7', '2020-3-8'); +insert into Drivers (driver_id, join_date) values ('4', '2020-5-17'); +insert into Drivers (driver_id, join_date) values ('1', '2020-10-24'); +insert into Drivers (driver_id, join_date) values ('6', '2021-1-5'); +insert into Rides (ride_id, user_id, requested_at) values ('6', '75', '2019-12-9'); +insert into Rides (ride_id, user_id, requested_at) values ('1', '54', '2020-2-9'); +insert into Rides (ride_id, user_id, requested_at) values ('10', '63', '2020-3-4'); +insert into Rides (ride_id, user_id, requested_at) values ('19', '39', '2020-4-6'); +insert into Rides (ride_id, user_id, requested_at) values ('3', '41', '2020-6-3'); +insert into Rides (ride_id, user_id, requested_at) values ('13', '52', '2020-6-22'); +insert into Rides (ride_id, user_id, requested_at) values ('7', '69', '2020-7-16'); +insert into Rides (ride_id, user_id, requested_at) values ('17', '70', '2020-8-25'); +insert into Rides (ride_id, user_id, requested_at) values ('20', '81', '2020-11-2'); +insert into Rides (ride_id, user_id, requested_at) values ('5', '57', '2020-11-9'); +insert into Rides (ride_id, user_id, requested_at) values ('2', '42', '2020-12-9'); +insert into Rides (ride_id, user_id, requested_at) values ('11', '68', '2021-1-11'); +insert into Rides (ride_id, user_id, requested_at) values ('15', '32', '2021-1-17'); +insert into Rides (ride_id, user_id, requested_at) values ('12', '11', '2021-1-19'); +insert into Rides (ride_id, user_id, requested_at) values ('14', '18', '2021-1-27'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('10', '10', '63', '38'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('13', '10', '73', '96'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('7', '8', '100', '28'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('17', '7', '119', '68'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('20', '1', '121', '92'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('5', '7', '42', '101'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('2', '4', '6', '38'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('11', '8', '37', '43'); +... (truncated) +``` + +## Expected Output Data + +```text ++-------+--------------------+ +| month | working_percentage | ++-------+--------------------+ +| 1 | 0.00 | +| 2 | 0.00 | +| 3 | 25.00 | +| 4 | 0.00 | +| 5 | 0.00 | +| 6 | 20.00 | +| 7 | 20.00 | +| 8 | 20.00 | +| 9 | 0.00 | +| 10 | 0.00 | +| 11 | 33.33 | +| 12 | 16.67 | ++-------+--------------------+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE months AS ( + SELECT 1 AS m + UNION + SELECT m+1 AS m + FROM months + WHERE m <= 11 +), +accepted_rides_2020 AS ( + SELECT mn.m,COUNT(ar.ride_id) AS accepted_rides + FROM accepted_rides_1645 ar + INNER JOIN rides_1645 r ON ar.ride_id = r.ride_id AND EXTRACT(year FROM r.requested_at)=2020 + RIGHT JOIN months mn ON mn.m = EXTRACT(month FROM r.requested_at) + GROUP BY m +), +running_drivers AS ( + SELECT *, + COUNT(driver_id) OVER (ORDER BY join_date) AS drivers_cnt + FROM drivers_1645 +), +drivers AS ( + SELECT mn.m,ar.accepted_rides,MAX(d.drivers_cnt) AS drivers + FROM running_drivers d + RIGHT JOIN months mn ON mn.m >= EXTRACT(month FROM d.join_date) AND EXTRACT(year FROM d.join_date)=2020 + INNER JOIN accepted_rides_2020 ar ON ar.m=mn.m + GROUP BY mn.m,ar.accepted_rides +) +SELECT m AS month,ROUND(accepted_rides*100.0/drivers,2) AS working_percentage +FROM drivers +ORDER BY month; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `month`, `working_percentage` from `months`, `accepted_rides`, `rides`, `r`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`accepted_rides_2020`, `running_drivers`, `drivers`) to transform data incrementally; recursion expands rows level-by-level until the stop condition is met. +2. CTE `months`: reads `months`. +3. CTE `accepted_rides_2020`: reads `accepted_rides`, `rides`, `r`, joins related entities. +4. CTE `running_drivers`: reads `drivers`, computes window metrics. +5. CTE `drivers`: reads `running_drivers`, `months`, `d`, joins related entities. +6. Combine datasets using RIGHT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +7. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +8. Project final output columns: `month`, `working_percentage`. +9. Order output deterministically with `ORDER BY month`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, recursive expansion, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/hard/1645. Hopper Company Queries II.sql b/hard/1645. Hopper Company Queries II.sql deleted file mode 100644 index 879a132..0000000 --- a/hard/1645. Hopper Company Queries II.sql +++ /dev/null @@ -1,29 +0,0 @@ -WITH RECURSIVE months AS ( - SELECT 1 AS m - UNION - SELECT m+1 AS m - FROM months - WHERE m <= 11 -), -accepted_rides_2020 AS ( - SELECT mn.m,COUNT(ar.ride_id) AS accepted_rides - FROM accepted_rides_1645 ar - INNER JOIN rides_1645 r ON ar.ride_id = r.ride_id AND EXTRACT(year FROM r.requested_at)=2020 - RIGHT JOIN months mn ON mn.m = EXTRACT(month FROM r.requested_at) - GROUP BY m -), -running_drivers AS ( - SELECT *, - COUNT(driver_id) OVER (ORDER BY join_date) AS drivers_cnt - FROM drivers_1645 -), -drivers AS ( - SELECT mn.m,ar.accepted_rides,MAX(d.drivers_cnt) AS drivers - FROM running_drivers d - RIGHT JOIN months mn ON mn.m >= EXTRACT(month FROM d.join_date) AND EXTRACT(year FROM d.join_date)=2020 - INNER JOIN accepted_rides_2020 ar ON ar.m=mn.m - GROUP BY mn.m,ar.accepted_rides -) -SELECT m AS month,ROUND(accepted_rides*100.0/drivers,2) AS working_percentage -FROM drivers -ORDER BY month; diff --git a/hard/1651. Hopper Company Queries III.md b/hard/1651. Hopper Company Queries III.md new file mode 100644 index 0000000..be3ad78 --- /dev/null +++ b/hard/1651. Hopper Company Queries III.md @@ -0,0 +1,133 @@ +# Question 1651: Hopper Company Queries III + +**LeetCode URL:** https://leetcode.com/problems/hopper-company-queries-iii/ + +## Description + +Write an SQL query to compute the average_ride_distance and average_ride_duration of every 3-month window starting from January - March 2020 to October - December 2020. Return the result table ordered by month in ascending order, where month is the starting month's number (January is 1, February is 2, etc. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Drivers (driver_id int, join_date date); +Create table If Not Exists Rides (ride_id int, user_id int, requested_at date); +Create table If Not Exists AcceptedRides (ride_id int, driver_id int, ride_distance int, ride_duration int); +``` + +## Sample Input Data + +```sql +insert into Drivers (driver_id, join_date) values ('10', '2019-12-10'); +insert into Drivers (driver_id, join_date) values ('8', '2020-1-13'); +insert into Drivers (driver_id, join_date) values ('5', '2020-2-16'); +insert into Drivers (driver_id, join_date) values ('7', '2020-3-8'); +insert into Drivers (driver_id, join_date) values ('4', '2020-5-17'); +insert into Drivers (driver_id, join_date) values ('1', '2020-10-24'); +insert into Drivers (driver_id, join_date) values ('6', '2021-1-5'); +insert into Rides (ride_id, user_id, requested_at) values ('6', '75', '2019-12-9'); +insert into Rides (ride_id, user_id, requested_at) values ('1', '54', '2020-2-9'); +insert into Rides (ride_id, user_id, requested_at) values ('10', '63', '2020-3-4'); +insert into Rides (ride_id, user_id, requested_at) values ('19', '39', '2020-4-6'); +insert into Rides (ride_id, user_id, requested_at) values ('3', '41', '2020-6-3'); +insert into Rides (ride_id, user_id, requested_at) values ('13', '52', '2020-6-22'); +insert into Rides (ride_id, user_id, requested_at) values ('7', '69', '2020-7-16'); +insert into Rides (ride_id, user_id, requested_at) values ('17', '70', '2020-8-25'); +insert into Rides (ride_id, user_id, requested_at) values ('20', '81', '2020-11-2'); +insert into Rides (ride_id, user_id, requested_at) values ('5', '57', '2020-11-9'); +insert into Rides (ride_id, user_id, requested_at) values ('2', '42', '2020-12-9'); +insert into Rides (ride_id, user_id, requested_at) values ('11', '68', '2021-1-11'); +insert into Rides (ride_id, user_id, requested_at) values ('15', '32', '2021-1-17'); +insert into Rides (ride_id, user_id, requested_at) values ('12', '11', '2021-1-19'); +insert into Rides (ride_id, user_id, requested_at) values ('14', '18', '2021-1-27'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('10', '10', '63', '38'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('13', '10', '73', '96'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('7', '8', '100', '28'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('17', '7', '119', '68'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('20', '1', '121', '92'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('5', '7', '42', '101'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('2', '4', '6', '38'); +insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('11', '8', '37', '43'); +... (truncated) +``` + +## Expected Output Data + +```text ++-------+-----------------------+-----------------------+ +| month | average_ride_distance | average_ride_duration | ++-------+-----------------------+-----------------------+ +| 1 | 21.00 | 12.67 | +| 2 | 21.00 | 12.67 | +| 3 | 21.00 | 12.67 | +| 4 | 24.33 | 32.00 | +| 5 | 57.67 | 41.33 | +| 6 | 97.33 | 64.00 | +| 7 | 73.00 | 32.00 | +| 8 | 39.67 | 22.67 | +| 9 | 54.33 | 64.33 | +| 10 | 56.33 | 77.00 | ++-------+-----------------------+-----------------------+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE months AS ( + select 1 as m + union + select m+1 as m + from months + where m<=11 +), +cte AS ( + SELECT mn.m,SUM(COALESCE(ar.ride_distance,0)) AS ride_distance,SUM(COALESCE(ar.ride_duration,0)) AS ride_duration, + CASE WHEN m BETWEEN 1 AND 3 THEN 'q1' + WHEN m BETWEEN 4 AND 6 THEN 'q2' + WHEN m BETWEEN 7 AND 9 THEN 'q3' + ELSE 'q4' + END AS quater + FROM accepted_rides_1651 ar + INNER JOIN rides_1651 r ON ar.ride_id = r.ride_id AND EXTRACT(year FROM requested_at) = 2020 + RIGHT JOIN months mn ON mn.m = EXTRACT(month FROM requested_at) + GROUP BY mn.m +) +SELECT m, + ROUND(AVG(ride_distance) OVER (ORDER BY m ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING),2) AS average_ride_distance, + ROUND(AVG(ride_duration) OVER (ORDER BY m ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING),2) AS average_ride_duration +FROM cte; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `m`, `average_ride_distance`, `average_ride_duration` from `months`, `accepted_rides`, `rides`, `requested_at`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to transform data incrementally; recursion expands rows level-by-level until the stop condition is met. +2. CTE `months`: reads `months`. +3. CTE `cte`: reads `accepted_rides`, `rides`, `requested_at`, joins related entities. +4. Combine datasets using RIGHT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `m`, `average_ride_distance`, `average_ride_duration`. +7. Order output deterministically with `ORDER BY m ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING),2) AS average_ride_distance, ROUND(AVG(ride_duration) OVER (ORDER BY m ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING),2) AS aver...`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, recursive expansion, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/hard/1651. Hopper Company Queries III.sql b/hard/1651. Hopper Company Queries III.sql deleted file mode 100644 index 9e82608..0000000 --- a/hard/1651. Hopper Company Queries III.sql +++ /dev/null @@ -1,23 +0,0 @@ -WITH RECURSIVE months AS ( - select 1 as m - union - select m+1 as m - from months - where m<=11 -), -cte AS ( - SELECT mn.m,SUM(COALESCE(ar.ride_distance,0)) AS ride_distance,SUM(COALESCE(ar.ride_duration,0)) AS ride_duration, - CASE WHEN m BETWEEN 1 AND 3 THEN 'q1' - WHEN m BETWEEN 4 AND 6 THEN 'q2' - WHEN m BETWEEN 7 AND 9 THEN 'q3' - ELSE 'q4' - END AS quater - FROM accepted_rides_1651 ar - INNER JOIN rides_1651 r ON ar.ride_id = r.ride_id AND EXTRACT(year FROM requested_at) = 2020 - RIGHT JOIN months mn ON mn.m = EXTRACT(month FROM requested_at) - GROUP BY mn.m -) -SELECT m, - ROUND(AVG(ride_distance) OVER (ORDER BY m ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING),2) AS average_ride_distance, - ROUND(AVG(ride_duration) OVER (ORDER BY m ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING),2) AS average_ride_duration -FROM cte; diff --git a/hard/1767. Find the Subtasks That Did Not Execute.md b/hard/1767. Find the Subtasks That Did Not Execute.md new file mode 100644 index 0000000..8a553b4 --- /dev/null +++ b/hard/1767. Find the Subtasks That Did Not Execute.md @@ -0,0 +1,90 @@ +# Question 1767: Find the Subtasks That Did Not Execute + +**LeetCode URL:** https://leetcode.com/problems/find-the-subtasks-that-did-not-execute/ + +## Description + +Write an SQL query to report the IDs of the missing subtasks for each task_id. Return the result table in any order. The query result format is in the following example: Tasks table: +---------+----------------+ | task_id | subtasks_count | +---------+----------------+ | 1 | 3 | | 2 | 2 | | 3 | 4 | +---------+----------------+ Executed table: +---------+------------+ | task_id | subtask_id | +---------+------------+ | 1 | 2 | | 3 | 1 | | 3 | 2 | | 3 | 3 | | 3 | 4 | +---------+------------+ Result table: +---------+------------+ | task_id | subtask_id | +---------+------------+ | 1 | 1 | | 1 | 3 | | 2 | 1 | | 2 | 2 | +---------+------------+ Task 1 was divided into 3 subtasks (1, 2, 3). + +## Table Schema Structure + +```sql +Create table If Not Exists Tasks (task_id int, subtasks_count int); +Create table If Not Exists Executed (task_id int, subtask_id int); +``` + +## Sample Input Data + +```sql +insert into Tasks (task_id, subtasks_count) values ('1', '3'); +insert into Tasks (task_id, subtasks_count) values ('2', '2'); +insert into Tasks (task_id, subtasks_count) values ('3', '4'); +insert into Executed (task_id, subtask_id) values ('1', '2'); +insert into Executed (task_id, subtask_id) values ('3', '1'); +insert into Executed (task_id, subtask_id) values ('3', '2'); +insert into Executed (task_id, subtask_id) values ('3', '3'); +insert into Executed (task_id, subtask_id) values ('3', '4'); +``` + +## Expected Output Data + +```text ++---------+------------+ +| task_id | subtask_id | ++---------+------------+ +| 1 | 1 | +| 1 | 3 | +| 2 | 1 | +| 2 | 2 | ++---------+------------+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE all_subtasks AS ( + SELECT task_id,subtasks_count,1 AS subtask_id + FROM tasks_1767 + UNION ALL + SELECT task_id,subtasks_count,subtask_id+1 AS subtask_id + FROM all_subtasks + WHERE subtask_id l2.page_id +) +SELECT user1_id,page_id,COUNT(DISTINCT user2_id) AS friends_likes +FROM possible_recommendation +WHERE (user1_id,page_id) NOT IN (SELECT * FROM likes_1892) +GROUP BY user1_id,page_id +ORDER BY user1_id,page_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user1_id`, `page_id`, `friends_likes` from `friendship`, `friends`, `likes`, `possible_recommendation`. + +### Result Grain + +One row per unique key in `GROUP BY user1_id,page_id`. + +### Step-by-Step Logic + +1. Create CTE layers (`friends`, `possible_recommendation`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `friends`: reads `friendship`. +3. CTE `possible_recommendation`: reads `friends`, `likes`, joins related entities. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: (user1_id,page_id) NOT IN (SELECT * FROM likes_1892). +6. Aggregate rows with COUNT grouped by user1_id,page_id. +7. Project final output columns: `user1_id`, `page_id`, `friends_likes`. +8. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. +9. Order output deterministically with `ORDER BY user1_id,page_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/hard/1892. Page Recommendations II (Hard).sql b/hard/1892. Page Recommendations II (Hard).sql deleted file mode 100644 index c5aa3c4..0000000 --- a/hard/1892. Page Recommendations II (Hard).sql +++ /dev/null @@ -1,18 +0,0 @@ -WITH friends AS ( - SELECT user1_id,user2_id - FROM friendship_1892 - UNION ALL - SELECT user2_id,user1_id - FROM friendship_1892 -), -possible_recommendation AS ( - SELECT f.user1_id,f.user2_id,l2.page_id - FROM friends f - INNER JOIN likes_1892 l1 ON f.user1_id = l1.user_id - INNER JOIN likes_1892 l2 ON f.user2_id = l2.user_id AND l1.page_id <> l2.page_id -) -SELECT user1_id,page_id,COUNT(DISTINCT user2_id) AS friends_likes -FROM possible_recommendation -WHERE (user1_id,page_id) NOT IN (SELECT * FROM likes_1892) -GROUP BY user1_id,page_id -ORDER BY user1_id,page_id; diff --git a/hard/1917. Leetcodify Friends Recommendations (Hard).md b/hard/1917. Leetcodify Friends Recommendations (Hard).md new file mode 100644 index 0000000..1863473 --- /dev/null +++ b/hard/1917. Leetcodify Friends Recommendations (Hard).md @@ -0,0 +1,103 @@ +# Question 1917: Leetcodify Friends Recommendations + +**LeetCode URL:** https://leetcode.com/problems/leetcodify-friends-recommendations/ + +## Description + +Drafted from this solution SQL: write a query on `listens`, `friendship`, `all_recommendations`, `friends` to return `user_id1`, `user_id2`. Apply filter conditions: f.user1_id IS NULL. Group results by: l1.user_id,l2.user_id HAVING COUNT(l1.song_id)>=3 ), friends AS ( SELECT user1_id,user2_id FROM friendship_1917 UNION SELECT user2_id,user1_id FROM friendship_1917 ) SELECT r.user_id1,r.user_id2 FROM all_recommendations r LEFT JOIN friends f ON f.user1_id = r.user_id1 AND f.user2_id = r.user_id2 WHERE f.user1_id IS NULL. Keep groups satisfying: COUNT(l1.song_id)>=3 ), friends AS ( SELECT user1_id,user2_id FROM friendship_1917 UNION SELECT user2_id,user1_id FROM friendship_1917 ) SELECT r.user_id1,r.user_id2 FROM all_recommendations r LEFT JOIN friends f ON f.user1_id = r.user_id1 AND f.user2_id = r.user_id2 WHERE f.user1_id IS NULL. + +## Table Schema Structure + +```sql +Create table If Not Exists Listens (user_id int, song_id int, day date); +Create table If Not Exists Friendship (user1_id int, user2_id int); +``` + +## Sample Input Data + +```sql +insert into Listens (user_id, song_id, day) values ('1', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('1', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('1', '12', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('2', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('2', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('2', '12', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('3', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('3', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('3', '12', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('4', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('4', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('4', '13', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('5', '10', '2021-03-16'); +insert into Listens (user_id, song_id, day) values ('5', '11', '2021-03-16'); +insert into Listens (user_id, song_id, day) values ('5', '12', '2021-03-16'); +insert into Friendship (user1_id, user2_id) values ('1', '2'); +``` + +## Expected Output Data + +```text ++----------+----------+ +| user_id1 | user_id2 | ++----------+----------+ +| sample | sample | ++----------+----------+ +``` + +## SQL Solution + +```sql +WITH all_recommendations AS ( + SELECT l1.user_id AS user_id1,l2.user_id AS user_id2,COUNT(l1.song_id) AS listened_songs + FROM listens_1917 l1 + INNER JOIN listens_1917 l2 + ON l1.user_id <> l2.user_id AND + l1.song_id = l2.song_id AND + l1.day = l2.day + GROUP BY l1.user_id,l2.user_id + HAVING COUNT(l1.song_id)>=3 +), +friends AS ( + SELECT user1_id,user2_id + FROM friendship_1917 + UNION + SELECT user2_id,user1_id + FROM friendship_1917 +) +SELECT r.user_id1,r.user_id2 +FROM all_recommendations r +LEFT JOIN friends f ON f.user1_id = r.user_id1 AND f.user2_id = r.user_id2 +WHERE f.user1_id IS NULL; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id1`, `user_id2` from `listens`, `friendship`, `all_recommendations`, `friends`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`all_recommendations`, `friends`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `all_recommendations`: reads `listens`, joins related entities. +3. CTE `friends`: reads `friendship`. +4. Combine datasets using LEFT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: f.user1_id IS NULL. +6. Project final output columns: `user_id1`, `user_id2`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/hard/1917. Leetcodify Friends Recommendations (Hard).sql b/hard/1917. Leetcodify Friends Recommendations (Hard).sql deleted file mode 100644 index 6b8ccea..0000000 --- a/hard/1917. Leetcodify Friends Recommendations (Hard).sql +++ /dev/null @@ -1,22 +0,0 @@ -WITH all_recommendations AS ( - SELECT l1.user_id AS user_id1,l2.user_id AS user_id2,COUNT(l1.song_id) AS listened_songs - FROM listens_1917 l1 - INNER JOIN listens_1917 l2 - ON l1.user_id <> l2.user_id AND - l1.song_id = l2.song_id AND - l1.day = l2.day - GROUP BY l1.user_id,l2.user_id - HAVING COUNT(l1.song_id)>=3 -), -friends AS ( - SELECT user1_id,user2_id - FROM friendship_1917 - UNION - SELECT user2_id,user1_id - FROM friendship_1917 -) -SELECT r.user_id1,r.user_id2 -FROM all_recommendations r -LEFT JOIN friends f ON f.user1_id = r.user_id1 AND f.user2_id = r.user_id2 -WHERE f.user1_id IS NULL; - diff --git a/hard/1919. Leetcodify Similar Friends (Hard).md b/hard/1919. Leetcodify Similar Friends (Hard).md new file mode 100644 index 0000000..2eb25e0 --- /dev/null +++ b/hard/1919. Leetcodify Similar Friends (Hard).md @@ -0,0 +1,94 @@ +# Question 1919: Leetcodify Similar Friends + +**LeetCode URL:** https://leetcode.com/problems/leetcodify-similar-friends/ + +## Description + +Drafted from this solution SQL: write a query on `listens`, `similar_friends`, `friendship` to return the required result columns. Group results by: l1.user_id,l2.user_id HAVING COUNT(l1.song_id) >= 3 ) SELECT f.* FROM similar_friends sf INNER JOIN friendship_1919 f ON sf.user_id1 = f.user1_id AND sf.user_id2 = f.user2_id. Keep groups satisfying: COUNT(l1.song_id) >= 3 ) SELECT f.* FROM similar_friends sf INNER JOIN friendship_1919 f ON sf.user_id1 = f.user1_id AND sf.user_id2 = f.user2_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Listens (user_id int, song_id int, day date); +Create table If Not Exists Friendship (user1_id int, user2_id int); +``` + +## Sample Input Data + +```sql +insert into Listens (user_id, song_id, day) values ('1', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('1', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('1', '12', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('2', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('2', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('2', '12', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('3', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('3', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('3', '12', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('4', '10', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('4', '11', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('4', '13', '2021-03-15'); +insert into Listens (user_id, song_id, day) values ('5', '10', '2021-03-16'); +insert into Listens (user_id, song_id, day) values ('5', '11', '2021-03-16'); +insert into Listens (user_id, song_id, day) values ('5', '12', '2021-03-16'); +insert into Friendship (user1_id, user2_id) values ('1', '2'); +insert into Friendship (user1_id, user2_id) values ('2', '4'); +insert into Friendship (user1_id, user2_id) values ('2', '5'); +``` + +## Expected Output Data + +```text ++------------------+ +| result | ++------------------+ +| derived values | ++------------------+ +``` + +## SQL Solution + +```sql +WITH similar_friends AS( + SELECT l1.user_id AS user_id1,l2.user_id AS user_id2,COUNT(l1.song_id) + FROM listens_1919 l1 + INNER JOIN listens_1919 l2 + ON l1.user_id < l2.user_id AND + l1.song_id = l2.song_id AND + l1.day = l2.day + GROUP BY l1.user_id,l2.user_id + HAVING COUNT(l1.song_id) >= 3 +) +SELECT f.* +FROM similar_friends sf +INNER JOIN friendship_1919 f ON sf.user_id1 = f.user1_id AND sf.user_id2 = f.user2_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `listens`, `similar_friends`, `friendship`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`similar_friends`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `similar_friends`: reads `listens`, joins related entities. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/hard/1919. Leetcodify Similar Friends (Hard).sql b/hard/1919. Leetcodify Similar Friends (Hard).sql deleted file mode 100644 index 7ca08f7..0000000 --- a/hard/1919. Leetcodify Similar Friends (Hard).sql +++ /dev/null @@ -1,13 +0,0 @@ -WITH similar_friends AS( - SELECT l1.user_id AS user_id1,l2.user_id AS user_id2,COUNT(l1.song_id) - FROM listens_1919 l1 - INNER JOIN listens_1919 l2 - ON l1.user_id < l2.user_id AND - l1.song_id = l2.song_id AND - l1.day = l2.day - GROUP BY l1.user_id,l2.user_id - HAVING COUNT(l1.song_id) >= 3 -) -SELECT f.* -FROM similar_friends sf -INNER JOIN friendship_1919 f ON sf.user_id1 = f.user1_id AND sf.user_id2 = f.user2_id; diff --git a/hard/1972. First and Last Call On the Same Day (Hard).md b/hard/1972. First and Last Call On the Same Day (Hard).md new file mode 100644 index 0000000..4dc390e --- /dev/null +++ b/hard/1972. First and Last Call On the Same Day (Hard).md @@ -0,0 +1,93 @@ +# Question 1972: First and Last Call On the Same Day + +**LeetCode URL:** https://leetcode.com/problems/first-and-last-call-on-the-same-day/ + +## Description + +Drafted from this solution SQL: write a query on `calls`, `call_time`, `first_last_calls` to return `user_id`. + +## Table Schema Structure + +```sql +Create table If Not Exists Calls (caller_id int, recipient_id int, call_time datetime); +``` + +## Sample Input Data + +```sql +insert into Calls (caller_id, recipient_id, call_time) values ('8', '4', '2021-08-24 17:46:07'); +insert into Calls (caller_id, recipient_id, call_time) values ('4', '8', '2021-08-24 19:57:13'); +insert into Calls (caller_id, recipient_id, call_time) values ('5', '1', '2021-08-11 05:28:44'); +insert into Calls (caller_id, recipient_id, call_time) values ('8', '3', '2021-08-17 04:04:15'); +insert into Calls (caller_id, recipient_id, call_time) values ('11', '3', '2021-08-17 13:07:00'); +insert into Calls (caller_id, recipient_id, call_time) values ('8', '11', '2021-08-17 22:22:22'); +``` + +## Expected Output Data + +```text ++---------+ +| user_id | ++---------+ +| sample | ++---------+ +``` + +## SQL Solution + +```sql +WITH calls AS ( + SELECT caller_id,recipient_id,call_time + FROM calls_1972 + UNION + SELECT recipient_id,caller_id,call_time + FROM calls_1972 +), +first_last_calls AS ( + SELECT *, + MIN(call_time) OVER (PARTITION BY caller_id,EXTRACT(DAY FROM call_time)) AS first_call, + MAX(call_time) OVER (PARTITION BY caller_id,EXTRACT(DAY FROM call_time)) AS last_call + FROM calls +) +SELECT DISTINCT f.caller_id AS user_id +FROM first_last_calls f +INNER JOIN first_last_calls l +ON f.caller_id = l.caller_id AND + f.recipient_id = l.recipient_id AND + f.call_time = f.first_call AND + l.call_time = l.last_call; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id` from `calls`, `call_time`, `first_last_calls`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`calls`, `first_last_calls`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `calls`: reads `calls`. +3. CTE `first_last_calls`: reads `call_time`, `calls`, computes window metrics. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `user_id`. +7. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/hard/1972. First and Last Call On the Same Day (Hard).sql b/hard/1972. First and Last Call On the Same Day (Hard).sql deleted file mode 100644 index 14ef638..0000000 --- a/hard/1972. First and Last Call On the Same Day (Hard).sql +++ /dev/null @@ -1,20 +0,0 @@ -WITH calls AS ( - SELECT caller_id,recipient_id,call_time - FROM calls_1972 - UNION - SELECT recipient_id,caller_id,call_time - FROM calls_1972 -), -first_last_calls AS ( - SELECT *, - MIN(call_time) OVER (PARTITION BY caller_id,EXTRACT(DAY FROM call_time)) AS first_call, - MAX(call_time) OVER (PARTITION BY caller_id,EXTRACT(DAY FROM call_time)) AS last_call - FROM calls -) -SELECT DISTINCT f.caller_id AS user_id -FROM first_last_calls f -INNER JOIN first_last_calls l -ON f.caller_id = l.caller_id AND - f.recipient_id = l.recipient_id AND - f.call_time = f.first_call AND - l.call_time = l.last_call; diff --git a/hard/2004. The Number of Seniors and Juniors to Join the Company (Hard).md b/hard/2004. The Number of Seniors and Juniors to Join the Company (Hard).md new file mode 100644 index 0000000..0766ec2 --- /dev/null +++ b/hard/2004. The Number of Seniors and Juniors to Join the Company (Hard).md @@ -0,0 +1,115 @@ +# Question 2004: The Number of Seniors and Juniors to Join the Company + +**LeetCode URL:** https://leetcode.com/problems/the-number-of-seniors-and-juniors-to-join-the-company/ + +## Description + +Drafted from this solution SQL: write a query on `candidates_2004_tc`, `seniors`, `left_budget`, `juniors`, `hired_candidates` to return `NULL`, `experience`. Apply filter conditions: remaining_budget >= 0 UNION SELECT NULL,'Senior' UNION SELECT NULL,'Junior' ) SELECT experience,COALESCE(COUNT(employee_id),0) FROM hired_candidates GROUP BY experience. Group results by: experience. Order the final output by: c.salary,c.employee_id) AS remaining_budget FROM candidates_2004_tc_2 c CROSS JOIN left_budget lb WHERE experience = 'Junior' ), hired_candidates AS ( SELECT employee_id,experience FROM juniors WHERE remaining_budget >= 0 UNION SELECT employee_id,experience FROM seniors WHERE remaining_budget >= 0 UNION SELECT NULL,'Senior' UNION SELECT NULL,'Junior' ) SELECT experience,COALESCE(COUNT(employee_id),0) FROM hired_candidates GROUP BY experience. + +## Table Schema Structure + +```sql +Create table If Not Exists Candidates (employee_id int, experience ENUM('Senior', 'Junior'), salary int); +``` + +## Sample Input Data + +```sql +insert into Candidates (employee_id, experience, salary) values ('1', 'Junior', '10000'); +insert into Candidates (employee_id, experience, salary) values ('9', 'Junior', '10000'); +insert into Candidates (employee_id, experience, salary) values ('2', 'Senior', '20000'); +insert into Candidates (employee_id, experience, salary) values ('11', 'Senior', '20000'); +insert into Candidates (employee_id, experience, salary) values ('13', 'Senior', '50000'); +insert into Candidates (employee_id, experience, salary) values ('4', 'Junior', '40000'); +``` + +## Expected Output Data + +```text ++--------+------------+ +| NULL | experience | ++--------+------------+ +| sample | sample | ++--------+------------+ +``` + +## SQL Solution + +```sql +-- Table Name for Test-Case 1: candidates_2004_tc_2 +-- Table Name for Test-Case 2: candidates_2004 + +WITH seniors AS ( + SELECT *, + SUM(salary) OVER (ORDER BY salary,employee_id) AS occupied_budget, + 70000-SUM(salary) OVER (ORDER BY salary,employee_id) AS remaining_budget + FROM candidates_2004_tc_2 + WHERE experience = 'Senior' +), +left_budget AS ( + SELECT COALESCE(MIN(remaining_budget),70000) AS budget + FROM seniors + WHERE remaining_budget >= 0 +), +juniors AS ( + SELECT c.*, + SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS occupied_budget, + lb.budget-SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS remaining_budget + FROM candidates_2004_tc_2 c + CROSS JOIN left_budget lb + WHERE experience = 'Junior' +), +hired_candidates AS ( + SELECT employee_id,experience + FROM juniors + WHERE remaining_budget >= 0 + UNION + SELECT employee_id,experience + FROM seniors + WHERE remaining_budget >= 0 + UNION + SELECT NULL,'Senior' + UNION + SELECT NULL,'Junior' +) +SELECT experience,COALESCE(COUNT(employee_id),0) +FROM hired_candidates +GROUP BY experience; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `experience` from `candidates_2004_tc`, `seniors`, `left_budget`, `juniors`. + +### Result Grain + +One row per unique key in `GROUP BY experience`. + +### Step-by-Step Logic + +1. Create CTE layers (`seniors`, `left_budget`, `juniors`, `hired_candidates`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `seniors`: reads `candidates_2004_tc`, computes window metrics. +3. CTE `left_budget`: reads `seniors`. +4. CTE `juniors`: reads `candidates_2004_tc`, `left_budget`, joins related entities, computes window metrics. +5. CTE `hired_candidates`: reads `juniors`, `seniors`. +6. Combine datasets using CROSS JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +7. Aggregate rows with COUNT, SUM, MIN grouped by experience. +8. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +9. Project final output columns: `experience`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/hard/2004. The Number of Seniors and Juniors to Join the Company (Hard).sql b/hard/2004. The Number of Seniors and Juniors to Join the Company (Hard).sql deleted file mode 100644 index 23fa0ff..0000000 --- a/hard/2004. The Number of Seniors and Juniors to Join the Company (Hard).sql +++ /dev/null @@ -1,39 +0,0 @@ --- Table Name for Test-Case 1: candidates_2004_tc_2 --- Table Name for Test-Case 2: candidates_2004 - -WITH seniors AS ( - SELECT *, - SUM(salary) OVER (ORDER BY salary,employee_id) AS occupied_budget, - 70000-SUM(salary) OVER (ORDER BY salary,employee_id) AS remaining_budget - FROM candidates_2004_tc_2 - WHERE experience = 'Senior' -), -left_budget AS ( - SELECT COALESCE(MIN(remaining_budget),70000) AS budget - FROM seniors - WHERE remaining_budget >= 0 -), -juniors AS ( - SELECT c.*, - SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS occupied_budget, - lb.budget-SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS remaining_budget - FROM candidates_2004_tc_2 c - CROSS JOIN left_budget lb - WHERE experience = 'Junior' -), -hired_candidates AS ( - SELECT employee_id,experience - FROM juniors - WHERE remaining_budget >= 0 - UNION - SELECT employee_id,experience - FROM seniors - WHERE remaining_budget >= 0 - UNION - SELECT NULL,'Senior' - UNION - SELECT NULL,'Junior' -) -SELECT experience,COALESCE(COUNT(employee_id),0) -FROM hired_candidates -GROUP BY experience; diff --git a/hard/2010. The Number of Seniors and Juniors to Join the Company II (Hard).md b/hard/2010. The Number of Seniors and Juniors to Join the Company II (Hard).md new file mode 100644 index 0000000..35a3ee0 --- /dev/null +++ b/hard/2010. The Number of Seniors and Juniors to Join the Company II (Hard).md @@ -0,0 +1,108 @@ +# Question 2010: The Number of Seniors and Juniors to Join the Company II + +**LeetCode URL:** https://leetcode.com/problems/the-number-of-seniors-and-juniors-to-join-the-company-ii/ + +## Description + +Drafted from this solution SQL: write a query on `candidates`, `seniors`, `left_budget`, `juniors`, `hired_candidates` to return `employee_id`. Apply filter conditions: remaining_budget >= 0 ) SELECT employee_id FROM hired_candidates. Order the final output by: c.salary,c.employee_id) AS remaining_budget FROM candidates_2010 c CROSS JOIN left_budget lb WHERE experience = 'Junior' ), hired_candidates AS ( SELECT * FROM juniors WHERE remaining_budget >= 0 UNION SELECT * FROM seniors WHERE remaining_budget >= 0 ) SELECT employee_id FROM hired_candidates. + +## Table Schema Structure + +```sql +Create table If Not Exists Candidates (employee_id int, experience ENUM('Senior', 'Junior'), salary int); +``` + +## Sample Input Data + +```sql +insert into Candidates (employee_id, experience, salary) values ('1', 'Junior', '10000'); +insert into Candidates (employee_id, experience, salary) values ('9', 'Junior', '15000'); +insert into Candidates (employee_id, experience, salary) values ('2', 'Senior', '20000'); +insert into Candidates (employee_id, experience, salary) values ('11', 'Senior', '16000'); +insert into Candidates (employee_id, experience, salary) values ('13', 'Senior', '50000'); +insert into Candidates (employee_id, experience, salary) values ('4', 'Junior', '40000'); +``` + +## Expected Output Data + +```text ++-------------+ +| employee_id | ++-------------+ +| sample | ++-------------+ +``` + +## SQL Solution + +```sql +-- Table Name for Test-Case 1: candidates_2010 +-- Table Name for Test-Case 2: candidates_2010_tc_2 + +WITH seniors AS ( + SELECT *, + SUM(salary) OVER (ORDER BY salary,employee_id) AS occupied_budget, + 70000-SUM(salary) OVER (ORDER BY salary,employee_id) AS remaining_budget + FROM candidates_2010 + WHERE experience = 'Senior' +), +left_budget AS ( + SELECT COALESCE(MIN(remaining_budget),70000) AS budget + FROM seniors + WHERE remaining_budget >= 0 +), +juniors AS ( + SELECT c.*, + SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS occupied_budget, + lb.budget-SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS remaining_budget + FROM candidates_2010 c + CROSS JOIN left_budget lb + WHERE experience = 'Junior' +), +hired_candidates AS ( + SELECT * + FROM juniors + WHERE remaining_budget >= 0 + UNION + SELECT * + FROM seniors + WHERE remaining_budget >= 0 +) +SELECT employee_id +FROM hired_candidates; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id` from `candidates`, `seniors`, `left_budget`, `juniors`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`seniors`, `left_budget`, `juniors`, `hired_candidates`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `seniors`: reads `candidates`, computes window metrics. +3. CTE `left_budget`: reads `seniors`. +4. CTE `juniors`: reads `candidates`, `left_budget`, joins related entities, computes window metrics. +5. CTE `hired_candidates`: reads `juniors`, `seniors`. +6. Combine datasets using CROSS JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +7. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +8. Project final output columns: `employee_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/hard/2010. The Number of Seniors and Juniors to Join the Company II (Hard).sql b/hard/2010. The Number of Seniors and Juniors to Join the Company II (Hard).sql deleted file mode 100644 index b36bdab..0000000 --- a/hard/2010. The Number of Seniors and Juniors to Join the Company II (Hard).sql +++ /dev/null @@ -1,34 +0,0 @@ --- Table Name for Test-Case 1: candidates_2010 --- Table Name for Test-Case 2: candidates_2010_tc_2 - -WITH seniors AS ( - SELECT *, - SUM(salary) OVER (ORDER BY salary,employee_id) AS occupied_budget, - 70000-SUM(salary) OVER (ORDER BY salary,employee_id) AS remaining_budget - FROM candidates_2010 - WHERE experience = 'Senior' -), -left_budget AS ( - SELECT COALESCE(MIN(remaining_budget),70000) AS budget - FROM seniors - WHERE remaining_budget >= 0 -), -juniors AS ( - SELECT c.*, - SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS occupied_budget, - lb.budget-SUM(c.salary) OVER (ORDER BY c.salary,c.employee_id) AS remaining_budget - FROM candidates_2010 c - CROSS JOIN left_budget lb - WHERE experience = 'Junior' -), -hired_candidates AS ( - SELECT * - FROM juniors - WHERE remaining_budget >= 0 - UNION - SELECT * - FROM seniors - WHERE remaining_budget >= 0 -) -SELECT employee_id -FROM hired_candidates; diff --git a/hard/2118. Build the Equation (Hard).md b/hard/2118. Build the Equation (Hard).md new file mode 100644 index 0000000..0d02daa --- /dev/null +++ b/hard/2118. Build the Equation (Hard).md @@ -0,0 +1,99 @@ +# Question 2118: Build the Equation + +**LeetCode URL:** https://leetcode.com/problems/build-the-equation/ + +## Description + +Drafted from this solution SQL: write a query on `terms`, `terms_2118_tc`, `grouped_terms` to return `equation`. Group results by: power ), terms AS ( SELECT power,ABS(factor) AS factor, (CASE WHEN factor < 0 THEN '-' ELSE '+' END) AS sign, (CASE WHEN power = 1 THEN CONCAT(ABS(factor),'X') WHEN power = 0 THEN ABS(factor)::TEXT ELSE CONCAT(ABS(factor),'X^',power) END) AS term FROM grouped_terms ) SELECT CONCAT(STRING_AGG(CONCAT(sign,term),''. Order the final output by: power DESC),'=0') AS equation FROM terms. + +## Table Schema Structure + +```sql +Create table If Not Exists Terms (power int, factor int); +``` + +## Sample Input Data + +```sql +insert into Terms (power, factor) values ('2', '1'); +insert into Terms (power, factor) values ('1', '-4'); +insert into Terms (power, factor) values ('0', '2'); +``` + +## Expected Output Data + +```text ++----------+ +| equation | ++----------+ +| sample | ++----------+ +``` + +## SQL Solution + +```sql +-- Table Name for Test-Case 1: terms_2118 +-- Table Name for Test-Case 2: terms_2118_tc_2 + +WITH terms AS ( + SELECT power,ABS(factor) AS factor, + (CASE WHEN factor < 0 THEN '-' ELSE '+' END) AS sign, + (CASE WHEN power = 1 THEN CONCAT(ABS(factor),'X') + WHEN power = 0 THEN ABS(factor)::TEXT + ELSE CONCAT(ABS(factor),'X^',power) + END) AS term + FROM terms_2118 +) +SELECT CONCAT(STRING_AGG(CONCAT(sign,term),'' ORDER BY power DESC),'=0') AS equation +FROM terms; + +-- Solution of the follow-up question (Table : terms_2118_tc_3) + +WITH grouped_terms AS ( + SELECT power,SUM(factor) AS factor + FROM terms_2118_tc_3 + GROUP BY power +), +terms AS ( + SELECT power,ABS(factor) AS factor, + (CASE WHEN factor < 0 THEN '-' ELSE '+' END) AS sign, + (CASE WHEN power = 1 THEN CONCAT(ABS(factor),'X') + WHEN power = 0 THEN ABS(factor)::TEXT + ELSE CONCAT(ABS(factor),'X^',power) + END) AS term + FROM grouped_terms +) +SELECT CONCAT(STRING_AGG(CONCAT(sign,term),'' ORDER BY power DESC),'=0') AS equation +FROM terms; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `equation` from `terms`, `terms_2118_tc`, `grouped_terms`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`terms`, `grouped_terms`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `terms`: reads `terms`. +3. Project final output columns: `equation`. +4. Order output deterministically with `ORDER BY power DESC),'=0') AS equation FROM terms`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/hard/2118. Build the Equation (Hard).sql b/hard/2118. Build the Equation (Hard).sql deleted file mode 100644 index 69ac3ea..0000000 --- a/hard/2118. Build the Equation (Hard).sql +++ /dev/null @@ -1,33 +0,0 @@ --- Table Name for Test-Case 1: terms_2118 --- Table Name for Test-Case 2: terms_2118_tc_2 - -WITH terms AS ( - SELECT power,ABS(factor) AS factor, - (CASE WHEN factor < 0 THEN '-' ELSE '+' END) AS sign, - (CASE WHEN power = 1 THEN CONCAT(ABS(factor),'X') - WHEN power = 0 THEN ABS(factor)::TEXT - ELSE CONCAT(ABS(factor),'X^',power) - END) AS term - FROM terms_2118 -) -SELECT CONCAT(STRING_AGG(CONCAT(sign,term),'' ORDER BY power DESC),'=0') AS equation -FROM terms; - --- Solution of the follow-up question (Table : terms_2118_tc_3) - -WITH grouped_terms AS ( - SELECT power,SUM(factor) AS factor - FROM terms_2118_tc_3 - GROUP BY power -), -terms AS ( - SELECT power,ABS(factor) AS factor, - (CASE WHEN factor < 0 THEN '-' ELSE '+' END) AS sign, - (CASE WHEN power = 1 THEN CONCAT(ABS(factor),'X') - WHEN power = 0 THEN ABS(factor)::TEXT - ELSE CONCAT(ABS(factor),'X^',power) - END) AS term - FROM grouped_terms -) -SELECT CONCAT(STRING_AGG(CONCAT(sign,term),'' ORDER BY power DESC),'=0') AS equation -FROM terms; diff --git a/hard/2173. Longest Winning Streak (Hard).md b/hard/2173. Longest Winning Streak (Hard).md new file mode 100644 index 0000000..b9c1f64 --- /dev/null +++ b/hard/2173. Longest Winning Streak (Hard).md @@ -0,0 +1,130 @@ +# Question 2173: Longest Winning Streak + +**LeetCode URL:** https://leetcode.com/problems/longest-winning-streak/ + +## Description + +Drafted from this solution SQL: write a query on `matches`, `ranked_all_matches`, `ranked_won_matches`, `players`, `winning_streaks` to return `player_id`, `longest_streak`. Apply filter conditions: result = 'Win' ), winning_streaks AS ( SELECT player_id,result,rn-wrn AS diff,COUNT(1) AS winning_streak FROM ranked_won_matches GROUP BY player_id,result,rn-wrn ), players AS ( SELECT DISTINCT player_id FROM matches_.... Group results by: p.player_id. Order the final output by: match_day) AS wrn FROM ranked_all_matches WHERE result = 'Win' ), winning_streaks AS ( SELECT player_id,result,rn-wrn AS diff,COUNT(1) AS winning_streak FROM ranked_won_matches GROUP BY player_id,result,rn-wrn ), players AS ( SELECT DISTINCT player_id FROM matches_2173 ) SELECT p.player_id,COALESCE(MAX(winning_streak),0) AS longest_streak FROM players p LEFT JOIN winning_streaks w ON w.player_id = p.player_id GROUP BY p.player_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Matches (player_id int, match_day date, result ENUM('Win', 'Draw', 'Lose')); +``` + +## Sample Input Data + +```sql +insert into Matches (player_id, match_day, result) values ('1', '2022-01-17', 'Win'); +insert into Matches (player_id, match_day, result) values ('1', '2022-01-18', 'Win'); +insert into Matches (player_id, match_day, result) values ('1', '2022-01-25', 'Win'); +insert into Matches (player_id, match_day, result) values ('1', '2022-01-31', 'Draw'); +insert into Matches (player_id, match_day, result) values ('1', '2022-02-08', 'Win'); +insert into Matches (player_id, match_day, result) values ('2', '2022-02-06', 'Lose'); +insert into Matches (player_id, match_day, result) values ('2', '2022-02-08', 'Lose'); +insert into Matches (player_id, match_day, result) values ('3', '2022-03-30', 'Win'); +``` + +## Expected Output Data + +```text ++-----------+----------------+ +| player_id | longest_streak | ++-----------+----------------+ +| sample | sample | ++-----------+----------------+ +``` + +## SQL Solution + +```sql +-- Table name for Test-Case1 : matches_2173 +-- Table name for Test-Case2 : matches_2173_tc_2 + +WITH ranked_all_matches AS ( + SELECT *, + ROW_NUMBER() OVER (PARTITION BY player_id ORDER BY match_day) AS rn + FROM matches_2173 +), +ranked_won_matches AS ( + SELECT player_id,result,rn, + ROW_NUMBER() OVER (PARTITION BY player_id ORDER BY match_day) AS wrn + FROM ranked_all_matches + WHERE result = 'Win' +), +winning_streaks AS ( + SELECT player_id,result,rn-wrn AS diff,COUNT(1) AS winning_streak + FROM ranked_won_matches + GROUP BY player_id,result,rn-wrn +), +players AS ( + SELECT DISTINCT player_id + FROM matches_2173 +) +SELECT p.player_id,COALESCE(MAX(winning_streak),0) AS longest_streak +FROM players p +LEFT JOIN winning_streaks w ON w.player_id = p.player_id +GROUP BY p.player_id; + +/* +pid rn wrn diff +----------------------------- +1 1 1 0 +1 2 2 0 +1 3 3 0 +1 4 - - +1 5 4 1 +1 6 5 1 +1 7 - - +1 8 6 2 +1 9 7 2 +1 10 8 2 +1 11 9 2 +1 12 - - + +2 1 - - +2 2 1 1 +2 3 - - +2 4 2 2 +2 5 3 2 + +3 1 1 0 +*/ +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `player_id`, `longest_streak` from `matches`, `ranked_all_matches`, `ranked_won_matches`, `players`. + +### Result Grain + +One row per unique key in `GROUP BY p.player_id`. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked_all_matches`, `ranked_won_matches`, `winning_streaks`, `players`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked_all_matches`: reads `matches`, computes window metrics. +3. CTE `ranked_won_matches`: reads `ranked_all_matches`, computes window metrics. +4. CTE `winning_streaks`: reads `ranked_won_matches`. +5. CTE `players`: reads `matches`. +6. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +7. Aggregate rows with COUNT, MAX grouped by p.player_id. +8. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +9. Project final output columns: `player_id`, `longest_streak`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/hard/2173. Longest Winning Streak (Hard).sql b/hard/2173. Longest Winning Streak (Hard).sql deleted file mode 100644 index 0e88dec..0000000 --- a/hard/2173. Longest Winning Streak (Hard).sql +++ /dev/null @@ -1,52 +0,0 @@ --- Table name for Test-Case1 : matches_2173 --- Table name for Test-Case2 : matches_2173_tc_2 - -WITH ranked_all_matches AS ( - SELECT *, - ROW_NUMBER() OVER (PARTITION BY player_id ORDER BY match_day) AS rn - FROM matches_2173 -), -ranked_won_matches AS ( - SELECT player_id,result,rn, - ROW_NUMBER() OVER (PARTITION BY player_id ORDER BY match_day) AS wrn - FROM ranked_all_matches - WHERE result = 'Win' -), -winning_streaks AS ( - SELECT player_id,result,rn-wrn AS diff,COUNT(1) AS winning_streak - FROM ranked_won_matches - GROUP BY player_id,result,rn-wrn -), -players AS ( - SELECT DISTINCT player_id - FROM matches_2173 -) -SELECT p.player_id,COALESCE(MAX(winning_streak),0) AS longest_streak -FROM players p -LEFT JOIN winning_streaks w ON w.player_id = p.player_id -GROUP BY p.player_id; - -/* -pid rn wrn diff ------------------------------ -1 1 1 0 -1 2 2 0 -1 3 3 0 -1 4 - - -1 5 4 1 -1 6 5 1 -1 7 - - -1 8 6 2 -1 9 7 2 -1 10 8 2 -1 11 9 2 -1 12 - - - -2 1 - - -2 2 1 1 -2 3 - - -2 4 2 2 -2 5 3 2 - -3 1 1 0 -*/ diff --git a/hard/2199. Finding the Topic of Each Post (Hard).md b/hard/2199. Finding the Topic of Each Post (Hard).md new file mode 100644 index 0000000..d21200c --- /dev/null +++ b/hard/2199. Finding the Topic of Each Post (Hard).md @@ -0,0 +1,77 @@ +# Question 2199: Finding the Topic of Each Post + +**LeetCode URL:** https://leetcode.com/problems/finding-the-topic-of-each-post/ + +## Description + +Drafted from this solution SQL: write a query on `posts`, `keywords` to return `post_id`, `topic`. Group results by: p.post_id. Order the final output by: k.topic_id::TEXT),'Ambiguous!') AS topic FROM posts_2199 p LEFT JOIN keywords_2199 k ON POSITION(LOWER(' '||k.word||' ') IN LOWER(' '||p.content||' '))!=0 GROUP BY p.post_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Keywords (topic_id int, word varchar(25)); +Create table If Not Exists Posts (post_id int, content varchar(100)); +``` + +## Sample Input Data + +```sql +insert into Keywords (topic_id, word) values ('1', 'handball'); +insert into Keywords (topic_id, word) values ('1', 'football'); +insert into Keywords (topic_id, word) values ('3', 'WAR'); +insert into Keywords (topic_id, word) values ('2', 'Vaccine'); +insert into Posts (post_id, content) values ('1', 'We call it soccer They call it football hahaha'); +insert into Posts (post_id, content) values ('2', 'Americans prefer basketball while Europeans love handball and football'); +insert into Posts (post_id, content) values ('3', 'stop the war and play handball'); +insert into Posts (post_id, content) values ('4', 'warning I planted some flowers this morning and then got vaccinated'); +``` + +## Expected Output Data + +```text ++---------+--------+ +| post_id | topic | ++---------+--------+ +| sample | sample | ++---------+--------+ +``` + +## SQL Solution + +```sql +SELECT p.post_id,COALESCE(STRING_AGG(DISTINCT k.topic_id::TEXT,',' ORDER BY k.topic_id::TEXT),'Ambiguous!') AS topic +FROM posts_2199 p +LEFT JOIN keywords_2199 k ON POSITION(LOWER(' '||k.word||' ') IN LOWER(' '||p.content||' '))!=0 +GROUP BY p.post_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `post_id`, `topic` from `posts`, `keywords`. + +### Result Grain + +One row per unique key in `GROUP BY p.post_id`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Group rows by p.post_id to enforce one result row per key. +3. Project final output columns: `post_id`, `topic`. +4. Order output deterministically with `ORDER BY k.topic_id::TEXT),'Ambiguous!') AS topic FROM posts_2199 p LEFT JOIN keywords_2199 k ON POSITION(LOWER(' '||k.word||' ') IN LOWER(' '||p.content||' '))!=0 GROUP BY p.p...`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/hard/2199. Finding the Topic of Each Post (Hard).sql b/hard/2199. Finding the Topic of Each Post (Hard).sql deleted file mode 100644 index a59f464..0000000 --- a/hard/2199. Finding the Topic of Each Post (Hard).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT p.post_id,COALESCE(STRING_AGG(DISTINCT k.topic_id::TEXT,',' ORDER BY k.topic_id::TEXT),'Ambiguous!') AS topic -FROM posts_2199 p -LEFT JOIN keywords_2199 k ON POSITION(LOWER(' '||k.word||' ') IN LOWER(' '||p.content||' '))!=0 -GROUP BY p.post_id; diff --git a/hard/2252. Dynamic Pivoting of a Table (Hard).md b/hard/2252. Dynamic Pivoting of a Table (Hard).md new file mode 100644 index 0000000..a4e41a4 --- /dev/null +++ b/hard/2252. Dynamic Pivoting of a Table (Hard).md @@ -0,0 +1,116 @@ +# Question 2252: Dynamic Pivoting of a Table + +**LeetCode URL:** https://leetcode.com/problems/dynamic-pivoting-of-a-table/ + +## Description + +Drafted from this solution SQL: write a query on `products` to return `product_id`. Group results by: product_id ORDER BY product_id. Order the final output by: product_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Products (product_id int, store varchar(7), price int); +``` + +## Sample Input Data + +```sql +insert into Products (product_id, store, price) values ('1', 'Shop', '110'); +insert into Products (product_id, store, price) values ('1', 'LC_Store', '100'); +insert into Products (product_id, store, price) values ('2', 'Nozama', '200'); +insert into Products (product_id, store, price) values ('2', 'Souq', '190'); +insert into Products (product_id, store, price) values ('3', 'Shop', '1000'); +insert into Products (product_id, store, price) values ('3', 'Souq', '1900'); +``` + +## Expected Output Data + +```text ++------------+ +| product_id | ++------------+ +| sample | ++------------+ +``` + +## SQL Solution + +```sql +CREATE OR REPLACE FUNCTION pivot_products_2252() +RETURNS TEXT +LANGUAGE PLPGSQL +AS +$$ +DECLARE + store_name_array TEXT[]; + store_name TEXT; + query_text TEXT; +BEGIN + --query to find all the stores given in the table + SELECT ARRAY_AGG(DISTINCT store ORDER BY store) + INTO store_name_array + FROM products_2252; + --RAISE NOTICE 'store_name_array = %',store_name_array; + + --prepare query + query_text := 'SELECT product_id, '; + + --prepare case statements for all the store_name in store_name_array + FOREACH store_name IN ARRAY store_name_array + LOOP + query_text := query_text || 'SUM(CASE WHEN store = ''' || store_name || ''' THEN price ELSE NULL END) AS "' || store_name || '",'; + END LOOP; + + --prepare query + query_text := LEFT(query_text,LENGTH(query_text)-1); + query_text := query_text || ' FROM products_2252 GROUP BY product_id ORDER BY product_id'; + --RAISE NOTICE '%',query_text; + + --return the query as text + RETURN query_text; +END $$; + +SELECT pivot_products_2252(); + +-- output of the function: +SELECT product_id, + SUM(CASE WHEN store = 'LC_Store' THEN price ELSE NULL END) AS "LC_Store", + SUM(CASE WHEN store = 'Nozama' THEN price ELSE NULL END) AS "Nozama", + SUM(CASE WHEN store = 'Shop' THEN price ELSE NULL END) AS "Shop", + SUM(CASE WHEN store = 'Souq' THEN price ELSE NULL END) AS "Souq" +FROM products_2252 +GROUP BY product_id +ORDER BY product_id; + +--running this query manually will give us expected results +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id` from `products`. + +### Result Grain + +One row per unique key in `GROUP BY product_id`. + +### Step-by-Step Logic + +1. Define a SQL function/procedural block first, then execute it to generate or run dynamic SQL for the final shape. +2. Aggregate rows with SUM grouped by product_id. +3. Project final output columns: `product_id`. +4. Order output deterministically with `ORDER BY product_id`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/hard/2252. Dynamic Pivoting of a Table (Hard).sql b/hard/2252. Dynamic Pivoting of a Table (Hard).sql deleted file mode 100644 index 3839457..0000000 --- a/hard/2252. Dynamic Pivoting of a Table (Hard).sql +++ /dev/null @@ -1,47 +0,0 @@ -CREATE OR REPLACE FUNCTION pivot_products_2252() -RETURNS TEXT -LANGUAGE PLPGSQL -AS -$$ -DECLARE - store_name_array TEXT[]; - store_name TEXT; - query_text TEXT; -BEGIN - --query to find all the stores given in the table - SELECT ARRAY_AGG(DISTINCT store ORDER BY store) - INTO store_name_array - FROM products_2252; - --RAISE NOTICE 'store_name_array = %',store_name_array; - - --prepare query - query_text := 'SELECT product_id, '; - - --prepare case statements for all the store_name in store_name_array - FOREACH store_name IN ARRAY store_name_array - LOOP - query_text := query_text || 'SUM(CASE WHEN store = ''' || store_name || ''' THEN price ELSE NULL END) AS "' || store_name || '",'; - END LOOP; - - --prepare query - query_text := LEFT(query_text,LENGTH(query_text)-1); - query_text := query_text || ' FROM products_2252 GROUP BY product_id ORDER BY product_id'; - --RAISE NOTICE '%',query_text; - - --return the query as text - RETURN query_text; -END $$; - -SELECT pivot_products_2252(); - --- output of the function: -SELECT product_id, - SUM(CASE WHEN store = 'LC_Store' THEN price ELSE NULL END) AS "LC_Store", - SUM(CASE WHEN store = 'Nozama' THEN price ELSE NULL END) AS "Nozama", - SUM(CASE WHEN store = 'Shop' THEN price ELSE NULL END) AS "Shop", - SUM(CASE WHEN store = 'Souq' THEN price ELSE NULL END) AS "Souq" -FROM products_2252 -GROUP BY product_id -ORDER BY product_id; - ---running this query manually will give us expected results diff --git a/hard/2253. Dynamic Unpivoting of a Table (Hard).md b/hard/2253. Dynamic Unpivoting of a Table (Hard).md new file mode 100644 index 0000000..de9bc28 --- /dev/null +++ b/hard/2253. Dynamic Unpivoting of a Table (Hard).md @@ -0,0 +1,119 @@ +# Question 2253: Dynamic Unpivoting of a Table + +**LeetCode URL:** https://leetcode.com/problems/dynamic-unpivoting-of-a-table/ + +## Description + +Drafted from this solution SQL: write a query on `information_schema`, `products` to return `product_id`, `store`. Apply filter conditions: "Souq" IS NOT NULL ORDER BY product_id,store. Order the final output by: product_id,store. + +## Table Schema Structure + +```sql +Schema not available in API payload. +``` + +## Sample Input Data + +```sql +insert into Products (product_id, LC_Store, Nozama, Shop, Souq) values ('1', '100', NULL, '110', NULL); +insert into Products (product_id, LC_Store, Nozama, Shop, Souq) values ('2', NULL, '200', NULL, '190'); +insert into Products (product_id, LC_Store, Nozama, Shop, Souq) values ('3', NULL, NULL, '1000', '1900'); +``` + +## Expected Output Data + +```text ++------------+--------+ +| product_id | store | ++------------+--------+ +| sample | sample | ++------------+--------+ +``` + +## SQL Solution + +```sql +CREATE OR REPLACE FUNCTION unpivot_products_2253() +RETURNS TEXT +LANGUAGE PLPGSQL +AS +$$ +DECLARE + stores_array TEXT[]; + query_text TEXT := ''; + store_name TEXT; +BEGIN + --query to find all the stores columns of the products table except product_id column + SELECT ARRAY_AGG(column_name) + INTO stores_array + FROM information_schema.columns + WHERE table_name = 'products_2253' AND column_name <> 'product_id'; + + -- prepare query + FOREACH store_name IN ARRAY stores_array + LOOP + query_text := query_text || 'SELECT product_id, ''' || store_name || ''' AS store, "' || store_name ||'" FROM products_2253 WHERE "' || store_name || '" IS NOT NULL'; + query_text := query_text || ' UNION '; + END LOOP; + + query_text := LEFT(query_text,LENGTH(query_text)-6); + query_text := query_text || ' ORDER BY product_id,store;'; + + --return the query as text + RETURN query_text; +END +$$; + +SELECT unpivot_products_2253(); + +-- output of the function: +SELECT product_id, 'LC_Store' AS store, "LC_Store" +FROM products_2253 +WHERE "LC_Store" IS NOT NULL +UNION +SELECT product_id, 'Nozama' AS store, "Nozama" +FROM products_2253 +WHERE "Nozama" IS NOT NULL +UNION +SELECT product_id, 'Shop' AS store, "Shop" +FROM products_2253 +WHERE "Shop" IS NOT NULL +UNION +SELECT product_id, 'Souq' AS store, "Souq" +FROM products_2253 +WHERE "Souq" IS NOT NULL +ORDER BY product_id,store; + +--running this query manually will give us expected results +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id`, `store` from `information_schema`, `products`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Define a SQL function/procedural block first, then execute it to generate or run dynamic SQL for the final shape. +2. Apply row-level filtering in `WHERE`: "Souq" IS NOT NULL. +3. Project final output columns: `product_id`, `store`. +4. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. +5. Order output deterministically with `ORDER BY product_id,store`. + +### Why This Works + +Predicate filtering removes irrelevant rows before expensive downstream computation. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/hard/2253. Dynamic Unpivoting of a Table (Hard).sql b/hard/2253. Dynamic Unpivoting of a Table (Hard).sql deleted file mode 100644 index 902f330..0000000 --- a/hard/2253. Dynamic Unpivoting of a Table (Hard).sql +++ /dev/null @@ -1,52 +0,0 @@ -CREATE OR REPLACE FUNCTION unpivot_products_2253() -RETURNS TEXT -LANGUAGE PLPGSQL -AS -$$ -DECLARE - stores_array TEXT[]; - query_text TEXT := ''; - store_name TEXT; -BEGIN - --query to find all the stores columns of the products table except product_id column - SELECT ARRAY_AGG(column_name) - INTO stores_array - FROM information_schema.columns - WHERE table_name = 'products_2253' AND column_name <> 'product_id'; - - -- prepare query - FOREACH store_name IN ARRAY stores_array - LOOP - query_text := query_text || 'SELECT product_id, ''' || store_name || ''' AS store, "' || store_name ||'" FROM products_2253 WHERE "' || store_name || '" IS NOT NULL'; - query_text := query_text || ' UNION '; - END LOOP; - - query_text := LEFT(query_text,LENGTH(query_text)-6); - query_text := query_text || ' ORDER BY product_id,store;'; - - --return the query as text - RETURN query_text; -END -$$; - -SELECT unpivot_products_2253(); - --- output of the function: -SELECT product_id, 'LC_Store' AS store, "LC_Store" -FROM products_2253 -WHERE "LC_Store" IS NOT NULL -UNION -SELECT product_id, 'Nozama' AS store, "Nozama" -FROM products_2253 -WHERE "Nozama" IS NOT NULL -UNION -SELECT product_id, 'Shop' AS store, "Shop" -FROM products_2253 -WHERE "Shop" IS NOT NULL -UNION -SELECT product_id, 'Souq' AS store, "Souq" -FROM products_2253 -WHERE "Souq" IS NOT NULL -ORDER BY product_id,store; - ---running this query manually will give us expected results diff --git a/hard/2362. Generate the Invoice (Hard).md b/hard/2362. Generate the Invoice (Hard).md new file mode 100644 index 0000000..a4752b4 --- /dev/null +++ b/hard/2362. Generate the Invoice (Hard).md @@ -0,0 +1,83 @@ +# Question 2362: Generate the Invoice + +**LeetCode URL:** https://leetcode.com/problems/generate-the-invoice/ + +## Description + +Drafted from this solution SQL: write a query on `purchases`, `products` to return `invoice_id`. Apply filter conditions: invoice_id = (SELECT pc.invoice_id FROM purchases_2362 pc INNER JOIN products_2362 pd ON pc.product_id = pd.product_id. Group results by: pc.invoice_id. Order the final output by: SUM(quantity*price) DESC,pc.invoice_id ASC. Return only the first 1) row(s). + +## Table Schema Structure + +```sql +Create table If Not Exists Products (product_id int, price int); +Create table If Not Exists Purchases (invoice_id int, product_id int, quantity int); +``` + +## Sample Input Data + +```sql +insert into Products (product_id, price) values ('1', '100'); +insert into Products (product_id, price) values ('2', '200'); +insert into Purchases (invoice_id, product_id, quantity) values ('1', '1', '2'); +insert into Purchases (invoice_id, product_id, quantity) values ('3', '2', '1'); +insert into Purchases (invoice_id, product_id, quantity) values ('2', '2', '3'); +insert into Purchases (invoice_id, product_id, quantity) values ('2', '1', '4'); +insert into Purchases (invoice_id, product_id, quantity) values ('4', '1', '10'); +``` + +## Expected Output Data + +```text ++------------+ +| invoice_id | ++------------+ +| sample | ++------------+ +``` + +## SQL Solution + +```sql +SELECT pc.product_id,pc.quantity,pc.quantity*pd.price AS price +FROM purchases_2362 pc +INNER JOIN products_2362 pd ON pc.product_id = pd.product_id +WHERE invoice_id = (SELECT pc.invoice_id + FROM purchases_2362 pc + INNER JOIN products_2362 pd ON pc.product_id = pd.product_id + GROUP BY pc.invoice_id + ORDER BY SUM(quantity*price) DESC,pc.invoice_id ASC + LIMIT 1); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id`, `quantity`, `price` from `purchases`, `products`. + +### Result Grain + +One row per unique key in `GROUP BY pc.invoice_id`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: invoice_id = (SELECT pc.invoice_id FROM purchases_2362 pc INNER JOIN products_2362 pd ON pc.product_id = pd.product_id. +3. Aggregate rows with SUM grouped by pc.invoice_id. +4. Project final output columns: `product_id`, `quantity`, `price`. +5. Order output deterministically with `ORDER BY SUM(quantity*price) DESC,pc.invoice_id ASC`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/hard/2362. Generate the Invoice (Hard).sql b/hard/2362. Generate the Invoice (Hard).sql deleted file mode 100644 index a4ad7ae..0000000 --- a/hard/2362. Generate the Invoice (Hard).sql +++ /dev/null @@ -1,9 +0,0 @@ -SELECT pc.product_id,pc.quantity,pc.quantity*pd.price AS price -FROM purchases_2362 pc -INNER JOIN products_2362 pd ON pc.product_id = pd.product_id -WHERE invoice_id = (SELECT pc.invoice_id - FROM purchases_2362 pc - INNER JOIN products_2362 pd ON pc.product_id = pd.product_id - GROUP BY pc.invoice_id - ORDER BY SUM(quantity*price) DESC,pc.invoice_id ASC - LIMIT 1); diff --git a/hard/2474. Customers With Strictly Increasing Purchases (Hard).md b/hard/2474. Customers With Strictly Increasing Purchases (Hard).md new file mode 100644 index 0000000..931551f --- /dev/null +++ b/hard/2474. Customers With Strictly Increasing Purchases (Hard).md @@ -0,0 +1,134 @@ +# Question 2474: Customers With Strictly Increasing Purchases + +**LeetCode URL:** https://leetcode.com/problems/customers-with-strictly-increasing-purchases/ + +## Description + +Drafted from this solution SQL: write a query on `order_date`, `orders`, `customer_purchase_years`, `all_years`, `o` to return `customer_id`. Apply filter conditions: diff <> 0). Group results by: customer_id HAVING (COUNT(new_line)=1). Keep groups satisfying: (COUNT(new_line)=1). Order the final output by: prices) AS rn FROM cte1 ), cte3 AS( SELECT DISTINCT customer_id,year-rn AS new_line FROM cte2 ) SELECT customer_id FROM cte3 GROUP BY customer_id HAVING (COUNT(new_line)=1). + +## Table Schema Structure + +```sql +Create table If Not Exists Orders (order_id int, customer_id int, order_date date, price int); +``` + +## Sample Input Data + +```sql +insert into Orders (order_id, customer_id, order_date, price) values ('1', '1', '2019-07-01', '1100'); +insert into Orders (order_id, customer_id, order_date, price) values ('2', '1', '2019-11-01', '1200'); +insert into Orders (order_id, customer_id, order_date, price) values ('3', '1', '2020-05-26', '3000'); +insert into Orders (order_id, customer_id, order_date, price) values ('4', '1', '2021-08-31', '3100'); +insert into Orders (order_id, customer_id, order_date, price) values ('5', '1', '2022-12-07', '4700'); +insert into Orders (order_id, customer_id, order_date, price) values ('6', '2', '2015-01-01', '700'); +insert into Orders (order_id, customer_id, order_date, price) values ('7', '2', '2017-11-07', '1000'); +insert into Orders (order_id, customer_id, order_date, price) values ('8', '3', '2017-01-01', '900'); +insert into Orders (order_id, customer_id, order_date, price) values ('9', '3', '2018-11-07', '900'); +``` + +## Expected Output Data + +```text ++-------------+ +| customer_id | ++-------------+ +| sample | ++-------------+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE customer_purchase_years AS ( + SELECT customer_id,MIN(EXTRACT(YEAR FROM order_date)) AS min_year,MAX(EXTRACT(YEAR FROM order_date)) AS max_year + FROM orders_2474 + GROUP BY customer_id +), +all_years AS ( + SELECT customer_id,min_year AS year,max_year + FROM customer_purchase_years + UNION + SELECT customer_id,year+1 AS year,max_year + FROM all_years + WHERE year 0); + +--OR-- + +WITH cte AS( + SELECT customer_id, EXTRACT(YEAR FROM order_date) AS year, price + FROM orders_2474 +), +cte1 AS( + SELECT customer_id,year,SUM(price) AS prices + FROM cte + GROUP BY customer_id,year +), +cte2 AS( + SELECT *, + DENSE_RANK() OVER(PARTITION BY customer_id ORDER BY prices) AS rn + FROM cte1 +), +cte3 AS( + SELECT DISTINCT customer_id,year-rn AS new_line + FROM cte2 +) +SELECT customer_id +FROM cte3 +GROUP BY customer_id +HAVING (COUNT(new_line)=1); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_id` from `order_date`, `orders`, `customer_purchase_years`, `all_years`. + +### Result Grain + +One row per unique key in `GROUP BY customer_id`. + +### Step-by-Step Logic + +1. Create CTE layers (`all_years`, `all_year_purchases`, `ranked`, `cte`, `cte1`, `cte2`) to transform data incrementally; recursion expands rows level-by-level until the stop condition is met. +2. CTE `customer_purchase_years`: reads `order_date`, `orders`. +3. CTE `all_years`: reads `customer_purchase_years`, `all_years`. +4. CTE `all_year_purchases`: reads `all_years`, `orders`, `o`, joins related entities. +5. CTE `ranked`: reads `all_year_purchases`, computes window metrics. +6. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +7. Aggregate rows with COUNT, SUM, MIN, MAX grouped by customer_id. +8. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +9. Project final output columns: `customer_id`. +10. Filter aggregated groups in `HAVING`: (COUNT(new_line)=1). + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, recursive expansion, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/hard/2474. Customers With Strictly Increasing Purchases (Hard).sql b/hard/2474. Customers With Strictly Increasing Purchases (Hard).sql deleted file mode 100644 index 0f1c339..0000000 --- a/hard/2474. Customers With Strictly Increasing Purchases (Hard).sql +++ /dev/null @@ -1,53 +0,0 @@ -WITH RECURSIVE customer_purchase_years AS ( - SELECT customer_id,MIN(EXTRACT(YEAR FROM order_date)) AS min_year,MAX(EXTRACT(YEAR FROM order_date)) AS max_year - FROM orders_2474 - GROUP BY customer_id -), -all_years AS ( - SELECT customer_id,min_year AS year,max_year - FROM customer_purchase_years - UNION - SELECT customer_id,year+1 AS year,max_year - FROM all_years - WHERE year 0); - ---OR-- - -WITH cte AS( - SELECT customer_id, EXTRACT(YEAR FROM order_date) AS year, price - FROM orders_2474 -), -cte1 AS( - SELECT customer_id,year,SUM(price) AS prices - FROM cte - GROUP BY customer_id,year -), -cte2 AS( - SELECT *, - DENSE_RANK() OVER(PARTITION BY customer_id ORDER BY prices) AS rn - FROM cte1 -), -cte3 AS( - SELECT DISTINCT customer_id,year-rn AS new_line - FROM cte2 -) -SELECT customer_id -FROM cte3 -GROUP BY customer_id -HAVING (COUNT(new_line)=1); diff --git a/hard/262. Trips and Users.md b/hard/262. Trips and Users.md new file mode 100644 index 0000000..bb1bd02 --- /dev/null +++ b/hard/262. Trips and Users.md @@ -0,0 +1,116 @@ +# Question 262: Trips and Users + +**LeetCode URL:** https://leetcode.com/problems/trips-and-users/ + +## Description + +Write a solution to find the cancellation rate of requests with unbanned users (both client and driver must not be banned) each day between "2013-10-01" and "2013-10-03" with at least one trip. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Trips (id int, client_id int, driver_id int, city_id int, status ENUM('completed', 'cancelled_by_driver', 'cancelled_by_client'), request_at varchar(50)); +Create table If Not Exists Users (users_id int, banned varchar(50), role ENUM('client', 'driver', 'partner')); +``` + +## Sample Input Data + +```sql +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('1', '1', '10', '1', 'completed', '2013-10-01'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('2', '2', '11', '1', 'cancelled_by_driver', '2013-10-01'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('3', '3', '12', '6', 'completed', '2013-10-01'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('4', '4', '13', '6', 'cancelled_by_client', '2013-10-01'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('5', '1', '10', '1', 'completed', '2013-10-02'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('6', '2', '11', '6', 'completed', '2013-10-02'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('7', '3', '12', '6', 'completed', '2013-10-02'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('8', '2', '12', '12', 'completed', '2013-10-03'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('9', '3', '10', '12', 'completed', '2013-10-03'); +insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('10', '4', '13', '12', 'cancelled_by_driver', '2013-10-03'); +insert into Users (users_id, banned, role) values ('1', 'No', 'client'); +insert into Users (users_id, banned, role) values ('2', 'Yes', 'client'); +insert into Users (users_id, banned, role) values ('3', 'No', 'client'); +insert into Users (users_id, banned, role) values ('4', 'No', 'client'); +insert into Users (users_id, banned, role) values ('10', 'No', 'driver'); +insert into Users (users_id, banned, role) values ('11', 'No', 'driver'); +insert into Users (users_id, banned, role) values ('12', 'No', 'driver'); +insert into Users (users_id, banned, role) values ('13', 'No', 'driver'); +``` + +## Expected Output Data + +```text ++------------+-------------------+ +| Day | Cancellation Rate | ++------------+-------------------+ +| 2013-10-01 | 0.33 | +| 2013-10-02 | 0.00 | +| 2013-10-03 | 0.50 | ++------------+-------------------+ +``` + +## SQL Solution + +```sql +WITH cancelled AS( +SELECT t.request_at,COUNT(*) AS cancelled_count +FROM trips_262 t +JOIN users_262 c ON t.client_id = c.user_id AND c.banned like 'No' +JOIN users_262 d ON t.driver_id = d.user_id AND d.banned like 'No' +WHERE t.status LIKE 'cancelled_by_client' OR t.status LIKE 'cancelled_by_driver' +GROUP BY t.request_at), + +total AS( +SELECT t.request_at,COUNT(*) AS total_count +FROM trips_262 t +JOIN users_262 c ON t.client_id = c.user_id AND c.banned like 'No' +JOIN users_262 d ON t.driver_id = d.user_id AND d.banned like 'No' +GROUP BY t.request_at) + +SELECT t.request_at,(COALESCE(c.cancelled_count::FLOAT,0.0)/t.total_count::FLOAT) +FROM cancelled c +RIGHT JOIN total t ON c.request_at = t.request_at; + +(OR) + +SELECT request_at,ROUND(COUNT(CASE WHEN status <> 'completed' THEN 1 ELSE NULL END)::NUMERIC/COUNT(*),2) AS cancellation_rate +FROM trips_262 +WHERE request_at BETWEEN '2013-10-01' AND '2013-10-03' AND + client_id NOT IN (SELECT user_id FROM users_262 WHERE banned LIKE 'Yes' AND role LIKE 'client') AND + driver_id NOT IN (SELECT user_id FROM users_262 WHERE banned LIKE 'Yes' AND role LIKE 'driver') +GROUP BY request_at; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `request_at`, `cancellation_rate` from `trips`, `users`, `cancelled`, `total`. + +### Result Grain + +One row per unique key in `GROUP BY request_at`. + +### Step-by-Step Logic + +1. Create CTE layers (`cancelled`, `total`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cancelled`: reads `trips`, `users`. +3. CTE `total`: reads `trips`, `users`. +4. Combine datasets using RIGHT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: request_at BETWEEN '2013-10-01' AND '2013-10-03' AND client_id NOT IN (SELECT user_id FROM users_262 WHERE banned LIKE 'Yes' AND role LIKE 'client') AND driver_id NOT IN (SELECT user_id FROM users_262 WHERE.... +6. Aggregate rows with COUNT, ROUND grouped by request_at. +7. Project final output columns: `request_at`, `cancellation_rate`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/hard/262. Trips and Users.sql b/hard/262. Trips and Users.sql deleted file mode 100644 index 156cbd2..0000000 --- a/hard/262. Trips and Users.sql +++ /dev/null @@ -1,27 +0,0 @@ -WITH cancelled AS( -SELECT t.request_at,COUNT(*) AS cancelled_count -FROM trips_262 t -JOIN users_262 c ON t.client_id = c.user_id AND c.banned like 'No' -JOIN users_262 d ON t.driver_id = d.user_id AND d.banned like 'No' -WHERE t.status LIKE 'cancelled_by_client' OR t.status LIKE 'cancelled_by_driver' -GROUP BY t.request_at), - -total AS( -SELECT t.request_at,COUNT(*) AS total_count -FROM trips_262 t -JOIN users_262 c ON t.client_id = c.user_id AND c.banned like 'No' -JOIN users_262 d ON t.driver_id = d.user_id AND d.banned like 'No' -GROUP BY t.request_at) - -SELECT t.request_at,(COALESCE(c.cancelled_count::FLOAT,0.0)/t.total_count::FLOAT) -FROM cancelled c -RIGHT JOIN total t ON c.request_at = t.request_at; - -(OR) - -SELECT request_at,ROUND(COUNT(CASE WHEN status <> 'completed' THEN 1 ELSE NULL END)::NUMERIC/COUNT(*),2) AS cancellation_rate -FROM trips_262 -WHERE request_at BETWEEN '2013-10-01' AND '2013-10-03' AND - client_id NOT IN (SELECT user_id FROM users_262 WHERE banned LIKE 'Yes' AND role LIKE 'client') AND - driver_id NOT IN (SELECT user_id FROM users_262 WHERE banned LIKE 'Yes' AND role LIKE 'driver') -GROUP BY request_at; diff --git a/hard/569. Median Employee Salary.md b/hard/569. Median Employee Salary.md new file mode 100644 index 0000000..124cd16 --- /dev/null +++ b/hard/569. Median Employee Salary.md @@ -0,0 +1,90 @@ +# Question 569: Median Employee Salary + +**LeetCode URL:** https://leetcode.com/problems/median-employee-salary/ + +## Description + +The Employee table holds all employees. The employee table has three columns: Employee Id, Company Name, and Salary. Write a SQL query to find the median salary of each company. Bonus points if you can solve it + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (id int, company varchar(255), salary int); +``` + +## Sample Input Data + +```sql +insert into Employee (id, company, salary) values ('1', 'A', '2341'); +insert into Employee (id, company, salary) values ('2', 'A', '341'); +insert into Employee (id, company, salary) values ('3', 'A', '15'); +insert into Employee (id, company, salary) values ('4', 'A', '15314'); +insert into Employee (id, company, salary) values ('5', 'A', '451'); +insert into Employee (id, company, salary) values ('6', 'A', '513'); +insert into Employee (id, company, salary) values ('7', 'B', '15'); +insert into Employee (id, company, salary) values ('8', 'B', '13'); +insert into Employee (id, company, salary) values ('9', 'B', '1154'); +insert into Employee (id, company, salary) values ('10', 'B', '1345'); +insert into Employee (id, company, salary) values ('11', 'B', '1221'); +insert into Employee (id, company, salary) values ('12', 'B', '234'); +insert into Employee (id, company, salary) values ('13', 'C', '2345'); +insert into Employee (id, company, salary) values ('14', 'C', '2645'); +insert into Employee (id, company, salary) values ('15', 'C', '2645'); +insert into Employee (id, company, salary) values ('16', 'C', '2652'); +insert into Employee (id, company, salary) values ('17', 'C', '65'); +``` + +## Expected Output Data + +```text ++--------+---------+--------+ +| id | company | salary | ++--------+---------+--------+ +| sample | sample | sample | ++--------+---------+--------+ +``` + +## SQL Solution + +```sql +WITH cte AS ( + SELECT id,company,salary, + ABS(ROW_NUMBER() OVER (PARTITION BY company ORDER BY salary,id) - ROW_NUMBER() OVER (PARTITION BY company ORDER BY salary DESC,id DESC)) AS diff + FROM employee_569 +) +SELECT id,company,salary +FROM cte +WHERE diff = 0 OR diff = 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `company`, `salary` from `employee`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `employee`, computes window metrics. +3. Apply row-level filtering in `WHERE`: diff = 0 OR diff = 1. +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `id`, `company`, `salary`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/hard/569. Median Employee Salary.sql b/hard/569. Median Employee Salary.sql deleted file mode 100644 index ffa42fb..0000000 --- a/hard/569. Median Employee Salary.sql +++ /dev/null @@ -1,8 +0,0 @@ -WITH cte AS ( - SELECT id,company,salary, - ABS(ROW_NUMBER() OVER (PARTITION BY company ORDER BY salary,id) - ROW_NUMBER() OVER (PARTITION BY company ORDER BY salary DESC,id DESC)) AS diff - FROM employee_569 -) -SELECT id,company,salary -FROM cte -WHERE diff = 0 OR diff = 1; diff --git a/hard/571. Find Median Given Frequency of Numbers.md b/hard/571. Find Median Given Frequency of Numbers.md new file mode 100644 index 0000000..0b4d45b --- /dev/null +++ b/hard/571. Find Median Given Frequency of Numbers.md @@ -0,0 +1,133 @@ +# Question 571: Find Median Given Frequency of Numbers + +**LeetCode URL:** https://leetcode.com/problems/find-median-given-frequency-of-numbers/ + +## Description + +The Numbers table keeps the value of number and its frequency. In this table, the numbers are 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3, so the median is (0 + 0) / 2 = 0. + +## Table Schema Structure + +```sql +Create table If Not Exists Numbers (num int, frequency int); +``` + +## Sample Input Data + +```sql +insert into Numbers (num, frequency) values ('0', '7'); +insert into Numbers (num, frequency) values ('1', '1'); +insert into Numbers (num, frequency) values ('2', '3'); +insert into Numbers (num, frequency) values ('3', '1'); +``` + +## Expected Output Data + +```text ++------------------+ +| result | ++------------------+ +| derived values | ++------------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT *, + SUM(frequency) OVER w AS e, + SUM(frequency) OVER () AS t + FROM numbers_571_tc_2 + WINDOW w AS (ORDER BY number) +), +cte2 AS( + SELECT number,frequency, + CASE WHEN (LAG(e::INT,1) OVER w) IS NULL THEN 1 ELSE (LAG(e::INT,1) OVER w)+1 END AS s, + e,t + FROM cte + WINDOW w AS (ORDER BY number) +) + +SELECT ROUND(AVG(number),1) +FROM cte2 +WHERE (t::NUMERIC/2 BETWEEN s AND e) OR (t::NUMERIC/2+1 BETWEEN s AND e); + +--------------------------- OR --------------------------- + +WITH RECURSIVE cte AS ( + SELECT number,frequency,1 AS cnt + FROM numbers_571 + UNION ALL + SELECT number,frequency,cnt+1 AS cnt + FROM cte + WHERE cnt < frequency +), +cte2 AS ( + SELECT number, + ROW_NUMBER() OVER (ORDER BY number) AS a, + COUNT(*) OVER () c + FROM cte +) +SELECT ROUND(AVG(number),1) +FROM cte2 +WHERE a BETWEEN (SELECT CEIL(AVG(c)::NUMERIC/2) FROM cte2) AND (SELECT CEIL((AVG(c)+1::NUMERIC)/2) FROM cte2) + +--------------------------- OR --------------------------- + +WITH RECURSIVE cte AS ( + SELECT number,frequency,1 AS cnt + FROM numbers_571 + UNION ALL + SELECT number,frequency,cnt+1 AS cnt + FROM cte + WHERE cnt < frequency +), +cte2 AS ( + SELECT number, + ROW_NUMBER() OVER (ORDER BY number) AS a + FROM cte +), +cte3 AS ( + SELECT number,a, + ROW_NUMBER() OVER (ORDER BY a DESC) AS d + FROM cte2 +) +SELECT ROUND(AVG(number),1) +FROM cte3 +WHERE ABS(a-d) = 0 OR ABS(a-d) = 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `numbers_571_tc`, `cte`, `cte2`, `numbers`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `cte2`, `cte3`) to transform data incrementally; recursion expands rows level-by-level until the stop condition is met. +2. CTE `cte`: reads `numbers_571_tc`, computes window metrics. +3. CTE `cte2`: reads `cte`. +4. Apply row-level filtering in `WHERE`: ABS(a-d) = 0 OR ABS(a-d) = 1. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. Set-union logic combines multiple valid pathways into one consistent output. + +### Performance Notes + +Primary cost drivers are window partitions, recursive expansion. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/hard/571. Find Median Given Frequency of Numbers.sql b/hard/571. Find Median Given Frequency of Numbers.sql deleted file mode 100644 index ea5cff8..0000000 --- a/hard/571. Find Median Given Frequency of Numbers.sql +++ /dev/null @@ -1,63 +0,0 @@ -WITH cte AS( - SELECT *, - SUM(frequency) OVER w AS e, - SUM(frequency) OVER () AS t - FROM numbers_571_tc_2 - WINDOW w AS (ORDER BY number) -), -cte2 AS( - SELECT number,frequency, - CASE WHEN (LAG(e::INT,1) OVER w) IS NULL THEN 1 ELSE (LAG(e::INT,1) OVER w)+1 END AS s, - e,t - FROM cte - WINDOW w AS (ORDER BY number) -) - -SELECT ROUND(AVG(number),1) -FROM cte2 -WHERE (t::NUMERIC/2 BETWEEN s AND e) OR (t::NUMERIC/2+1 BETWEEN s AND e); - ---------------------------- OR --------------------------- - -WITH RECURSIVE cte AS ( - SELECT number,frequency,1 AS cnt - FROM numbers_571 - UNION ALL - SELECT number,frequency,cnt+1 AS cnt - FROM cte - WHERE cnt < frequency -), -cte2 AS ( - SELECT number, - ROW_NUMBER() OVER (ORDER BY number) AS a, - COUNT(*) OVER () c - FROM cte -) -SELECT ROUND(AVG(number),1) -FROM cte2 -WHERE a BETWEEN (SELECT CEIL(AVG(c)::NUMERIC/2) FROM cte2) AND (SELECT CEIL((AVG(c)+1::NUMERIC)/2) FROM cte2) - ---------------------------- OR --------------------------- - -WITH RECURSIVE cte AS ( - SELECT number,frequency,1 AS cnt - FROM numbers_571 - UNION ALL - SELECT number,frequency,cnt+1 AS cnt - FROM cte - WHERE cnt < frequency -), -cte2 AS ( - SELECT number, - ROW_NUMBER() OVER (ORDER BY number) AS a - FROM cte -), -cte3 AS ( - SELECT number,a, - ROW_NUMBER() OVER (ORDER BY a DESC) AS d - FROM cte2 -) -SELECT ROUND(AVG(number),1) -FROM cte3 -WHERE ABS(a-d) = 0 OR ABS(a-d) = 1; - diff --git a/hard/579. Find Cumulative Salary of an Employee.md b/hard/579. Find Cumulative Salary of an Employee.md new file mode 100644 index 0000000..03fd3bb --- /dev/null +++ b/hard/579. Find Cumulative Salary of an Employee.md @@ -0,0 +1,102 @@ +# Question 579: Find Cumulative Salary of an Employee + +**LeetCode URL:** https://leetcode.com/problems/find-cumulative-salary-of-an-employee/ + +## Description + +The Employee table holds the salary information in a year. Write a SQL to get the cumulative sum of an employee's salary over a period of 3 months but exclude the most recent month. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (id int, month int, salary int); +``` + +## Sample Input Data + +```sql +insert into Employee (id, month, salary) values ('1', '1', '20'); +insert into Employee (id, month, salary) values ('2', '1', '20'); +insert into Employee (id, month, salary) values ('1', '2', '30'); +insert into Employee (id, month, salary) values ('2', '2', '30'); +insert into Employee (id, month, salary) values ('3', '2', '40'); +insert into Employee (id, month, salary) values ('1', '3', '40'); +insert into Employee (id, month, salary) values ('3', '3', '60'); +insert into Employee (id, month, salary) values ('1', '4', '60'); +insert into Employee (id, month, salary) values ('3', '4', '70'); +insert into Employee (id, month, salary) values ('1', '7', '90'); +insert into Employee (id, month, salary) values ('1', '8', '90'); +``` + +## Expected Output Data + +```text ++--------+ +| id | ++--------+ +| sample | ++--------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT id,month, + SUM(salary) OVER w AS salary, + ROW_NUMBER() OVER w AS row_num, + COUNT(*) OVER w1 AS count + FROM employee_579 + WINDOW w AS (PARTITION BY id ORDER BY month), + w1 AS (PARTITION BY id) +) + +SELECT id,month,salary +FROM cte +WHERE row_num= 100 AND s2.people >= 100 AND s3.people>=100 +ORDER BY visit_date; + +------------------------------------------------------------------------------------------------------------------------------------ + +WITH ranked AS( + SELECT *, + ROW_NUMBER() OVER w AS rn, + (id - ROW_NUMBER() OVER w) AS diff + FROM stadium_601 + WHERE people>=100 + WINDOW w AS (ORDER BY visit_date) +), +consecutive AS( + SELECT diff,COUNT(diff) count + FROM ranked + GROUP BY diff +) + +SELECT id,visit_date,people +FROM ranked r +LEFT JOIN consecutive c ON r.diff = c.diff +WHERE c.count >=3 +ORDER BY visit_date; + +------------------------------------------------------------------------------------------------------------------------------------ + +WITH ranked AS ( + SELECT *, + id-ROW_NUMBER() OVER (ORDER BY id) AS diff + FROM stadium_601 + WHERE people >= 100 +), +consecutives AS ( + SELECT *, + COUNT(id) OVER (PARTITION BY diff) AS cnt + FROM ranked +) +SELECT id,visit_date,people +FROM consecutives +WHERE cnt >= 3 +ORDER BY visit_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `visit_date`, `people` from `stadium`, `ranked`, `consecutive`, `consecutives`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`, `consecutive`, `consecutives`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `stadium`. +3. CTE `consecutive`: reads `ranked`. +4. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: cnt >= 3. +6. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +7. Project final output columns: `id`, `visit_date`, `people`. +8. Order output deterministically with `ORDER BY visit_date`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/hard/601. Human Traffic of Stadium.sql b/hard/601. Human Traffic of Stadium.sql deleted file mode 100644 index eb90383..0000000 --- a/hard/601. Human Traffic of Stadium.sql +++ /dev/null @@ -1,47 +0,0 @@ -SELECT DISTINCT s1.* -FROM stadium_601 s1 -JOIN stadium_601 s2 -JOIN stadium_601 s3 -ON ((s1.id = s2.id-1 AND s1.id = s3.id-2) OR (s1.id = s2.id+1 AND s1.id = s3.id-1) OR (s1.id = s2.id+1 AND s1.id = s3.id+2)) -WHERE s1.people >= 100 AND s2.people >= 100 AND s3.people>=100 -ORDER BY visit_date; - ------------------------------------------------------------------------------------------------------------------------------------- - -WITH ranked AS( - SELECT *, - ROW_NUMBER() OVER w AS rn, - (id - ROW_NUMBER() OVER w) AS diff - FROM stadium_601 - WHERE people>=100 - WINDOW w AS (ORDER BY visit_date) -), -consecutive AS( - SELECT diff,COUNT(diff) count - FROM ranked - GROUP BY diff -) - -SELECT id,visit_date,people -FROM ranked r -LEFT JOIN consecutive c ON r.diff = c.diff -WHERE c.count >=3 -ORDER BY visit_date; - ------------------------------------------------------------------------------------------------------------------------------------- - -WITH ranked AS ( - SELECT *, - id-ROW_NUMBER() OVER (ORDER BY id) AS diff - FROM stadium_601 - WHERE people >= 100 -), -consecutives AS ( - SELECT *, - COUNT(id) OVER (PARTITION BY diff) AS cnt - FROM ranked -) -SELECT id,visit_date,people -FROM consecutives -WHERE cnt >= 3 -ORDER BY visit_date; diff --git a/hard/615. Average Salary: Departments VS Company.md b/hard/615. Average Salary: Departments VS Company.md new file mode 100644 index 0000000..6b5a1f9 --- /dev/null +++ b/hard/615. Average Salary: Departments VS Company.md @@ -0,0 +1,84 @@ +# Question 615: Average Salary: Departments VS Company + +**LeetCode URL:** https://leetcode.com/problems/average-salary-departments-vs-company/ + +## Description + +Given two tables as below, write a query to display the comparison result (higher/lower/same) of the average salary of employees in a department to the company's average salary. Table: salary + +## Table Schema Structure + +```sql +Create table If Not Exists Salary (id int, employee_id int, amount int, pay_date date); +Create table If Not Exists Employee (employee_id int, department_id int); +``` + +## Sample Input Data + +```sql +insert into Salary (id, employee_id, amount, pay_date) values ('1', '1', '9000', '2017/03/31'); +insert into Salary (id, employee_id, amount, pay_date) values ('2', '2', '6000', '2017/03/31'); +insert into Salary (id, employee_id, amount, pay_date) values ('3', '3', '10000', '2017/03/31'); +insert into Salary (id, employee_id, amount, pay_date) values ('4', '1', '7000', '2017/02/28'); +insert into Salary (id, employee_id, amount, pay_date) values ('5', '2', '6000', '2017/02/28'); +insert into Salary (id, employee_id, amount, pay_date) values ('6', '3', '8000', '2017/02/28'); +insert into Employee (employee_id, department_id) values ('1', '1'); +insert into Employee (employee_id, department_id) values ('2', '2'); +insert into Employee (employee_id, department_id) values ('3', '2'); +``` + +## Expected Output Data + +```text ++-----------+---------------+------------+ +| pay_month | department_id | comparison | ++-----------+---------------+------------+ +| sample | sample | sample | ++-----------+---------------+------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT TO_CHAR(pay_date,'YYYY-MM') AS pay_month,b.department_id, + CASE WHEN (AVG(amount) OVER w1) = (AVG(amount) OVER w2) THEN 'same' + WHEN (AVG(amount) OVER w1) > (AVG(amount) OVER w2) THEN 'lower' + ELSE 'higher' + END AS comparison +FROM salary_615 a +JOIN employee_615 b ON a.employee_id=b.employee_id +WINDOW w1 AS (PARTITION BY TO_CHAR(pay_date,'YYYY-MM')), + w2 AS (PARTITION BY TO_CHAR(pay_date,'YYYY-MM'),department_id) +ORDER BY 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `pay_month`, `department_id`, `comparison` from `salary`, `employee`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`w2`) to decompose the logic into smaller, testable steps before the final SELECT. +2. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +3. Project final output columns: `pay_month`, `department_id`, `comparison`. +4. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. +5. Order output deterministically with `ORDER BY 1`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/hard/615. Average Salary: Departments VS Company.sql b/hard/615. Average Salary: Departments VS Company.sql deleted file mode 100644 index 0407e70..0000000 --- a/hard/615. Average Salary: Departments VS Company.sql +++ /dev/null @@ -1,10 +0,0 @@ -SELECT DISTINCT TO_CHAR(pay_date,'YYYY-MM') AS pay_month,b.department_id, - CASE WHEN (AVG(amount) OVER w1) = (AVG(amount) OVER w2) THEN 'same' - WHEN (AVG(amount) OVER w1) > (AVG(amount) OVER w2) THEN 'lower' - ELSE 'higher' - END AS comparison -FROM salary_615 a -JOIN employee_615 b ON a.employee_id=b.employee_id -WINDOW w1 AS (PARTITION BY TO_CHAR(pay_date,'YYYY-MM')), - w2 AS (PARTITION BY TO_CHAR(pay_date,'YYYY-MM'),department_id) -ORDER BY 1; diff --git a/hard/618. Students Report By Geography.md b/hard/618. Students Report By Geography.md new file mode 100644 index 0000000..f61ac84 --- /dev/null +++ b/hard/618. Students Report By Geography.md @@ -0,0 +1,87 @@ +# Question 618: Students Report By Geography + +**LeetCode URL:** https://leetcode.com/problems/students-report-by-geography/ + +## Description + +A U.S graduate school has students from Asia, Europe and America. The students' location information are stored in table student as below. Pivot the continent column in this + +## Table Schema Structure + +```sql +Create table If Not Exists Student (name varchar(50), continent varchar(7)); +``` + +## Sample Input Data + +```sql +insert into Student (name, continent) values ('Jane', 'America'); +insert into Student (name, continent) values ('Pascal', 'Europe'); +insert into Student (name, continent) values ('Xi', 'Asia'); +insert into Student (name, continent) values ('Jack', 'America'); +``` + +## Expected Output Data + +```text ++---------+--------+--------+ +| America | Europe | Asia | ++---------+--------+--------+ +| sample | sample | sample | ++---------+--------+--------+ +``` + +## SQL Solution + +```sql +WITH ranked AS( + SELECT *, + ROW_NUMBER() OVER w AS rnk + FROM student_618 + WINDOW w AS (PARTITION BY continent ORDER BY name) +) +SELECT + MAX(CASE WHEN continent = 'America' THEN name END) AS America, + MAX(CASE WHEN continent = 'Europe' THEN name END) AS Europe, + MAX(CASE WHEN continent = 'Asia' THEN name END) AS Asia +FROM ranked +GROUP BY rnk +ORDER BY rnk; + +--Why we need to rank the rows? Without it below will be the result. + +SELECT + CASE WHEN continent = 'America' THEN name END AS America, + CASE WHEN continent = 'Europe' THEN name END AS Europe, + CASE WHEN continent = 'Asia' THEN name END AS Asia +FROM student_618; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `America`, `Europe`, `Asia` from `student`, `ranked`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `student`. +3. Project final output columns: `America`, `Europe`, `Asia`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/hard/618. Students Report By Geography.sql b/hard/618. Students Report By Geography.sql deleted file mode 100644 index 1bb7aaa..0000000 --- a/hard/618. Students Report By Geography.sql +++ /dev/null @@ -1,21 +0,0 @@ -WITH ranked AS( - SELECT *, - ROW_NUMBER() OVER w AS rnk - FROM student_618 - WINDOW w AS (PARTITION BY continent ORDER BY name) -) -SELECT - MAX(CASE WHEN continent = 'America' THEN name END) AS America, - MAX(CASE WHEN continent = 'Europe' THEN name END) AS Europe, - MAX(CASE WHEN continent = 'Asia' THEN name END) AS Asia -FROM ranked -GROUP BY rnk -ORDER BY rnk; - ---Why we need to rank the rows? Without it below will be the result. - -SELECT - CASE WHEN continent = 'America' THEN name END AS America, - CASE WHEN continent = 'Europe' THEN name END AS Europe, - CASE WHEN continent = 'Asia' THEN name END AS Asia -FROM student_618; diff --git a/medium/1045. Customers Who Bought All Products.md b/medium/1045. Customers Who Bought All Products.md new file mode 100644 index 0000000..ca90434 --- /dev/null +++ b/medium/1045. Customers Who Bought All Products.md @@ -0,0 +1,77 @@ +# Question 1045: Customers Who Bought All Products + +**LeetCode URL:** https://leetcode.com/problems/customers-who-bought-all-products/ + +## Description + +Write a solution to report the customer ids from the Customer table that bought all the products in the Product table. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Customer (customer_id int, product_key int); +Create table Product (product_key int); +``` + +## Sample Input Data + +```sql +insert into Customer (customer_id, product_key) values ('1', '5'); +insert into Customer (customer_id, product_key) values ('2', '6'); +insert into Customer (customer_id, product_key) values ('3', '5'); +insert into Customer (customer_id, product_key) values ('3', '6'); +insert into Customer (customer_id, product_key) values ('1', '6'); +insert into Product (product_key) values ('5'); +insert into Product (product_key) values ('6'); +``` + +## Expected Output Data + +```text ++-------------+ +| customer_id | ++-------------+ +| 1 | +| 3 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT customer_id +FROM customer_1045 +GROUP BY customer_id +HAVING COUNT(customer_id) = (SELECT COUNT(product_key) FROM product_1045) +ORDER BY customer_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_id` from `customer`, `product`. + +### Result Grain + +One row per unique key in `GROUP BY customer_id`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by customer_id. +2. Project final output columns: `customer_id`. +3. Filter aggregated groups in `HAVING`: COUNT(customer_id) = (SELECT COUNT(product_key) FROM product_1045). +4. Order output deterministically with `ORDER BY customer_id`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1045. Customers Who Bought All Products.sql b/medium/1045. Customers Who Bought All Products.sql deleted file mode 100644 index 58093b1..0000000 --- a/medium/1045. Customers Who Bought All Products.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT customer_id -FROM customer_1045 -GROUP BY customer_id -HAVING COUNT(customer_id) = (SELECT COUNT(product_key) FROM product_1045) -ORDER BY customer_id; diff --git a/medium/1070. Product Sales Analysis III.md b/medium/1070. Product Sales Analysis III.md new file mode 100644 index 0000000..4373a83 --- /dev/null +++ b/medium/1070. Product Sales Analysis III.md @@ -0,0 +1,72 @@ +# Question 1070: Product Sales Analysis III + +**LeetCode URL:** https://leetcode.com/problems/product-sales-analysis-iii/ + +## Description + +The query result format is in the following example: Sales table: +---------+------------+------+----------+-------+ | sale_id | product_id | year | quantity | price | +---------+------------+------+----------+-------+ | 1 | 100 | 2008 | 10 | 5000 | | 2 | 100 | 2009 | 12 | 5000 | | 7 | 200 | 2011 | 15 | 9000 | +---------+------------+------+----------+-------+ Product table: +------------+--------------+ | product_id | product_name | +------------+--------------+ | 100 | Nokia | | 200 | Apple | | 300 | Samsung | +------------+--------------+ Result table: +------------+------------+----------+-------+ | product_id | first_year | quantity | price | +------------+------------+----------+-------+ | 100 | 2008 | 10 | 5000 | | 200 | 2011 | 15 | 9000 | +------------+------------+----------+-------+ Difficulty: Medium Lock: Prime Company: Amazon Problem Solution 1070-Product-Sales-Analysis-III All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Sales (sale_id int, product_id int, year int, quantity int, price int); +``` + +## Sample Input Data + +```sql +insert into Sales (sale_id, product_id, year, quantity, price) values ('1', '100', '2008', '10', '5000'); +insert into Sales (sale_id, product_id, year, quantity, price) values ('2', '100', '2009', '12', '5000'); +insert into Sales (sale_id, product_id, year, quantity, price) values ('7', '200', '2011', '15', '9000'); +``` + +## Expected Output Data + +```text ++------------+------------+----------+-------+ +| product_id | first_year | quantity | price | ++------------+------------+----------+-------+ +| 100 | 2008 | 10 | 5000 | +| 200 | 2011 | 15 | 9000 | ++------------+------------+----------+-------+ +``` + +## SQL Solution + +```sql +SELECT product_id,year,quantity,price +FROM sales_1068 +WHERE (product_id,year) IN (SELECT product_id,MIN(year) + FROM sales_1068 + GROUP BY product_id); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id`, `year`, `quantity`, `price` from `sales`. + +### Result Grain + +One row per unique key in `GROUP BY product_id)`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: (product_id,year) IN (SELECT product_id,MIN(year) FROM sales_1068. +2. Aggregate rows with MIN grouped by product_id). +3. Project final output columns: `product_id`, `year`, `quantity`, `price`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1070. Product Sales Analysis III.sql b/medium/1070. Product Sales Analysis III.sql deleted file mode 100644 index 2f7f778..0000000 --- a/medium/1070. Product Sales Analysis III.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT product_id,year,quantity,price -FROM sales_1068 -WHERE (product_id,year) IN (SELECT product_id,MIN(year) - FROM sales_1068 - GROUP BY product_id); diff --git a/medium/1077. Project Employees III.md b/medium/1077. Project Employees III.md new file mode 100644 index 0000000..2e8f0f9 --- /dev/null +++ b/medium/1077. Project Employees III.md @@ -0,0 +1,103 @@ +# Question 1077: Project Employees III + +**LeetCode URL:** https://leetcode.com/problems/project-employees-iii/ + +## Description + +The query result format is in the following example: Project table: +-------------+-------------+ | project_id | employee_id | +-------------+-------------+ | 1 | 1 | | 1 | 2 | | 1 | 3 | | 2 | 1 | | 2 | 4 | +-------------+-------------+ Employee table: +-------------+--------+------------------+ | employee_id | name | experience_years | +-------------+--------+------------------+ | 1 | Khaled | 3 | | 2 | Ali | 2 | | 3 | John | 3 | | 4 | Doe | 2 | +-------------+--------+------------------+ Result table: +-------------+---------------+ | project_id | employee_id | +-------------+---------------+ | 1 | 1 | | 1 | 3 | | 2 | 1 | +-------------+---------------+ Both employees with id 1 and 3 have the most experience among the employees of the first project. + +## Table Schema Structure + +```sql +Create table If Not Exists Project (project_id int, employee_id int); +Create table If Not Exists Employee (employee_id int, name varchar(10), experience_years int); +``` + +## Sample Input Data + +```sql +insert into Project (project_id, employee_id) values ('1', '1'); +insert into Project (project_id, employee_id) values ('1', '2'); +insert into Project (project_id, employee_id) values ('1', '3'); +insert into Project (project_id, employee_id) values ('2', '1'); +insert into Project (project_id, employee_id) values ('2', '4'); +insert into Employee (employee_id, name, experience_years) values ('1', 'Khaled', '3'); +insert into Employee (employee_id, name, experience_years) values ('2', 'Ali', '2'); +insert into Employee (employee_id, name, experience_years) values ('3', 'John', '3'); +insert into Employee (employee_id, name, experience_years) values ('4', 'Doe', '2'); +``` + +## Expected Output Data + +```text ++-------------+---------------+ +| project_id | employee_id | ++-------------+---------------+ +| 1 | 1 | +| 1 | 3 | +| 2 | 1 | ++-------------+---------------+ +``` + +## SQL Solution + +```sql +SELECT p.project_id,p.employee_id +FROM project_1077 p +JOIN employee_1077 e ON p.employee_id=e.employee_id +WHERE (p.project_id,e.experience_years) IN (SELECT p.project_id,MAX(e.experience_years) + FROM project_1077 p + JOIN employee_1077 e ON p.employee_id=e.employee_id + GROUP BY p.project_id) +ORDER BY 1; + + +--(OR) + + +WITH ranked AS( + SELECT p.project_id,p.employee_id, + DENSE_RANK() OVER (w) rnk + FROM project_1077 p + JOIN employee_1077 e ON p.employee_id=e.employee_id + WINDOW w AS (PARTITION BY p.project_id ORDER BY e.experience_years DESC) +) + +SELECT project_id,employee_id +FROM ranked +WHERE rnk =1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `project_id`, `employee_id` from `project`, `employee`, `ranked`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `project`, `employee`, computes window metrics. +3. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Apply row-level filtering in `WHERE`: rnk =1. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `project_id`, `employee_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1077. Project Employees III.sql b/medium/1077. Project Employees III.sql deleted file mode 100644 index 59b0593..0000000 --- a/medium/1077. Project Employees III.sql +++ /dev/null @@ -1,24 +0,0 @@ -SELECT p.project_id,p.employee_id -FROM project_1077 p -JOIN employee_1077 e ON p.employee_id=e.employee_id -WHERE (p.project_id,e.experience_years) IN (SELECT p.project_id,MAX(e.experience_years) - FROM project_1077 p - JOIN employee_1077 e ON p.employee_id=e.employee_id - GROUP BY p.project_id) -ORDER BY 1; - - ---(OR) - - -WITH ranked AS( - SELECT p.project_id,p.employee_id, - DENSE_RANK() OVER (w) rnk - FROM project_1077 p - JOIN employee_1077 e ON p.employee_id=e.employee_id - WINDOW w AS (PARTITION BY p.project_id ORDER BY e.experience_years DESC) -) - -SELECT project_id,employee_id -FROM ranked -WHERE rnk =1; diff --git a/medium/1098. Unpopular Books.md b/medium/1098. Unpopular Books.md new file mode 100644 index 0000000..b3001d4 --- /dev/null +++ b/medium/1098. Unpopular Books.md @@ -0,0 +1,90 @@ +# Question 1098: Unpopular Books + +**LeetCode URL:** https://leetcode.com/problems/unpopular-books/ + +## Description + +The query result format is in the following example: Books table: +---------+--------------------+----------------+ | book_id | name | available_from | +---------+--------------------+----------------+ | 1 | "Kalila And Demna" | 2010-01-01 | | 2 | "28 Letters" | 2012-05-12 | | 3 | "The Hobbit" | 2019-06-10 | | 4 | "13 Reasons Why" | 2019-06-01 | | 5 | "The Hunger Games" | 2008-09-21 | +---------+--------------------+----------------+ Orders table: +----------+---------+----------+---------------+ | order_id | book_id | quantity | dispatch_date | +----------+---------+----------+---------------+ | 1 | 1 | 2 | 2018-07-26 | | 2 | 1 | 1 | 2018-11-05 | | 3 | 3 | 8 | 2019-06-11 | | 4 | 4 | 6 | 2019-06-05 | | 5 | 4 | 5 | 2019-06-20 | | 6 | 5 | 9 | 2009-02-02 | | 7 | 5 | 8 | 2010-04-13 | +----------+---------+----------+---------------+ Result table: +-----------+--------------------+ | book_id | name | +-----------+--------------------+ | 1 | "Kalila And Demna" | | 2 | "28 Letters" | | 5 | "The Hunger Games" | +-----------+--------------------+ Difficulty: Medium Lock: Prime Company: Unknown Problem Solution 1098-Unpopular-Books All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Books (book_id int, name varchar(50), available_from date); +Create table If Not Exists Orders (order_id int, book_id int, quantity int, dispatch_date date); +``` + +## Sample Input Data + +```sql +insert into Books (book_id, name, available_from) values ('1', 'Kalila And Demna', '2010-01-01'); +insert into Books (book_id, name, available_from) values ('2', '28 Letters', '2012-05-12'); +insert into Books (book_id, name, available_from) values ('3', 'The Hobbit', '2019-06-10'); +insert into Books (book_id, name, available_from) values ('4', '13 Reasons Why', '2019-06-01'); +insert into Books (book_id, name, available_from) values ('5', 'The Hunger Games', '2008-09-21'); +insert into Orders (order_id, book_id, quantity, dispatch_date) values ('1', '1', '2', '2018-07-26'); +insert into Orders (order_id, book_id, quantity, dispatch_date) values ('2', '1', '1', '2018-11-05'); +insert into Orders (order_id, book_id, quantity, dispatch_date) values ('3', '3', '8', '2019-06-11'); +insert into Orders (order_id, book_id, quantity, dispatch_date) values ('4', '4', '6', '2019-06-05'); +insert into Orders (order_id, book_id, quantity, dispatch_date) values ('5', '4', '5', '2019-06-20'); +insert into Orders (order_id, book_id, quantity, dispatch_date) values ('6', '5', '9', '2009-02-02'); +insert into Orders (order_id, book_id, quantity, dispatch_date) values ('7', '5', '8', '2010-04-13'); +``` + +## Expected Output Data + +```text ++-----------+--------------------+ +| book_id | name | ++-----------+--------------------+ +| 1 | "Kalila And Demna" | +| 2 | "28 Letters" | +| 5 | "The Hunger Games" | ++-----------+--------------------+ +``` + +## SQL Solution + +```sql +SELECT b.book_id, b.name +FROM books_1098 b +LEFT JOIN ( + SELECT book_id, SUM(quantity) nsold + FROM orders_1098 + WHERE dispatch_date BETWEEN '2018-06-23' AND '2019-06-23' + GROUP BY book_id + ) o +ON b.book_id = o.book_id +WHERE (o.nsold < 10 OR o.nsold IS NULL) AND '2019-06-23'::DATE-b.available_from > 30; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `book_id`, `name` from `books`, `orders`. + +### Result Grain + +One row per unique key in `GROUP BY book_id ) o ON b.book_id = o.book_id WHERE (o.nsold < 10 OR o.nsold IS NULL) AND '2019-06-23'::DATE-b.available_from > 30`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: dispatch_date BETWEEN '2018-06-23' AND '2019-06-23'. +3. Aggregate rows with SUM grouped by book_id ) o ON b.book_id = o.book_id WHERE (o.nsold < 10 OR o.nsold IS NULL) AND '2019-06-23'::DATE-b.available_from > 30. +4. Project final output columns: `book_id`, `name`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1098. Unpopular Books.sql b/medium/1098. Unpopular Books.sql deleted file mode 100644 index c29371c..0000000 --- a/medium/1098. Unpopular Books.sql +++ /dev/null @@ -1,10 +0,0 @@ -SELECT b.book_id, b.name -FROM books_1098 b -LEFT JOIN ( - SELECT book_id, SUM(quantity) nsold - FROM orders_1098 - WHERE dispatch_date BETWEEN '2018-06-23' AND '2019-06-23' - GROUP BY book_id - ) o -ON b.book_id = o.book_id -WHERE (o.nsold < 10 OR o.nsold IS NULL) AND '2019-06-23'::DATE-b.available_from > 30; diff --git a/medium/1107. New Users Daily Count.md b/medium/1107. New Users Daily Count.md new file mode 100644 index 0000000..762aa9c --- /dev/null +++ b/medium/1107. New Users Daily Count.md @@ -0,0 +1,106 @@ +# Question 1107: New Users Daily Count + +**LeetCode URL:** https://leetcode.com/problems/new-users-daily-count/ + +## Description + +The query result format is in the following example: Traffic table: +---------+----------+---------------+ | user_id | activity | activity_date | +---------+----------+---------------+ | 1 | login | 2019-05-01 | | 1 | homepage | 2019-05-01 | | 1 | logout | 2019-05-01 | | 2 | login | 2019-06-21 | | 2 | logout | 2019-06-21 | | 3 | login | 2019-01-01 | | 3 | jobs | 2019-01-01 | | 3 | logout | 2019-01-01 | | 4 | login | 2019-06-21 | | 4 | groups | 2019-06-21 | | 4 | logout | 2019-06-21 | | 5 | login | 2019-03-01 | | 5 | logout | 2019-03-01 | | 5 | login | 2019-06-21 | | 5 | logout | 2019-06-21 | +---------+----------+---------------+ Result table: +------------+-------------+ | login_date | user_count | +------------+-------------+ | 2019-05-01 | 1 | | 2019-06-21 | 2 | +------------+-------------+ Note that we only care about dates with non zero user count. + +## Table Schema Structure + +```sql +Create table If Not Exists Traffic (user_id int, activity ENUM('login', 'logout', 'jobs', 'groups', 'homepage'), activity_date date); +``` + +## Sample Input Data + +```sql +insert into Traffic (user_id, activity, activity_date) values ('1', 'login', '2019-05-01'); +insert into Traffic (user_id, activity, activity_date) values ('1', 'homepage', '2019-05-01'); +insert into Traffic (user_id, activity, activity_date) values ('1', 'logout', '2019-05-01'); +insert into Traffic (user_id, activity, activity_date) values ('2', 'login', '2019-06-21'); +insert into Traffic (user_id, activity, activity_date) values ('2', 'logout', '2019-06-21'); +insert into Traffic (user_id, activity, activity_date) values ('3', 'login', '2019-01-01'); +insert into Traffic (user_id, activity, activity_date) values ('3', 'jobs', '2019-01-01'); +insert into Traffic (user_id, activity, activity_date) values ('3', 'logout', '2019-01-01'); +insert into Traffic (user_id, activity, activity_date) values ('4', 'login', '2019-06-21'); +insert into Traffic (user_id, activity, activity_date) values ('4', 'groups', '2019-06-21'); +insert into Traffic (user_id, activity, activity_date) values ('4', 'logout', '2019-06-21'); +insert into Traffic (user_id, activity, activity_date) values ('5', 'login', '2019-03-01'); +insert into Traffic (user_id, activity, activity_date) values ('5', 'logout', '2019-03-01'); +insert into Traffic (user_id, activity, activity_date) values ('5', 'login', '2019-06-21'); +insert into Traffic (user_id, activity, activity_date) values ('5', 'logout', '2019-06-21'); +``` + +## Expected Output Data + +```text ++------------+-------------+ +| login_date | user_count | ++------------+-------------+ +| 2019-05-01 | 1 | +| 2019-06-21 | 2 | ++------------+-------------+ +``` + +## SQL Solution + +```sql +WITH ranked AS( + SELECT *, + ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY activity_date) AS rnk + FROM traffic_1107 + WHERE activity = 'login' +) +SELECT activity_date,COUNT(DISTINCT user_id) +FROM ranked +WHERE ('2019-06-30'::DATE-activity_date)<=90 AND rnk =1 +GROUP BY activity_date; + +--(OR) + +WITH ranked AS( + SELECT user_id,MIN(activity_date) AS activity_date + FROM traffic_1107 + WHERE activity = 'login' + GROUP BY user_id +) +SELECT activity_date,COUNT(user_id) +FROM ranked +WHERE ('2019-06-30'::DATE-activity_date)<=90 +GROUP BY activity_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `activity_date` from `traffic`, `ranked`. + +### Result Grain + +One row per unique key in `GROUP BY activity_date`. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `traffic`, computes window metrics. +3. Apply row-level filtering in `WHERE`: ('2019-06-30'::DATE-activity_date)<=90. +4. Aggregate rows with COUNT, MIN grouped by activity_date. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `activity_date`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1107. New Users Daily Count.sql b/medium/1107. New Users Daily Count.sql deleted file mode 100644 index e4a40b0..0000000 --- a/medium/1107. New Users Daily Count.sql +++ /dev/null @@ -1,24 +0,0 @@ -WITH ranked AS( - SELECT *, - ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY activity_date) AS rnk - FROM traffic_1107 - WHERE activity = 'login' -) -SELECT activity_date,COUNT(DISTINCT user_id) -FROM ranked -WHERE ('2019-06-30'::DATE-activity_date)<=90 AND rnk =1 -GROUP BY activity_date; - ---(OR) - -WITH ranked AS( - SELECT user_id,MIN(activity_date) AS activity_date - FROM traffic_1107 - WHERE activity = 'login' - GROUP BY user_id -) -SELECT activity_date,COUNT(user_id) -FROM ranked -WHERE ('2019-06-30'::DATE-activity_date)<=90 -GROUP BY activity_date; - diff --git a/medium/1112. Highest Grade For Each Student.md b/medium/1112. Highest Grade For Each Student.md new file mode 100644 index 0000000..5d3edfb --- /dev/null +++ b/medium/1112. Highest Grade For Each Student.md @@ -0,0 +1,83 @@ +# Question 1112: Highest Grade For Each Student + +**LeetCode URL:** https://leetcode.com/problems/highest-grade-for-each-student/ + +## Description + +The query result format is in the following example: Enrollments table: +------------+-------------------+ | student_id | course_id | grade | +------------+-----------+-------+ | 2 | 2 | 95 | | 2 | 3 | 95 | | 1 | 1 | 90 | | 1 | 2 | 99 | | 3 | 1 | 80 | | 3 | 2 | 75 | | 3 | 3 | 82 | +------------+-----------+-------+ Result table: +------------+-------------------+ | student_id | course_id | grade | +------------+-----------+-------+ | 1 | 2 | 99 | | 2 | 2 | 95 | | 3 | 3 | 82 | +------------+-----------+-------+ Difficulty: Medium Lock: Prime Company: Coursera Problem Solution 1112-Highest-Grade-For-Each-Student All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Enrollments (student_id int, course_id int, grade int); +``` + +## Sample Input Data + +```sql +insert into Enrollments (student_id, course_id, grade) values ('2', '2', '95'); +insert into Enrollments (student_id, course_id, grade) values ('2', '3', '95'); +insert into Enrollments (student_id, course_id, grade) values ('1', '1', '90'); +insert into Enrollments (student_id, course_id, grade) values ('1', '2', '99'); +insert into Enrollments (student_id, course_id, grade) values ('3', '1', '80'); +insert into Enrollments (student_id, course_id, grade) values ('3', '2', '75'); +insert into Enrollments (student_id, course_id, grade) values ('3', '3', '82'); +``` + +## Expected Output Data + +```text ++------------+-------------------+ +| student_id | course_id | grade | ++------------+-----------+-------+ +| 1 | 2 | 99 | +| 2 | 2 | 95 | +| 3 | 3 | 82 | ++------------+-----------+-------+ +``` + +## SQL Solution + +```sql +WITH ranked AS( + SELECT *, + RANK() OVER (PARTITION BY student_id ORDER BY grade DESC,course_id ASC) AS rnk + FROM enrollments_1112 +) + +SELECT student_id,course_id,grade +FROM ranked +WHERE rnk = 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `student_id`, `course_id`, `grade` from `enrollments`, `ranked`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `enrollments`, computes window metrics. +3. Apply row-level filtering in `WHERE`: rnk = 1. +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `student_id`, `course_id`, `grade`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1112. Highest Grade For Each Student.sql b/medium/1112. Highest Grade For Each Student.sql deleted file mode 100644 index 1caf7c9..0000000 --- a/medium/1112. Highest Grade For Each Student.sql +++ /dev/null @@ -1,9 +0,0 @@ -WITH ranked AS( - SELECT *, - RANK() OVER (PARTITION BY student_id ORDER BY grade DESC,course_id ASC) AS rnk - FROM enrollments_1112 -) - -SELECT student_id,course_id,grade -FROM ranked -WHERE rnk = 1; diff --git a/medium/1126. Active Businesses.md b/medium/1126. Active Businesses.md new file mode 100644 index 0000000..d9655fc --- /dev/null +++ b/medium/1126. Active Businesses.md @@ -0,0 +1,86 @@ +# Question 1126: Active Businesses + +**LeetCode URL:** https://leetcode.com/problems/active-businesses/ + +## Description + +Write an SQL query to find all active businesses. The query result format is in the following example: Events table: +-------------+------------+------------+ | business_id | event_type | occurences | +-------------+------------+------------+ | 1 | reviews | 7 | | 3 | reviews | 3 | | 1 | ads | 11 | | 2 | ads | 7 | | 3 | ads | 6 | | 1 | page views | 3 | | 2 | page views | 12 | +-------------+------------+------------+ Result table: +-------------+ | business_id | +-------------+ | 1 | +-------------+ Average for 'reviews', 'ads' and 'page views' are (7+3)/2=5, (11+7+6)/3=8, (3+12)/2=7. + +## Table Schema Structure + +```sql +Create table If Not Exists Events (business_id int, event_type varchar(10), occurrences int); +``` + +## Sample Input Data + +```sql +insert into Events (business_id, event_type, occurrences) values ('1', 'reviews', '7'); +insert into Events (business_id, event_type, occurrences) values ('3', 'reviews', '3'); +insert into Events (business_id, event_type, occurrences) values ('1', 'ads', '11'); +insert into Events (business_id, event_type, occurrences) values ('2', 'ads', '7'); +insert into Events (business_id, event_type, occurrences) values ('3', 'ads', '6'); +insert into Events (business_id, event_type, occurrences) values ('1', 'page views', '3'); +insert into Events (business_id, event_type, occurrences) values ('2', 'page views', '12'); +``` + +## Expected Output Data + +```text ++-------------+ +| business_id | ++-------------+ +| 1 | ++-------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT *, + AVG(occurences) OVER(PARTITION BY event_type) AS avg + FROM events_1126 +) + +SELECT business_id +FROM cte +WHERE occurences > avg +GROUP BY business_id +HAVING COUNT(business_id) > 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `business_id` from `events`, `cte`. + +### Result Grain + +One row per unique key in `GROUP BY business_id`. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `events`. +3. Apply row-level filtering in `WHERE`: occurences > avg. +4. Aggregate rows with COUNT, AVG grouped by business_id. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `business_id`. +7. Filter aggregated groups in `HAVING`: COUNT(business_id) > 1. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1126. Active Businesses.sql b/medium/1126. Active Businesses.sql deleted file mode 100644 index c25a8c6..0000000 --- a/medium/1126. Active Businesses.sql +++ /dev/null @@ -1,11 +0,0 @@ -WITH cte AS( - SELECT *, - AVG(occurences) OVER(PARTITION BY event_type) AS avg - FROM events_1126 -) - -SELECT business_id -FROM cte -WHERE occurences > avg -GROUP BY business_id -HAVING COUNT(business_id) > 1; diff --git a/medium/1132. Reported Posts II.md b/medium/1132. Reported Posts II.md new file mode 100644 index 0000000..5e17c6e --- /dev/null +++ b/medium/1132. Reported Posts II.md @@ -0,0 +1,108 @@ +# Question 1132: Reported Posts II + +**LeetCode URL:** https://leetcode.com/problems/reported-posts-ii/ + +## Description + +Write an SQL query to find the average for daily percentage of posts that got removed after being reported as spam, rounded to 2 decimal places. The query result format is in the following example: Actions table: +---------+---------+-------------+--------+--------+ | user_id | post_id | action_date | action | extra | +---------+---------+-------------+--------+--------+ | 1 | 1 | 2019-07-01 | view | null | | 1 | 1 | 2019-07-01 | like | null | | 1 | 1 | 2019-07-01 | share | null | | 2 | 2 | 2019-07-04 | view | null | | 2 | 2 | 2019-07-04 | report | spam | | 3 | 4 | 2019-07-04 | view | null | | 3 | 4 | 2019-07-04 | report | spam | | 4 | 3 | 2019-07-02 | view | null | | 4 | 3 | 2019-07-02 | report | spam | | 5 | 2 | 2019-07-03 | view | null | | 5 | 2 | 2019-07-03 | report | racism | | 5 | 5 | 2019-07-03 | view | null | | 5 | 5 | 2019-07-03 | report | racism | +---------+---------+-------------+--------+--------+ Removals table: +---------+-------------+ | post_id | remove_date | +---------+-------------+ | 2 | 2019-07-20 | | 3 | 2019-07-18 | +---------+-------------+ Result table: +-----------------------+ | average_daily_percent | +-----------------------+ | 75. + +## Table Schema Structure + +```sql +Create table If Not Exists Actions (user_id int, post_id int, action_date date, action ENUM('view', 'like', 'reaction', 'comment', 'report', 'share'), extra varchar(10)); +create table if not exists Removals (post_id int, remove_date date); +``` + +## Sample Input Data + +```sql +insert into Actions (user_id, post_id, action_date, action, extra) values ('1', '1', '2019-07-01', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('1', '1', '2019-07-01', 'like', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('1', '1', '2019-07-01', 'share', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('2', '2', '2019-07-04', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('2', '2', '2019-07-04', 'report', 'spam'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('3', '4', '2019-07-04', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('3', '4', '2019-07-04', 'report', 'spam'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('4', '3', '2019-07-02', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('4', '3', '2019-07-02', 'report', 'spam'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '2', '2019-07-03', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '2', '2019-07-03', 'report', 'racism'); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '5', '2019-07-03', 'view', NULL); +insert into Actions (user_id, post_id, action_date, action, extra) values ('5', '5', '2019-07-03', 'report', 'racism'); +insert into Removals (post_id, remove_date) values ('2', '2019-07-20'); +insert into Removals (post_id, remove_date) values ('3', '2019-07-18'); +``` + +## Expected Output Data + +```text ++-----------------------+ +| average_daily_percent | ++-----------------------+ +| 75.00 | ++-----------------------+ +``` + +## SQL Solution + +```sql +WITH spammed AS( + SELECT * + FROM actions_1132 + WHERE extra = 'spam' +), + +percentage AS( + SELECT (COUNT(r.post_id)::NUMERIC/COUNT(s.post_id))*100 AS per + FROM spammed s + LEFT JOIN removals_1132 r ON s.post_id = r.post_id + GROUP BY s.action_date +) + +SELECT ROUND(AVG(per),2) AS avg_daily_percent FROM percentage; + +--------------------------- OR --------------------------- + +WITH cte AS ( + SELECT a.action_date, + ROUND(COUNT(CASE WHEN a.extra = 'spam' AND r.post_id IS NOT NULL THEN 1 ELSE NULL END)*100::NUMERIC/COUNT(DISTINCT a.post_id),2) AS removed_spammed_post_percentage + FROM actions_1132 a + LEFT JOIN removals_1132 r ON a.post_id = r.post_id + GROUP BY a.action_date +) +SELECT ROUND(AVG(removed_spammed_post_percentage),2) +FROM cte +WHERE removed_spammed_post_percentage <> 0; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `actions`, `spammed`, `removals`, `percentage`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`spammed`, `percentage`, `cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `spammed`: reads `actions`. +3. CTE `percentage`: reads `spammed`, `removals`, joins related entities. +4. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: removed_spammed_post_percentage <> 0. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1132. Reported Posts II.sql b/medium/1132. Reported Posts II.sql deleted file mode 100644 index 6bbbc2c..0000000 --- a/medium/1132. Reported Posts II.sql +++ /dev/null @@ -1,27 +0,0 @@ -WITH spammed AS( - SELECT * - FROM actions_1132 - WHERE extra = 'spam' -), - -percentage AS( - SELECT (COUNT(r.post_id)::NUMERIC/COUNT(s.post_id))*100 AS per - FROM spammed s - LEFT JOIN removals_1132 r ON s.post_id = r.post_id - GROUP BY s.action_date -) - -SELECT ROUND(AVG(per),2) AS avg_daily_percent FROM percentage; - ---------------------------- OR --------------------------- - -WITH cte AS ( - SELECT a.action_date, - ROUND(COUNT(CASE WHEN a.extra = 'spam' AND r.post_id IS NOT NULL THEN 1 ELSE NULL END)*100::NUMERIC/COUNT(DISTINCT a.post_id),2) AS removed_spammed_post_percentage - FROM actions_1132 a - LEFT JOIN removals_1132 r ON a.post_id = r.post_id - GROUP BY a.action_date -) -SELECT ROUND(AVG(removed_spammed_post_percentage),2) -FROM cte -WHERE removed_spammed_post_percentage <> 0; diff --git a/medium/1149. Article Views II.md b/medium/1149. Article Views II.md new file mode 100644 index 0000000..275f2e8 --- /dev/null +++ b/medium/1149. Article Views II.md @@ -0,0 +1,77 @@ +# Question 1149: Article Views II + +**LeetCode URL:** https://leetcode.com/problems/article-views-ii/ + +## Description + +Write an SQL query to find all the people who viewed more than one article on the same date, sorted in ascending order by their id. The query result format is in the following example: Views table: +------------+-----------+-----------+------------+ | article_id | author_id | viewer_id | view_date | +------------+-----------+-----------+------------+ | 1 | 3 | 5 | 2019-08-01 | | 3 | 4 | 5 | 2019-08-01 | | 1 | 3 | 6 | 2019-08-02 | | 2 | 7 | 7 | 2019-08-01 | | 2 | 7 | 6 | 2019-08-02 | | 4 | 7 | 1 | 2019-07-22 | | 3 | 4 | 4 | 2019-07-21 | | 3 | 4 | 4 | 2019-07-21 | +------------+-----------+-----------+------------+ Result table: +------+ | id | +------+ | 5 | | 6 | +------+ Difficulty: Medium Lock: Prime Company: LinkedIn Problem Solution 1149-Article-Views-II All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Views (article_id int, author_id int, viewer_id int, view_date date); +``` + +## Sample Input Data + +```sql +insert into Views (article_id, author_id, viewer_id, view_date) values ('1', '3', '5', '2019-08-01'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('3', '4', '5', '2019-08-01'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('1', '3', '6', '2019-08-02'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('2', '7', '7', '2019-08-01'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('2', '7', '6', '2019-08-02'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('4', '7', '1', '2019-07-22'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('3', '4', '4', '2019-07-21'); +insert into Views (article_id, author_id, viewer_id, view_date) values ('3', '4', '4', '2019-07-21'); +``` + +## Expected Output Data + +```text ++------+ +| id | ++------+ +| 5 | +| 6 | ++------+ +``` + +## SQL Solution + +```sql +SELECT viewer_id +FROM views_1149 +GROUP BY viewer_id,view_date +HAVING COUNT(DISTINCT article_id)>1 +ORDER BY 1 ASC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `viewer_id` from `views`. + +### Result Grain + +One row per unique key in `GROUP BY viewer_id,view_date`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by viewer_id,view_date. +2. Project final output columns: `viewer_id`. +3. Filter aggregated groups in `HAVING`: COUNT(DISTINCT article_id)>1. +4. Order output deterministically with `ORDER BY 1 ASC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1149. Article Views II.sql b/medium/1149. Article Views II.sql deleted file mode 100644 index 8c335aa..0000000 --- a/medium/1149. Article Views II.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT viewer_id -FROM views_1149 -GROUP BY viewer_id,view_date -HAVING COUNT(DISTINCT article_id)>1 -ORDER BY 1 ASC; diff --git a/medium/1158. Market Analysis I.md b/medium/1158. Market Analysis I.md new file mode 100644 index 0000000..fed1a68 --- /dev/null +++ b/medium/1158. Market Analysis I.md @@ -0,0 +1,92 @@ +# Question 1158: Market Analysis I + +**LeetCode URL:** https://leetcode.com/problems/market-analysis-i/ + +## Description + +Write an SQL query to find for each user, the join date and the number of orders they made as a buyer in 2019. The query result format is in the following example: Users table: +---------+------------+----------------+ | user_id | join_date | favorite_brand | +---------+------------+----------------+ | 1 | 2018-01-01 | Lenovo | | 2 | 2018-02-09 | Samsung | | 3 | 2018-01-19 | LG | | 4 | 2018-05-21 | HP | +---------+------------+----------------+ Orders table: +----------+------------+---------+----------+-----------+ | order_id | order_date | item_id | buyer_id | seller_id | +----------+------------+---------+----------+-----------+ | 1 | 2019-08-01 | 4 | 1 | 2 | | 2 | 2018-08-02 | 2 | 1 | 3 | | 3 | 2019-08-03 | 3 | 2 | 3 | | 4 | 2018-08-04 | 1 | 4 | 2 | | 5 | 2018-08-04 | 1 | 3 | 4 | | 6 | 2019-08-05 | 2 | 2 | 4 | +----------+------------+---------+----------+-----------+ Items table: +---------+------------+ | item_id | item_brand | +---------+------------+ | 1 | Samsung | | 2 | Lenovo | | 3 | LG | | 4 | HP | +---------+------------+ Result table: +-----------+------------+----------------+ | buyer_id | join_date | orders_in_2019 | +-----------+------------+----------------+ | 1 | 2018-01-01 | 1 | | 2 | 2018-02-09 | 2 | | 3 | 2018-01-19 | 0 | | 4 | 2018-05-21 | 0 | +-----------+------------+----------------+ Difficulty: Medium Lock: Prime Company: Poshmark Problem Solution 1158-Market-Analysis-I All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Users (user_id int, join_date date, favorite_brand varchar(10)); +Create table If Not Exists Orders (order_id int, order_date date, item_id int, buyer_id int, seller_id int); +Create table If Not Exists Items (item_id int, item_brand varchar(10)); +``` + +## Sample Input Data + +```sql +insert into Users (user_id, join_date, favorite_brand) values ('1', '2018-01-01', 'Lenovo'); +insert into Users (user_id, join_date, favorite_brand) values ('2', '2018-02-09', 'Samsung'); +insert into Users (user_id, join_date, favorite_brand) values ('3', '2018-01-19', 'LG'); +insert into Users (user_id, join_date, favorite_brand) values ('4', '2018-05-21', 'HP'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('1', '2019-08-01', '4', '1', '2'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('2', '2018-08-02', '2', '1', '3'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('3', '2019-08-03', '3', '2', '3'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('4', '2018-08-04', '1', '4', '2'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('5', '2018-08-04', '1', '3', '4'); +insert into Orders (order_id, order_date, item_id, buyer_id, seller_id) values ('6', '2019-08-05', '2', '2', '4'); +insert into Items (item_id, item_brand) values ('1', 'Samsung'); +insert into Items (item_id, item_brand) values ('2', 'Lenovo'); +insert into Items (item_id, item_brand) values ('3', 'LG'); +insert into Items (item_id, item_brand) values ('4', 'HP'); +``` + +## Expected Output Data + +```text ++-----------+------------+----------------+ +| buyer_id | join_date | orders_in_2019 | ++-----------+------------+----------------+ +| 1 | 2018-01-01 | 1 | +| 2 | 2018-02-09 | 2 | +| 3 | 2018-01-19 | 0 | +| 4 | 2018-05-21 | 0 | ++-----------+------------+----------------+ +``` + +## SQL Solution + +```sql +SELECT u.user_id,u.join_date,COALESCE(b.orders_in_2019,0) +FROM users_1158 u +LEFT JOIN + (SELECT buyer_id,COUNT(order_id) AS orders_in_2019 + FROM orders_1158 + WHERE EXTRACT(YEAR FROM order_date) = 2019 + GROUP BY buyer_id) b +ON u.user_id = b.buyer_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id`, `join_date` from `users`, `orders`, `order_date`. + +### Result Grain + +One row per unique key in `GROUP BY buyer_id) b ON u.user_id = b.buyer_id`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: EXTRACT(YEAR FROM order_date) = 2019. +3. Aggregate rows with COUNT grouped by buyer_id) b ON u.user_id = b.buyer_id. +4. Project final output columns: `user_id`, `join_date`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1158. Market Analysis I.sql b/medium/1158. Market Analysis I.sql deleted file mode 100644 index 57f8939..0000000 --- a/medium/1158. Market Analysis I.sql +++ /dev/null @@ -1,8 +0,0 @@ -SELECT u.user_id,u.join_date,COALESCE(b.orders_in_2019,0) -FROM users_1158 u -LEFT JOIN - (SELECT buyer_id,COUNT(order_id) AS orders_in_2019 - FROM orders_1158 - WHERE EXTRACT(YEAR FROM order_date) = 2019 - GROUP BY buyer_id) b -ON u.user_id = b.buyer_id; diff --git a/medium/1164. Product Price at a Given Date.md b/medium/1164. Product Price at a Given Date.md new file mode 100644 index 0000000..07f8070 --- /dev/null +++ b/medium/1164. Product Price at a Given Date.md @@ -0,0 +1,89 @@ +# Question 1164: Product Price at a Given Date + +**LeetCode URL:** https://leetcode.com/problems/product-price-at-a-given-date/ + +## Description + +Write an SQL query to find the prices of all products on 2019-08-16. The query result format is in the following example: Products table: +------------+-----------+-------------+ | product_id | new_price | change_date | +------------+-----------+-------------+ | 1 | 20 | 2019-08-14 | | 2 | 50 | 2019-08-14 | | 1 | 30 | 2019-08-15 | | 1 | 35 | 2019-08-16 | | 2 | 65 | 2019-08-17 | | 3 | 20 | 2019-08-18 | +------------+-----------+-------------+ Result table: +------------+-------+ | product_id | price | +------------+-------+ | 2 | 50 | | 1 | 35 | | 3 | 10 | +------------+-------+ Difficulty: Medium Lock: Prime Company: Unknown Problem Solution 1164-Product-Price-at-a-Given-Date All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Products (product_id int, new_price int, change_date date); +``` + +## Sample Input Data + +```sql +insert into Products (product_id, new_price, change_date) values ('1', '20', '2019-08-14'); +insert into Products (product_id, new_price, change_date) values ('2', '50', '2019-08-14'); +insert into Products (product_id, new_price, change_date) values ('1', '30', '2019-08-15'); +insert into Products (product_id, new_price, change_date) values ('1', '35', '2019-08-16'); +insert into Products (product_id, new_price, change_date) values ('2', '65', '2019-08-17'); +insert into Products (product_id, new_price, change_date) values ('3', '20', '2019-08-18'); +``` + +## Expected Output Data + +```text ++------------+-------+ +| product_id | price | ++------------+-------+ +| 2 | 50 | +| 1 | 35 | +| 3 | 10 | ++------------+-------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT product_id,MAX(change_date) AS max_date + FROM products_1164 + WHERE change_date <= '2019-08-16' + GROUP BY product_id +) + +SELECT p.product_id, + MAX(CASE WHEN c.product_id IS NULL THEN 10 + WHEN p.change_date = c.max_date THEN p.new_price + END) AS price +FROM products_1164 p +LEFT JOIN cte c ON p.product_id = c.product_id +GROUP BY p.product_id +ORDER BY p.product_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id`, `price` from `products`, `cte`. + +### Result Grain + +One row per unique key in `GROUP BY p.product_id`. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `products`. +3. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Aggregate rows with MAX grouped by p.product_id. +5. Project final output columns: `product_id`, `price`. +6. Order output deterministically with `ORDER BY p.product_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1164. Product Price at a Given Date.sql b/medium/1164. Product Price at a Given Date.sql deleted file mode 100644 index 924cdfb..0000000 --- a/medium/1164. Product Price at a Given Date.sql +++ /dev/null @@ -1,15 +0,0 @@ -WITH cte AS( - SELECT product_id,MAX(change_date) AS max_date - FROM products_1164 - WHERE change_date <= '2019-08-16' - GROUP BY product_id -) - -SELECT p.product_id, - MAX(CASE WHEN c.product_id IS NULL THEN 10 - WHEN p.change_date = c.max_date THEN p.new_price - END) AS price -FROM products_1164 p -LEFT JOIN cte c ON p.product_id = c.product_id -GROUP BY p.product_id -ORDER BY p.product_id; diff --git a/medium/1174. Immediate Food Delivery II.md b/medium/1174. Immediate Food Delivery II.md new file mode 100644 index 0000000..623fec9 --- /dev/null +++ b/medium/1174. Immediate Food Delivery II.md @@ -0,0 +1,80 @@ +# Question 1174: Immediate Food Delivery II + +**LeetCode URL:** https://leetcode.com/problems/immediate-food-delivery-ii/ + +## Description + +Write a solution to find the percentage of immediate orders in the first orders of all customers, rounded to 2 decimal places. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Delivery (delivery_id int, customer_id int, order_date date, customer_pref_delivery_date date); +``` + +## Sample Input Data + +```sql +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('1', '1', '2019-08-01', '2019-08-02'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('2', '2', '2019-08-02', '2019-08-02'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('3', '1', '2019-08-11', '2019-08-12'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('4', '3', '2019-08-24', '2019-08-24'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('5', '3', '2019-08-21', '2019-08-22'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('6', '2', '2019-08-11', '2019-08-13'); +insert into Delivery (delivery_id, customer_id, order_date, customer_pref_delivery_date) values ('7', '4', '2019-08-09', '2019-08-09'); +``` + +## Expected Output Data + +```text ++----------------------+ +| immediate_percentage | ++----------------------+ +| 50.00 | ++----------------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT customer_id,MIN(order_date) AS first_order + FROM delivery_1174 + GROUP BY customer_id +) + +SELECT ROUND((COUNT(CASE WHEN d.order_date = d.customer_pref_delivery_date THEN 1 ELSE NULL END)::NUMERIC/COUNT(*))*100,2) + AS immediate_percentage +FROM delivery_1174 d +INNER JOIN cte c ON d.customer_id = c.customer_id AND d.order_date = c.first_order; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `immediate_percentage` from `delivery`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `delivery`. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `immediate_percentage`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1174. Immediate Food Delivery II.sql b/medium/1174. Immediate Food Delivery II.sql deleted file mode 100644 index c238c78..0000000 --- a/medium/1174. Immediate Food Delivery II.sql +++ /dev/null @@ -1,10 +0,0 @@ -WITH cte AS( - SELECT customer_id,MIN(order_date) AS first_order - FROM delivery_1174 - GROUP BY customer_id -) - -SELECT ROUND((COUNT(CASE WHEN d.order_date = d.customer_pref_delivery_date THEN 1 ELSE NULL END)::NUMERIC/COUNT(*))*100,2) - AS immediate_percentage -FROM delivery_1174 d -INNER JOIN cte c ON d.customer_id = c.customer_id AND d.order_date = c.first_order; diff --git a/medium/1193. Monthly Transactions I.md b/medium/1193. Monthly Transactions I.md new file mode 100644 index 0000000..435b890 --- /dev/null +++ b/medium/1193. Monthly Transactions I.md @@ -0,0 +1,75 @@ +# Question 1193: Monthly Transactions I + +**LeetCode URL:** https://leetcode.com/problems/monthly-transactions-i/ + +## Description + +Write an SQL query to find for each month and country, the number of transactions and their total amount, the number of approved transactions and their total amount. The query result format is in the following example: Transactions table: +------+---------+----------+--------+------------+ | id | country | state | amount | trans_date | +------+---------+----------+--------+------------+ | 121 | US | approved | 1000 | 2018-12-18 | | 122 | US | declined | 2000 | 2018-12-19 | | 123 | US | approved | 2000 | 2019-01-01 | | 124 | DE | approved | 2000 | 2019-01-07 | +------+---------+----------+--------+------------+ Result table: +----------+---------+-------------+----------------+--------------------+-----------------------+ | month | country | trans_count | approved_count | trans_total_amount | approved_total_amount | +----------+---------+-------------+----------------+--------------------+-----------------------+ | 2018-12 | US | 2 | 1 | 3000 | 1000 | | 2019-01 | US | 1 | 1 | 2000 | 2000 | | 2019-01 | DE | 1 | 1 | 2000 | 2000 | +----------+---------+-------------+----------------+--------------------+-----------------------+ Difficulty: Medium Lock: Prime Company: Wayfair Wish Problem Solution 1193-Monthly-Transactions-I All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Transactions (id int, country varchar(4), state enum('approved', 'declined'), amount int, trans_date date); +``` + +## Sample Input Data + +```sql +insert into Transactions (id, country, state, amount, trans_date) values ('121', 'US', 'approved', '1000', '2018-12-18'); +insert into Transactions (id, country, state, amount, trans_date) values ('122', 'US', 'declined', '2000', '2018-12-19'); +insert into Transactions (id, country, state, amount, trans_date) values ('123', 'US', 'approved', '2000', '2019-01-01'); +insert into Transactions (id, country, state, amount, trans_date) values ('124', 'DE', 'approved', '2000', '2019-01-07'); +``` + +## Expected Output Data + +```text ++----------+---------+-------------+----------------+--------------------+-----------------------+ +| month | country | trans_count | approved_count | trans_total_amount | approved_total_amount | ++----------+---------+-------------+----------------+--------------------+-----------------------+ +| 2018-12 | US | 2 | 1 | 3000 | 1000 | +| 2019-01 | US | 1 | 1 | 2000 | 2000 | +| 2019-01 | DE | 1 | 1 | 2000 | 2000 | ++----------+---------+-------------+----------------+--------------------+-----------------------+ +``` + +## SQL Solution + +```sql +SELECT TO_CHAR(trans_date,'YYYY-MM') AS month,country, + COUNT(id) AS trans_count, + COUNT(CASE WHEN state = 'approved' THEN 1 ELSE NULL END) AS approved_count, + SUM(amount) AS trans_total_amount +FROM transactions_1193 +GROUP BY month,country +ORDER BY month; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `month`, `country`, `trans_count`, `approved_count`, `trans_total_amount` from `transactions`. + +### Result Grain + +One row per unique key in `GROUP BY month,country`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT, SUM grouped by month,country. +2. Project final output columns: `month`, `country`, `trans_count`, `approved_count`, `trans_total_amount`. +3. Order output deterministically with `ORDER BY month`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1193. Monthly Transactions I.sql b/medium/1193. Monthly Transactions I.sql deleted file mode 100644 index b123b3d..0000000 --- a/medium/1193. Monthly Transactions I.sql +++ /dev/null @@ -1,7 +0,0 @@ -SELECT TO_CHAR(trans_date,'YYYY-MM') AS month,country, - COUNT(id) AS trans_count, - COUNT(CASE WHEN state = 'approved' THEN 1 ELSE NULL END) AS approved_count, - SUM(amount) AS trans_total_amount -FROM transactions_1193 -GROUP BY month,country -ORDER BY month; diff --git a/medium/1204. Last Person to Fit in the Elevator.md b/medium/1204. Last Person to Fit in the Elevator.md new file mode 100644 index 0000000..fe19d6f --- /dev/null +++ b/medium/1204. Last Person to Fit in the Elevator.md @@ -0,0 +1,80 @@ +# Question 1204: Last Person to Fit in the Bus + +**LeetCode URL:** https://leetcode.com/problems/last-person-to-fit-in-the-bus/ + +## Description + +Write a solution to find the person_name of the last person that can fit on the bus without exceeding the weight limit. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Queue (person_id int, person_name varchar(30), weight int, turn int); +``` + +## Sample Input Data + +```sql +insert into Queue (person_id, person_name, weight, turn) values ('5', 'Alice', '250', '1'); +insert into Queue (person_id, person_name, weight, turn) values ('4', 'Bob', '175', '5'); +insert into Queue (person_id, person_name, weight, turn) values ('3', 'Alex', '350', '2'); +insert into Queue (person_id, person_name, weight, turn) values ('6', 'John Cena', '400', '3'); +insert into Queue (person_id, person_name, weight, turn) values ('1', 'Winston', '500', '6'); +insert into Queue (person_id, person_name, weight, turn) values ('2', 'Marie', '200', '4'); +``` + +## Expected Output Data + +```text ++-------------+ +| person_name | ++-------------+ +| John Cena | ++-------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT *, + SUM(weight) OVER (ORDER BY turn) AS wsum + FROM queue_1204 +) + +SELECT person_name +FROM cte +WHERE turn = (SELECT MAX(turn) FROM cte WHERE wsum<=1000); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `person_name` from `queue`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `queue`, computes window metrics. +3. Apply row-level filtering in `WHERE`: turn = (SELECT MAX(turn) FROM cte WHERE wsum<=1000). +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `person_name`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1204. Last Person to Fit in the Elevator.sql b/medium/1204. Last Person to Fit in the Elevator.sql deleted file mode 100644 index 01e760e..0000000 --- a/medium/1204. Last Person to Fit in the Elevator.sql +++ /dev/null @@ -1,9 +0,0 @@ -WITH cte AS( - SELECT *, - SUM(weight) OVER (ORDER BY turn) AS wsum - FROM queue_1204 -) - -SELECT person_name -FROM cte -WHERE turn = (SELECT MAX(turn) FROM cte WHERE wsum<=1000); diff --git a/medium/1205. Monthly Transactions II.md b/medium/1205. Monthly Transactions II.md new file mode 100644 index 0000000..3e91a52 --- /dev/null +++ b/medium/1205. Monthly Transactions II.md @@ -0,0 +1,101 @@ +# Question 1205: Monthly Transactions II + +**LeetCode URL:** https://leetcode.com/problems/monthly-transactions-ii/ + +## Description + +Write an SQL query to find for each month and country, the number of approved transactions and their total amount, the number of chargebacks and their total amount. The query result format is in the following example: Transactions table: +------+---------+----------+--------+------------+ | id | country | state | amount | trans_date | +------+---------+----------+--------+------------+ | 101 | US | approved | 1000 | 2019-05-18 | | 102 | US | declined | 2000 | 2019-05-19 | | 103 | US | approved | 3000 | 2019-06-10 | | 104 | US | approved | 4000 | 2019-06-13 | | 105 | US | approved | 5000 | 2019-06-15 | +------+---------+----------+--------+------------+ Chargebacks table: +------------+------------+ | trans_id | trans_date | +------------+------------+ | 102 | 2019-05-29 | | 101 | 2019-06-30 | | 105 | 2019-09-18 | +------------+------------+ Result table: +----------+---------+----------------+-----------------+-------------------+--------------------+ | month | country | approved_count | approved_amount | chargeback_count | chargeback_amount | +----------+---------+----------------+-----------------+-------------------+--------------------+ | 2019-05 | US | 1 | 1000 | 1 | 2000 | | 2019-06 | US | 3 | 12000 | 1 | 1000 | | 2019-09 | US | 0 | 0 | 1 | 5000 | +----------+---------+----------------+-----------------+-------------------+--------------------+ Difficulty: Medium Lock: Prime Company: Wish Problem Solution 1205-Monthly-Transactions-II All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Transactions (id int, country varchar(4), state enum('approved', 'declined'), amount int, trans_date date); +Create table If Not Exists Chargebacks (trans_id int, trans_date date); +``` + +## Sample Input Data + +```sql +insert into Transactions (id, country, state, amount, trans_date) values ('101', 'US', 'approved', '1000', '2019-05-18'); +insert into Transactions (id, country, state, amount, trans_date) values ('102', 'US', 'declined', '2000', '2019-05-19'); +insert into Transactions (id, country, state, amount, trans_date) values ('103', 'US', 'approved', '3000', '2019-06-10'); +insert into Transactions (id, country, state, amount, trans_date) values ('104', 'US', 'declined', '4000', '2019-06-13'); +insert into Transactions (id, country, state, amount, trans_date) values ('105', 'US', 'approved', '5000', '2019-06-15'); +insert into Chargebacks (trans_id, trans_date) values ('102', '2019-05-29'); +insert into Chargebacks (trans_id, trans_date) values ('101', '2019-06-30'); +insert into Chargebacks (trans_id, trans_date) values ('105', '2019-09-18'); +``` + +## Expected Output Data + +```text ++----------+---------+----------------+-----------------+-------------------+--------------------+ +| month | country | approved_count | approved_amount | chargeback_count | chargeback_amount | ++----------+---------+----------------+-----------------+-------------------+--------------------+ +| 2019-05 | US | 1 | 1000 | 1 | 2000 | +| 2019-06 | US | 3 | 12000 | 1 | 1000 | +| 2019-09 | US | 0 | 0 | 1 | 5000 | ++----------+---------+----------------+-----------------+-------------------+--------------------+ +``` + +## SQL Solution + +```sql +WITH cte1 AS ( + SELECT TO_CHAR(c.charge_date,'YYYY-MM') AS month,t.country, + COUNT(c.trans_id) AS chargeback_count, + SUM(t.amount) AS chargeback_amount + FROM chargebacks_1205 c + JOIN transactions_1205 t ON t.id = c.trans_id + GROUP BY TO_CHAR(charge_date,'YYYY-MM'),t.country +), + +cte2 AS ( + SELECT TO_CHAR(trans_date,'YYYY-MM') AS month,country, + COUNT(CASE WHEN state='approved' THEN 1 ELSE NULL END) AS approved_count, + SUM(CASE WHEN state='approved' THEN amount ELSE NULL END) AS approved_amount + FROM transactions_1205 + GROUP BY TO_CHAR(trans_date,'YYYY-MM'),country +) + +SELECT c1.month,c1.country, + COALESCE(c2.approved_count,0) AS approved_count, + COALESCE(c2.approved_amount,0) AS approved_amount, + COALESCE(c1.chargeback_count,0) AS chargeback_count, + COALESCE(c1.chargeback_amount,0) AS chargeback_amount +FROM cte1 c1 +FULL OUTER JOIN cte2 c2 ON c1.month = c2.month +ORDER BY c1.month; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `month`, `country`, `approved_count`, `approved_amount`, `chargeback_count`, `chargeback_amount` from `chargebacks`, `transactions`, `cte1`, `cte2`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte1`, `cte2`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte1`: reads `chargebacks`, `transactions`. +3. CTE `cte2`: reads `transactions`. +4. Combine datasets using FULL OUTER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Project final output columns: `month`, `country`, `approved_count`, `approved_amount`, `chargeback_count`, `chargeback_amount`. +6. Order output deterministically with `ORDER BY c1.month`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1205. Monthly Transactions II.sql b/medium/1205. Monthly Transactions II.sql deleted file mode 100644 index dae6aa8..0000000 --- a/medium/1205. Monthly Transactions II.sql +++ /dev/null @@ -1,25 +0,0 @@ -WITH cte1 AS ( - SELECT TO_CHAR(c.charge_date,'YYYY-MM') AS month,t.country, - COUNT(c.trans_id) AS chargeback_count, - SUM(t.amount) AS chargeback_amount - FROM chargebacks_1205 c - JOIN transactions_1205 t ON t.id = c.trans_id - GROUP BY TO_CHAR(charge_date,'YYYY-MM'),t.country -), - -cte2 AS ( - SELECT TO_CHAR(trans_date,'YYYY-MM') AS month,country, - COUNT(CASE WHEN state='approved' THEN 1 ELSE NULL END) AS approved_count, - SUM(CASE WHEN state='approved' THEN amount ELSE NULL END) AS approved_amount - FROM transactions_1205 - GROUP BY TO_CHAR(trans_date,'YYYY-MM'),country -) - -SELECT c1.month,c1.country, - COALESCE(c2.approved_count,0) AS approved_count, - COALESCE(c2.approved_amount,0) AS approved_amount, - COALESCE(c1.chargeback_count,0) AS chargeback_count, - COALESCE(c1.chargeback_amount,0) AS chargeback_amount -FROM cte1 c1 -FULL OUTER JOIN cte2 c2 ON c1.month = c2.month -ORDER BY c1.month; diff --git a/medium/1212. Team Scores in Football Tournament.md b/medium/1212. Team Scores in Football Tournament.md new file mode 100644 index 0000000..5f1e00b --- /dev/null +++ b/medium/1212. Team Scores in Football Tournament.md @@ -0,0 +1,109 @@ +# Question 1212: Team Scores in Football Tournament + +**LeetCode URL:** https://leetcode.com/problems/team-scores-in-football-tournament/ + +## Description + +The query result format is in the following example: Teams table: +-----------+--------------+ | team_id | team_name | +-----------+--------------+ | 10 | Leetcode FC | | 20 | NewYork FC | | 30 | Atlanta FC | | 40 | Chicago FC | | 50 | Toronto FC | +-----------+--------------+ Matches table: +------------+--------------+---------------+-------------+--------------+ | match_id | host_team | guest_team | host_goals | guest_goals | +------------+--------------+---------------+-------------+--------------+ | 1 | 10 | 20 | 3 | 0 | | 2 | 30 | 10 | 2 | 2 | | 3 | 10 | 50 | 5 | 1 | | 4 | 20 | 30 | 1 | 0 | | 5 | 50 | 30 | 1 | 0 | +------------+--------------+---------------+-------------+--------------+ Result table: +------------+--------------+---------------+ | team_id | team_name | num_points | +------------+--------------+---------------+ | 10 | Leetcode FC | 7 | | 20 | NewYork FC | 3 | | 50 | Toronto FC | 3 | | 30 | Atlanta FC | 1 | | 40 | Chicago FC | 0 | +------------+--------------+---------------+ Difficulty: Medium Lock: Prime Company: Oracle Wayfair Problem Solution 1212-Team-Scores-in-Football-Tournament All Problems: Link to All Problems All contents and pictures on this website come from the Internet and are updated regularly every week. + +## Table Schema Structure + +```sql +Create table If Not Exists Teams (team_id int, team_name varchar(30)); +Create table If Not Exists Matches (match_id int, host_team int, guest_team int, host_goals int, guest_goals int); +``` + +## Sample Input Data + +```sql +insert into Teams (team_id, team_name) values ('10', 'Leetcode FC'); +insert into Teams (team_id, team_name) values ('20', 'NewYork FC'); +insert into Teams (team_id, team_name) values ('30', 'Atlanta FC'); +insert into Teams (team_id, team_name) values ('40', 'Chicago FC'); +insert into Teams (team_id, team_name) values ('50', 'Toronto FC'); +insert into Matches (match_id, host_team, guest_team, host_goals, guest_goals) values ('1', '10', '20', '3', '0'); +insert into Matches (match_id, host_team, guest_team, host_goals, guest_goals) values ('2', '30', '10', '2', '2'); +insert into Matches (match_id, host_team, guest_team, host_goals, guest_goals) values ('3', '10', '50', '5', '1'); +insert into Matches (match_id, host_team, guest_team, host_goals, guest_goals) values ('4', '20', '30', '1', '0'); +insert into Matches (match_id, host_team, guest_team, host_goals, guest_goals) values ('5', '50', '30', '1', '0'); +``` + +## Expected Output Data + +```text ++------------+--------------+---------------+ +| team_id | team_name | num_points | ++------------+--------------+---------------+ +| 10 | Leetcode FC | 7 | +| 20 | NewYork FC | 3 | +| 50 | Toronto FC | 3 | +| 30 | Atlanta FC | 1 | +| 40 | Chicago FC | 0 | ++------------+--------------+---------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT host_team,guest_team,host_goals,guest_goals + FROM matches_1212 + UNION ALL + SELECT guest_team AS host_team,host_team AS guest_team,host_goals,guest_goals + FROM matches_1212 + WHERE host_goals=guest_goals +), +cte2 AS ( + SELECT + CASE WHEN host_goals > guest_goals THEN host_team + WHEN host_goals < guest_goals THEN guest_team + ELSE host_team + END AS winner, + CASE WHEN host_goals > guest_goals THEN 3 + WHEN host_goals < guest_goals THEN 3 + ELSE 1 + END AS points + FROM cte +) + +SELECT t.team_id,t.team_name,COALESCE(SUM(c.points),0) AS points +FROM cte2 c +RIGHT JOIN teams_1212 t ON t.team_id = c.winner +GROUP BY t.team_id,t.team_name +ORDER BY points DESC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `team_id`, `team_name`, `points` from `matches`, `cte`, `cte2`, `teams`. + +### Result Grain + +One row per unique key in `GROUP BY t.team_id,t.team_name`. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `cte2`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `matches`. +3. CTE `cte2`: reads `cte`. +4. Combine datasets using RIGHT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Aggregate rows with SUM grouped by t.team_id,t.team_name. +6. Project final output columns: `team_id`, `team_name`, `points`. +7. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. +8. Order output deterministically with `ORDER BY points DESC`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1212. Team Scores in Football Tournament.sql b/medium/1212. Team Scores in Football Tournament.sql deleted file mode 100644 index 84ca6d1..0000000 --- a/medium/1212. Team Scores in Football Tournament.sql +++ /dev/null @@ -1,26 +0,0 @@ -WITH cte AS( - SELECT host_team,guest_team,host_goals,guest_goals - FROM matches_1212 - UNION ALL - SELECT guest_team AS host_team,host_team AS guest_team,host_goals,guest_goals - FROM matches_1212 - WHERE host_goals=guest_goals -), -cte2 AS ( - SELECT - CASE WHEN host_goals > guest_goals THEN host_team - WHEN host_goals < guest_goals THEN guest_team - ELSE host_team - END AS winner, - CASE WHEN host_goals > guest_goals THEN 3 - WHEN host_goals < guest_goals THEN 3 - ELSE 1 - END AS points - FROM cte -) - -SELECT t.team_id,t.team_name,COALESCE(SUM(c.points),0) AS points -FROM cte2 c -RIGHT JOIN teams_1212 t ON t.team_id = c.winner -GROUP BY t.team_id,t.team_name -ORDER BY points DESC; diff --git a/medium/1264. Page Recommendations.md b/medium/1264. Page Recommendations.md new file mode 100644 index 0000000..ad3dd68 --- /dev/null +++ b/medium/1264. Page Recommendations.md @@ -0,0 +1,132 @@ +# Question 1264: Page Recommendations + +**LeetCode URL:** https://leetcode.com/problems/page-recommendations/ + +## Description + +Write an SQL query to recommend pages to the user with user_id = 1 using the pages that your friends liked. Return result table in any order without duplicates. The query result format is in the following example: Friendship table: +----------+----------+ | user1_id | user2_id | +----------+----------+ | 1 | 2 | | 1 | 3 | | 1 | 4 | | 2 | 3 | | 2 | 4 | | 2 | 5 | | 6 | 1 | +----------+----------+ Likes table: +---------+---------+ | user_id | page_id | +---------+---------+ | 1 | 88 | | 2 | 23 | | 3 | 24 | | 4 | 56 | | 5 | 11 | | 6 | 33 | | 2 | 77 | | 3 | 77 | | 6 | 88 | +---------+---------+ Result table: +------------------+ | recommended_page | +------------------+ | 23 | | 24 | | 56 | | 33 | | 77 | +------------------+ User one is friend with users 2, 3, 4 and 6. + +## Table Schema Structure + +```sql +Create table If Not Exists Friendship (user1_id int, user2_id int); +Create table If Not Exists Likes (user_id int, page_id int); +``` + +## Sample Input Data + +```sql +insert into Friendship (user1_id, user2_id) values ('1', '2'); +insert into Friendship (user1_id, user2_id) values ('1', '3'); +insert into Friendship (user1_id, user2_id) values ('1', '4'); +insert into Friendship (user1_id, user2_id) values ('2', '3'); +insert into Friendship (user1_id, user2_id) values ('2', '4'); +insert into Friendship (user1_id, user2_id) values ('2', '5'); +insert into Friendship (user1_id, user2_id) values ('6', '1'); +insert into Likes (user_id, page_id) values ('1', '88'); +insert into Likes (user_id, page_id) values ('2', '23'); +insert into Likes (user_id, page_id) values ('3', '24'); +insert into Likes (user_id, page_id) values ('4', '56'); +insert into Likes (user_id, page_id) values ('5', '11'); +insert into Likes (user_id, page_id) values ('6', '33'); +insert into Likes (user_id, page_id) values ('2', '77'); +insert into Likes (user_id, page_id) values ('3', '77'); +insert into Likes (user_id, page_id) values ('6', '88'); +``` + +## Expected Output Data + +```text ++------------------+ +| recommended_page | ++------------------+ +| 23 | +| 24 | +| 56 | +| 33 | +| 77 | ++------------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT user_id1,user_id2 + FROM friendship_1264 + UNION + SELECT user_id2,user_id1 + FROM friendship_1264 +), + +friends AS( + SELECT user_id2 AS friends + FROM cte WHERE user_id1 = 1 +) + +SELECT DISTINCT page_id +FROM likes_1264 +WHERE user_id IN (SELECT * FROM friends) AND + page_id NOT IN (SELECT DISTINCT page_id FROM likes_1264 WHERE user_id = 1) +ORDER BY 1; + +--------------------------- OR ---------------------------- + +WITH likes AS ( + SELECT user_id,ARRAY_AGG(page_id) as liked_pages + FROM likes_1264 + GROUP BY user_id +),friends AS ( + SELECT user_id1,user_id2,page_id + FROM friendship_1264 f + JOIN likes_1264 l ON f.user_id1 = l.user_id + UNION ALL + SELECT user_id2,user_id1,page_id + FROM friendship_1264 f + JOIN likes_1264 l ON f.user_id2 = l.user_id +),reco AS( + SELECT f.user_id1 AS from_user, f.user_id2 AS friend, f.page_id AS page_to_reco, liked_pages AS friend_already_liked_pages + FROM friends f + JOIN likes l ON f.user_id2 = l.user_id AND (NOT f.page_id = ANY (liked_pages)) +) +SELECT DISTINCT page_to_reco +FROM reco +WHERE friend = 1 +ORDER BY page_to_reco; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `page_to_reco` from `friendship`, `cte`, `likes`, `friends`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `friends`, `likes`, `reco`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `friendship`. +3. CTE `friends`: reads `cte`, applies row filters. +4. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: friend = 1. +6. Project final output columns: `page_to_reco`. +7. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. +8. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. +9. Order output deterministically with `ORDER BY page_to_reco`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1264. Page Recommendations.sql b/medium/1264. Page Recommendations.sql deleted file mode 100644 index e0d755e..0000000 --- a/medium/1264. Page Recommendations.sql +++ /dev/null @@ -1,42 +0,0 @@ -WITH cte AS( - SELECT user_id1,user_id2 - FROM friendship_1264 - UNION - SELECT user_id2,user_id1 - FROM friendship_1264 -), - -friends AS( - SELECT user_id2 AS friends - FROM cte WHERE user_id1 = 1 -) - -SELECT DISTINCT page_id -FROM likes_1264 -WHERE user_id IN (SELECT * FROM friends) AND - page_id NOT IN (SELECT DISTINCT page_id FROM likes_1264 WHERE user_id = 1) -ORDER BY 1; - ---------------------------- OR ---------------------------- - -WITH likes AS ( - SELECT user_id,ARRAY_AGG(page_id) as liked_pages - FROM likes_1264 - GROUP BY user_id -),friends AS ( - SELECT user_id1,user_id2,page_id - FROM friendship_1264 f - JOIN likes_1264 l ON f.user_id1 = l.user_id - UNION ALL - SELECT user_id2,user_id1,page_id - FROM friendship_1264 f - JOIN likes_1264 l ON f.user_id2 = l.user_id -),reco AS( - SELECT f.user_id1 AS from_user, f.user_id2 AS friend, f.page_id AS page_to_reco, liked_pages AS friend_already_liked_pages - FROM friends f - JOIN likes l ON f.user_id2 = l.user_id AND (NOT f.page_id = ANY (liked_pages)) -) -SELECT DISTINCT page_to_reco -FROM reco -WHERE friend = 1 -ORDER BY page_to_reco; diff --git a/medium/1270. All People Report to the Given Manager.md b/medium/1270. All People Report to the Given Manager.md new file mode 100644 index 0000000..10841d5 --- /dev/null +++ b/medium/1270. All People Report to the Given Manager.md @@ -0,0 +1,130 @@ +# Question 1270: All People Report to the Given Manager + +**LeetCode URL:** https://leetcode.com/problems/all-people-report-to-the-given-manager/ + +## Description + +Write an SQL query to find employee_id of all employees that directly or indirectly report their work to the head of the company. Return result table in any order without duplicates. The query result format is in the following example: Employees table: +-------------+---------------+------------+ | employee_id | employee_name | manager_id | +-------------+---------------+------------+ | 1 | Boss | 1 | | 3 | Alice | 3 | | 2 | Bob | 1 | | 4 | Daniel | 2 | | 7 | Luis | 4 | | 8 | Jhon | 3 | | 9 | Angela | 8 | | 77 | Robert | 1 | +-------------+---------------+------------+ Result table: +-------------+ | employee_id | +-------------+ | 2 | | 77 | | 4 | | 7 | +-------------+ The head of the company is the employee with employee_id 1. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees (employee_id int, employee_name varchar(30), manager_id int); +``` + +## Sample Input Data + +```sql +insert into Employees (employee_id, employee_name, manager_id) values ('1', 'Boss', '1'); +insert into Employees (employee_id, employee_name, manager_id) values ('3', 'Alice', '3'); +insert into Employees (employee_id, employee_name, manager_id) values ('2', 'Bob', '1'); +insert into Employees (employee_id, employee_name, manager_id) values ('4', 'Daniel', '2'); +insert into Employees (employee_id, employee_name, manager_id) values ('7', 'Luis', '4'); +insert into Employees (employee_id, employee_name, manager_id) values ('8', 'John', '3'); +insert into Employees (employee_id, employee_name, manager_id) values ('9', 'Angela', '8'); +insert into Employees (employee_id, employee_name, manager_id) values ('77', 'Robert', '1'); +``` + +## Expected Output Data + +```text ++-------------+ +| employee_id | ++-------------+ +| 2 | +| 77 | +| 4 | +| 7 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT e1.employee_id +FROM employees_1270 e1 +INNER JOIN employees_1270 e2 +ON e1.manager_id = e2.employee_id + AND e1.manager_id = 1 + AND e1.employee_id <> e2.employee_id +UNION +SELECT e1.employee_id +FROM employees_1270 e1 +INNER JOIN employees_1270 e2 +ON e1.manager_id = e2.employee_id + AND e1.employee_id <> e2.employee_id +INNER JOIN employees_1270 e3 +ON e2.manager_id = e3.employee_id + AND e2.manager_id = 1 + AND e2.employee_id <> e3.employee_id +UNION +SELECT e1.employee_id +FROM employees_1270 e1 +INNER JOIN employees_1270 e2 +ON e1.manager_id = e2.employee_id + AND e1.employee_id <> e2.employee_id +INNER JOIN employees_1270 e3 +ON e2.manager_id = e3.employee_id + AND e2.employee_id <> e3.employee_id +INNER JOIN employees_1270 e4 +ON e3.manager_id = e4.employee_id + AND e3.manager_id = 1 + AND e3.employee_id <> e4.employee_id; + +--------------(OR)------------ + +SELECT e1.employee_id +FROM employees_1270 e1 +INNER JOIN employees_1270 e2 +ON e1.manager_id = e2.employee_id +INNER JOIN employees_1270 e3 +ON e2.manager_id = e3.employee_id +WHERE e3.manager_id = 1 AND e1.employee_id <> 1 + + +--------------(OR)------------ + +WITH RECURSIVE cte AS ( + SELECT employee_id,employee_name,manager_id,1 AS level + FROM employees_1270 + WHERE employee_id = 1 + UNION + SELECT e.employee_id,e.employee_name,e.manager_id,level+1 AS level + FROM cte c + INNER JOIN employees_1270 e ON c.employee_id = e.manager_id + WHERE level < 4 +) +SELECT DISTINCT employee_id,employee_name FROM cte WHERE employee_id <> 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id`, `employee_name` from `employees`, `cte`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: employee_id <> 1. +3. Project final output columns: `employee_id`, `employee_name`. +4. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are recursive expansion, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). +- Recursive CTEs need a strict termination condition to avoid runaway recursion. + diff --git a/medium/1270. All People Report to the Given Manager.sql b/medium/1270. All People Report to the Given Manager.sql deleted file mode 100644 index c1e4a02..0000000 --- a/medium/1270. All People Report to the Given Manager.sql +++ /dev/null @@ -1,54 +0,0 @@ -SELECT e1.employee_id -FROM employees_1270 e1 -INNER JOIN employees_1270 e2 -ON e1.manager_id = e2.employee_id - AND e1.manager_id = 1 - AND e1.employee_id <> e2.employee_id -UNION -SELECT e1.employee_id -FROM employees_1270 e1 -INNER JOIN employees_1270 e2 -ON e1.manager_id = e2.employee_id - AND e1.employee_id <> e2.employee_id -INNER JOIN employees_1270 e3 -ON e2.manager_id = e3.employee_id - AND e2.manager_id = 1 - AND e2.employee_id <> e3.employee_id -UNION -SELECT e1.employee_id -FROM employees_1270 e1 -INNER JOIN employees_1270 e2 -ON e1.manager_id = e2.employee_id - AND e1.employee_id <> e2.employee_id -INNER JOIN employees_1270 e3 -ON e2.manager_id = e3.employee_id - AND e2.employee_id <> e3.employee_id -INNER JOIN employees_1270 e4 -ON e3.manager_id = e4.employee_id - AND e3.manager_id = 1 - AND e3.employee_id <> e4.employee_id; - ---------------(OR)------------ - -SELECT e1.employee_id -FROM employees_1270 e1 -INNER JOIN employees_1270 e2 -ON e1.manager_id = e2.employee_id -INNER JOIN employees_1270 e3 -ON e2.manager_id = e3.employee_id -WHERE e3.manager_id = 1 AND e1.employee_id <> 1 - - ---------------(OR)------------ - -WITH RECURSIVE cte AS ( - SELECT employee_id,employee_name,manager_id,1 AS level - FROM employees_1270 - WHERE employee_id = 1 - UNION - SELECT e.employee_id,e.employee_name,e.manager_id,level+1 AS level - FROM cte c - INNER JOIN employees_1270 e ON c.employee_id = e.manager_id - WHERE level < 4 -) -SELECT DISTINCT employee_id,employee_name FROM cte WHERE employee_id <> 1; diff --git a/medium/1285. Find the Start and End Number of Continuous Ranges.md b/medium/1285. Find the Start and End Number of Continuous Ranges.md new file mode 100644 index 0000000..f434b84 --- /dev/null +++ b/medium/1285. Find the Start and End Number of Continuous Ranges.md @@ -0,0 +1,84 @@ +# Question 1285: Find the Start and End Number of Continuous Ranges + +**LeetCode URL:** https://leetcode.com/problems/find-the-start-and-end-number-of-continuous-ranges/ + +## Description + +The query result format is in the following example: Logs table: +------------+ | log_id | +------------+ | 1 | | 2 | | 3 | | 7 | | 8 | | 10 | +------------+ Result table: +------------+--------------+ | start_id | end_id | +------------+--------------+ | 1 | 3 | | 7 | 8 | | 10 | 10 | +------------+--------------+ The result table should contain all ranges in table Logs. + +## Table Schema Structure + +```sql +Create table If Not Exists Logs (log_id int); +``` + +## Sample Input Data + +```sql +insert into Logs (log_id) values ('1'); +insert into Logs (log_id) values ('2'); +insert into Logs (log_id) values ('3'); +insert into Logs (log_id) values ('7'); +insert into Logs (log_id) values ('8'); +insert into Logs (log_id) values ('10'); +``` + +## Expected Output Data + +```text ++------------+--------------+ +| start_id | end_id | ++------------+--------------+ +| 1 | 3 | +| 7 | 8 | +| 10 | 10 | ++------------+--------------+ +``` + +## SQL Solution + +```sql +WITH ranked AS ( + SELECT log_id, + log_id-ROW_NUMBER() OVER (ORDER BY log_id) AS diff + FROM logs_1285 +) + +SELECT MIN(log_id) AS start_id,MAX(log_id) AS end_id +FROM ranked +GROUP BY diff +ORDER BY start_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `start_id`, `end_id` from `logs`, `ranked`. + +### Result Grain + +One row per unique key in `GROUP BY diff`. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `logs`, computes window metrics. +3. Aggregate rows with MIN, MAX grouped by diff. +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `start_id`, `end_id`. +6. Order output deterministically with `ORDER BY start_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/1285. Find the Start and End Number of Continuous Ranges.sql b/medium/1285. Find the Start and End Number of Continuous Ranges.sql deleted file mode 100644 index f3000a5..0000000 --- a/medium/1285. Find the Start and End Number of Continuous Ranges.sql +++ /dev/null @@ -1,10 +0,0 @@ -WITH ranked AS ( - SELECT log_id, - log_id-ROW_NUMBER() OVER (ORDER BY log_id) AS diff - FROM logs_1285 -) - -SELECT MIN(log_id) AS start_id,MAX(log_id) AS end_id -FROM ranked -GROUP BY diff -ORDER BY start_id; diff --git a/medium/1308. Running Total for Different Genders.md b/medium/1308. Running Total for Different Genders.md new file mode 100644 index 0000000..48d4ccd --- /dev/null +++ b/medium/1308. Running Total for Different Genders.md @@ -0,0 +1,82 @@ +# Question 1308: Running Total for Different Genders + +**LeetCode URL:** https://leetcode.com/problems/running-total-for-different-genders/ + +## Description + +Write an SQL query to find the total score for each gender at each day. The query result format is in the following example: Scores table: +-------------+--------+------------+--------------+ | player_name | gender | day | score_points | +-------------+--------+------------+--------------+ | Aron | F | 2020-01-01 | 17 | | Alice | F | 2020-01-07 | 23 | | Bajrang | M | 2020-01-07 | 7 | | Khali | M | 2019-12-25 | 11 | | Slaman | M | 2019-12-30 | 13 | | Joe | M | 2019-12-31 | 3 | | Jose | M | 2019-12-18 | 2 | | Priya | F | 2019-12-31 | 23 | | Priyanka | F | 2019-12-30 | 17 | +-------------+--------+------------+--------------+ Result table: +--------+------------+-------+ | gender | day | total | +--------+------------+-------+ | F | 2019-12-30 | 17 | | F | 2019-12-31 | 40 | | F | 2020-01-01 | 57 | | F | 2020-01-07 | 80 | | M | 2019-12-18 | 2 | | M | 2019-12-25 | 13 | | M | 2019-12-30 | 26 | | M | 2019-12-31 | 29 | | M | 2020-01-07 | 36 | +--------+------------+-------+ For females team: First day is 2019-12-30, Priyanka scored 17 points and the total score for the team is 17. + +## Table Schema Structure + +```sql +Create table If Not Exists Scores (player_name varchar(20), gender varchar(1), day date, score_points int); +``` + +## Sample Input Data + +```sql +insert into Scores (player_name, gender, day, score_points) values ('Aron', 'F', '2020-01-01', '17'); +insert into Scores (player_name, gender, day, score_points) values ('Alice', 'F', '2020-01-07', '23'); +insert into Scores (player_name, gender, day, score_points) values ('Bajrang', 'M', '2020-01-07', '7'); +insert into Scores (player_name, gender, day, score_points) values ('Khali', 'M', '2019-12-25', '11'); +insert into Scores (player_name, gender, day, score_points) values ('Slaman', 'M', '2019-12-30', '13'); +insert into Scores (player_name, gender, day, score_points) values ('Joe', 'M', '2019-12-31', '3'); +insert into Scores (player_name, gender, day, score_points) values ('Jose', 'M', '2019-12-18', '2'); +insert into Scores (player_name, gender, day, score_points) values ('Priya', 'F', '2019-12-31', '23'); +insert into Scores (player_name, gender, day, score_points) values ('Priyanka', 'F', '2019-12-30', '17'); +``` + +## Expected Output Data + +```text ++--------+------------+-------+ +| gender | day | total | ++--------+------------+-------+ +| F | 2019-12-30 | 17 | +| F | 2019-12-31 | 40 | +| F | 2020-01-01 | 57 | +| F | 2020-01-07 | 80 | +| M | 2019-12-18 | 2 | +| M | 2019-12-25 | 13 | +| M | 2019-12-30 | 26 | +| M | 2019-12-31 | 29 | +| M | 2020-01-07 | 36 | ++--------+------------+-------+ +``` + +## SQL Solution + +```sql +SELECT gender,day, + SUM(score_points) OVER (PARTITION BY gender ORDER BY day) +FROM scores_1308; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `gender`, `day` from `scores`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +2. Project final output columns: `gender`, `day`. +3. Order output deterministically with `ORDER BY day) FROM scores_1308`. + +### Why This Works + +Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/1308. Running Total for Different Genders.sql b/medium/1308. Running Total for Different Genders.sql deleted file mode 100644 index 976b86a..0000000 --- a/medium/1308. Running Total for Different Genders.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT gender,day, - SUM(score_points) OVER (PARTITION BY gender ORDER BY day) -FROM scores_1308; diff --git a/medium/1321. Restaurant Growth.md b/medium/1321. Restaurant Growth.md new file mode 100644 index 0000000..3d07f0a --- /dev/null +++ b/medium/1321. Restaurant Growth.md @@ -0,0 +1,118 @@ +# Question 1321: Restaurant Growth + +**LeetCode URL:** https://leetcode.com/problems/restaurant-growth/ + +## Description + +Return the result table ordered by visited_on in ascending order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Customer (customer_id int, name varchar(20), visited_on date, amount int); +``` + +## Sample Input Data + +```sql +insert into Customer (customer_id, name, visited_on, amount) values ('1', 'Jhon', '2019-01-01', '100'); +insert into Customer (customer_id, name, visited_on, amount) values ('2', 'Daniel', '2019-01-02', '110'); +insert into Customer (customer_id, name, visited_on, amount) values ('3', 'Jade', '2019-01-03', '120'); +insert into Customer (customer_id, name, visited_on, amount) values ('4', 'Khaled', '2019-01-04', '130'); +insert into Customer (customer_id, name, visited_on, amount) values ('5', 'Winston', '2019-01-05', '110'); +insert into Customer (customer_id, name, visited_on, amount) values ('6', 'Elvis', '2019-01-06', '140'); +insert into Customer (customer_id, name, visited_on, amount) values ('7', 'Anna', '2019-01-07', '150'); +insert into Customer (customer_id, name, visited_on, amount) values ('8', 'Maria', '2019-01-08', '80'); +insert into Customer (customer_id, name, visited_on, amount) values ('9', 'Jaze', '2019-01-09', '110'); +insert into Customer (customer_id, name, visited_on, amount) values ('1', 'Jhon', '2019-01-10', '130'); +insert into Customer (customer_id, name, visited_on, amount) values ('3', 'Jade', '2019-01-10', '150'); +``` + +## Expected Output Data + +```text ++--------------+--------------+----------------+ +| visited_on | amount | average_amount | ++--------------+--------------+----------------+ +| 2019-01-07 | 860 | 122.86 | +| 2019-01-08 | 840 | 120 | +| 2019-01-09 | 840 | 120 | +| 2019-01-10 | 1000 | 142.86 | ++--------------+--------------+----------------+ +``` + +## SQL Solution + +```sql +WITH grouped AS( + SELECT visited_on,SUM(amount) AS amount + FROM customer_1321 + GROUP BY visited_on +), +cte AS ( + SELECT *,ROW_NUMBER() OVER (ORDER BY visited_on) AS num + FROM grouped +), +cte2 AS( + SELECT *, + SUM(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS sum_amount, + ROUND(AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW),2) AS average_amount + FROM cte +) +SELECT visited_on,sum_amount,average_amount +FROM cte2 +WHERE num>=7; + +--------------------------------------------------------------------------------------------------------------------------------- + +WITH daily_spent AS ( + SELECT visited_on,SUM(amount) AS amount, + ROW_NUMBER() OVER (ORDER BY visited_on) AS rn + FROM customer_1321 + GROUP BY visited_on +), +moving_averages AS ( + SELECT visited_on,rn, + ROUND((AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)),2) AS running_avg + FROM daily_spent + ORDER BY visited_on +) +SELECT * +FROM moving_averages +WHERE rn >= 7; + +-- Window functions are evaluated after group by clause got executed, and window functions are applied to the result of group by clause. +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `customer`, `grouped`, `cte`, `cte2`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`grouped`, `cte`, `cte2`, `daily_spent`, `moving_averages`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `grouped`: reads `customer`. +3. CTE `cte`: reads `grouped`, computes window metrics. +4. CTE `cte2`: reads `cte`, computes window metrics. +5. Apply row-level filtering in `WHERE`: rn >= 7. +6. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost drivers are window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1321. Restaurant Growth.sql b/medium/1321. Restaurant Growth.sql deleted file mode 100644 index 3c81cdb..0000000 --- a/medium/1321. Restaurant Growth.sql +++ /dev/null @@ -1,38 +0,0 @@ -WITH grouped AS( - SELECT visited_on,SUM(amount) AS amount - FROM customer_1321 - GROUP BY visited_on -), -cte AS ( - SELECT *,ROW_NUMBER() OVER (ORDER BY visited_on) AS num - FROM grouped -), -cte2 AS( - SELECT *, - SUM(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS sum_amount, - ROUND(AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW),2) AS average_amount - FROM cte -) -SELECT visited_on,sum_amount,average_amount -FROM cte2 -WHERE num>=7; - ---------------------------------------------------------------------------------------------------------------------------------- - -WITH daily_spent AS ( - SELECT visited_on,SUM(amount) AS amount, - ROW_NUMBER() OVER (ORDER BY visited_on) AS rn - FROM customer_1321 - GROUP BY visited_on -), -moving_averages AS ( - SELECT visited_on,rn, - ROUND((AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)),2) AS running_avg - FROM daily_spent - ORDER BY visited_on -) -SELECT * -FROM moving_averages -WHERE rn >= 7; - --- Window functions are evaluated after group by clause got executed, and window functions are applied to the result of group by clause. diff --git a/medium/1341. Movie Rating.md b/medium/1341. Movie Rating.md new file mode 100644 index 0000000..94ad5fc --- /dev/null +++ b/medium/1341. Movie Rating.md @@ -0,0 +1,99 @@ +# Question 1341: Movie Rating + +**LeetCode URL:** https://leetcode.com/problems/movie-rating/ + +## Description + +return the lexicographically smaller user name. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Movies (movie_id int, title varchar(30)); +Create table If Not Exists Users (user_id int, name varchar(30)); +Create table If Not Exists MovieRating (movie_id int, user_id int, rating int, created_at date); +``` + +## Sample Input Data + +```sql +insert into Movies (movie_id, title) values ('1', 'Avengers'); +insert into Movies (movie_id, title) values ('2', 'Frozen 2'); +insert into Movies (movie_id, title) values ('3', 'Joker'); +insert into Users (user_id, name) values ('1', 'Daniel'); +insert into Users (user_id, name) values ('2', 'Monica'); +insert into Users (user_id, name) values ('3', 'Maria'); +insert into Users (user_id, name) values ('4', 'James'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('1', '1', '3', '2020-01-12'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('1', '2', '4', '2020-02-11'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('1', '3', '2', '2020-02-12'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('1', '4', '1', '2020-01-01'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('2', '1', '5', '2020-02-17'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('2', '2', '2', '2020-02-01'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('2', '3', '2', '2020-03-01'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('3', '1', '3', '2020-02-22'); +insert into MovieRating (movie_id, user_id, rating, created_at) values ('3', '2', '4', '2020-02-25'); +``` + +## Expected Output Data + +```text ++--------------+ +| results | ++--------------+ +| Daniel | +| Frozen 2 | ++--------------+ +``` + +## SQL Solution + +```sql +(SELECT u.name +FROM movie_rating_1341 mr +INNER JOIN users_1341 u ON mr.user_id = u.user_id +GROUP BY u.name +ORDER BY COUNT(mr.rating) DESC,u.name +LIMIT 1) +UNION +(SELECT m.title +FROM movie_rating_1341 mr +INNER JOIN movies_1341 m ON mr.movie_id = m.movie_id +WHERE EXTRACT(MONTH FROM created_at)=2 +GROUP BY m.title +ORDER BY AVG(mr.rating) DESC,m.title +LIMIT 1); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name` from `movie_rating`, `users`, `movies`, `created_at`. + +### Result Grain + +One row per unique key in `GROUP BY u.name`. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: EXTRACT(MONTH FROM created_at)=2. +3. Aggregate rows with COUNT, AVG grouped by u.name. +4. Project final output columns: `name`. +5. Order output deterministically with `ORDER BY COUNT(mr.rating) DESC,u.name`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1341. Movie Rating.sql b/medium/1341. Movie Rating.sql deleted file mode 100644 index 9089f1a..0000000 --- a/medium/1341. Movie Rating.sql +++ /dev/null @@ -1,14 +0,0 @@ -(SELECT u.name -FROM movie_rating_1341 mr -INNER JOIN users_1341 u ON mr.user_id = u.user_id -GROUP BY u.name -ORDER BY COUNT(mr.rating) DESC,u.name -LIMIT 1) -UNION -(SELECT m.title -FROM movie_rating_1341 mr -INNER JOIN movies_1341 m ON mr.movie_id = m.movie_id -WHERE EXTRACT(MONTH FROM created_at)=2 -GROUP BY m.title -ORDER BY AVG(mr.rating) DESC,m.title -LIMIT 1); diff --git a/medium/1355. Activity Participants.md b/medium/1355. Activity Participants.md new file mode 100644 index 0000000..905ba7b --- /dev/null +++ b/medium/1355. Activity Participants.md @@ -0,0 +1,90 @@ +# Question 1355: Activity Participants + +**LeetCode URL:** https://leetcode.com/problems/activity-participants/ + +## Description + +Write an SQL query to find the names of all the activities with neither maximum, nor minimum number of participants. Return the result table in any order. The query result format is in the following example: Friends table: +------+--------------+---------------+ | id | name | activity | +------+--------------+---------------+ | 1 | Jonathan D. + +## Table Schema Structure + +```sql +Create table If Not Exists Friends (id int, name varchar(30), activity varchar(30)); +Create table If Not Exists Activities (id int, name varchar(30)); +``` + +## Sample Input Data + +```sql +insert into Friends (id, name, activity) values ('1', 'Jonathan D.', 'Eating'); +insert into Friends (id, name, activity) values ('2', 'Jade W.', 'Singing'); +insert into Friends (id, name, activity) values ('3', 'Victor J.', 'Singing'); +insert into Friends (id, name, activity) values ('4', 'Elvis Q.', 'Eating'); +insert into Friends (id, name, activity) values ('5', 'Daniel A.', 'Eating'); +insert into Friends (id, name, activity) values ('6', 'Bob B.', 'Horse Riding'); +insert into Activities (id, name) values ('1', 'Eating'); +insert into Activities (id, name) values ('2', 'Singing'); +insert into Activities (id, name) values ('3', 'Horse Riding'); +``` + +## Expected Output Data + +```text ++--------------+ +| results | ++--------------+ +| Singing | ++--------------+ +``` + +## SQL Solution + +```sql +WITH cte AS ( + SELECT activity,COUNT(activity) AS cnt + FROM friends_1355 + GROUP BY activity +), +cte1 AS ( + SELECT activity,cnt, + MAX(cnt) OVER () AS max_cnt, + MIN(cnt) OVER () AS min_cnt + FROM cte +) +SELECT activity +FROM cte1 +WHERE cnt <> max_cnt AND cnt <> min_cnt; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `activity` from `friends`, `cte`, `cte1`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `cte1`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `friends`. +3. CTE `cte1`: reads `cte`, computes window metrics. +4. Apply row-level filtering in `WHERE`: cnt <> max_cnt AND cnt <> min_cnt. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `activity`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1355. Activity Participants.sql b/medium/1355. Activity Participants.sql deleted file mode 100644 index 0ffeac7..0000000 --- a/medium/1355. Activity Participants.sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH cte AS ( - SELECT activity,COUNT(activity) AS cnt - FROM friends_1355 - GROUP BY activity -), -cte1 AS ( - SELECT activity,cnt, - MAX(cnt) OVER () AS max_cnt, - MIN(cnt) OVER () AS min_cnt - FROM cte -) -SELECT activity -FROM cte1 -WHERE cnt <> max_cnt AND cnt <> min_cnt; diff --git a/medium/1364. Number of Trusted Contacts of a Customer.md b/medium/1364. Number of Trusted Contacts of a Customer.md new file mode 100644 index 0000000..5b1a260 --- /dev/null +++ b/medium/1364. Number of Trusted Contacts of a Customer.md @@ -0,0 +1,105 @@ +# Question 1364: Number of Trusted Contacts of a Customer + +**LeetCode URL:** https://leetcode.com/problems/number-of-trusted-contacts-of-a-customer/ + +## Description + +Write an SQL query to find the following for each invoice_id: - customer_name: The name of the customer the invoice is related to. The query result format is in the following example: Customers table: +-------------+---------------+--------------------+ | customer_id | customer_name | email | +-------------+---------------+--------------------+ | 1 | Alice | alice@leetcode. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, customer_name varchar(20), email varchar(30)); +Create table If Not Exists Contacts (user_id int, contact_name varchar(20), contact_email varchar(30)); +Create table If Not Exists Invoices (invoice_id int, price int, user_id int); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, customer_name, email) values ('1', 'Alice', 'alice@leetcode.com'); +insert into Customers (customer_id, customer_name, email) values ('2', 'Bob', 'bob@leetcode.com'); +insert into Customers (customer_id, customer_name, email) values ('13', 'John', 'john@leetcode.com'); +insert into Customers (customer_id, customer_name, email) values ('6', 'Alex', 'alex@leetcode.com'); +insert into Contacts (user_id, contact_name, contact_email) values ('1', 'Bob', 'bob@leetcode.com'); +insert into Contacts (user_id, contact_name, contact_email) values ('1', 'John', 'john@leetcode.com'); +insert into Contacts (user_id, contact_name, contact_email) values ('1', 'Jal', 'jal@leetcode.com'); +insert into Contacts (user_id, contact_name, contact_email) values ('2', 'Omar', 'omar@leetcode.com'); +insert into Contacts (user_id, contact_name, contact_email) values ('2', 'Meir', 'meir@leetcode.com'); +insert into Contacts (user_id, contact_name, contact_email) values ('6', 'Alice', 'alice@leetcode.com'); +insert into Invoices (invoice_id, price, user_id) values ('77', '100', '1'); +insert into Invoices (invoice_id, price, user_id) values ('88', '200', '1'); +insert into Invoices (invoice_id, price, user_id) values ('99', '300', '2'); +insert into Invoices (invoice_id, price, user_id) values ('66', '400', '2'); +insert into Invoices (invoice_id, price, user_id) values ('55', '500', '13'); +insert into Invoices (invoice_id, price, user_id) values ('44', '60', '6'); +``` + +## Expected Output Data + +```text ++------------+---------------+-------+--------------+----------------------+ +| invoice_id | customer_name | price | contacts_cnt | trusted_contacts_cnt | ++------------+---------------+-------+--------------+----------------------+ +| 44 | Alex | 60 | 1 | 1 | +| 55 | John | 500 | 0 | 0 | +| 66 | Bob | 400 | 2 | 0 | +| 77 | Alice | 100 | 3 | 2 | +| 88 | Alice | 200 | 3 | 2 | +| 99 | Bob | 300 | 2 | 0 | ++------------+---------------+-------+--------------+----------------------+ +``` + +## SQL Solution + +```sql +WITH all_contacts AS( + SELECT user_id,COUNT(contact_name) AS a_contacts + FROM contacts_1364 + GROUP BY user_id +), +trusted_contacts AS( + SELECT user_id,COUNT(contact_name) AS t_contacts + FROM contacts_1364 ct + INNER JOIN customers_1364 cs ON ct.contact_name = cs.customer_name + GROUP BY user_id +) + +SELECT i.*,COALESCE(a_contacts,0) all_contacts,COALESCE(t_contacts,0) trusted_contacts +FROM invoices_1364 i +LEFT JOIN all_contacts ac ON i.user_id = ac.user_id +LEFT JOIN trusted_contacts tc ON i.user_id = tc.user_id +ORDER BY 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `all_contacts`, `trusted_contacts` from `contacts`, `customers`, `invoices`, `all_contacts`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`all_contacts`, `trusted_contacts`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `all_contacts`: reads `contacts`. +3. CTE `trusted_contacts`: reads `contacts`, `customers`, joins related entities. +4. Combine datasets using LEFT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Project final output columns: `all_contacts`, `trusted_contacts`. +6. Order output deterministically with `ORDER BY 1`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1364. Number of Trusted Contacts of a Customer.sql b/medium/1364. Number of Trusted Contacts of a Customer.sql deleted file mode 100644 index d4d5477..0000000 --- a/medium/1364. Number of Trusted Contacts of a Customer.sql +++ /dev/null @@ -1,17 +0,0 @@ -WITH all_contacts AS( - SELECT user_id,COUNT(contact_name) AS a_contacts - FROM contacts_1364 - GROUP BY user_id -), -trusted_contacts AS( - SELECT user_id,COUNT(contact_name) AS t_contacts - FROM contacts_1364 ct - INNER JOIN customers_1364 cs ON ct.contact_name = cs.customer_name - GROUP BY user_id -) - -SELECT i.*,COALESCE(a_contacts,0) all_contacts,COALESCE(t_contacts,0) trusted_contacts -FROM invoices_1364 i -LEFT JOIN all_contacts ac ON i.user_id = ac.user_id -LEFT JOIN trusted_contacts tc ON i.user_id = tc.user_id -ORDER BY 1; diff --git a/medium/1393. Capital Gain Loss.md b/medium/1393. Capital Gain Loss.md new file mode 100644 index 0000000..28ead21 --- /dev/null +++ b/medium/1393. Capital Gain Loss.md @@ -0,0 +1,80 @@ +# Question 1393: Capital Gain/Loss + +**LeetCode URL:** https://leetcode.com/problems/capital-gainloss/ + +## Description + +Write a solution to report the Capital gain/loss for each stock. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create Table If Not Exists Stocks (stock_name varchar(15), operation ENUM('Sell', 'Buy'), operation_day int, price int); +``` + +## Sample Input Data + +```sql +insert into Stocks (stock_name, operation, operation_day, price) values ('Leetcode', 'Buy', '1', '1000'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Corona Masks', 'Buy', '2', '10'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Leetcode', 'Sell', '5', '9000'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Handbags', 'Buy', '17', '30000'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Corona Masks', 'Sell', '3', '1010'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Corona Masks', 'Buy', '4', '1000'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Corona Masks', 'Sell', '5', '500'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Corona Masks', 'Buy', '6', '1000'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Handbags', 'Sell', '29', '7000'); +insert into Stocks (stock_name, operation, operation_day, price) values ('Corona Masks', 'Sell', '10', '10000'); +``` + +## Expected Output Data + +```text ++---------------+-------------------+ +| stock_name | capital_gain_loss | ++---------------+-------------------+ +| Corona Masks | 9500 | +| Leetcode | 8000 | +| Handbags | -23000 | ++---------------+-------------------+ +``` + +## SQL Solution + +```sql +SELECT stock_name, + SUM(CASE WHEN operation = 'Buy' THEN price * -1 + ELSE price END) AS capital_gain_loss +FROM stocks_1393 +GROUP BY stock_name +ORDER BY capital_gain_loss DESC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `stock_name`, `capital_gain_loss` from `stocks`. + +### Result Grain + +One row per unique key in `GROUP BY stock_name`. + +### Step-by-Step Logic + +1. Aggregate rows with SUM grouped by stock_name. +2. Project final output columns: `stock_name`, `capital_gain_loss`. +3. Order output deterministically with `ORDER BY capital_gain_loss DESC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1393. Capital Gain Loss.sql b/medium/1393. Capital Gain Loss.sql deleted file mode 100644 index 4fadd4c..0000000 --- a/medium/1393. Capital Gain Loss.sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT stock_name, - SUM(CASE WHEN operation = 'Buy' THEN price * -1 - ELSE price END) AS capital_gain_loss -FROM stocks_1393 -GROUP BY stock_name -ORDER BY capital_gain_loss DESC; diff --git a/medium/1398. Customers Who Bought Products A and B but Not C.md b/medium/1398. Customers Who Bought Products A and B but Not C.md new file mode 100644 index 0000000..c8419bf --- /dev/null +++ b/medium/1398. Customers Who Bought Products A and B but Not C.md @@ -0,0 +1,83 @@ +# Question 1398: Customers Who Bought Products A and B but Not C + +**LeetCode URL:** https://leetcode.com/problems/customers-who-bought-products-a-and-b-but-not-c/ + +## Description + +Write an SQL query to report the customer_id and customer_name of customers who bought products "A", "B" but did not buy the product "C" since we want to recommend them buy this product. Return the result table ordered by customer_id. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, customer_name varchar(30)); +Create table If Not Exists Orders (order_id int, customer_id int, product_name varchar(30)); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, customer_name) values ('1', 'Daniel'); +insert into Customers (customer_id, customer_name) values ('2', 'Diana'); +insert into Customers (customer_id, customer_name) values ('3', 'Elizabeth'); +insert into Customers (customer_id, customer_name) values ('4', 'Jhon'); +insert into Orders (order_id, customer_id, product_name) values ('10', '1', 'A'); +insert into Orders (order_id, customer_id, product_name) values ('20', '1', 'B'); +insert into Orders (order_id, customer_id, product_name) values ('30', '1', 'D'); +insert into Orders (order_id, customer_id, product_name) values ('40', '1', 'C'); +insert into Orders (order_id, customer_id, product_name) values ('50', '2', 'A'); +insert into Orders (order_id, customer_id, product_name) values ('60', '3', 'A'); +insert into Orders (order_id, customer_id, product_name) values ('70', '3', 'B'); +insert into Orders (order_id, customer_id, product_name) values ('80', '3', 'D'); +insert into Orders (order_id, customer_id, product_name) values ('90', '4', 'C'); +``` + +## Expected Output Data + +```text ++-------------+---------------+ +| customer_id | customer_name | ++-------------+---------------+ +| 3 | Elizabeth | ++-------------+---------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT o1.customer_id,c.customer_name +FROM orders_1398 o1 +LEFT JOIN orders_1398 o2 ON o1.product_name = 'A' AND o2.product_name = 'B' AND o1.customer_id = o2.customer_id +LEFT JOIN orders_1398 o3 ON o1.product_name = 'A' AND o3.product_name = 'C' AND o1.customer_id = o3.customer_id +INNER JOIN customers_1398 c ON o1.customer_id = c.customer_id +WHERE o1.product_name IS NOT NULL AND o2.product_name IS NOT NULL AND o3.product_name IS NULL; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_id`, `customer_name` from `orders`, `customers`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Apply row-level filtering in `WHERE`: o1.product_name IS NOT NULL AND o2.product_name IS NOT NULL AND o3.product_name IS NULL. +3. Project final output columns: `customer_id`, `customer_name`. +4. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1398. Customers Who Bought Products A and B but Not C.sql b/medium/1398. Customers Who Bought Products A and B but Not C.sql deleted file mode 100644 index e95507f..0000000 --- a/medium/1398. Customers Who Bought Products A and B but Not C.sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT DISTINCT o1.customer_id,c.customer_name -FROM orders_1398 o1 -LEFT JOIN orders_1398 o2 ON o1.product_name = 'A' AND o2.product_name = 'B' AND o1.customer_id = o2.customer_id -LEFT JOIN orders_1398 o3 ON o1.product_name = 'A' AND o3.product_name = 'C' AND o1.customer_id = o3.customer_id -INNER JOIN customers_1398 c ON o1.customer_id = c.customer_id -WHERE o1.product_name IS NOT NULL AND o2.product_name IS NOT NULL AND o3.product_name IS NULL; diff --git a/medium/1440. Evaluate Boolean Expression.md b/medium/1440. Evaluate Boolean Expression.md new file mode 100644 index 0000000..ae4f6b5 --- /dev/null +++ b/medium/1440. Evaluate Boolean Expression.md @@ -0,0 +1,78 @@ +# Question 1440: Evaluate Boolean Expression + +**LeetCode URL:** https://leetcode.com/problems/evaluate-boolean-expression/ + +## Description + +Write an SQL query to evaluate the boolean expressions in Expressions table. Return the result table in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create Table If Not Exists Variables (name varchar(3), value int); +Create Table If Not Exists Expressions (left_operand varchar(3), operator ENUM('>', '<', '='), right_operand varchar(3)); +``` + +## Sample Input Data + +```sql +insert into Variables (name, value) values ('x', '66'); +insert into Variables (name, value) values ('y', '77'); +insert into Expressions (left_operand, operator, right_operand) values ('x', '>', 'y'); +insert into Expressions (left_operand, operator, right_operand) values ('x', '<', 'y'); +insert into Expressions (left_operand, operator, right_operand) values ('x', '=', 'y'); +insert into Expressions (left_operand, operator, right_operand) values ('y', '>', 'x'); +insert into Expressions (left_operand, operator, right_operand) values ('y', '<', 'x'); +insert into Expressions (left_operand, operator, right_operand) values ('x', '=', 'x'); +``` + +## Expected Output Data + +```text ++--------+ +| value | ++--------+ +| sample | ++--------+ +``` + +## SQL Solution + +```sql +SELECT e.*, + CASE WHEN operator = '=' THEN v1.value = v2.value + WHEN operator = '>' THEN v1.value > v2.value + WHEN operator = '<' THEN v1.value < v2.value + END AS value +FROM expressions_1440 e +INNER JOIN variables_1440 v1 ON e.left_operand = v1.name +INNER JOIN variables_1440 v2 ON e.right_operand = v2.name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `value` from `expressions`, `variables`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `value`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1440. Evaluate Boolean Expression.sql b/medium/1440. Evaluate Boolean Expression.sql deleted file mode 100644 index 35d219f..0000000 --- a/medium/1440. Evaluate Boolean Expression.sql +++ /dev/null @@ -1,8 +0,0 @@ -SELECT e.*, - CASE WHEN operator = '=' THEN v1.value = v2.value - WHEN operator = '>' THEN v1.value > v2.value - WHEN operator = '<' THEN v1.value < v2.value - END AS value -FROM expressions_1440 e -INNER JOIN variables_1440 v1 ON e.left_operand = v1.name -INNER JOIN variables_1440 v2 ON e.right_operand = v2.name; diff --git a/medium/1445. Apples & Oranges.md b/medium/1445. Apples & Oranges.md new file mode 100644 index 0000000..a974979 --- /dev/null +++ b/medium/1445. Apples & Oranges.md @@ -0,0 +1,75 @@ +# Question 1445: Apples & Oranges + +**LeetCode URL:** https://leetcode.com/problems/apples-oranges/ + +## Description + +Write an SQL query to report the difference between number of apples and oranges sold each day. Return the result table ordered by sale_date in format ('YYYY-MM-DD'). The query result format is in the following example: Sales table: +------------+------------+-------------+ | sale_date | fruit | sold_num | +------------+------------+-------------+ | 2020-05-01 | apples | 10 | | 2020-05-01 | oranges | 8 | | 2020-05-02 | apples | 15 | | 2020-05-02 | oranges | 15 | | 2020-05-03 | apples | 20 | | 2020-05-03 | oranges | 0 | | 2020-05-04 | apples | 15 | | 2020-05-04 | oranges | 16 | +------------+------------+-------------+ Result table: +------------+--------------+ | sale_date | diff | +------------+--------------+ | 2020-05-01 | 2 | | 2020-05-02 | 0 | | 2020-05-03 | 20 | | 2020-05-04 | -1 | +------------+--------------+ Day 2020-05-01, 10 apples and 8 oranges were sold (Difference 10 - 8 = 2). + +## Table Schema Structure + +```sql +Create table If Not Exists Sales (sale_date date, fruit ENUM('apples', 'oranges'), sold_num int); +``` + +## Sample Input Data + +```sql +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-01', 'apples', '10'); +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-01', 'oranges', '8'); +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-02', 'apples', '15'); +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-02', 'oranges', '15'); +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-03', 'apples', '20'); +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-03', 'oranges', '0'); +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-04', 'apples', '15'); +insert into Sales (sale_date, fruit, sold_num) values ('2020-05-04', 'oranges', '16'); +``` + +## Expected Output Data + +```text ++------------+--------------+ +| sale_date | diff | ++------------+--------------+ +| 2020-05-01 | 2 | +| 2020-05-02 | 0 | +| 2020-05-03 | 20 | +| 2020-05-04 | -1 | ++------------+--------------+ +``` + +## SQL Solution + +```sql +SELECT s1.sale_date,s1.sold_num-s2.sold_num AS diff +FROM sales_1445 s1 +INNER JOIN sales_1445 s2 ON s1.sale_date=s2.sale_date AND s1.fruit <> s2.fruit AND s1.fruit = 'apples'; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `sale_date`, `diff` from `sales`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `sale_date`, `diff`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1445. Apples & Oranges.sql b/medium/1445. Apples & Oranges.sql deleted file mode 100644 index aca4fcb..0000000 --- a/medium/1445. Apples & Oranges.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT s1.sale_date,s1.sold_num-s2.sold_num AS diff -FROM sales_1445 s1 -INNER JOIN sales_1445 s2 ON s1.sale_date=s2.sale_date AND s1.fruit <> s2.fruit AND s1.fruit = 'apples'; diff --git a/medium/1454. Active Users.md b/medium/1454. Active Users.md new file mode 100644 index 0000000..146d26e --- /dev/null +++ b/medium/1454. Active Users.md @@ -0,0 +1,94 @@ +# Question 1454: Active Users + +**LeetCode URL:** https://leetcode.com/problems/active-users/ + +## Description + +Write an SQL query to find the id and the name of active users. Return the result table ordered by the id. The query result format is in the following example: Accounts table: +----+----------+ | id | name | +----+----------+ | 1 | Winston | | 7 | Jonathan | +----+----------+ Logins table: +----+------------+ | id | login_date | +----+------------+ | 7 | 2020-05-30 | | 1 | 2020-05-30 | | 7 | 2020-05-31 | | 7 | 2020-06-01 | | 7 | 2020-06-02 | | 7 | 2020-06-02 | | 7 | 2020-06-03 | | 1 | 2020-06-07 | | 7 | 2020-06-10 | +----+------------+ Result table: +----+----------+ | id | name | +----+----------+ | 7 | Jonathan | +----+----------+ User Winston with id = 1 logged in 2 times only in 2 different days, so, Winston is not an active user. + +## Table Schema Structure + +```sql +Create table If Not Exists Accounts (id int, name varchar(10)); +Create table If Not Exists Logins (id int, login_date date); +``` + +## Sample Input Data + +```sql +insert into Accounts (id, name) values ('1', 'Winston'); +insert into Accounts (id, name) values ('7', 'Jonathan'); +insert into Logins (id, login_date) values ('7', '2020-05-30'); +insert into Logins (id, login_date) values ('1', '2020-05-30'); +insert into Logins (id, login_date) values ('7', '2020-05-31'); +insert into Logins (id, login_date) values ('7', '2020-06-01'); +insert into Logins (id, login_date) values ('7', '2020-06-02'); +insert into Logins (id, login_date) values ('7', '2020-06-02'); +insert into Logins (id, login_date) values ('7', '2020-06-03'); +insert into Logins (id, login_date) values ('1', '2020-06-07'); +insert into Logins (id, login_date) values ('7', '2020-06-10'); +``` + +## Expected Output Data + +```text ++----+----------+ +| id | name | ++----+----------+ +| 7 | Jonathan | ++----+----------+ +``` + +## SQL Solution + +```sql +WITH dedup AS ( + SELECT * + FROM logins_1454 + GROUP BY id,login_date +), +cte AS ( + SELECT id,login_date, + LEAD(login_date,4) OVER (PARTITION BY id ORDER BY login_date) AS date_5 + FROM dedup +) +SELECT a.id,a.name +FROM cte c +INNER JOIN accounts_1454 a ON a.id = c.id +WHERE c.date_5-c.login_date=4; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `name` from `logins`, `dedup`, `cte`, `accounts`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`dedup`, `cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `dedup`: reads `logins`. +3. CTE `cte`: reads `dedup`, computes window metrics. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: c.date_5-c.login_date=4. +6. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +7. Project final output columns: `id`, `name`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1454. Active Users.sql b/medium/1454. Active Users.sql deleted file mode 100644 index f51360a..0000000 --- a/medium/1454. Active Users.sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH dedup AS ( - SELECT * - FROM logins_1454 - GROUP BY id,login_date -), -cte AS ( - SELECT id,login_date, - LEAD(login_date,4) OVER (PARTITION BY id ORDER BY login_date) AS date_5 - FROM dedup -) -SELECT a.id,a.name -FROM cte c -INNER JOIN accounts_1454 a ON a.id = c.id -WHERE c.date_5-c.login_date=4; diff --git a/medium/1459. Rectangles Area.md b/medium/1459. Rectangles Area.md new file mode 100644 index 0000000..d1498d0 --- /dev/null +++ b/medium/1459. Rectangles Area.md @@ -0,0 +1,70 @@ +# Question 1459: Rectangles Area + +**LeetCode URL:** https://leetcode.com/problems/rectangles-area/ + +## Description + +Write an SQL query to report of all possible rectangles which can be formed by any two points of the table. + +## Table Schema Structure + +```sql +Create table If Not Exists Points (id int, x_value int, y_value int); +``` + +## Sample Input Data + +```sql +insert into Points (id, x_value, y_value) values ('1', '2', '7'); +insert into Points (id, x_value, y_value) values ('2', '4', '8'); +insert into Points (id, x_value, y_value) values ('3', '2', '10'); +``` + +## Expected Output Data + +```text ++----------+-------------+-------------+ +| p1 | p2 | area | ++----------+-------------+-------------+ +| 2 | 3 | 6 | +| 1 | 2 | 2 | ++----------+-------------+-------------+ +``` + +## SQL Solution + +```sql +SELECT p1.id AS p1,p2.id AS p2,ABS(p1.x_value-p2.x_value)*ABS(p1.y_value-p2.y_value) AS area +FROM points_1459 p1 +INNER JOIN points_1459 p2 ON p1.id < p2.id AND ABS(p1.x_value-p2.x_value)*ABS(p1.y_value-p2.y_value)<>0 +ORDER BY area DESC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `p1`, `p2`, `area` from `points`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `p1`, `p2`, `area`. +3. Order output deterministically with `ORDER BY area DESC`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1459. Rectangles Area.sql b/medium/1459. Rectangles Area.sql deleted file mode 100644 index 2fb4a59..0000000 --- a/medium/1459. Rectangles Area.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT p1.id AS p1,p2.id AS p2,ABS(p1.x_value-p2.x_value)*ABS(p1.y_value-p2.y_value) AS area -FROM points_1459 p1 -INNER JOIN points_1459 p2 ON p1.id < p2.id AND ABS(p1.x_value-p2.x_value)*ABS(p1.y_value-p2.y_value)<>0 -ORDER BY area DESC; diff --git a/medium/1468. Calculate Salaries.md b/medium/1468. Calculate Salaries.md new file mode 100644 index 0000000..e7ba97d --- /dev/null +++ b/medium/1468. Calculate Salaries.md @@ -0,0 +1,94 @@ +# Question 1468: Calculate Salaries + +**LeetCode URL:** https://leetcode.com/problems/calculate-salaries/ + +## Description + +Write an SQL query to find the salaries of the employees after applying taxes. Return the result table in any order. The query result format is in the following example: Salaries table: +------------+-------------+---------------+--------+ | company_id | employee_id | employee_name | salary | +------------+-------------+---------------+--------+ | 1 | 1 | Tony | 2000 | | 1 | 2 | Pronub | 21300 | | 1 | 3 | Tyrrox | 10800 | | 2 | 1 | Pam | 300 | | 2 | 7 | Bassem | 450 | | 2 | 9 | Hermione | 700 | | 3 | 7 | Bocaben | 100 | | 3 | 2 | Ognjen | 2200 | | 3 | 13 | Nyancat | 3300 | | 3 | 15 | Morninngcat | 1866 | +------------+-------------+---------------+--------+ Result table: +------------+-------------+---------------+--------+ | company_id | employee_id | employee_name | salary | +------------+-------------+---------------+--------+ | 1 | 1 | Tony | 1020 | | 1 | 2 | Pronub | 10863 | | 1 | 3 | Tyrrox | 5508 | | 2 | 1 | Pam | 300 | | 2 | 7 | Bassem | 450 | | 2 | 9 | Hermione | 700 | | 3 | 7 | Bocaben | 76 | | 3 | 2 | Ognjen | 1672 | | 3 | 13 | Nyancat | 2508 | | 3 | 15 | Morninngcat | 5911 | +------------+-------------+---------------+--------+ For company 1, Max salary is 21300. + +## Table Schema Structure + +```sql +Create table If Not Exists Salaries (company_id int, employee_id int, employee_name varchar(13), salary int); +``` + +## Sample Input Data + +```sql +insert into Salaries (company_id, employee_id, employee_name, salary) values ('1', '1', 'Tony', '2000'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('1', '2', 'Pronub', '21300'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('1', '3', 'Tyrrox', '10800'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('2', '1', 'Pam', '300'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('2', '7', 'Bassem', '450'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('2', '9', 'Hermione', '700'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('3', '7', 'Bocaben', '100'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('3', '2', 'Ognjen', '2200'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('3', '13', 'Nyancat', '3300'); +insert into Salaries (company_id, employee_id, employee_name, salary) values ('3', '15', 'Morninngcat', '7777'); +``` + +## Expected Output Data + +```text ++------------+-------------+---------------+--------+ +| company_id | employee_id | employee_name | salary | ++------------+-------------+---------------+--------+ +| 1 | 1 | Tony | 1020 | +| 1 | 2 | Pronub | 10863 | +| 1 | 3 | Tyrrox | 5508 | +| 2 | 1 | Pam | 300 | +| 2 | 7 | Bassem | 450 | +| 2 | 9 | Hermione | 700 | +| 3 | 7 | Bocaben | 76 | +| 3 | 2 | Ognjen | 1672 | +| 3 | 13 | Nyancat | 2508 | +| 3 | 15 | Morninngcat | 5911 | ++------------+-------------+---------------+--------+ +``` + +## SQL Solution + +```sql +WITH cte AS ( + SELECT *, + MAX(salary) OVER (PARTITION BY company_id) AS max_salary + FROM salaries_1468 +) +SELECT *, + ROUND( + CASE WHEN max_salary<1000 THEN salary + WHEN max_salary<=10000 THEN salary-(salary*24)/100 + ELSE salary-(salary*49)/100 + END) AS new_salary +FROM cte; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `new_salary` from `salaries`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `salaries`, computes window metrics. +3. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +4. Project final output columns: `new_salary`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/1468. Calculate Salaries.sql b/medium/1468. Calculate Salaries.sql deleted file mode 100644 index fe038a2..0000000 --- a/medium/1468. Calculate Salaries.sql +++ /dev/null @@ -1,12 +0,0 @@ -WITH cte AS ( - SELECT *, - MAX(salary) OVER (PARTITION BY company_id) AS max_salary - FROM salaries_1468 -) -SELECT *, - ROUND( - CASE WHEN max_salary<1000 THEN salary - WHEN max_salary<=10000 THEN salary-(salary*24)/100 - ELSE salary-(salary*49)/100 - END) AS new_salary -FROM cte; diff --git a/medium/1501. Countries You Can Safely Invest In.md b/medium/1501. Countries You Can Safely Invest In.md new file mode 100644 index 0000000..61c6fa4 --- /dev/null +++ b/medium/1501. Countries You Can Safely Invest In.md @@ -0,0 +1,110 @@ +# Question 1501: Countries You Can Safely Invest In + +**LeetCode URL:** https://leetcode.com/problems/countries-you-can-safely-invest-in/ + +## Description + +Write an SQL query to find the countries where this company can invest. Return the result table in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Person (id int, name varchar(15), phone_number varchar(11)); +Create table If Not Exists Country (name varchar(15), country_code varchar(3)); +Create table If Not Exists Calls (caller_id int, callee_id int, duration int); +``` + +## Sample Input Data + +```sql +insert into Person (id, name, phone_number) values ('3', 'Jonathan', '051-1234567'); +insert into Person (id, name, phone_number) values ('12', 'Elvis', '051-7654321'); +insert into Person (id, name, phone_number) values ('1', 'Moncef', '212-1234567'); +insert into Person (id, name, phone_number) values ('2', 'Maroua', '212-6523651'); +insert into Person (id, name, phone_number) values ('7', 'Meir', '972-1234567'); +insert into Person (id, name, phone_number) values ('9', 'Rachel', '972-0011100'); +insert into Country (name, country_code) values ('Peru', '051'); +insert into Country (name, country_code) values ('Israel', '972'); +insert into Country (name, country_code) values ('Morocco', '212'); +insert into Country (name, country_code) values ('Germany', '049'); +insert into Country (name, country_code) values ('Ethiopia', '251'); +insert into Calls (caller_id, callee_id, duration) values ('1', '9', '33'); +insert into Calls (caller_id, callee_id, duration) values ('2', '9', '4'); +insert into Calls (caller_id, callee_id, duration) values ('1', '2', '59'); +insert into Calls (caller_id, callee_id, duration) values ('3', '12', '102'); +insert into Calls (caller_id, callee_id, duration) values ('3', '12', '330'); +insert into Calls (caller_id, callee_id, duration) values ('12', '3', '5'); +insert into Calls (caller_id, callee_id, duration) values ('7', '9', '13'); +insert into Calls (caller_id, callee_id, duration) values ('7', '1', '3'); +insert into Calls (caller_id, callee_id, duration) values ('9', '7', '1'); +insert into Calls (caller_id, callee_id, duration) values ('1', '7', '7'); +``` + +## Expected Output Data + +```text ++----------+ +| country | ++----------+ +| Peru | ++----------+ +``` + +## SQL Solution + +```sql +WITH cte AS ( + SELECT caller_id AS person_id,duration + FROM calls_1501 + UNION + SELECT callee_id AS person_id,duration + FROM calls_1501 +), +avg_duration AS( + SELECT cn.name AS country_name,c.duration AS duration, + AVG(duration) OVER () avg_global_duration, + AVG(duration) OVER (PARTITION BY cn.name) avg_country_duration + FROM cte c + INNER JOIN person_1501 p ON c.person_id=p.id + INNER JOIN country_1501 cn ON cn.country_code=SUBSTR(p.phone_number,1,3) +) +SELECT DISTINCT country_name +FROM avg_duration +WHERE avg_country_duration>avg_global_duration; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `country_name` from `calls`, `cte`, `person`, `country`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `avg_duration`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `calls`. +3. CTE `avg_duration`: reads `cte`, `person`, `country`, joins related entities, computes window metrics. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: avg_country_duration>avg_global_duration. +6. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +7. Project final output columns: `country_name`. +8. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1501. Countries You Can Safely Invest In.sql b/medium/1501. Countries You Can Safely Invest In.sql deleted file mode 100644 index f29283e..0000000 --- a/medium/1501. Countries You Can Safely Invest In.sql +++ /dev/null @@ -1,18 +0,0 @@ -WITH cte AS ( - SELECT caller_id AS person_id,duration - FROM calls_1501 - UNION - SELECT callee_id AS person_id,duration - FROM calls_1501 -), -avg_duration AS( - SELECT cn.name AS country_name,c.duration AS duration, - AVG(duration) OVER () avg_global_duration, - AVG(duration) OVER (PARTITION BY cn.name) avg_country_duration - FROM cte c - INNER JOIN person_1501 p ON c.person_id=p.id - INNER JOIN country_1501 cn ON cn.country_code=SUBSTR(p.phone_number,1,3) -) -SELECT DISTINCT country_name -FROM avg_duration -WHERE avg_country_duration>avg_global_duration; diff --git a/medium/1532. The Most Recent Three Orders.md b/medium/1532. The Most Recent Three Orders.md new file mode 100644 index 0000000..b4b8322 --- /dev/null +++ b/medium/1532. The Most Recent Three Orders.md @@ -0,0 +1,114 @@ +# Question 1532: The Most Recent Three Orders + +**LeetCode URL:** https://leetcode.com/problems/the-most-recent-three-orders/ + +## Description + +Write an SQL query to find the most recent 3 orders of each user. return all of their orders. The query result format is in the following example: Customers +-------------+-----------+ | customer_id | name | +-------------+-----------+ | 1 | Winston | | 2 | Jonathan | | 3 | Annabelle | | 4 | Marwan | | 5 | Khaled | +-------------+-----------+ Orders +----------+------------+-------------+------+ | order_id | order_date | customer_id | cost | +----------+------------+-------------+------+ | 1 | 2020-07-31 | 1 | 30 | | 2 | 2020-07-30 | 2 | 40 | | 3 | 2020-07-31 | 3 | 70 | | 4 | 2020-07-29 | 4 | 100 | | 5 | 2020-06-10 | 1 | 1010 | | 6 | 2020-08-01 | 2 | 102 | | 7 | 2020-08-01 | 3 | 111 | | 8 | 2020-08-03 | 1 | 99 | | 9 | 2020-08-07 | 2 | 32 | | 10 | 2020-07-15 | 1 | 2 | +----------+------------+-------------+------+ Result table: +---------------+-------------+----------+------------+ | customer_name | customer_id | order_id | order_date | +---------------+-------------+----------+------------+ | Annabelle | 3 | 7 | 2020-08-01 | | Annabelle | 3 | 3 | 2020-07-31 | | Jonathan | 2 | 9 | 2020-08-07 | | Jonathan | 2 | 6 | 2020-08-01 | | Jonathan | 2 | 2 | 2020-07-30 | | Marwan | 4 | 4 | 2020-07-29 | | Winston | 1 | 8 | 2020-08-03 | | Winston | 1 | 1 | 2020-07-31 | | Winston | 1 | 10 | 2020-07-15 | +---------------+-------------+----------+------------+ Winston has 4 orders, we discard the order of "2020-06-10" because it is the oldest order. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, name varchar(10)); +Create table If Not Exists Orders (order_id int, order_date date, customer_id int, cost int); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, name) values ('1', 'Winston'); +insert into Customers (customer_id, name) values ('2', 'Jonathan'); +insert into Customers (customer_id, name) values ('3', 'Annabelle'); +insert into Customers (customer_id, name) values ('4', 'Marwan'); +insert into Customers (customer_id, name) values ('5', 'Khaled'); +insert into Orders (order_id, order_date, customer_id, cost) values ('1', '2020-07-31', '1', '30'); +insert into Orders (order_id, order_date, customer_id, cost) values ('2', '2020-7-30', '2', '40'); +insert into Orders (order_id, order_date, customer_id, cost) values ('3', '2020-07-31', '3', '70'); +insert into Orders (order_id, order_date, customer_id, cost) values ('4', '2020-07-29', '4', '100'); +insert into Orders (order_id, order_date, customer_id, cost) values ('5', '2020-06-10', '1', '1010'); +insert into Orders (order_id, order_date, customer_id, cost) values ('6', '2020-08-01', '2', '102'); +insert into Orders (order_id, order_date, customer_id, cost) values ('7', '2020-08-01', '3', '111'); +insert into Orders (order_id, order_date, customer_id, cost) values ('8', '2020-08-03', '1', '99'); +insert into Orders (order_id, order_date, customer_id, cost) values ('9', '2020-08-07', '2', '32'); +insert into Orders (order_id, order_date, customer_id, cost) values ('10', '2020-07-15', '1', '2'); +``` + +## Expected Output Data + +```text ++---------------+-------------+----------+------------+ +| customer_name | customer_id | order_id | order_date | ++---------------+-------------+----------+------------+ +| Annabelle | 3 | 7 | 2020-08-01 | +| Annabelle | 3 | 3 | 2020-07-31 | +| Jonathan | 2 | 9 | 2020-08-07 | +| Jonathan | 2 | 6 | 2020-08-01 | +| Jonathan | 2 | 2 | 2020-07-30 | +| Marwan | 4 | 4 | 2020-07-29 | +| Winston | 1 | 8 | 2020-08-03 | +| Winston | 1 | 1 | 2020-07-31 | +| Winston | 1 | 10 | 2020-07-15 | ++---------------+-------------+----------+------------+ +``` + +## SQL Solution + +```sql +WITH ranked AS ( + SELECT *, + DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn + FROM orders_1532 +) +SELECT c.name AS customer_name,r.customer_id,r.order_id,r.order_date +FROM ranked r +INNER JOIN customers_1532 c ON r.customer_id=c.customer_id +WHERE rn<=3 +ORDER BY c.name,c.customer_id,r.order_date DESC; + +---------------Without Window function--------------- + +SELECT o1.customer_id,o1.order_date,COUNT(o2.order_date) +FROM orders_1532 o1 +INNER JOIN orders_1532 o2 ON o1.customer_id=o2.customer_id AND o1.order_date<=o2.order_date +GROUP BY o1.customer_id,o1.order_date +HAVING COUNT(o2.order_date)<=3 +ORDER BY 1,2 DESC; + +-- Main logic is over now we only need to apply 2 joins to bring other columns. +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_id`, `order_date` from `orders`, `ranked`, `customers`. + +### Result Grain + +One row per unique key in `GROUP BY o1.customer_id,o1.order_date`. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `orders`, computes window metrics. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Aggregate rows with COUNT grouped by o1.customer_id,o1.order_date. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `customer_id`, `order_date`. +7. Filter aggregated groups in `HAVING`: COUNT(o2.order_date)<=3. +8. Order output deterministically with `ORDER BY 1,2 DESC`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/1532. The Most Recent Three Orders.sql b/medium/1532. The Most Recent Three Orders.sql deleted file mode 100644 index bb81d6e..0000000 --- a/medium/1532. The Most Recent Three Orders.sql +++ /dev/null @@ -1,21 +0,0 @@ -WITH ranked AS ( - SELECT *, - DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn - FROM orders_1532 -) -SELECT c.name AS customer_name,r.customer_id,r.order_id,r.order_date -FROM ranked r -INNER JOIN customers_1532 c ON r.customer_id=c.customer_id -WHERE rn<=3 -ORDER BY c.name,c.customer_id,r.order_date DESC; - ----------------Without Window function--------------- - -SELECT o1.customer_id,o1.order_date,COUNT(o2.order_date) -FROM orders_1532 o1 -INNER JOIN orders_1532 o2 ON o1.customer_id=o2.customer_id AND o1.order_date<=o2.order_date -GROUP BY o1.customer_id,o1.order_date -HAVING COUNT(o2.order_date)<=3 -ORDER BY 1,2 DESC; - --- Main logic is over now we only need to apply 2 joins to bring other columns. diff --git a/medium/1549. The Most Recent Orders for Each Product.md b/medium/1549. The Most Recent Orders for Each Product.md new file mode 100644 index 0000000..9bda71c --- /dev/null +++ b/medium/1549. The Most Recent Orders for Each Product.md @@ -0,0 +1,98 @@ +# Question 1549: The Most Recent Orders for Each Product + +**LeetCode URL:** https://leetcode.com/problems/the-most-recent-orders-for-each-product/ + +## Description + +Write an SQL query to find the most recent order(s) of each product. Return the result table sorted by product_name in ascending order and in case of a tie by the product_id in ascending order. The query result format is in the following example: Customers +-------------+-----------+ | customer_id | name | +-------------+-----------+ | 1 | Winston | | 2 | Jonathan | | 3 | Annabelle | | 4 | Marwan | | 5 | Khaled | +-------------+-----------+ Orders +----------+------------+-------------+------------+ | order_id | order_date | customer_id | product_id | +----------+------------+-------------+------------+ | 1 | 2020-07-31 | 1 | 1 | | 2 | 2020-07-30 | 2 | 2 | | 3 | 2020-08-29 | 3 | 3 | | 4 | 2020-07-29 | 4 | 1 | | 5 | 2020-06-10 | 1 | 2 | | 6 | 2020-08-01 | 2 | 1 | | 7 | 2020-08-01 | 3 | 1 | | 8 | 2020-08-03 | 1 | 2 | | 9 | 2020-08-07 | 2 | 3 | | 10 | 2020-07-15 | 1 | 2 | +----------+------------+-------------+------------+ Products +------------+--------------+-------+ | product_id | product_name | price | +------------+--------------+-------+ | 1 | keyboard | 120 | | 2 | mouse | 80 | | 3 | screen | 600 | | 4 | hard disk | 450 | +------------+--------------+-------+ Result table: +--------------+------------+----------+------------+ | product_name | product_id | order_id | order_date | +--------------+------------+----------+------------+ | keyboard | 1 | 6 | 2020-08-01 | | keyboard | 1 | 7 | 2020-08-01 | | mouse | 2 | 8 | 2020-08-03 | | screen | 3 | 3 | 2020-08-29 | +--------------+------------+----------+------------+ keyboard's most recent order is in 2020-08-01, it was ordered two times this day. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, name varchar(10)); +Create table If Not Exists Orders (order_id int, order_date date, customer_id int, product_id int); +Create table If Not Exists Products (product_id int, product_name varchar(20), price int); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, name) values ('1', 'Winston'); +insert into Customers (customer_id, name) values ('2', 'Jonathan'); +insert into Customers (customer_id, name) values ('3', 'Annabelle'); +insert into Customers (customer_id, name) values ('4', 'Marwan'); +insert into Customers (customer_id, name) values ('5', 'Khaled'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('1', '2020-07-31', '1', '1'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('2', '2020-7-30', '2', '2'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('3', '2020-08-29', '3', '3'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('4', '2020-07-29', '4', '1'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('5', '2020-06-10', '1', '2'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('6', '2020-08-01', '2', '1'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('7', '2020-08-01', '3', '1'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('8', '2020-08-03', '1', '2'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('9', '2020-08-07', '2', '3'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('10', '2020-07-15', '1', '2'); +insert into Products (product_id, product_name, price) values ('1', 'keyboard', '120'); +insert into Products (product_id, product_name, price) values ('2', 'mouse', '80'); +insert into Products (product_id, product_name, price) values ('3', 'screen', '600'); +insert into Products (product_id, product_name, price) values ('4', 'hard disk', '450'); +``` + +## Expected Output Data + +```text ++--------------+------------+----------+------------+ +| product_name | product_id | order_id | order_date | ++--------------+------------+----------+------------+ +| keyboard | 1 | 6 | 2020-08-01 | +| keyboard | 1 | 7 | 2020-08-01 | +| mouse | 2 | 8 | 2020-08-03 | +| screen | 3 | 3 | 2020-08-29 | ++--------------+------------+----------+------------+ +``` + +## SQL Solution + +```sql +WITH recent_orders AS ( + SELECT o1.* + FROM orders_1549 o1 + LEFT JOIN orders_1549 o2 ON o1.product_id = o2.product_id AND o1.order_date < o2.order_date + WHERE o2.order_id IS NULL +) +SELECT p.product_name,p.product_id,ro.order_id,ro.order_date +FROM recent_orders ro +INNER JOIN products_1549 p ON ro.product_id = p.product_id +ORDER BY p.product_name,p.product_id,ro.order_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_name`, `product_id`, `order_id`, `order_date` from `orders`, `recent_orders`, `products`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`recent_orders`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `recent_orders`: reads `orders`, joins related entities. +3. Combine datasets using LEFT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `product_name`, `product_id`, `order_id`, `order_date`. +5. Order output deterministically with `ORDER BY p.product_name,p.product_id,ro.order_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1549. The Most Recent Orders for Each Product.sql b/medium/1549. The Most Recent Orders for Each Product.sql deleted file mode 100644 index 26d4c41..0000000 --- a/medium/1549. The Most Recent Orders for Each Product.sql +++ /dev/null @@ -1,10 +0,0 @@ -WITH recent_orders AS ( - SELECT o1.* - FROM orders_1549 o1 - LEFT JOIN orders_1549 o2 ON o1.product_id = o2.product_id AND o1.order_date < o2.order_date - WHERE o2.order_id IS NULL -) -SELECT p.product_name,p.product_id,ro.order_id,ro.order_date -FROM recent_orders ro -INNER JOIN products_1549 p ON ro.product_id = p.product_id -ORDER BY p.product_name,p.product_id,ro.order_id; diff --git a/medium/1555. Bank Account Summary.md b/medium/1555. Bank Account Summary.md new file mode 100644 index 0000000..0441852 --- /dev/null +++ b/medium/1555. Bank Account Summary.md @@ -0,0 +1,95 @@ +# Question 1555: Bank Account Summary + +**LeetCode URL:** https://leetcode.com/problems/bank-account-summary/ + +## Description + +Write an SQL query to report. Return the result table in any order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Users (user_id int, user_name varchar(20), credit int); +Create table If Not Exists Transactions (trans_id int, paid_by int, paid_to int, amount int, transacted_on date); +``` + +## Sample Input Data + +```sql +insert into Users (user_id, user_name, credit) values ('1', 'Moustafa', '100'); +insert into Users (user_id, user_name, credit) values ('2', 'Jonathan', '200'); +insert into Users (user_id, user_name, credit) values ('3', 'Winston', '10000'); +insert into Users (user_id, user_name, credit) values ('4', 'Luis', '800'); +insert into Transactions (trans_id, paid_by, paid_to, amount, transacted_on) values ('1', '1', '3', '400', '2020-08-01'); +insert into Transactions (trans_id, paid_by, paid_to, amount, transacted_on) values ('2', '3', '2', '500', '2020-08-02'); +insert into Transactions (trans_id, paid_by, paid_to, amount, transacted_on) values ('3', '2', '1', '200', '2020-08-03'); +``` + +## Expected Output Data + +```text ++------------+------------+------------+-----------------------+ +| user_id | user_name | credit | credit_limit_breached | ++------------+------------+------------+-----------------------+ +| 1 | Moustafa | -100 | Yes | +| 2 | Jonathan | 500 | No | +| 3 | Winston | 9990 | No | +| 4 | Luis | 800 | No | ++------------+------------+------------+-----------------------+ +``` + +## SQL Solution + +```sql +WITH trans AS ( + SELECT paid_by AS usr,amount*-1 AS amount + FROM transactions_1555 + UNION ALL + SELECT paid_to AS usr,amount + FROM transactions_1555 +), +agg_trans AS ( + SELECT usr,SUM(amount) AS cr + FROM trans + GROUP BY usr +) + +SELECT u.user_id,u.user_name,(COALESCE(t.cr,0)+u.credit) AS credit, + CASE WHEN (COALESCE(t.cr,0)+u.credit) < 0 THEN 'Yes' + ELSE 'No' + END AS credit_limit_breached +FROM users_1555 u +LEFT JOIN agg_trans t ON t.usr = u.user_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id`, `user_name`, `credit`, `credit_limit_breached` from `transactions`, `trans`, `users`, `agg_trans`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`trans`, `agg_trans`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `trans`: reads `transactions`. +3. CTE `agg_trans`: reads `trans`. +4. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Project final output columns: `user_id`, `user_name`, `credit`, `credit_limit_breached`. +6. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1555. Bank Account Summary.sql b/medium/1555. Bank Account Summary.sql deleted file mode 100644 index 0a0d5f8..0000000 --- a/medium/1555. Bank Account Summary.sql +++ /dev/null @@ -1,19 +0,0 @@ -WITH trans AS ( - SELECT paid_by AS usr,amount*-1 AS amount - FROM transactions_1555 - UNION ALL - SELECT paid_to AS usr,amount - FROM transactions_1555 -), -agg_trans AS ( - SELECT usr,SUM(amount) AS cr - FROM trans - GROUP BY usr -) - -SELECT u.user_id,u.user_name,(COALESCE(t.cr,0)+u.credit) AS credit, - CASE WHEN (COALESCE(t.cr,0)+u.credit) < 0 THEN 'Yes' - ELSE 'No' - END AS credit_limit_breached -FROM users_1555 u -LEFT JOIN agg_trans t ON t.usr = u.user_id; diff --git "a/medium/1596. The Most Frequently Ordered Products for Each\nCustomer.md" "b/medium/1596. The Most Frequently Ordered Products for Each\nCustomer.md" new file mode 100644 index 0000000..fa3124a --- /dev/null +++ "b/medium/1596. The Most Frequently Ordered Products for Each\nCustomer.md" @@ -0,0 +1,109 @@ +# Question 1596: The Most Frequently Ordered Products for Each Customer + +**LeetCode URL:** https://leetcode.com/problems/the-most-frequently-ordered-products-for-each-customer/ + +## Description + +Write an SQL query to find the most frequently ordered product(s) for each customer. Return the result table in any order. The query result format is in the following example: Customers +-------------+-------+ | customer_id | name | +-------------+-------+ | 1 | Alice | | 2 | Bob | | 3 | Tom | | 4 | Jerry | | 5 | John | +-------------+-------+ Orders +----------+------------+-------------+------------+ | order_id | order_date | customer_id | product_id | +----------+------------+-------------+------------+ | 1 | 2020-07-31 | 1 | 1 | | 2 | 2020-07-30 | 2 | 2 | | 3 | 2020-08-29 | 3 | 3 | | 4 | 2020-07-29 | 4 | 1 | | 5 | 2020-06-10 | 1 | 2 | | 6 | 2020-08-01 | 2 | 1 | | 7 | 2020-08-01 | 3 | 3 | | 8 | 2020-08-03 | 1 | 2 | | 9 | 2020-08-07 | 2 | 3 | | 10 | 2020-07-15 | 1 | 2 | +----------+------------+-------------+------------+ Products +------------+--------------+-------+ | product_id | product_name | price | +------------+--------------+-------+ | 1 | keyboard | 120 | | 2 | mouse | 80 | | 3 | screen | 600 | | 4 | hard disk | 450 | +------------+--------------+-------+ Result table: +-------------+------------+--------------+ | customer_id | product_id | product_name | +-------------+------------+--------------+ | 1 | 2 | mouse | | 2 | 1 | keyboard | | 2 | 2 | mouse | | 2 | 3 | screen | | 3 | 3 | screen | | 4 | 1 | keyboard | +-------------+------------+--------------+ Alice (customer 1) ordered the mouse three times and the keyboard one time, so the mouse is the most frquently ordered product for them. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, name varchar(10)); +Create table If Not Exists Orders (order_id int, order_date date, customer_id int, product_id int); +Create table If Not Exists Products (product_id int, product_name varchar(20), price int); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, name) values ('1', 'Alice'); +insert into Customers (customer_id, name) values ('2', 'Bob'); +insert into Customers (customer_id, name) values ('3', 'Tom'); +insert into Customers (customer_id, name) values ('4', 'Jerry'); +insert into Customers (customer_id, name) values ('5', 'John'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('1', '2020-07-31', '1', '1'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('2', '2020-7-30', '2', '2'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('3', '2020-08-29', '3', '3'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('4', '2020-07-29', '4', '1'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('5', '2020-06-10', '1', '2'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('6', '2020-08-01', '2', '1'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('7', '2020-08-01', '3', '3'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('8', '2020-08-03', '1', '2'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('9', '2020-08-07', '2', '3'); +insert into Orders (order_id, order_date, customer_id, product_id) values ('10', '2020-07-15', '1', '2'); +insert into Products (product_id, product_name, price) values ('1', 'keyboard', '120'); +insert into Products (product_id, product_name, price) values ('2', 'mouse', '80'); +insert into Products (product_id, product_name, price) values ('3', 'screen', '600'); +insert into Products (product_id, product_name, price) values ('4', 'hard disk', '450'); +``` + +## Expected Output Data + +```text ++-------------+------------+--------------+ +| customer_id | product_id | product_name | ++-------------+------------+--------------+ +| 1 | 2 | mouse | +| 2 | 1 | keyboard | +| 2 | 2 | mouse | +| 2 | 3 | screen | +| 3 | 3 | screen | +| 4 | 1 | keyboard | ++-------------+------------+--------------+ +``` + +## SQL Solution + +```sql +WITH cte AS ( + SELECT customer_id,product_id,COUNT(1) AS cnt + FROM orders_1596 + GROUP BY customer_id,product_id + ORDER BY 1,3 DESC +), +mx AS ( + SELECT *, + MAX(cnt) OVER (PARTITION BY customer_id) AS maximum + FROM cte +) + +SELECT m.customer_id,m.product_id,p.product_name +FROM mx m +INNER JOIN products_1596 p ON m.product_id = p.product_id AND m.cnt = m.maximum +ORDER BY 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `customer_id`, `product_id`, `product_name` from `orders`, `cte`, `mx`, `products`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `mx`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `orders`. +3. CTE `mx`: reads `cte`, computes window metrics. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `customer_id`, `product_id`, `product_name`. +7. Order output deterministically with `ORDER BY 1`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git "a/medium/1596. The Most Frequently Ordered Products for Each\nCustomer.sql" "b/medium/1596. The Most Frequently Ordered Products for Each\nCustomer.sql" deleted file mode 100644 index 38d40b9..0000000 --- "a/medium/1596. The Most Frequently Ordered Products for Each\nCustomer.sql" +++ /dev/null @@ -1,16 +0,0 @@ -WITH cte AS ( - SELECT customer_id,product_id,COUNT(1) AS cnt - FROM orders_1596 - GROUP BY customer_id,product_id - ORDER BY 1,3 DESC -), -mx AS ( - SELECT *, - MAX(cnt) OVER (PARTITION BY customer_id) AS maximum - FROM cte -) - -SELECT m.customer_id,m.product_id,p.product_name -FROM mx m -INNER JOIN products_1596 p ON m.product_id = p.product_id AND m.cnt = m.maximum -ORDER BY 1; diff --git a/medium/1613. Find the Missing IDs.md b/medium/1613. Find the Missing IDs.md new file mode 100644 index 0000000..9bb3350 --- /dev/null +++ b/medium/1613. Find the Missing IDs.md @@ -0,0 +1,89 @@ +# Question 1613: Find the Missing IDs + +**LeetCode URL:** https://leetcode.com/problems/find-the-missing-ids/ + +## Description + +Write an SQL query to find the missing customer IDs. Return the result table ordered by ids in ascending order. The query result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Customers (customer_id int, customer_name varchar(20)); +``` + +## Sample Input Data + +```sql +insert into Customers (customer_id, customer_name) values ('1', 'Alice'); +insert into Customers (customer_id, customer_name) values ('4', 'Bob'); +insert into Customers (customer_id, customer_name) values ('5', 'Charlie'); +``` + +## Expected Output Data + +```text ++-----+ +| ids | ++-----+ +| 2 | +| 3 | ++-----+ +``` + +## SQL Solution + +```sql +WITH RECURSIVE max_id AS( + SELECT MAX(customer_id) AS mx_id + FROM customers_1613 +), +cte AS ( + SELECT 1 AS id + UNION ALL + SELECT id+1 AS id + FROM cte c + INNER JOIN max_id m ON true + WHERE c.id The account was active from "2021-02-01 09:00:00" to "2021-02-01 09:30:00" with two different IP addresses (1 and 2). + +## Table Schema Structure + +```sql +Create table If Not Exists LogInfo (account_id int, ip_address int, login datetime, logout datetime); +``` + +## Sample Input Data + +```sql +insert into LogInfo (account_id, ip_address, login, logout) values ('1', '1', '2021-02-01 09:00:00', '2021-02-01 09:30:00'); +insert into LogInfo (account_id, ip_address, login, logout) values ('1', '2', '2021-02-01 08:00:00', '2021-02-01 11:30:00'); +insert into LogInfo (account_id, ip_address, login, logout) values ('2', '6', '2021-02-01 20:30:00', '2021-02-01 22:00:00'); +insert into LogInfo (account_id, ip_address, login, logout) values ('2', '7', '2021-02-02 20:30:00', '2021-02-02 22:00:00'); +insert into LogInfo (account_id, ip_address, login, logout) values ('3', '9', '2021-02-01 16:00:00', '2021-02-01 16:59:59'); +insert into LogInfo (account_id, ip_address, login, logout) values ('3', '13', '2021-02-01 17:00:00', '2021-02-01 17:59:59'); +insert into LogInfo (account_id, ip_address, login, logout) values ('4', '10', '2021-02-01 16:00:00', '2021-02-01 17:00:00'); +insert into LogInfo (account_id, ip_address, login, logout) values ('4', '11', '2021-02-01 17:00:00', '2021-02-01 17:59:59'); +``` + +## Expected Output Data + +```text ++------------+ +| account_id | ++------------+ +| 1 | +| 4 | ++------------+ +``` + +## SQL Solution + +```sql +SELECT l1.account_id +FROM log_info_1747 l1 +INNER JOIN log_info_1747 l2 ON l1.account_id = l2.account_id AND l1.login BETWEEN l2.login AND l2.logout AND l1.ip_address <> l2.ip_address; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `account_id` from `log_info`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `account_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1747. Leetflex Banned Accounts.sql b/medium/1747. Leetflex Banned Accounts.sql deleted file mode 100644 index 4291736..0000000 --- a/medium/1747. Leetflex Banned Accounts.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT l1.account_id -FROM log_info_1747 l1 -INNER JOIN log_info_1747 l2 ON l1.account_id = l2.account_id AND l1.login BETWEEN l2.login AND l2.logout AND l1.ip_address <> l2.ip_address; diff --git a/medium/176. Second Highest Salary.md b/medium/176. Second Highest Salary.md new file mode 100644 index 0000000..bdfcd1e --- /dev/null +++ b/medium/176. Second Highest Salary.md @@ -0,0 +1,70 @@ +# Question 176: Second Highest Salary + +**LeetCode URL:** https://leetcode.com/problems/second-highest-salary/ + +## Description + +Write a solution to find the second highest distinct salary from the Employee table. return null (return None in Pandas). The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (id int, salary int); +``` + +## Sample Input Data + +```sql +insert into Employee (id, salary) values ('1', '100'); +insert into Employee (id, salary) values ('2', '200'); +insert into Employee (id, salary) values ('3', '300'); +``` + +## Expected Output Data + +```text ++---------------------+ +| SecondHighestSalary | ++---------------------+ +| 200 | ++---------------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT salary +FROM employee_176 +ORDER BY salary DESC +LIMIT 1 +OFFSET 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `salary` from `employee`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Project final output columns: `salary`. +2. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. +3. Order output deterministically with `ORDER BY salary DESC`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/medium/176. Second Highest Salary.sql b/medium/176. Second Highest Salary.sql deleted file mode 100644 index 5832af1..0000000 --- a/medium/176. Second Highest Salary.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT DISTINCT salary -FROM employee_176 -ORDER BY salary DESC -LIMIT 1 -OFFSET 1; diff --git a/medium/177. Nth Highest Salary.md b/medium/177. Nth Highest Salary.md new file mode 100644 index 0000000..acfa7c3 --- /dev/null +++ b/medium/177. Nth Highest Salary.md @@ -0,0 +1,70 @@ +# Question 177: Nth Highest Salary + +**LeetCode URL:** https://leetcode.com/problems/nth-highest-salary/ + +## Description + +Write a solution to find the nth highest distinct salary from the Employee table. return null. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (Id int, Salary int); +``` + +## Sample Input Data + +```sql +insert into Employee (id, salary) values ('1', '100'); +insert into Employee (id, salary) values ('2', '200'); +insert into Employee (id, salary) values ('3', '300'); +``` + +## Expected Output Data + +```text ++------------------------+ +| getNthHighestSalary(2) | ++------------------------+ +| 200 | ++------------------------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT salary +FROM employee_176 +ORDER BY salary DESC +LIMIT 1 +OFFSET N-1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `salary` from `employee`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Project final output columns: `salary`. +2. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. +3. Order output deterministically with `ORDER BY salary DESC`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/medium/177. Nth Highest Salary.sql b/medium/177. Nth Highest Salary.sql deleted file mode 100644 index a56da20..0000000 --- a/medium/177. Nth Highest Salary.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT DISTINCT salary -FROM employee_176 -ORDER BY salary DESC -LIMIT 1 -OFFSET N-1; diff --git a/medium/178. Rank Scores.md b/medium/178. Rank Scores.md new file mode 100644 index 0000000..781365c --- /dev/null +++ b/medium/178. Rank Scores.md @@ -0,0 +1,72 @@ +# Question 178: Rank Scores + +**LeetCode URL:** https://leetcode.com/problems/rank-scores/ + +## Description + +Write a SQL query to rank scores. If there is a tie between two scores, both should have the same ranking. Note that after a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no "holes" between ranks. + +## Table Schema Structure + +```sql +Create table If Not Exists Scores (id int, score DECIMAL(3,2)); +``` + +## Sample Input Data + +```sql +insert into Scores (id, score) values ('1', '3.5'); +insert into Scores (id, score) values ('2', '3.65'); +insert into Scores (id, score) values ('3', '4.0'); +insert into Scores (id, score) values ('4', '3.85'); +insert into Scores (id, score) values ('5', '4.0'); +insert into Scores (id, score) values ('6', '3.65'); +``` + +## Expected Output Data + +```text ++--------+--------+ +| score | rank | ++--------+--------+ +| sample | sample | ++--------+--------+ +``` + +## SQL Solution + +```sql +SELECT score, + DENSE_RANK() OVER(w) as rank +FROM scores_178 +WINDOW w AS (ORDER BY score DESC); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `score`, `rank` from `scores`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +2. Project final output columns: `score`, `rank`. +3. Order output deterministically with `ORDER BY score DESC)`. + +### Why This Works + +Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/178. Rank Scores.sql b/medium/178. Rank Scores.sql deleted file mode 100644 index 5a04658..0000000 --- a/medium/178. Rank Scores.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT score, - DENSE_RANK() OVER(w) as rank -FROM scores_178 -WINDOW w AS (ORDER BY score DESC); diff --git a/medium/1783. Grand Slam Titles.md b/medium/1783. Grand Slam Titles.md new file mode 100644 index 0000000..d4c9520 --- /dev/null +++ b/medium/1783. Grand Slam Titles.md @@ -0,0 +1,88 @@ +# Question 1783: Grand Slam Titles + +**LeetCode URL:** https://leetcode.com/problems/grand-slam-titles/ + +## Description + +Write an SQL query to report the number of grand slam tournaments won by each player. Return the result table in any order. The query result format is in the following example: Players table: +-----------+-------------+ | player_id | player_name | +-----------+-------------+ | 1 | Nadal | | 2 | Federer | | 3 | Novak | +-----------+-------------+ Championships table: +------+-----------+---------+---------+---------+ | year | Wimbledon | Fr_open | US_open | Au_open | +------+-----------+---------+---------+---------+ | 2018 | 1 | 1 | 1 | 1 | | 2019 | 1 | 1 | 2 | 2 | | 2020 | 2 | 1 | 2 | 2 | +------+-----------+---------+---------+---------+ Result table: +-----------+-------------+-------------------+ | player_id | player_name | grand_slams_count | +-----------+-------------+-------------------+ | 2 | Federer | 5 | | 1 | Nadal | 7 | +-----------+-------------+-------------------+ Player 1 (Nadal) won 7 titles: Wimbledon (2018, 2019), Fr_open (2018, 2019, 2020), US_open (2018), and Au_open (2018). + +## Table Schema Structure + +```sql +Create table If Not Exists Players (player_id int, player_name varchar(20)); +Create table If Not Exists Championships (year int, Wimbledon int, Fr_open int, US_open int, Au_open int); +``` + +## Sample Input Data + +```sql +insert into Players (player_id, player_name) values ('1', 'Nadal'); +insert into Players (player_id, player_name) values ('2', 'Federer'); +insert into Players (player_id, player_name) values ('3', 'Novak'); +insert into Championships (year, Wimbledon, Fr_open, US_open, Au_open) values ('2018', '1', '1', '1', '1'); +insert into Championships (year, Wimbledon, Fr_open, US_open, Au_open) values ('2019', '1', '1', '2', '2'); +insert into Championships (year, Wimbledon, Fr_open, US_open, Au_open) values ('2020', '2', '1', '2', '2'); +``` + +## Expected Output Data + +```text ++-----------+-------------+-------------------+ +| player_id | player_name | grand_slams_count | ++-----------+-------------+-------------------+ +| 2 | Federer | 5 | +| 1 | Nadal | 7 | ++-----------+-------------+-------------------+ +``` + +## SQL Solution + +```sql +WITH winners AS ( + SELECT wimbledon AS winner_id FROM championships_1783 + UNION ALL + SELECT fr_open AS winner_id FROM championships_1783 + UNION ALL + SELECT us_open AS winner_id FROM championships_1783 + UNION ALL + SELECT au_open AS winner_id FROM championships_1783 +) + +SELECT p.player_id,p.player_name,COUNT(p.player_id) AS num_wins +FROM winners w +INNER JOIN players_1783 p ON w.winner_id = p.player_id +GROUP BY p.player_id,p.player_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `player_id`, `player_name`, `num_wins` from `championships`, `winners`, `players`. + +### Result Grain + +One row per unique key in `GROUP BY p.player_id,p.player_name`. + +### Step-by-Step Logic + +1. Create CTE layers (`winners`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `winners`: reads `championships`. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Aggregate rows with COUNT grouped by p.player_id,p.player_name. +5. Project final output columns: `player_id`, `player_name`, `num_wins`. +6. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1783. Grand Slam Titles.sql b/medium/1783. Grand Slam Titles.sql deleted file mode 100644 index 2fd654b..0000000 --- a/medium/1783. Grand Slam Titles.sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH winners AS ( - SELECT wimbledon AS winner_id FROM championships_1783 - UNION ALL - SELECT fr_open AS winner_id FROM championships_1783 - UNION ALL - SELECT us_open AS winner_id FROM championships_1783 - UNION ALL - SELECT au_open AS winner_id FROM championships_1783 -) - -SELECT p.player_id,p.player_name,COUNT(p.player_id) AS num_wins -FROM winners w -INNER JOIN players_1783 p ON w.winner_id = p.player_id -GROUP BY p.player_id,p.player_name; diff --git a/medium/180. Consecutive Numbers.md b/medium/180. Consecutive Numbers.md new file mode 100644 index 0000000..77e50c8 --- /dev/null +++ b/medium/180. Consecutive Numbers.md @@ -0,0 +1,126 @@ +# Question 180: Consecutive Numbers + +**LeetCode URL:** https://leetcode.com/problems/consecutive-numbers/ + +## Description + +Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Logs (id int, num int); +``` + +## Sample Input Data + +```sql +insert into Logs (id, num) values ('1', '1'); +insert into Logs (id, num) values ('2', '1'); +insert into Logs (id, num) values ('3', '1'); +insert into Logs (id, num) values ('4', '2'); +insert into Logs (id, num) values ('5', '1'); +insert into Logs (id, num) values ('6', '2'); +insert into Logs (id, num) values ('7', '2'); +``` + +## Expected Output Data + +```text ++-----------------+ +| ConsecutiveNums | ++-----------------+ +| 1 | ++-----------------+ +``` + +## SQL Solution + +```sql +----------------------------------------------------------- +--Solution 1 : +----------------------------------------------------------- +WITH cte AS( + SELECT id,Num, + LEAD(Num,1) OVER() as Next1, + LEAD(Num,2) OVER() as Next2 + FROM logs_180 +) +SELECT DISTINCT Num AS ConsecutiveNums +FROM cte +WHERE Num = Next1 AND Num = Next2; + + +----------------------------------------------------------- +--Solution 2 : +----------------------------------------------------------- + +WITH cte AS( + SELECT id,Num, + LAG(Num) OVER() as Prev, + LEAD(Num) OVER() as Next + FROM logs_180 +) +SELECT DISTINCT Num AS ConsecutiveNums +FROM cte +WHERE Num = Prev AND Num = Next; + + +----------------------------------------------------------- +--Solution 3 : +----------------------------------------------------------- +SELECT DISTINCT l1.Num AS ConsecutiveNums +FROM logs_180 l1 +JOIN logs_180 l2 ON l1.id=l2.id-1 AND l1.Num=l2.Num +JOIN logs_180 l3 ON l1.id=l3.id-2 AND l2.Num=l3.Num + + +----------------------------------------------------------- +--Exensible Solution (Best) : +----------------------------------------------------------- + +WITH ranked AS ( + SELECT *, + (id-ROW_NUMBER() OVER (PARTITION BY num ORDER BY id)) AS diff + FROM logs_180 +) +SELECT DISTINCT num AS "ConsecutiveNums" +FROM ranked +GROUP BY diff,num +HAVING COUNT(id) >= 3; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `logs`, `cte`, `ranked`. + +### Result Grain + +One row per unique key in `GROUP BY diff,num`. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`, `ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `logs`. +3. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Aggregate rows with COUNT grouped by diff,num. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Filter aggregated groups in `HAVING`: COUNT(id) >= 3. +7. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Window expressions calculate comparative metrics without collapsing rows too early. `HAVING` ensures only groups that satisfy business thresholds survive. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/180. Consecutive Numbers.sql b/medium/180. Consecutive Numbers.sql deleted file mode 100644 index ccf81c1..0000000 --- a/medium/180. Consecutive Numbers.sql +++ /dev/null @@ -1,51 +0,0 @@ ------------------------------------------------------------ ---Solution 1 : ------------------------------------------------------------ -WITH cte AS( - SELECT id,Num, - LEAD(Num,1) OVER() as Next1, - LEAD(Num,2) OVER() as Next2 - FROM logs_180 -) -SELECT DISTINCT Num AS ConsecutiveNums -FROM cte -WHERE Num = Next1 AND Num = Next2; - - ------------------------------------------------------------ ---Solution 2 : ------------------------------------------------------------ - -WITH cte AS( - SELECT id,Num, - LAG(Num) OVER() as Prev, - LEAD(Num) OVER() as Next - FROM logs_180 -) -SELECT DISTINCT Num AS ConsecutiveNums -FROM cte -WHERE Num = Prev AND Num = Next; - - ------------------------------------------------------------ ---Solution 3 : ------------------------------------------------------------ -SELECT DISTINCT l1.Num AS ConsecutiveNums -FROM logs_180 l1 -JOIN logs_180 l2 ON l1.id=l2.id-1 AND l1.Num=l2.Num -JOIN logs_180 l3 ON l1.id=l3.id-2 AND l2.Num=l3.Num - - ------------------------------------------------------------ ---Exensible Solution (Best) : ------------------------------------------------------------ - -WITH ranked AS ( - SELECT *, - (id-ROW_NUMBER() OVER (PARTITION BY num ORDER BY id)) AS diff - FROM logs_180 -) -SELECT DISTINCT num AS "ConsecutiveNums" -FROM ranked -GROUP BY diff,num -HAVING COUNT(id) >= 3; diff --git a/medium/1811. Find Interview Candidates (Medium).md b/medium/1811. Find Interview Candidates (Medium).md new file mode 100644 index 0000000..8ca7545 --- /dev/null +++ b/medium/1811. Find Interview Candidates (Medium).md @@ -0,0 +1,158 @@ +# Question 1811: Find Interview Candidates + +**LeetCode URL:** https://leetcode.com/problems/find-interview-candidates/ + +## Description + +Write an SQL query to report the name and the mail of all interview candidates. Return the result table in any order. The query result format is in the following example: Contests table: +------------+------------+--------------+--------------+ | contest_id | gold_medal | silver_medal | bronze_medal | +------------+------------+--------------+--------------+ | 190 | 1 | 5 | 2 | | 191 | 2 | 3 | 5 | | 192 | 5 | 2 | 3 | | 193 | 1 | 3 | 5 | | 194 | 4 | 5 | 2 | | 195 | 4 | 2 | 1 | | 196 | 1 | 5 | 2 | +------------+------------+--------------+--------------+ Users table: +---------+--------------------+-------+ | user_id | mail | name | +---------+--------------------+-------+ | 1 | sarah@leetcode. + +## Table Schema Structure + +```sql +Create table If Not Exists Contests (contest_id int, gold_medal int, silver_medal int, bronze_medal int); +Create table If Not Exists Users (user_id int, mail varchar(50), name varchar(30)); +``` + +## Sample Input Data + +```sql +insert into Contests (contest_id, gold_medal, silver_medal, bronze_medal) values ('190', '1', '5', '2'); +insert into Contests (contest_id, gold_medal, silver_medal, bronze_medal) values ('191', '2', '3', '5'); +insert into Contests (contest_id, gold_medal, silver_medal, bronze_medal) values ('192', '5', '2', '3'); +insert into Contests (contest_id, gold_medal, silver_medal, bronze_medal) values ('193', '1', '3', '5'); +insert into Contests (contest_id, gold_medal, silver_medal, bronze_medal) values ('194', '4', '5', '2'); +insert into Contests (contest_id, gold_medal, silver_medal, bronze_medal) values ('195', '4', '2', '1'); +insert into Contests (contest_id, gold_medal, silver_medal, bronze_medal) values ('196', '1', '5', '2'); +insert into Users (user_id, mail, name) values ('1', 'sarah@leetcode.com', 'Sarah'); +insert into Users (user_id, mail, name) values ('2', 'bob@leetcode.com', 'Bob'); +insert into Users (user_id, mail, name) values ('3', 'alice@leetcode.com', 'Alice'); +insert into Users (user_id, mail, name) values ('4', 'hercy@leetcode.com', 'Hercy'); +insert into Users (user_id, mail, name) values ('5', 'quarz@leetcode.com', 'Quarz'); +``` + +## Expected Output Data + +```text ++-------+--------------------+ +| name | mail | ++-------+--------------------+ +| Sarah | sarah@leetcode.com | +| Bob | bob@leetcode.com | +| Alice | alice@leetcode.com | +| Quarz | quarz@leetcode.com | ++-------+--------------------+ +``` + +## SQL Solution + +```sql +-- (Works for 3 consecutive contests only) + + +WITH gold_medal_users AS ( + SELECT DISTINCT gold_medal AS usr + FROM contests_1811 + GROUP BY gold_medal + HAVING COUNT(contest_id) = 3 +), +all_users AS ( + SELECT gold_medal AS usr,contest_id FROM contests_1811 + UNION ALL + SELECT silver_medal AS usr,contest_id FROM contests_1811 + UNION ALL + SELECT bronz_medal AS usr,contest_id FROM contests_1811 +), +consecutive_medal_users AS ( + SELECT DISTINCT a1.usr + FROM all_users a1 + INNER JOIN all_users a2 ON a1.usr = a2.usr AND a1.contest_id - 1 = a2.contest_id + INNER JOIN all_users a3 ON a1.usr = a3.usr AND a1.contest_id + 1 = a3.contest_id +), +inerview_candidates AS ( + SELECT usr + FROM gold_medal_users + UNION + SELECT usr + FROM consecutive_medal_users +) +SELECT name,mail +FROM inerview_candidates ic +INNER JOIN users_1811 u ON ic.usr = u.user_id; + + + +--OR-- (Generic Query : Works for any number of consecutive contests) + + + +WITH gold_medal_users AS ( + SELECT DISTINCT gold_medal AS usr + FROM contests_1811 + GROUP BY gold_medal + HAVING COUNT(contest_id) = 3 +), +all_users AS ( + SELECT gold_medal AS usr,contest_id FROM contests_1811 + UNION ALL + SELECT silver_medal AS usr,contest_id FROM contests_1811 + UNION ALL + SELECT bronz_medal AS usr,contest_id FROM contests_1811 +), +ranked_users AS ( + SELECT *, + contest_id-ROW_NUMBER() OVER (PARTITION BY usr ORDER BY contest_id) AS diff + FROM all_users +), +consecutive_medal_users AS ( + SELECT usr,contest_id, + COUNT(diff) OVER (PARTITION BY usr,diff) AS num_consecutive_contests + FROM ranked_users +), +inerview_candidates AS ( + SELECT usr + FROM gold_medal_users + UNION + SELECT DISTINCT usr + FROM consecutive_medal_users + WHERE num_consecutive_contests >= 3 +) +SELECT name,mail +FROM inerview_candidates ic +INNER JOIN users_1811 u ON ic.usr = u.user_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name`, `mail` from `contests`, `all_users`, `gold_medal_users`, `consecutive_medal_users`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`gold_medal_users`, `all_users`, `consecutive_medal_users`, `inerview_candidates`, `ranked_users`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `gold_medal_users`: reads `contests`. +3. CTE `all_users`: reads `contests`. +4. CTE `consecutive_medal_users`: reads `all_users`, joins related entities. +5. CTE `inerview_candidates`: reads `gold_medal_users`, `consecutive_medal_users`. +6. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +7. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +8. Project final output columns: `name`, `mail`. +9. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/1811. Find Interview Candidates (Medium).sql b/medium/1811. Find Interview Candidates (Medium).sql deleted file mode 100644 index 624e534..0000000 --- a/medium/1811. Find Interview Candidates (Medium).sql +++ /dev/null @@ -1,73 +0,0 @@ --- (Works for 3 consecutive contests only) - - -WITH gold_medal_users AS ( - SELECT DISTINCT gold_medal AS usr - FROM contests_1811 - GROUP BY gold_medal - HAVING COUNT(contest_id) = 3 -), -all_users AS ( - SELECT gold_medal AS usr,contest_id FROM contests_1811 - UNION ALL - SELECT silver_medal AS usr,contest_id FROM contests_1811 - UNION ALL - SELECT bronz_medal AS usr,contest_id FROM contests_1811 -), -consecutive_medal_users AS ( - SELECT DISTINCT a1.usr - FROM all_users a1 - INNER JOIN all_users a2 ON a1.usr = a2.usr AND a1.contest_id - 1 = a2.contest_id - INNER JOIN all_users a3 ON a1.usr = a3.usr AND a1.contest_id + 1 = a3.contest_id -), -inerview_candidates AS ( - SELECT usr - FROM gold_medal_users - UNION - SELECT usr - FROM consecutive_medal_users -) -SELECT name,mail -FROM inerview_candidates ic -INNER JOIN users_1811 u ON ic.usr = u.user_id; - - - ---OR-- (Generic Query : Works for any number of consecutive contests) - - - -WITH gold_medal_users AS ( - SELECT DISTINCT gold_medal AS usr - FROM contests_1811 - GROUP BY gold_medal - HAVING COUNT(contest_id) = 3 -), -all_users AS ( - SELECT gold_medal AS usr,contest_id FROM contests_1811 - UNION ALL - SELECT silver_medal AS usr,contest_id FROM contests_1811 - UNION ALL - SELECT bronz_medal AS usr,contest_id FROM contests_1811 -), -ranked_users AS ( - SELECT *, - contest_id-ROW_NUMBER() OVER (PARTITION BY usr ORDER BY contest_id) AS diff - FROM all_users -), -consecutive_medal_users AS ( - SELECT usr,contest_id, - COUNT(diff) OVER (PARTITION BY usr,diff) AS num_consecutive_contests - FROM ranked_users -), -inerview_candidates AS ( - SELECT usr - FROM gold_medal_users - UNION - SELECT DISTINCT usr - FROM consecutive_medal_users - WHERE num_consecutive_contests >= 3 -) -SELECT name,mail -FROM inerview_candidates ic -INNER JOIN users_1811 u ON ic.usr = u.user_id; diff --git a/medium/1831. Maximum Transaction Each Day (Medium).md b/medium/1831. Maximum Transaction Each Day (Medium).md new file mode 100644 index 0000000..68fbf87 --- /dev/null +++ b/medium/1831. Maximum Transaction Each Day (Medium).md @@ -0,0 +1,112 @@ +# Question 1831: Maximum Transaction Each Day + +**LeetCode URL:** https://leetcode.com/problems/maximum-transaction-each-day/ + +## Description + +Write an SQL query to report the IDs of the transactions with the maximum amount on their respective day. return all of them. The query result format is in the following example: Transactions table: +----------------+--------------------+--------+ | transaction_id | day | amount | +----------------+--------------------+--------+ | 8 | 2021-4-3 15:57:28 | 57 | | 9 | 2021-4-28 08:47:25 | 21 | | 1 | 2021-4-29 13:28:30 | 58 | | 5 | 2021-4-28 16:39:59 | 40 | | 6 | 2021-4-29 23:39:28 | 58 | +----------------+--------------------+--------+ Result table: +----------------+ | transaction_id | +----------------+ | 1 | | 5 | | 6 | | 8 | +----------------+ "2021-4-3" --> We have one transaction with ID 8, so we add 8 to the result table. + +## Table Schema Structure + +```sql +Create table If Not Exists Transactions (transaction_id int, day date, amount int); +``` + +## Sample Input Data + +```sql +insert into Transactions (transaction_id, day, amount) values ('8', '2021-4-3 15:57:28', '57'); +insert into Transactions (transaction_id, day, amount) values ('9', '2021-4-28 08:47:25', '21'); +insert into Transactions (transaction_id, day, amount) values ('1', '2021-4-29 13:28:30', '58'); +insert into Transactions (transaction_id, day, amount) values ('5', '2021-4-28 16:39:59', '40'); +insert into Transactions (transaction_id, day, amount) values ('6', '2021-4-29 23:39:28', '58'); +``` + +## Expected Output Data + +```text ++----------------+ +| transaction_id | ++----------------+ +| 1 | +| 5 | +| 6 | +| 8 | ++----------------+ +``` + +## SQL Solution + +```sql +-- (Using MAX() function) + +WITH updated_transactions AS ( + SELECT *, + MAX(amount) OVER (PARTITION BY DATE_TRUNC('DAY',day)) as max_amount + FROM transactions_1831 +) +SELECT transactions_id +FROM updated_transactions +WHERE amount = max_amount +ORDER BY transactions_id; + +--OR (Without using MAX() function) + +WITH updated_transactions AS ( + SELECT *, + RANK() OVER (PARTITION BY DATE_TRUNC('DAY',day) ORDER BY amount DESC) as rn + FROM transactions_1831 +) +SELECT transactions_id +FROM updated_transactions +WHERE rn= 1 +ORDER BY transactions_id; + +--OR (Without using window functions) + +WITH max_amounts AS ( + SELECT DISTINCT t1.day,t1.amount AS max_amt + FROM transactions_1831 t1 + LEFT JOIN transactions_1831 t2 ON DATE_TRUNC('DAY',t1.day)=DATE_TRUNC('DAY',t2.day) AND t2.amount>t1.amount + WHERE t2.transactions_id IS NULL +) +SELECT transactions_id +FROM transactions_1831 +WHERE (day,amount) IN (SELECT * FROM max_amounts) +ORDER BY transactions_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `transactions_id` from `transactions`, `updated_transactions`, `max_amounts`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`updated_transactions`, `max_amounts`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `updated_transactions`: reads `transactions`, computes window metrics. +3. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Apply row-level filtering in `WHERE`: (day,amount) IN (SELECT * FROM max_amounts). +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `transactions_id`. +7. Order output deterministically with `ORDER BY transactions_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1831. Maximum Transaction Each Day (Medium).sql b/medium/1831. Maximum Transaction Each Day (Medium).sql deleted file mode 100644 index ef42bad..0000000 --- a/medium/1831. Maximum Transaction Each Day (Medium).sql +++ /dev/null @@ -1,36 +0,0 @@ --- (Using MAX() function) - -WITH updated_transactions AS ( - SELECT *, - MAX(amount) OVER (PARTITION BY DATE_TRUNC('DAY',day)) as max_amount - FROM transactions_1831 -) -SELECT transactions_id -FROM updated_transactions -WHERE amount = max_amount -ORDER BY transactions_id; - ---OR (Without using MAX() function) - -WITH updated_transactions AS ( - SELECT *, - RANK() OVER (PARTITION BY DATE_TRUNC('DAY',day) ORDER BY amount DESC) as rn - FROM transactions_1831 -) -SELECT transactions_id -FROM updated_transactions -WHERE rn= 1 -ORDER BY transactions_id; - ---OR (Without using window functions) - -WITH max_amounts AS ( - SELECT DISTINCT t1.day,t1.amount AS max_amt - FROM transactions_1831 t1 - LEFT JOIN transactions_1831 t2 ON DATE_TRUNC('DAY',t1.day)=DATE_TRUNC('DAY',t2.day) AND t2.amount>t1.amount - WHERE t2.transactions_id IS NULL -) -SELECT transactions_id -FROM transactions_1831 -WHERE (day,amount) IN (SELECT * FROM max_amounts) -ORDER BY transactions_id; diff --git a/medium/184. Department Highest Salary.md b/medium/184. Department Highest Salary.md new file mode 100644 index 0000000..148fb79 --- /dev/null +++ b/medium/184. Department Highest Salary.md @@ -0,0 +1,95 @@ +# Question 184: Department Highest Salary + +**LeetCode URL:** https://leetcode.com/problems/department-highest-salary/ + +## Description + +Write a solution to find employees who have the highest salary in each of the departments. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (id int, name varchar(255), salary int, departmentId int); +Create table If Not Exists Department (id int, name varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Employee (id, name, salary, departmentId) values ('1', 'Joe', '70000', '1'); +insert into Employee (id, name, salary, departmentId) values ('2', 'Jim', '90000', '1'); +insert into Employee (id, name, salary, departmentId) values ('3', 'Henry', '80000', '2'); +insert into Employee (id, name, salary, departmentId) values ('4', 'Sam', '60000', '2'); +insert into Employee (id, name, salary, departmentId) values ('5', 'Max', '90000', '1'); +insert into Department (id, name) values ('1', 'IT'); +insert into Department (id, name) values ('2', 'Sales'); +``` + +## Expected Output Data + +```text ++------------+----------+--------+ +| Department | Employee | Salary | ++------------+----------+--------+ +| IT | Jim | 90000 | +| Sales | Henry | 80000 | +| IT | Max | 90000 | ++------------+----------+--------+ +``` + +## SQL Solution + +```sql +SELECT d.name AS Department,e.name AS Employee,e.salary AS Salary +FROM employee_184 e +JOIN department_184 d ON e.department_id = d.id +WHERE (e.department_id,e.salary) IN (SELECT department_id,MAX(salary) + FROM employee_184 + GROUP BY department_id); + +(OR) + +WITH cte AS( +SELECT d.name AS Department,e.name AS Employee, + DENSE_RANK() OVER w rnk +FROM employee_184 e +JOIN department_184 d ON e.department_id = d.id +WINDOW w AS (PARTITION BY d.name ORDER BY e.salary DESC) +) + +SELECT Department,Employee +FROM cte +WHERE rnk = 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `Department`, `Employee` from `employee`, `department`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `employee`, `department`. +3. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Apply row-level filtering in `WHERE`: rnk = 1. +5. Project final output columns: `Department`, `Employee`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/184. Department Highest Salary.sql b/medium/184. Department Highest Salary.sql deleted file mode 100644 index ff91af1..0000000 --- a/medium/184. Department Highest Salary.sql +++ /dev/null @@ -1,20 +0,0 @@ -SELECT d.name AS Department,e.name AS Employee,e.salary AS Salary -FROM employee_184 e -JOIN department_184 d ON e.department_id = d.id -WHERE (e.department_id,e.salary) IN (SELECT department_id,MAX(salary) - FROM employee_184 - GROUP BY department_id); - -(OR) - -WITH cte AS( -SELECT d.name AS Department,e.name AS Employee, - DENSE_RANK() OVER w rnk -FROM employee_184 e -JOIN department_184 d ON e.department_id = d.id -WINDOW w AS (PARTITION BY d.name ORDER BY e.salary DESC) -) - -SELECT Department,Employee -FROM cte -WHERE rnk = 1; diff --git a/medium/1841. League Statistics (Medium).md b/medium/1841. League Statistics (Medium).md new file mode 100644 index 0000000..4bdfcbb --- /dev/null +++ b/medium/1841. League Statistics (Medium).md @@ -0,0 +1,106 @@ +# Question 1841: League Statistics + +**LeetCode URL:** https://leetcode.com/problems/league-statistics/ + +## Description + +Write an SQL query to report the statistics of the league. Return the result table in descending order by points. The query result format is in the following example: Teams table: +---------+-----------+ | team_id | team_name | +---------+-----------+ | 1 | Ajax | | 4 | Dortmund | | 6 | Arsenal | +---------+-----------+ Matches table: +--------------+--------------+-----------------+-----------------+ | home_team_id | away_team_id | home_team_goals | away_team_goals | +--------------+--------------+-----------------+-----------------+ | 1 | 4 | 0 | 1 | | 1 | 6 | 3 | 3 | | 4 | 1 | 5 | 2 | | 6 | 1 | 0 | 0 | +--------------+--------------+-----------------+-----------------+ Result table: +-----------+----------------+--------+----------+--------------+-----------+ | team_name | matches_played | points | goal_for | goal_against | goal_diff | +-----------+----------------+--------+----------+--------------+-----------+ | Dortmund | 2 | 6 | 6 | 2 | 4 | | Arsenal | 2 | 2 | 3 | 3 | 0 | | Ajax | 4 | 2 | 5 | 9 | -4 | +-----------+----------------+--------+----------+--------------+-----------+ Ajax (team_id=1) played 4 matches: 2 losses and 2 draws. + +## Table Schema Structure + +```sql +Create table If Not Exists Teams (team_id int, team_name varchar(20)); +Create table If Not Exists Matches (home_team_id int, away_team_id int, home_team_goals int, away_team_goals int); +``` + +## Sample Input Data + +```sql +insert into Teams (team_id, team_name) values ('1', 'Ajax'); +insert into Teams (team_id, team_name) values ('4', 'Dortmund'); +insert into Teams (team_id, team_name) values ('6', 'Arsenal'); +insert into Matches (home_team_id, away_team_id, home_team_goals, away_team_goals) values ('1', '4', '0', '1'); +insert into Matches (home_team_id, away_team_id, home_team_goals, away_team_goals) values ('1', '6', '3', '3'); +insert into Matches (home_team_id, away_team_id, home_team_goals, away_team_goals) values ('4', '1', '5', '2'); +insert into Matches (home_team_id, away_team_id, home_team_goals, away_team_goals) values ('6', '1', '0', '0'); +``` + +## Expected Output Data + +```text ++-----------+----------------+--------+----------+--------------+-----------+ +| team_name | matches_played | points | goal_for | goal_against | goal_diff | ++-----------+----------------+--------+----------+--------------+-----------+ +| Dortmund | 2 | 6 | 6 | 2 | 4 | +| Arsenal | 2 | 2 | 3 | 3 | 0 | +| Ajax | 4 | 2 | 5 | 9 | -4 | ++-----------+----------------+--------+----------+--------------+-----------+ +``` + +## SQL Solution + +```sql +WITH all_matches AS( + SELECT home_team_id,away_team_id,home_team_goals,away_team_goals + FROM matches_1841 + UNION + SELECT away_team_id,home_team_id,away_team_goals,home_team_goals + FROM matches_1841 +), +report_data AS ( + SELECT *, + CASE WHEN home_team_goals < away_team_goals THEN 0 + WHEN home_team_goals > away_team_goals THEN 3 + ELSE 1 + END AS home_team_points, + CASE WHEN home_team_goals < away_team_goals THEN 3 + WHEN home_team_goals > away_team_goals THEN 0 + ELSE 1 + END AS away_team_points + FROM all_matches +) +SELECT t.team_name AS team_name, + COUNT(t.team_name) AS matches_played, + SUM(home_team_points) AS points, + SUM(home_team_goals) AS goals_for, + SUM(away_team_goals) AS goals_against, + SUM(home_team_goals)-SUM(away_team_goals) AS goals_diff +FROM report_data r +INNER JOIN teams_1841 t ON r.home_team_id = t.team_id +GROUP BY t.team_name +ORDER BY points DESC,goals_diff DESC,t.team_name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `team_name`, `matches_played`, `points`, `goals_for`, `goals_against`, `goals_diff` from `matches`, `all_matches`, `report_data`, `teams`. + +### Result Grain + +One row per unique key in `GROUP BY t.team_name`. + +### Step-by-Step Logic + +1. Create CTE layers (`all_matches`, `report_data`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `all_matches`: reads `matches`. +3. CTE `report_data`: reads `all_matches`. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Aggregate rows with COUNT, SUM grouped by t.team_name. +6. Project final output columns: `team_name`, `matches_played`, `points`, `goals_for`, `goals_against`, `goals_diff`. +7. Order output deterministically with `ORDER BY points DESC,goals_diff DESC,t.team_name`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1841. League Statistics (Medium).sql b/medium/1841. League Statistics (Medium).sql deleted file mode 100644 index 26aa4be..0000000 --- a/medium/1841. League Statistics (Medium).sql +++ /dev/null @@ -1,29 +0,0 @@ -WITH all_matches AS( - SELECT home_team_id,away_team_id,home_team_goals,away_team_goals - FROM matches_1841 - UNION - SELECT away_team_id,home_team_id,away_team_goals,home_team_goals - FROM matches_1841 -), -report_data AS ( - SELECT *, - CASE WHEN home_team_goals < away_team_goals THEN 0 - WHEN home_team_goals > away_team_goals THEN 3 - ELSE 1 - END AS home_team_points, - CASE WHEN home_team_goals < away_team_goals THEN 3 - WHEN home_team_goals > away_team_goals THEN 0 - ELSE 1 - END AS away_team_points - FROM all_matches -) -SELECT t.team_name AS team_name, - COUNT(t.team_name) AS matches_played, - SUM(home_team_points) AS points, - SUM(home_team_goals) AS goals_for, - SUM(away_team_goals) AS goals_against, - SUM(home_team_goals)-SUM(away_team_goals) AS goals_diff -FROM report_data r -INNER JOIN teams_1841 t ON r.home_team_id = t.team_id -GROUP BY t.team_name -ORDER BY points DESC,goals_diff DESC,t.team_name; diff --git a/medium/1843. Suspicious Bank Accounts (Medium).md b/medium/1843. Suspicious Bank Accounts (Medium).md new file mode 100644 index 0000000..6773a68 --- /dev/null +++ b/medium/1843. Suspicious Bank Accounts (Medium).md @@ -0,0 +1,88 @@ +# Question 1843: Suspicious Bank Accounts + +**LeetCode URL:** https://leetcode.com/problems/suspicious-bank-accounts/ + +## Description + +Write an SQL query to report the IDs of all suspicious bank accounts. Return the result table in ascending order by transaction_id. The query result format is in the following example: Accounts table: +------------+------------+ | account_id | max_income | +------------+------------+ | 3 | 21000 | | 4 | 10400 | +------------+------------+ Transactions table: +----------------+------------+----------+--------+---------------------+ | transaction_id | account_id | type | amount | day | +----------------+------------+----------+--------+---------------------+ | 2 | 3 | Creditor | 107100 | 2021-06-02 11:38:14 | | 4 | 4 | Creditor | 10400 | 2021-06-20 12:39:18 | | 11 | 4 | Debtor | 58800 | 2021-07-23 12:41:55 | | 1 | 4 | Creditor | 49300 | 2021-05-03 16:11:04 | | 15 | 3 | Debtor | 75500 | 2021-05-23 14:40:20 | | 10 | 3 | Creditor | 102100 | 2021-06-15 10:37:16 | | 14 | 4 | Creditor | 56300 | 2021-07-21 12:12:25 | | 19 | 4 | Debtor | 101100 | 2021-05-09 15:21:49 | | 8 | 3 | Creditor | 64900 | 2021-07-26 15:09:56 | | 7 | 3 | Creditor | 90900 | 2021-06-14 11:23:07 | +----------------+------------+----------+--------+---------------------+ Result table: +------------+ | account_id | +------------+ | 3 | +------------+ For account 3: - In 6-2021, the user had an income of 107100 + 102100 + 90900 = 300100. + +## Table Schema Structure + +```sql +Create table If Not Exists Accounts (account_id int, max_income int); +Create table If Not Exists Transactions (transaction_id int, account_id int, type ENUM('creditor', 'debtor'), amount int, day datetime); +``` + +## Sample Input Data + +```sql +insert into Accounts (account_id, max_income) values ('3', '21000'); +insert into Accounts (account_id, max_income) values ('4', '10400'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('2', '3', 'Creditor', '107100', '2021-06-02 11:38:14'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('4', '4', 'Creditor', '10400', '2021-06-20 12:39:18'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('11', '4', 'Debtor', '58800', '2021-07-23 12:41:55'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('1', '4', 'Creditor', '49300', '2021-05-03 16:11:04'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('15', '3', 'Debtor', '75500', '2021-05-23 14:40:20'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('10', '3', 'Creditor', '102100', '2021-06-15 10:37:16'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('14', '4', 'Creditor', '56300', '2021-07-21 12:12:25'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('19', '4', 'Debtor', '101100', '2021-05-09 15:21:49'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('8', '3', 'Creditor', '64900', '2021-07-26 15:09:56'); +insert into Transactions (transaction_id, account_id, type, amount, day) values ('7', '3', 'Creditor', '90900', '2021-06-14 11:23:07'); +``` + +## Expected Output Data + +```text ++------------+ +| account_id | ++------------+ +| 3 | ++------------+ +``` + +## SQL Solution + +```sql +WITH monthly_income_data AS ( + SELECT t.account_id,EXTRACT(MONTH FROM day) AS mnth,SUM(amount) AS total_income + FROM transactions_1843 t + INNER JOIN accounts_1843 a ON t.account_id = a.account_id AND t.type = 'Creditor' + GROUP BY t.account_id,EXTRACT(MONTH FROM day),a.max_income + HAVING SUM(amount)>a.max_income +) +SELECT DISTINCT m1.account_id +FROM monthly_income_data m1 +INNER JOIN monthly_income_data m2 +ON m1.account_id=m2.account_id AND m1.mnth + 1 = m2.mnth; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `account_id` from `day`, `transactions`, `accounts`, `monthly_income_data`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`monthly_income_data`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `monthly_income_data`: reads `day`, `transactions`, `accounts`, joins related entities. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `account_id`. +5. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1843. Suspicious Bank Accounts (Medium).sql b/medium/1843. Suspicious Bank Accounts (Medium).sql deleted file mode 100644 index 331615b..0000000 --- a/medium/1843. Suspicious Bank Accounts (Medium).sql +++ /dev/null @@ -1,11 +0,0 @@ -WITH monthly_income_data AS ( - SELECT t.account_id,EXTRACT(MONTH FROM day) AS mnth,SUM(amount) AS total_income - FROM transactions_1843 t - INNER JOIN accounts_1843 a ON t.account_id = a.account_id AND t.type = 'Creditor' - GROUP BY t.account_id,EXTRACT(MONTH FROM day),a.max_income - HAVING SUM(amount)>a.max_income -) -SELECT DISTINCT m1.account_id -FROM monthly_income_data m1 -INNER JOIN monthly_income_data m2 -ON m1.account_id=m2.account_id AND m1.mnth + 1 = m2.mnth; diff --git a/medium/1867. Orders With Maximum Quantity Above Average (Medium).md b/medium/1867. Orders With Maximum Quantity Above Average (Medium).md new file mode 100644 index 0000000..b188fc9 --- /dev/null +++ b/medium/1867. Orders With Maximum Quantity Above Average (Medium).md @@ -0,0 +1,88 @@ +# Question 1867: Orders With Maximum Quantity Above Average + +**LeetCode URL:** https://leetcode.com/problems/orders-with-maximum-quantity-above-average/ + +## Description + +Drafted from this solution SQL: write a query on `orders_details`, `orders_stat` to return `avg_quantity`. Apply filter conditions: max_quantity > ALL( SELECT avg_quantity FROM orders_stat). Group results by: order_id ) SELECT order_id FROM orders_stat WHERE max_quantity > ALL( SELECT avg_quantity FROM orders_stat). + +## Table Schema Structure + +```sql +Create table If Not Exists OrdersDetails (order_id int, product_id int, quantity int); +``` + +## Sample Input Data + +```sql +insert into OrdersDetails (order_id, product_id, quantity) values ('1', '1', '12'); +insert into OrdersDetails (order_id, product_id, quantity) values ('1', '2', '10'); +insert into OrdersDetails (order_id, product_id, quantity) values ('1', '3', '15'); +insert into OrdersDetails (order_id, product_id, quantity) values ('2', '1', '8'); +insert into OrdersDetails (order_id, product_id, quantity) values ('2', '4', '4'); +insert into OrdersDetails (order_id, product_id, quantity) values ('2', '5', '6'); +insert into OrdersDetails (order_id, product_id, quantity) values ('3', '3', '5'); +insert into OrdersDetails (order_id, product_id, quantity) values ('3', '4', '18'); +insert into OrdersDetails (order_id, product_id, quantity) values ('4', '5', '2'); +insert into OrdersDetails (order_id, product_id, quantity) values ('4', '6', '8'); +insert into OrdersDetails (order_id, product_id, quantity) values ('5', '7', '9'); +insert into OrdersDetails (order_id, product_id, quantity) values ('5', '8', '9'); +insert into OrdersDetails (order_id, product_id, quantity) values ('3', '9', '20'); +insert into OrdersDetails (order_id, product_id, quantity) values ('2', '9', '4'); +``` + +## Expected Output Data + +```text ++--------------+ +| avg_quantity | ++--------------+ +| sample | ++--------------+ +``` + +## SQL Solution + +```sql +WITH orders_stat AS ( + SELECT order_id, + AVG(quantity) AS avg_quantity, + MAX(quantity) AS max_quantity + FROM orders_details_1867 + GROUP BY order_id +) +SELECT order_id +FROM orders_stat +WHERE max_quantity > ALL( SELECT avg_quantity + FROM orders_stat); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `order_id` from `orders_details`, `orders_stat`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`orders_stat`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `orders_stat`: reads `orders_details`. +3. Apply row-level filtering in `WHERE`: max_quantity > ALL( SELECT avg_quantity FROM orders_stat). +4. Project final output columns: `order_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1867. Orders With Maximum Quantity Above Average (Medium).sql b/medium/1867. Orders With Maximum Quantity Above Average (Medium).sql deleted file mode 100644 index bc3014b..0000000 --- a/medium/1867. Orders With Maximum Quantity Above Average (Medium).sql +++ /dev/null @@ -1,11 +0,0 @@ -WITH orders_stat AS ( - SELECT order_id, - AVG(quantity) AS avg_quantity, - MAX(quantity) AS max_quantity - FROM orders_details_1867 - GROUP BY order_id -) -SELECT order_id -FROM orders_stat -WHERE max_quantity > ALL( SELECT avg_quantity - FROM orders_stat); diff --git a/medium/1875. Group Employees of the Same Salary (Medium).md b/medium/1875. Group Employees of the Same Salary (Medium).md new file mode 100644 index 0000000..3b33b8d --- /dev/null +++ b/medium/1875. Group Employees of the Same Salary (Medium).md @@ -0,0 +1,80 @@ +# Question 1875: Group Employees of the Same Salary + +**LeetCode URL:** https://leetcode.com/problems/group-employees-of-the-same-salary/ + +## Description + +Drafted from this solution SQL: write a query on `employees`, `cnts` to return `employee_id`, `name`, `salary`, `team_id`. Apply filter conditions: c <> 1. Order the final output by: salary) AS team_id FROM cnts WHERE c <> 1. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees (employee_id int, name varchar(30), salary int); +``` + +## Sample Input Data + +```sql +insert into Employees (employee_id, name, salary) values ('2', 'Meir', '3000'); +insert into Employees (employee_id, name, salary) values ('3', 'Michael', '3000'); +insert into Employees (employee_id, name, salary) values ('7', 'Addilyn', '7400'); +insert into Employees (employee_id, name, salary) values ('8', 'Juan', '6100'); +insert into Employees (employee_id, name, salary) values ('9', 'Kannon', '7400'); +``` + +## Expected Output Data + +```text ++-------------+--------+--------+---------+ +| employee_id | name | salary | team_id | ++-------------+--------+--------+---------+ +| sample | sample | sample | sample | ++-------------+--------+--------+---------+ +``` + +## SQL Solution + +```sql +WITH cnts AS ( + SELECT *, + COUNT(salary) OVER (PARTITION BY salary) AS c + FROM employees_1875 +) +SELECT employee_id,name,salary, + DENSE_RANK() OVER (ORDER BY salary) AS team_id +FROM cnts +WHERE c <> 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id`, `name`, `salary`, `team_id` from `employees`, `cnts`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cnts`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cnts`: reads `employees`, computes window metrics. +3. Apply row-level filtering in `WHERE`: c <> 1. +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `employee_id`, `name`, `salary`, `team_id`. +6. Order output deterministically with `ORDER BY salary) AS team_id FROM cnts WHERE c <> 1`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1875. Group Employees of the Same Salary (Medium).sql b/medium/1875. Group Employees of the Same Salary (Medium).sql deleted file mode 100644 index 4fdd7cd..0000000 --- a/medium/1875. Group Employees of the Same Salary (Medium).sql +++ /dev/null @@ -1,9 +0,0 @@ -WITH cnts AS ( - SELECT *, - COUNT(salary) OVER (PARTITION BY salary) AS c - FROM employees_1875 -) -SELECT employee_id,name,salary, - DENSE_RANK() OVER (ORDER BY salary) AS team_id -FROM cnts -WHERE c <> 1; diff --git a/medium/1907. Count Salary Categories (Medium).md b/medium/1907. Count Salary Categories (Medium).md new file mode 100644 index 0000000..0661e40 --- /dev/null +++ b/medium/1907. Count Salary Categories (Medium).md @@ -0,0 +1,87 @@ +# Question 1907: Count Salary Categories + +**LeetCode URL:** https://leetcode.com/problems/count-salary-categories/ + +## Description + +Write a solution to calculate the number of bank accounts for each salary category. return 0. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Accounts (account_id int, income int); +``` + +## Sample Input Data + +```sql +insert into Accounts (account_id, income) values ('3', '108939'); +insert into Accounts (account_id, income) values ('2', '12747'); +insert into Accounts (account_id, income) values ('8', '87709'); +insert into Accounts (account_id, income) values ('6', '91796'); +``` + +## Expected Output Data + +```text ++----------------+----------------+ +| category | accounts_count | ++----------------+----------------+ +| Low Salary | 1 | +| Average Salary | 0 | +| High Salary | 3 | ++----------------+----------------+ +``` + +## SQL Solution + +```sql +WITH tagged_accounts AS ( + SELECT *, + CASE WHEN income < 20000 THEN 'Low Salary' + WHEN income >= 20000 AND income <= 50000 THEN 'Average Salary' + ELSE 'High Salary' + END AS salary_tag + FROM accounts_1907 +), +salary_tags AS ( + SELECT UNNEST(ARRAY['Low Salary','Average Salary','High Salary']) AS salary_tag +) +SELECT st.salary_tag,COALESCE(COUNT(account_id),0) AS accounts_count +FROM salary_tags st +LEFT JOIN tagged_accounts ta ON st.salary_tag = ta.salary_tag +GROUP BY st.salary_tag; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `salary_tag`, `accounts_count` from `accounts`, `salary_tags`, `tagged_accounts`. + +### Result Grain + +One row per unique key in `GROUP BY st.salary_tag`. + +### Step-by-Step Logic + +1. Create CTE layers (`tagged_accounts`, `salary_tags`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `tagged_accounts`: reads `accounts`. +3. CTE `salary_tags`: prepares intermediate rows. +4. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Aggregate rows with COUNT grouped by st.salary_tag. +6. Project final output columns: `salary_tag`, `accounts_count`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1907. Count Salary Categories (Medium).sql b/medium/1907. Count Salary Categories (Medium).sql deleted file mode 100644 index be100ca..0000000 --- a/medium/1907. Count Salary Categories (Medium).sql +++ /dev/null @@ -1,15 +0,0 @@ -WITH tagged_accounts AS ( - SELECT *, - CASE WHEN income < 20000 THEN 'Low Salary' - WHEN income >= 20000 AND income <= 50000 THEN 'Average Salary' - ELSE 'High Salary' - END AS salary_tag - FROM accounts_1907 -), -salary_tags AS ( - SELECT UNNEST(ARRAY['Low Salary','Average Salary','High Salary']) AS salary_tag -) -SELECT st.salary_tag,COALESCE(COUNT(account_id),0) AS accounts_count -FROM salary_tags st -LEFT JOIN tagged_accounts ta ON st.salary_tag = ta.salary_tag -GROUP BY st.salary_tag; diff --git a/medium/1934. Confirmation Rate (Medium).md b/medium/1934. Confirmation Rate (Medium).md new file mode 100644 index 0000000..20666ce --- /dev/null +++ b/medium/1934. Confirmation Rate (Medium).md @@ -0,0 +1,88 @@ +# Question 1934: Confirmation Rate + +**LeetCode URL:** https://leetcode.com/problems/confirmation-rate/ + +## Description + +Write a solution to find the confirmation rate of each user. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Signups (user_id int, time_stamp datetime); +Create table If Not Exists Confirmations (user_id int, time_stamp datetime, action ENUM('confirmed','timeout')); +``` + +## Sample Input Data + +```sql +insert into Signups (user_id, time_stamp) values ('3', '2020-03-21 10:16:13'); +insert into Signups (user_id, time_stamp) values ('7', '2020-01-04 13:57:59'); +insert into Signups (user_id, time_stamp) values ('2', '2020-07-29 23:09:44'); +insert into Signups (user_id, time_stamp) values ('6', '2020-12-09 10:39:37'); +insert into Confirmations (user_id, time_stamp, action) values ('3', '2021-01-06 03:30:46', 'timeout'); +insert into Confirmations (user_id, time_stamp, action) values ('3', '2021-07-14 14:00:00', 'timeout'); +insert into Confirmations (user_id, time_stamp, action) values ('7', '2021-06-12 11:57:29', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('7', '2021-06-13 12:58:28', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('7', '2021-06-14 13:59:27', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('2', '2021-01-22 00:00:00', 'confirmed'); +insert into Confirmations (user_id, time_stamp, action) values ('2', '2021-02-28 23:59:59', 'timeout'); +``` + +## Expected Output Data + +```text ++---------+-------------------+ +| user_id | confirmation_rate | ++---------+-------------------+ +| 6 | 0.00 | +| 3 | 0.00 | +| 7 | 1.00 | +| 2 | 0.50 | ++---------+-------------------+ +``` + +## SQL Solution + +```sql +WITH users AS ( + SELECT user_id, + ROUND(COUNT(CASE WHEN action = 'confirmed' THEN 1 ELSE NULL END)::NUMERIC/ + COUNT(1),2) AS confirmation_rate + FROM confirmations_1934 + GROUP BY user_id +) +SELECT s.user_id,COALESCE(u.confirmation_rate,0.00) +FROM signups_1934 s +LEFT JOIN users u ON s.user_id = u.user_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id` from `confirmations`, `signups`, `users`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`users`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `users`: reads `confirmations`. +3. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `user_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/1934. Confirmation Rate (Medium).sql b/medium/1934. Confirmation Rate (Medium).sql deleted file mode 100644 index b3d0960..0000000 --- a/medium/1934. Confirmation Rate (Medium).sql +++ /dev/null @@ -1,10 +0,0 @@ -WITH users AS ( - SELECT user_id, - ROUND(COUNT(CASE WHEN action = 'confirmed' THEN 1 ELSE NULL END)::NUMERIC/ - COUNT(1),2) AS confirmation_rate - FROM confirmations_1934 - GROUP BY user_id -) -SELECT s.user_id,COALESCE(u.confirmation_rate,0.00) -FROM signups_1934 s -LEFT JOIN users u ON s.user_id = u.user_id; diff --git a/medium/1949. Strong Friendship (Medium).md b/medium/1949. Strong Friendship (Medium).md new file mode 100644 index 0000000..9224ca0 --- /dev/null +++ b/medium/1949. Strong Friendship (Medium).md @@ -0,0 +1,94 @@ +# Question 1949: Strong Friendship + +**LeetCode URL:** https://leetcode.com/problems/strong-friendship/ + +## Description + +Drafted from this solution SQL: write a query on `friendship`, `friends` to return `user1_id`, `num_mututal_friends`. Apply filter conditions: f1.user1_id < f2.user1_id GROUP BY f1.user1_id,f2.user1_id HAVING COUNT(f2.user1_id) >=3. Group results by: f1.user1_id,f2.user1_id HAVING COUNT(f2.user1_id) >=3. Keep groups satisfying: COUNT(f2.user1_id) >=3. + +## Table Schema Structure + +```sql +Create table If Not Exists Friendship (user1_id int, user2_id int); +``` + +## Sample Input Data + +```sql +insert into Friendship (user1_id, user2_id) values ('1', '2'); +insert into Friendship (user1_id, user2_id) values ('1', '3'); +insert into Friendship (user1_id, user2_id) values ('2', '3'); +insert into Friendship (user1_id, user2_id) values ('1', '4'); +insert into Friendship (user1_id, user2_id) values ('2', '4'); +insert into Friendship (user1_id, user2_id) values ('1', '5'); +insert into Friendship (user1_id, user2_id) values ('2', '5'); +insert into Friendship (user1_id, user2_id) values ('1', '7'); +insert into Friendship (user1_id, user2_id) values ('3', '7'); +insert into Friendship (user1_id, user2_id) values ('1', '6'); +insert into Friendship (user1_id, user2_id) values ('3', '6'); +insert into Friendship (user1_id, user2_id) values ('2', '6'); +``` + +## Expected Output Data + +```text ++----------+---------------------+ +| user1_id | num_mututal_friends | ++----------+---------------------+ +| sample | sample | ++----------+---------------------+ +``` + +## SQL Solution + +```sql +WITH friends AS ( + SELECT user1_id,user2_id + FROM friendship_1949 + UNION + SELECT user2_id,user1_id + FROM friendship_1949 +) +SELECT f1.user1_id,f2.user1_id,COUNT(f2.user1_id) AS num_mututal_friends +FROM friends f1 +INNER JOIN friends f2 +ON f1.user1_id <> f2.user1_id AND f1.user2_id = f2.user2_id +WHERE f1.user1_id < f2.user1_id +GROUP BY f1.user1_id,f2.user1_id +HAVING COUNT(f2.user1_id) >=3 +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user1_id`, `num_mututal_friends` from `friendship`, `friends`. + +### Result Grain + +One row per unique key in `GROUP BY f1.user1_id,f2.user1_id`. + +### Step-by-Step Logic + +1. Create CTE layers (`friends`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `friends`: reads `friendship`. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Apply row-level filtering in `WHERE`: f1.user1_id < f2.user1_id. +5. Aggregate rows with COUNT grouped by f1.user1_id,f2.user1_id. +6. Project final output columns: `user1_id`, `num_mututal_friends`. +7. Filter aggregated groups in `HAVING`: COUNT(f2.user1_id) >=3. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1949. Strong Friendship (Medium).sql b/medium/1949. Strong Friendship (Medium).sql deleted file mode 100644 index d6f9fc8..0000000 --- a/medium/1949. Strong Friendship (Medium).sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH friends AS ( - SELECT user1_id,user2_id - FROM friendship_1949 - UNION - SELECT user2_id,user1_id - FROM friendship_1949 -) -SELECT f1.user1_id,f2.user1_id,COUNT(f2.user1_id) AS num_mututal_friends -FROM friends f1 -INNER JOIN friends f2 -ON f1.user1_id <> f2.user1_id AND f1.user2_id = f2.user2_id -WHERE f1.user1_id < f2.user1_id -GROUP BY f1.user1_id,f2.user1_id -HAVING COUNT(f2.user1_id) >=3 diff --git a/medium/1951. All the Pairs With the Maximum Number of Common Followers (Medium).md b/medium/1951. All the Pairs With the Maximum Number of Common Followers (Medium).md new file mode 100644 index 0000000..a881b63 --- /dev/null +++ b/medium/1951. All the Pairs With the Maximum Number of Common Followers (Medium).md @@ -0,0 +1,93 @@ +# Question 1951: All the Pairs With the Maximum Number of Common Followers + +**LeetCode URL:** https://leetcode.com/problems/all-the-pairs-with-the-maximum-number-of-common-followers/ + +## Description + +Drafted from this solution SQL: write a query on `relations`, `common_followers`, `max_common_followers` to return `user1`, `user2`. Apply filter conditions: mx_cmn_fwlr = cmn_fwlr. Group results by: r1.user_id,r2.user_id ), max_common_followers AS ( SELECT user1,user2,cmn_fwlr, MAX(cmn_fwlr) OVER () mx_cmn_fwlr FROM common_followers ) SELECT user1,user2 FROM max_common_followers WHERE mx_cmn_fwlr = cmn_fwlr. + +## Table Schema Structure + +```sql +Create table If Not Exists Relations (user_id int, follower_id int); +``` + +## Sample Input Data + +```sql +insert into Relations (user_id, follower_id) values ('1', '3'); +insert into Relations (user_id, follower_id) values ('2', '3'); +insert into Relations (user_id, follower_id) values ('7', '3'); +insert into Relations (user_id, follower_id) values ('1', '4'); +insert into Relations (user_id, follower_id) values ('2', '4'); +insert into Relations (user_id, follower_id) values ('7', '4'); +insert into Relations (user_id, follower_id) values ('1', '5'); +insert into Relations (user_id, follower_id) values ('2', '6'); +insert into Relations (user_id, follower_id) values ('7', '5'); +``` + +## Expected Output Data + +```text ++--------+--------+ +| user1 | user2 | ++--------+--------+ +| sample | sample | ++--------+--------+ +``` + +## SQL Solution + +```sql +WITH common_followers AS ( + SELECT r1.user_id AS user1,r2.user_id AS user2, + COUNT(r1.follower_id) AS cmn_fwlr + FROM relations_1951 r1 + INNER JOIN relations_1951 r2 + ON r1.user_id < r2.user_id AND r1.follower_id = r2.follower_id + GROUP BY r1.user_id,r2.user_id +), +max_common_followers AS ( + SELECT user1,user2,cmn_fwlr, + MAX(cmn_fwlr) OVER () mx_cmn_fwlr + FROM common_followers +) +SELECT user1,user2 +FROM max_common_followers +WHERE mx_cmn_fwlr = cmn_fwlr; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user1`, `user2` from `relations`, `common_followers`, `max_common_followers`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`common_followers`, `max_common_followers`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `common_followers`: reads `relations`, joins related entities. +3. CTE `max_common_followers`: reads `common_followers`, computes window metrics. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: mx_cmn_fwlr = cmn_fwlr. +6. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +7. Project final output columns: `user1`, `user2`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/1951. All the Pairs With the Maximum Number of Common Followers (Medium).sql b/medium/1951. All the Pairs With the Maximum Number of Common Followers (Medium).sql deleted file mode 100644 index f398047..0000000 --- a/medium/1951. All the Pairs With the Maximum Number of Common Followers (Medium).sql +++ /dev/null @@ -1,16 +0,0 @@ -WITH common_followers AS ( - SELECT r1.user_id AS user1,r2.user_id AS user2, - COUNT(r1.follower_id) AS cmn_fwlr - FROM relations_1951 r1 - INNER JOIN relations_1951 r2 - ON r1.user_id < r2.user_id AND r1.follower_id = r2.follower_id - GROUP BY r1.user_id,r2.user_id -), -max_common_followers AS ( - SELECT user1,user2,cmn_fwlr, - MAX(cmn_fwlr) OVER () mx_cmn_fwlr - FROM common_followers -) -SELECT user1,user2 -FROM max_common_followers -WHERE mx_cmn_fwlr = cmn_fwlr; diff --git a/medium/1988. Find Cutoff Score for Each School (Medium).md b/medium/1988. Find Cutoff Score for Each School (Medium).md new file mode 100644 index 0000000..37dcec3 --- /dev/null +++ b/medium/1988. Find Cutoff Score for Each School (Medium).md @@ -0,0 +1,80 @@ +# Question 1988: Find Cutoff Score for Each School + +**LeetCode URL:** https://leetcode.com/problems/find-cutoff-score-for-each-school/ + +## Description + +Drafted from this solution SQL: write a query on `school`, `exam` to return `school_id`, `max_filled_students`. Group results by: s.school_id ORDER BY s.school_id. Order the final output by: s.school_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Schools (school_id int, capacity int); +Create table If Not Exists Exam (score int, student_count int); +``` + +## Sample Input Data + +```sql +insert into Schools (school_id, capacity) values ('11', '151'); +insert into Schools (school_id, capacity) values ('5', '48'); +insert into Schools (school_id, capacity) values ('9', '9'); +insert into Schools (school_id, capacity) values ('10', '99'); +insert into Exam (score, student_count) values ('975', '10'); +insert into Exam (score, student_count) values ('966', '60'); +insert into Exam (score, student_count) values ('844', '76'); +insert into Exam (score, student_count) values ('749', '76'); +insert into Exam (score, student_count) values ('744', '100'); +``` + +## Expected Output Data + +```text ++-----------+---------------------+ +| school_id | max_filled_students | ++-----------+---------------------+ +| sample | sample | ++-----------+---------------------+ +``` + +## SQL Solution + +```sql +SELECT s.school_id,COALESCE(MIN(e.score),-1) AS max_filled_students +FROM school_1988 s +LEFT JOIN exam_1988 e +ON s.capacity >= student_count +GROUP BY s.school_id +ORDER BY s.school_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `school_id`, `max_filled_students` from `school`, `exam`. + +### Result Grain + +One row per unique key in `GROUP BY s.school_id`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with MIN grouped by s.school_id. +3. Project final output columns: `school_id`, `max_filled_students`. +4. Order output deterministically with `ORDER BY s.school_id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1988. Find Cutoff Score for Each School (Medium).sql b/medium/1988. Find Cutoff Score for Each School (Medium).sql deleted file mode 100644 index e285d11..0000000 --- a/medium/1988. Find Cutoff Score for Each School (Medium).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT s.school_id,COALESCE(MIN(e.score),-1) AS max_filled_students -FROM school_1988 s -LEFT JOIN exam_1988 e -ON s.capacity >= student_count -GROUP BY s.school_id -ORDER BY s.school_id; diff --git a/medium/1990. Count the Number of Experiments (Medium).md b/medium/1990. Count the Number of Experiments (Medium).md new file mode 100644 index 0000000..16b9e4a --- /dev/null +++ b/medium/1990. Count the Number of Experiments (Medium).md @@ -0,0 +1,89 @@ +# Question 1990: Count the Number of Experiments + +**LeetCode URL:** https://leetcode.com/problems/count-the-number-of-experiments/ + +## Description + +Drafted from this solution SQL: write a query on `platforms`, `activities`, `combinations`, `experiments` to return `platform`, `experiment_name`, `num_experiments`. Group results by: c.pf,c.act. + +## Table Schema Structure + +```sql +Create table If Not Exists Experiments (experiment_id int, platform ENUM('Android', 'IOS', 'Web'), experiment_name ENUM('Reading', 'Sports', 'Programming')); +``` + +## Sample Input Data + +```sql +insert into Experiments (experiment_id, platform, experiment_name) values ('4', 'IOS', 'Programming'); +insert into Experiments (experiment_id, platform, experiment_name) values ('13', 'IOS', 'Sports'); +insert into Experiments (experiment_id, platform, experiment_name) values ('14', 'Android', 'Reading'); +insert into Experiments (experiment_id, platform, experiment_name) values ('8', 'Web', 'Reading'); +insert into Experiments (experiment_id, platform, experiment_name) values ('12', 'Web', 'Reading'); +insert into Experiments (experiment_id, platform, experiment_name) values ('18', 'Web', 'Programming'); +``` + +## Expected Output Data + +```text ++----------+-----------------+-----------------+ +| platform | experiment_name | num_experiments | ++----------+-----------------+-----------------+ +| sample | sample | sample | ++----------+-----------------+-----------------+ +``` + +## SQL Solution + +```sql +WITH platforms AS ( + SELECT UNNEST(ARRAY['Android','IOS','Web']) AS pf +), +activities AS ( + SELECT UNNEST(ARRAY['Programming','Sports','Reading']) AS act +), +combinations AS ( + SELECT * + FROM platforms p + CROSS JOIN + activities a +) +SELECT c.pf AS platform,c.act AS experiment_name,COALESCE(COUNT(e.experiment_id),0) AS num_experiments +FROM combinations c +LEFT JOIN experiments_1990 e ON e.platform = c.pf AND e.experiment_name = c.act +GROUP BY c.pf,c.act; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `platform`, `experiment_name`, `num_experiments` from `platforms`, `activities`, `combinations`, `experiments`. + +### Result Grain + +One row per unique key in `GROUP BY c.pf,c.act`. + +### Step-by-Step Logic + +1. Create CTE layers (`platforms`, `activities`, `combinations`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `platforms`: prepares intermediate rows. +3. CTE `activities`: prepares intermediate rows. +4. CTE `combinations`: reads `platforms`, `activities`. +5. Combine datasets using LEFT JOIN, CROSS JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +6. Aggregate rows with COUNT grouped by c.pf,c.act. +7. Project final output columns: `platform`, `experiment_name`, `num_experiments`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/1990. Count the Number of Experiments (Medium).sql b/medium/1990. Count the Number of Experiments (Medium).sql deleted file mode 100644 index fc2ef7a..0000000 --- a/medium/1990. Count the Number of Experiments (Medium).sql +++ /dev/null @@ -1,16 +0,0 @@ -WITH platforms AS ( - SELECT UNNEST(ARRAY['Android','IOS','Web']) AS pf -), -activities AS ( - SELECT UNNEST(ARRAY['Programming','Sports','Reading']) AS act -), -combinations AS ( - SELECT * - FROM platforms p - CROSS JOIN - activities a -) -SELECT c.pf AS platform,c.act AS experiment_name,COALESCE(COUNT(e.experiment_id),0) AS num_experiments -FROM combinations c -LEFT JOIN experiments_1990 e ON e.platform = c.pf AND e.experiment_name = c.act -GROUP BY c.pf,c.act; diff --git a/medium/2020. Number of Accounts That Did Not Stream (Medium).md b/medium/2020. Number of Accounts That Did Not Stream (Medium).md new file mode 100644 index 0000000..29c9d4f --- /dev/null +++ b/medium/2020. Number of Accounts That Did Not Stream (Medium).md @@ -0,0 +1,84 @@ +# Question 2020: Number of Accounts That Did Not Stream + +**LeetCode URL:** https://leetcode.com/problems/number-of-accounts-that-did-not-stream/ + +## Description + +Drafted from this solution SQL: write a query on `subscriptions`, `start_date`, `end_date`, `streams`, `stream_date` to return the required result columns. Apply filter conditions: EXTRACT(YEAR FROM stream_date) <> 2021 AND account_id IN (SELECT * FROM accounts). + +## Table Schema Structure + +```sql +Create table If Not Exists Subscriptions (account_id int, start_date date, end_date date); +Create table If Not Exists Streams (session_id int, account_id int, stream_date date); +``` + +## Sample Input Data + +```sql +insert into Subscriptions (account_id, start_date, end_date) values ('9', '2020-02-18', '2021-10-30'); +insert into Subscriptions (account_id, start_date, end_date) values ('3', '2021-09-21', '2021-11-13'); +insert into Subscriptions (account_id, start_date, end_date) values ('11', '2020-02-28', '2020-08-18'); +insert into Subscriptions (account_id, start_date, end_date) values ('13', '2021-04-20', '2021-09-22'); +insert into Subscriptions (account_id, start_date, end_date) values ('4', '2020-10-26', '2021-05-08'); +insert into Subscriptions (account_id, start_date, end_date) values ('5', '2020-09-11', '2021-01-17'); +insert into Streams (session_id, account_id, stream_date) values ('14', '9', '2020-05-16'); +insert into Streams (session_id, account_id, stream_date) values ('16', '3', '2021-10-27'); +insert into Streams (session_id, account_id, stream_date) values ('18', '11', '2020-04-29'); +insert into Streams (session_id, account_id, stream_date) values ('17', '13', '2021-08-08'); +insert into Streams (session_id, account_id, stream_date) values ('19', '4', '2020-12-31'); +insert into Streams (session_id, account_id, stream_date) values ('13', '5', '2021-01-05'); +``` + +## Expected Output Data + +```text ++------------------+ +| result | ++------------------+ +| derived values | ++------------------+ +``` + +## SQL Solution + +```sql +WITH accounts AS ( + SELECT account_id + FROM subscriptions_2020 + WHERE EXTRACT(YEAR FROM start_date)<=2021 AND EXTRACT(YEAR FROM end_date)>=2021 +) +SELECT COUNT(DISTINCT account_id) AS accounts_count +FROM streams_2020 +WHERE EXTRACT(YEAR FROM stream_date) <> 2021 AND account_id IN (SELECT * FROM accounts); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `accounts_count` from `subscriptions`, `start_date`, `end_date`, `streams`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`accounts`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `accounts`: reads `subscriptions`, `start_date`, `end_date`. +3. Apply row-level filtering in `WHERE`: EXTRACT(YEAR FROM stream_date) <> 2021 AND account_id IN (SELECT * FROM accounts). +4. Project final output columns: `accounts_count`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/2020. Number of Accounts That Did Not Stream (Medium).sql b/medium/2020. Number of Accounts That Did Not Stream (Medium).sql deleted file mode 100644 index bd7054d..0000000 --- a/medium/2020. Number of Accounts That Did Not Stream (Medium).sql +++ /dev/null @@ -1,8 +0,0 @@ -WITH accounts AS ( - SELECT account_id - FROM subscriptions_2020 - WHERE EXTRACT(YEAR FROM start_date)<=2021 AND EXTRACT(YEAR FROM end_date)>=2021 -) -SELECT COUNT(DISTINCT account_id) AS accounts_count -FROM streams_2020 -WHERE EXTRACT(YEAR FROM stream_date) <> 2021 AND account_id IN (SELECT * FROM accounts); diff --git a/medium/2041. Accepted Candidates From the Interviews (Medium).md b/medium/2041. Accepted Candidates From the Interviews (Medium).md new file mode 100644 index 0000000..d9e7b00 --- /dev/null +++ b/medium/2041. Accepted Candidates From the Interviews (Medium).md @@ -0,0 +1,88 @@ +# Question 2041: Accepted Candidates From the Interviews + +**LeetCode URL:** https://leetcode.com/problems/accepted-candidates-from-the-interviews/ + +## Description + +Drafted from this solution SQL: write a query on `candidates`, `rounds` to return `interview_id`. Apply filter conditions: years_of_exp >= 2 AND interview_id IN (SELECT interview_id FROM rounds_2041. Group results by: interview_id. Keep groups satisfying: SUM(score) > 15). + +## Table Schema Structure + +```sql +Create table If Not Exists Candidates (candidate_id int, name varchar(30), years_of_exp int, interview_id int); +Create table If Not Exists Rounds (interview_id int, round_id int, score int); +``` + +## Sample Input Data + +```sql +insert into Candidates (candidate_id, name, years_of_exp, interview_id) values ('11', 'Atticus', '1', '101'); +insert into Candidates (candidate_id, name, years_of_exp, interview_id) values ('9', 'Ruben', '6', '104'); +insert into Candidates (candidate_id, name, years_of_exp, interview_id) values ('6', 'Aliza', '10', '109'); +insert into Candidates (candidate_id, name, years_of_exp, interview_id) values ('8', 'Alfredo', '0', '107'); +insert into Rounds (interview_id, round_id, score) values ('109', '3', '4'); +insert into Rounds (interview_id, round_id, score) values ('101', '2', '8'); +insert into Rounds (interview_id, round_id, score) values ('109', '4', '1'); +insert into Rounds (interview_id, round_id, score) values ('107', '1', '3'); +insert into Rounds (interview_id, round_id, score) values ('104', '3', '6'); +insert into Rounds (interview_id, round_id, score) values ('109', '1', '4'); +insert into Rounds (interview_id, round_id, score) values ('104', '4', '7'); +insert into Rounds (interview_id, round_id, score) values ('104', '1', '2'); +insert into Rounds (interview_id, round_id, score) values ('109', '2', '1'); +insert into Rounds (interview_id, round_id, score) values ('104', '2', '7'); +insert into Rounds (interview_id, round_id, score) values ('107', '2', '3'); +insert into Rounds (interview_id, round_id, score) values ('101', '1', '8'); +``` + +## Expected Output Data + +```text ++--------------+ +| interview_id | ++--------------+ +| sample | ++--------------+ +``` + +## SQL Solution + +```sql +SELECT candidate_id +FROM candidates_2041 +WHERE years_of_exp >= 2 AND + interview_id IN (SELECT interview_id + FROM rounds_2041 + GROUP BY interview_id + HAVING SUM(score) > 15); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `candidate_id` from `candidates`, `rounds`. + +### Result Grain + +One row per unique key in `GROUP BY interview_id`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: years_of_exp >= 2 AND interview_id IN (SELECT interview_id FROM rounds_2041. +2. Aggregate rows with SUM grouped by interview_id. +3. Project final output columns: `candidate_id`. +4. Filter aggregated groups in `HAVING`: SUM(score) > 15). + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/2041. Accepted Candidates From the Interviews (Medium).sql b/medium/2041. Accepted Candidates From the Interviews (Medium).sql deleted file mode 100644 index 0a14dbf..0000000 --- a/medium/2041. Accepted Candidates From the Interviews (Medium).sql +++ /dev/null @@ -1,7 +0,0 @@ -SELECT candidate_id -FROM candidates_2041 -WHERE years_of_exp >= 2 AND - interview_id IN (SELECT interview_id - FROM rounds_2041 - GROUP BY interview_id - HAVING SUM(score) > 15); diff --git a/medium/2051. The Category of Each Member in the Store (Medium).md b/medium/2051. The Category of Each Member in the Store (Medium).md new file mode 100644 index 0000000..4b8a1d6 --- /dev/null +++ b/medium/2051. The Category of Each Member in the Store (Medium).md @@ -0,0 +1,96 @@ +# Question 2051: The Category of Each Member in the Store + +**LeetCode URL:** https://leetcode.com/problems/the-category-of-each-member-in-the-store/ + +## Description + +Drafted from this solution SQL: write a query on `visits`, `purchases`, `members`, `categorized_members` to return `category`. Group results by: v.member_id ) SELECT m.*,COALESCE(category,'Bronze') AS category FROM members_2051 m LEFT JOIN categorized_members c ON m.member_id=c.member_id ORDER BY m.member_id. Order the final output by: m.member_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Members (member_id int, name varchar(30)); +Create table If Not Exists Visits (visit_id int, member_id int, visit_date date); +Create table If Not Exists Purchases (visit_id int, charged_amount int); +``` + +## Sample Input Data + +```sql +insert into Members (member_id, name) values ('9', 'Alice'); +insert into Members (member_id, name) values ('11', 'Bob'); +insert into Members (member_id, name) values ('3', 'Winston'); +insert into Members (member_id, name) values ('8', 'Hercy'); +insert into Members (member_id, name) values ('1', 'Narihan'); +insert into Visits (visit_id, member_id, visit_date) values ('22', '11', '2021-10-28'); +insert into Visits (visit_id, member_id, visit_date) values ('16', '11', '2021-01-12'); +insert into Visits (visit_id, member_id, visit_date) values ('18', '9', '2021-12-10'); +insert into Visits (visit_id, member_id, visit_date) values ('19', '3', '2021-10-19'); +insert into Visits (visit_id, member_id, visit_date) values ('12', '11', '2021-03-01'); +insert into Visits (visit_id, member_id, visit_date) values ('17', '8', '2021-05-07'); +insert into Visits (visit_id, member_id, visit_date) values ('21', '9', '2021-05-12'); +insert into Purchases (visit_id, charged_amount) values ('12', '2000'); +insert into Purchases (visit_id, charged_amount) values ('18', '9000'); +insert into Purchases (visit_id, charged_amount) values ('17', '7000'); +``` + +## Expected Output Data + +```text ++----------+ +| category | ++----------+ +| sample | ++----------+ +``` + +## SQL Solution + +```sql +WITH categorized_members AS ( + SELECT v.member_id, + CASE WHEN COUNT(p.visit_id)*100/COUNT(v.visit_id)>=80 THEN 'Diamond' + WHEN COUNT(p.visit_id)*100/COUNT(v.visit_id)>=50 AND + COUNT(p.visit_id)*100/COUNT(v.visit_id)< 80 THEN 'Gold' + WHEN COUNT(p.visit_id)*100/COUNT(v.visit_id)< 50 THEN 'Silver' + END AS category + FROM visits_2051 v + LEFT JOIN purchases_2051 p ON v.visit_id = p.visit_id + GROUP BY v.member_id +) +SELECT m.*,COALESCE(category,'Bronze') AS category +FROM members_2051 m +LEFT JOIN categorized_members c ON m.member_id=c.member_id +ORDER BY m.member_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `category` from `visits`, `purchases`, `members`, `categorized_members`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`categorized_members`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `categorized_members`: reads `visits`, `purchases`, joins related entities. +3. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `category`. +5. Order output deterministically with `ORDER BY m.member_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/2051. The Category of Each Member in the Store (Medium).sql b/medium/2051. The Category of Each Member in the Store (Medium).sql deleted file mode 100644 index e14ac25..0000000 --- a/medium/2051. The Category of Each Member in the Store (Medium).sql +++ /dev/null @@ -1,15 +0,0 @@ -WITH categorized_members AS ( - SELECT v.member_id, - CASE WHEN COUNT(p.visit_id)*100/COUNT(v.visit_id)>=80 THEN 'Diamond' - WHEN COUNT(p.visit_id)*100/COUNT(v.visit_id)>=50 AND - COUNT(p.visit_id)*100/COUNT(v.visit_id)< 80 THEN 'Gold' - WHEN COUNT(p.visit_id)*100/COUNT(v.visit_id)< 50 THEN 'Silver' - END AS category - FROM visits_2051 v - LEFT JOIN purchases_2051 p ON v.visit_id = p.visit_id - GROUP BY v.member_id -) -SELECT m.*,COALESCE(category,'Bronze') AS category -FROM members_2051 m -LEFT JOIN categorized_members c ON m.member_id=c.member_id -ORDER BY m.member_id; diff --git a/medium/2066. Account Balance (Medium).md b/medium/2066. Account Balance (Medium).md new file mode 100644 index 0000000..d774e27 --- /dev/null +++ b/medium/2066. Account Balance (Medium).md @@ -0,0 +1,80 @@ +# Question 2066: Account Balance + +**LeetCode URL:** https://leetcode.com/problems/account-balance/ + +## Description + +Drafted from this solution SQL: write a query on `transactions`, `fixed_amount` to return `account_id`, `day`, `balance`. Order the final output by: account_id,day. + +## Table Schema Structure + +```sql +Create table If Not Exists Transactions (account_id int, day date, type ENUM('Deposit', 'Withdraw'), amount int); +``` + +## Sample Input Data + +```sql +insert into Transactions (account_id, day, type, amount) values ('1', '2021-11-07', 'Deposit', '2000'); +insert into Transactions (account_id, day, type, amount) values ('1', '2021-11-09', 'Withdraw', '1000'); +insert into Transactions (account_id, day, type, amount) values ('1', '2021-11-11', 'Deposit', '3000'); +insert into Transactions (account_id, day, type, amount) values ('2', '2021-12-07', 'Deposit', '7000'); +insert into Transactions (account_id, day, type, amount) values ('2', '2021-12-12', 'Withdraw', '7000'); +``` + +## Expected Output Data + +```text ++------------+--------+---------+ +| account_id | day | balance | ++------------+--------+---------+ +| sample | sample | sample | ++------------+--------+---------+ +``` + +## SQL Solution + +```sql +WITH fixed_amount AS ( + SELECT account_id,day, + CASE WHEN type = 'Deposit' THEN amount + ELSE amount*-1 + END AS amount + FROM transactions_2066 +) +SELECT account_id,day, + SUM(amount) OVER (PARTITION BY account_id ORDER BY day) AS balance +FROM fixed_amount +ORDER BY account_id,day; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `account_id`, `day`, `balance` from `transactions`, `fixed_amount`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`fixed_amount`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `fixed_amount`: reads `transactions`. +3. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +4. Project final output columns: `account_id`, `day`, `balance`. +5. Order output deterministically with `ORDER BY day) AS balance FROM fixed_amount ORDER BY account_id,day`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/2066. Account Balance (Medium).sql b/medium/2066. Account Balance (Medium).sql deleted file mode 100644 index d79a300..0000000 --- a/medium/2066. Account Balance (Medium).sql +++ /dev/null @@ -1,11 +0,0 @@ -WITH fixed_amount AS ( - SELECT account_id,day, - CASE WHEN type = 'Deposit' THEN amount - ELSE amount*-1 - END AS amount - FROM transactions_2066 -) -SELECT account_id,day, - SUM(amount) OVER (PARTITION BY account_id ORDER BY day) AS balance -FROM fixed_amount -ORDER BY account_id,day; diff --git a/medium/2084. Drop Type 1 Orders for Customers With Type 0 Orders (Medium).md b/medium/2084. Drop Type 1 Orders for Customers With Type 0 Orders (Medium).md new file mode 100644 index 0000000..9caea90 --- /dev/null +++ b/medium/2084. Drop Type 1 Orders for Customers With Type 0 Orders (Medium).md @@ -0,0 +1,75 @@ +# Question 2084: Drop Type 1 Orders for Customers With Type 0 Orders + +**LeetCode URL:** https://leetcode.com/problems/drop-type-1-orders-for-customers-with-type-0-orders/ + +## Description + +Drafted from this solution SQL: write a query on `orders` to return `customer_id`. Apply filter conditions: (customer_id, order_type) IN (SELECT customer_id, MIN(order_type) FROM orders_2084. Group results by: customer_id). + +## Table Schema Structure + +```sql +Create table If Not Exists Orders (order_id int, customer_id int, order_type int); +``` + +## Sample Input Data + +```sql +insert into Orders (order_id, customer_id, order_type) values ('1', '1', '0'); +insert into Orders (order_id, customer_id, order_type) values ('2', '1', '0'); +insert into Orders (order_id, customer_id, order_type) values ('11', '2', '0'); +insert into Orders (order_id, customer_id, order_type) values ('12', '2', '1'); +insert into Orders (order_id, customer_id, order_type) values ('21', '3', '1'); +insert into Orders (order_id, customer_id, order_type) values ('22', '3', '0'); +insert into Orders (order_id, customer_id, order_type) values ('31', '4', '1'); +insert into Orders (order_id, customer_id, order_type) values ('32', '4', '1'); +``` + +## Expected Output Data + +```text ++-------------+ +| customer_id | ++-------------+ +| sample | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT * FROM orders_2084 +WHERE (customer_id, order_type) +IN (SELECT customer_id, MIN(order_type) + FROM orders_2084 + GROUP BY customer_id) +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `orders`. + +### Result Grain + +One row per unique key in `GROUP BY customer_id)`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: (customer_id, order_type) IN (SELECT customer_id, MIN(order_type) FROM orders_2084. +2. Aggregate rows with MIN grouped by customer_id). + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/2084. Drop Type 1 Orders for Customers With Type 0 Orders (Medium).sql b/medium/2084. Drop Type 1 Orders for Customers With Type 0 Orders (Medium).sql deleted file mode 100644 index e5cfa39..0000000 --- a/medium/2084. Drop Type 1 Orders for Customers With Type 0 Orders (Medium).sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT * FROM orders_2084 -WHERE (customer_id, order_type) -IN (SELECT customer_id, MIN(order_type) - FROM orders_2084 - GROUP BY customer_id) diff --git a/medium/2112. The Airport With the Most Traffic (Medium).md b/medium/2112. The Airport With the Most Traffic (Medium).md new file mode 100644 index 0000000..84cc966 --- /dev/null +++ b/medium/2112. The Airport With the Most Traffic (Medium).md @@ -0,0 +1,83 @@ +# Question 2112: The Airport With the Most Traffic + +**LeetCode URL:** https://leetcode.com/problems/the-airport-with-the-most-traffic/ + +## Description + +Drafted from this solution SQL: write a query on `flights`, `airports`, `grouped_airport` to return the required result columns. Apply filter conditions: flights_count = (SELECT MAX(flights_count) FROM grouped_airport). Group results by: airport ) SELECT airport FROM grouped_airport WHERE flights_count = (SELECT MAX(flights_count) FROM grouped_airport). + +## Table Schema Structure + +```sql +Create table If Not Exists Flights (departure_airport int, arrival_airport int, flights_count int); +``` + +## Sample Input Data + +```sql +insert into Flights (departure_airport, arrival_airport, flights_count) values ('1', '2', '4'); +insert into Flights (departure_airport, arrival_airport, flights_count) values ('2', '1', '5'); +insert into Flights (departure_airport, arrival_airport, flights_count) values ('2', '4', '5'); +``` + +## Expected Output Data + +```text ++------------------+ +| result | ++------------------+ +| derived values | ++------------------+ +``` + +## SQL Solution + +```sql +WITH airports AS ( + SELECT departure_airport AS airport,flights_count + FROM flights_2112 + UNION ALL + SELECT arrival_airport AS airport,flights_count + FROM flights_2112 +), +grouped_airport AS ( + SELECT airport,SUM(flights_count) AS flights_count + FROM airports + GROUP BY airport +) +SELECT airport +FROM grouped_airport +WHERE flights_count = (SELECT MAX(flights_count) FROM grouped_airport); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `airport` from `flights`, `airports`, `grouped_airport`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`airports`, `grouped_airport`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `airports`: reads `flights`. +3. CTE `grouped_airport`: reads `airports`. +4. Apply row-level filtering in `WHERE`: flights_count = (SELECT MAX(flights_count) FROM grouped_airport). +5. Project final output columns: `airport`. +6. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Predicate filtering removes irrelevant rows before expensive downstream computation. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/2112. The Airport With the Most Traffic (Medium).sql b/medium/2112. The Airport With the Most Traffic (Medium).sql deleted file mode 100644 index 75dbc40..0000000 --- a/medium/2112. The Airport With the Most Traffic (Medium).sql +++ /dev/null @@ -1,15 +0,0 @@ -WITH airports AS ( - SELECT departure_airport AS airport,flights_count - FROM flights_2112 - UNION ALL - SELECT arrival_airport AS airport,flights_count - FROM flights_2112 -), -grouped_airport AS ( - SELECT airport,SUM(flights_count) AS flights_count - FROM airports - GROUP BY airport -) -SELECT airport -FROM grouped_airport -WHERE flights_count = (SELECT MAX(flights_count) FROM grouped_airport); diff --git a/medium/2142. The Number of Passengers in Each Bus I (Medium).md b/medium/2142. The Number of Passengers in Each Bus I (Medium).md new file mode 100644 index 0000000..7b0e8f3 --- /dev/null +++ b/medium/2142. The Number of Passengers in Each Bus I (Medium).md @@ -0,0 +1,98 @@ +# Question 2142: The Number of Passengers in Each Bus I + +**LeetCode URL:** https://leetcode.com/problems/the-number-of-passengers-in-each-bus-i/ + +## Description + +Drafted from this solution SQL: write a query on `buses`, `passengers`, `running_total_passengers` to return `bus_id`, `passengers_cnt`. Order the final output by: b.arrival_time) AS passengers FROM buses_2142 b LEFT JOIN passengers_2142 p ON p.arrival_time <= b.arrival_time ) SELECT DISTINCT r1.bus_id,r1.passengers-COALESCE(r2.passengers,0) AS passengers_cnt FROM running_total_passengers r1 LEFT JOIN running_total_passengers r2 ON r1.bus_arrival_time > r2.bus_arrival_time AND r1.bus_id=r2.bus_id+1. + +## Table Schema Structure + +```sql +Create table If Not Exists Buses (bus_id int, arrival_time int); +Create table If Not Exists Passengers (passenger_id int, arrival_time int); +``` + +## Sample Input Data + +```sql +insert into Buses (bus_id, arrival_time) values ('1', '2'); +insert into Buses (bus_id, arrival_time) values ('2', '4'); +insert into Buses (bus_id, arrival_time) values ('3', '7'); +insert into Passengers (passenger_id, arrival_time) values ('11', '1'); +insert into Passengers (passenger_id, arrival_time) values ('12', '5'); +insert into Passengers (passenger_id, arrival_time) values ('13', '6'); +insert into Passengers (passenger_id, arrival_time) values ('14', '7'); +``` + +## Expected Output Data + +```text ++--------+----------------+ +| bus_id | passengers_cnt | ++--------+----------------+ +| sample | sample | ++--------+----------------+ +``` + +## SQL Solution + +```sql +WITH running_total_passengers AS ( + SELECT *, + COUNT(passenger_id) OVER (PARTITION BY bus_id ORDER BY b.arrival_time) AS passengers + FROM buses_2142 b + LEFT JOIN passengers_2142 p ON p.arrival_time <= b.arrival_time +) +SELECT r1.bus_id,r1.passengers-COALESCE(r2.passengers,0) AS passengers_cnt +FROM running_total_passengers r1 +LEFT JOIN running_total_passengers r2 ON r1.bus_id=r2.bus_id+1 +ORDER BY r1.passengers,r1.bus_id; + +--(Here we have made an assumption that smaller bus_id arrived first which will not be the case always) +-- Better Query + +WITH running_total_passengers AS ( + SELECT b.bus_id,b.arrival_time AS bus_arrival_time, + p.passenger_id,p.arrival_time AS passenger_arrival_time, + COUNT(passenger_id) OVER (PARTITION BY bus_id ORDER BY b.arrival_time) AS passengers + FROM buses_2142 b + LEFT JOIN passengers_2142 p ON p.arrival_time <= b.arrival_time +) +SELECT DISTINCT r1.bus_id,r1.passengers-COALESCE(r2.passengers,0) AS passengers_cnt +FROM running_total_passengers r1 +LEFT JOIN running_total_passengers r2 ON r1.bus_arrival_time > r2.bus_arrival_time AND r1.bus_id=r2.bus_id+1 +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `bus_id`, `passengers_cnt` from `buses`, `passengers`, `running_total_passengers`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`running_total_passengers`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `running_total_passengers`: reads `buses`, `passengers`, joins related entities, computes window metrics. +3. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `bus_id`, `passengers_cnt`. +6. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/2142. The Number of Passengers in Each Bus I (Medium).sql b/medium/2142. The Number of Passengers in Each Bus I (Medium).sql deleted file mode 100644 index cb1d4a6..0000000 --- a/medium/2142. The Number of Passengers in Each Bus I (Medium).sql +++ /dev/null @@ -1,24 +0,0 @@ -WITH running_total_passengers AS ( - SELECT *, - COUNT(passenger_id) OVER (PARTITION BY bus_id ORDER BY b.arrival_time) AS passengers - FROM buses_2142 b - LEFT JOIN passengers_2142 p ON p.arrival_time <= b.arrival_time -) -SELECT r1.bus_id,r1.passengers-COALESCE(r2.passengers,0) AS passengers_cnt -FROM running_total_passengers r1 -LEFT JOIN running_total_passengers r2 ON r1.bus_id=r2.bus_id+1 -ORDER BY r1.passengers,r1.bus_id; - ---(Here we have made an assumption that smaller bus_id arrived first which will not be the case always) --- Better Query - -WITH running_total_passengers AS ( - SELECT b.bus_id,b.arrival_time AS bus_arrival_time, - p.passenger_id,p.arrival_time AS passenger_arrival_time, - COUNT(passenger_id) OVER (PARTITION BY bus_id ORDER BY b.arrival_time) AS passengers - FROM buses_2142 b - LEFT JOIN passengers_2142 p ON p.arrival_time <= b.arrival_time -) -SELECT DISTINCT r1.bus_id,r1.passengers-COALESCE(r2.passengers,0) AS passengers_cnt -FROM running_total_passengers r1 -LEFT JOIN running_total_passengers r2 ON r1.bus_arrival_time > r2.bus_arrival_time AND r1.bus_id=r2.bus_id+1 diff --git a/medium/2159. Order Two Columns Independently (Medium).md b/medium/2159. Order Two Columns Independently (Medium).md new file mode 100644 index 0000000..c0cb094 --- /dev/null +++ b/medium/2159. Order Two Columns Independently (Medium).md @@ -0,0 +1,83 @@ +# Question 2159: Order Two Columns Independently + +**LeetCode URL:** https://leetcode.com/problems/order-two-columns-independently/ + +## Description + +Drafted from this solution SQL: write a query on `data`, `ranked_first_column`, `ranked_second_column` to return `first_col`, `second_col`. Order the final output by: second_col DESC) AS rn FROM data_2159 ) SELECT f.first_col,s.second_col FROM ranked_first_column f JOIN ranked_second_column s ON f.rn = s.rn. + +## Table Schema Structure + +```sql +Create table If Not Exists Data (first_col int, second_col int); +``` + +## Sample Input Data + +```sql +insert into Data (first_col, second_col) values ('4', '2'); +insert into Data (first_col, second_col) values ('2', '3'); +insert into Data (first_col, second_col) values ('3', '1'); +insert into Data (first_col, second_col) values ('1', '4'); +``` + +## Expected Output Data + +```text ++-----------+------------+ +| first_col | second_col | ++-----------+------------+ +| sample | sample | ++-----------+------------+ +``` + +## SQL Solution + +```sql +WITH ranked_first_column AS ( + SELECT first_col, + ROW_NUMBER() OVER (ORDER BY first_col) AS rn + FROM data_2159 +), +ranked_second_column AS ( + SELECT second_col, + ROW_NUMBER() OVER (ORDER BY second_col DESC) AS rn + FROM data_2159 +) +SELECT f.first_col,s.second_col +FROM ranked_first_column f +JOIN ranked_second_column s ON f.rn = s.rn; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `first_col`, `second_col` from `data`, `ranked_first_column`, `ranked_second_column`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked_first_column`, `ranked_second_column`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked_first_column`: reads `data`, computes window metrics. +3. CTE `ranked_second_column`: reads `data`, computes window metrics. +4. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `first_col`, `second_col`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/2159. Order Two Columns Independently (Medium).sql b/medium/2159. Order Two Columns Independently (Medium).sql deleted file mode 100644 index 13a49ef..0000000 --- a/medium/2159. Order Two Columns Independently (Medium).sql +++ /dev/null @@ -1,13 +0,0 @@ -WITH ranked_first_column AS ( - SELECT first_col, - ROW_NUMBER() OVER (ORDER BY first_col) AS rn - FROM data_2159 -), -ranked_second_column AS ( - SELECT second_col, - ROW_NUMBER() OVER (ORDER BY second_col DESC) AS rn - FROM data_2159 -) -SELECT f.first_col,s.second_col -FROM ranked_first_column f -JOIN ranked_second_column s ON f.rn = s.rn; diff --git a/medium/2175. The Change in Global Rankings (Medium).md b/medium/2175. The Change in Global Rankings (Medium).md new file mode 100644 index 0000000..50d0876 --- /dev/null +++ b/medium/2175. The Change in Global Rankings (Medium).md @@ -0,0 +1,101 @@ +# Question 2175: The Change in Global Rankings + +**LeetCode URL:** https://leetcode.com/problems/the-change-in-global-rankings/ + +## Description + +Drafted from this solution SQL: write a query on `team_points`, `points_change`, `before_update_ranked_teams`, `after_update_ranked_teams`, `SELECT` to return `team_id`, `name`, `rank_diff`. Order the final output by: tp.points+pc.points_change DESC,tp.name. + +## Table Schema Structure + +```sql +Create table If Not Exists TeamPoints (team_id int, name varchar(100), points int); +Create table If Not Exists PointsChange (team_id int, points_change int); +``` + +## Sample Input Data + +```sql +insert into TeamPoints (team_id, name, points) values ('3', 'Algeria', '1431'); +insert into TeamPoints (team_id, name, points) values ('1', 'Senegal', '2132'); +insert into TeamPoints (team_id, name, points) values ('2', 'New Zealand', '1402'); +insert into TeamPoints (team_id, name, points) values ('4', 'Croatia', '1817'); +insert into PointsChange (team_id, points_change) values ('3', '399'); +insert into PointsChange (team_id, points_change) values ('2', '0'); +insert into PointsChange (team_id, points_change) values ('4', '13'); +insert into PointsChange (team_id, points_change) values ('1', '-22'); +``` + +## Expected Output Data + +```text ++---------+--------+-----------+ +| team_id | name | rank_diff | ++---------+--------+-----------+ +| sample | sample | sample | ++---------+--------+-----------+ +``` + +## SQL Solution + +```sql +-- More Readable, But Requires two Joins +WITH before_update_ranked_teams AS ( + SELECT *, + DENSE_RANK() OVER (ORDER BY points DESC,name) before_rn + FROM team_points_2175 +), +after_update_ranked_teams AS ( + SELECT tp.team_id,tp.name,tp.points+pc.points_change AS points, + DENSE_RANK() OVER (ORDER BY tp.points+pc.points_change DESC,tp.name) after_rn + FROM team_points_2175 tp + INNER JOIN points_change_2175 pc ON tp.team_id = pc.team_id +) +SELECT au.team_id,au.name,au.points,au.after_rn-bu.before_rn AS rank_diff +FROM before_update_ranked_teams bu +INNER JOIN after_update_ranked_teams au ON bu.team_id = au.team_id +ORDER BY au.points DESC,au.name; + +-- Using a single Join + +SELECT tp.team_id,tp.name, + DENSE_RANK() OVER (ORDER BY tp.points+pc.points_change DESC,tp.name)- + DENSE_RANK() OVER (ORDER BY tp.points DESC,name) AS rank_diff +FROM team_points_2175 tp +INNER JOIN points_change_2175 pc ON tp.team_id = pc.team_id +ORDER BY tp.points+pc.points_change DESC,tp.name; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `team_id`, `name`, `rank_diff` from `team_points`, `points_change`, `before_update_ranked_teams`, `after_update_ranked_teams`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`before_update_ranked_teams`, `after_update_ranked_teams`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `before_update_ranked_teams`: reads `team_points`, computes window metrics. +3. CTE `after_update_ranked_teams`: reads `team_points`, `points_change`, joins related entities, computes window metrics. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +6. Project final output columns: `team_id`, `name`, `rank_diff`. +7. Order output deterministically with `ORDER BY tp.points+pc.points_change DESC,tp.name)- DENSE_RANK() OVER (ORDER BY tp.points DESC,name) AS rank_diff FROM team_points_2175 tp INNER JOIN points_change_2175 pc ON tp...`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/2175. The Change in Global Rankings (Medium).sql b/medium/2175. The Change in Global Rankings (Medium).sql deleted file mode 100644 index c2b3ae9..0000000 --- a/medium/2175. The Change in Global Rankings (Medium).sql +++ /dev/null @@ -1,25 +0,0 @@ --- More Readable, But Requires two Joins -WITH before_update_ranked_teams AS ( - SELECT *, - DENSE_RANK() OVER (ORDER BY points DESC,name) before_rn - FROM team_points_2175 -), -after_update_ranked_teams AS ( - SELECT tp.team_id,tp.name,tp.points+pc.points_change AS points, - DENSE_RANK() OVER (ORDER BY tp.points+pc.points_change DESC,tp.name) after_rn - FROM team_points_2175 tp - INNER JOIN points_change_2175 pc ON tp.team_id = pc.team_id -) -SELECT au.team_id,au.name,au.points,au.after_rn-bu.before_rn AS rank_diff -FROM before_update_ranked_teams bu -INNER JOIN after_update_ranked_teams au ON bu.team_id = au.team_id -ORDER BY au.points DESC,au.name; - --- Using a single Join - -SELECT tp.team_id,tp.name, - DENSE_RANK() OVER (ORDER BY tp.points+pc.points_change DESC,tp.name)- - DENSE_RANK() OVER (ORDER BY tp.points DESC,name) AS rank_diff -FROM team_points_2175 tp -INNER JOIN points_change_2175 pc ON tp.team_id = pc.team_id -ORDER BY tp.points+pc.points_change DESC,tp.name; diff --git a/medium/2228. Users With Two Purchases Within Seven Days (Medium).md b/medium/2228. Users With Two Purchases Within Seven Days (Medium).md new file mode 100644 index 0000000..75f3603 --- /dev/null +++ b/medium/2228. Users With Two Purchases Within Seven Days (Medium).md @@ -0,0 +1,74 @@ +# Question 2228: Users With Two Purchases Within Seven Days + +**LeetCode URL:** https://leetcode.com/problems/users-with-two-purchases-within-seven-days/ + +## Description + +Drafted from this solution SQL: write a query on `purchases`, `p1`, `p2` to return `user_id`. + +## Table Schema Structure + +```sql +Create table If Not Exists Purchases (purchase_id int, user_id int, purchase_date date); +``` + +## Sample Input Data + +```sql +insert into Purchases (purchase_id, user_id, purchase_date) values ('4', '2', '2022-03-13'); +insert into Purchases (purchase_id, user_id, purchase_date) values ('1', '5', '2022-02-11'); +insert into Purchases (purchase_id, user_id, purchase_date) values ('3', '7', '2022-06-19'); +insert into Purchases (purchase_id, user_id, purchase_date) values ('6', '2', '2022-03-20'); +insert into Purchases (purchase_id, user_id, purchase_date) values ('5', '7', '2022-06-19'); +insert into Purchases (purchase_id, user_id, purchase_date) values ('2', '2', '2022-06-08'); +``` + +## Expected Output Data + +```text ++---------+ +| user_id | ++---------+ +| sample | ++---------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT p1.user_id +FROM purchases_2228 p1 +INNER JOIN purchases_2228 p2 +ON p1.purchase_id <> p2.purchase_id AND + p1.user_id = p2.user_id AND + ABS(EXTRACT(DAY FROM p1.purchase_date)-EXTRACT(DAY FROM p2.purchase_date))<=7 +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id` from `purchases`, `p1`, `p2`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `user_id`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/2228. Users With Two Purchases Within Seven Days (Medium).sql b/medium/2228. Users With Two Purchases Within Seven Days (Medium).sql deleted file mode 100644 index a94bbaf..0000000 --- a/medium/2228. Users With Two Purchases Within Seven Days (Medium).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT DISTINCT p1.user_id -FROM purchases_2228 p1 -INNER JOIN purchases_2228 p2 -ON p1.purchase_id <> p2.purchase_id AND - p1.user_id = p2.user_id AND - ABS(EXTRACT(DAY FROM p1.purchase_date)-EXTRACT(DAY FROM p2.purchase_date))<=7 diff --git a/medium/2238. Number of Times a Driver Was a Passenger (Medium).md b/medium/2238. Number of Times a Driver Was a Passenger (Medium).md new file mode 100644 index 0000000..64e86ad --- /dev/null +++ b/medium/2238. Number of Times a Driver Was a Passenger (Medium).md @@ -0,0 +1,73 @@ +# Question 2238: Number of Times a Driver Was a Passenger + +**LeetCode URL:** https://leetcode.com/problems/number-of-times-a-driver-was-a-passenger/ + +## Description + +Drafted from this solution SQL: write a query on `rides` to return `driver_id`, `cnt`. Group results by: r1.driver_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Rides (ride_id int, driver_id int, passenger_id int); +``` + +## Sample Input Data + +```sql +insert into Rides (ride_id, driver_id, passenger_id) values ('1', '7', '1'); +insert into Rides (ride_id, driver_id, passenger_id) values ('2', '7', '2'); +insert into Rides (ride_id, driver_id, passenger_id) values ('3', '11', '1'); +insert into Rides (ride_id, driver_id, passenger_id) values ('4', '11', '7'); +insert into Rides (ride_id, driver_id, passenger_id) values ('5', '11', '7'); +insert into Rides (ride_id, driver_id, passenger_id) values ('6', '11', '3'); +``` + +## Expected Output Data + +```text ++-----------+--------+ +| driver_id | cnt | ++-----------+--------+ +| sample | sample | ++-----------+--------+ +``` + +## SQL Solution + +```sql +SELECT r1.driver_id,COUNT(DISTINCT r2.ride_id) AS cnt +FROM rides_2238 r1 +LEFT JOIN rides_2238 r2 ON r1.driver_id = r2.passenger_id +GROUP BY r1.driver_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `driver_id`, `cnt` from `rides`. + +### Result Grain + +One row per unique key in `GROUP BY r1.driver_id`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with COUNT grouped by r1.driver_id. +3. Project final output columns: `driver_id`, `cnt`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/2238. Number of Times a Driver Was a Passenger (Medium).sql b/medium/2238. Number of Times a Driver Was a Passenger (Medium).sql deleted file mode 100644 index a7fec03..0000000 --- a/medium/2238. Number of Times a Driver Was a Passenger (Medium).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT r1.driver_id,COUNT(DISTINCT r2.ride_id) AS cnt -FROM rides_2238 r1 -LEFT JOIN rides_2238 r2 ON r1.driver_id = r2.passenger_id -GROUP BY r1.driver_id; diff --git a/medium/2292. Products With Three or More Orders in Two Consecutive Years (Medium).md b/medium/2292. Products With Three or More Orders in Two Consecutive Years (Medium).md new file mode 100644 index 0000000..40287f5 --- /dev/null +++ b/medium/2292. Products With Three or More Orders in Two Consecutive Years (Medium).md @@ -0,0 +1,82 @@ +# Question 2292: Products With Three or More Orders in Two Consecutive Years + +**LeetCode URL:** https://leetcode.com/problems/products-with-three-or-more-orders-in-two-consecutive-years/ + +## Description + +Drafted from this solution SQL: write a query on `purchase_date`, `orders`, `order_counts` to return `product_id`. Group results by: product_id,EXTRACT(year FROM purchase_date) HAVING COUNT(order_id) >= 3 ORDER BY 1,2 ) SELECT DISTINCT oc1.product_id FROM order_counts oc1 INNER JOIN order_counts oc2 ON oc1.product_id = oc2.product_id AND oc1.yr+1 = oc2.yr. Keep groups satisfying: COUNT(order_id) >= 3 ORDER BY 1,2 ) SELECT DISTINCT oc1.product_id FROM order_counts oc1 INNER JOIN order_counts oc2 ON oc1.product_id = oc2.product_id AND oc1.yr+1 = oc2.yr. Order the final output by: 1,2 ) SELECT DISTINCT oc1.product_id FROM order_counts oc1 INNER JOIN order_counts oc2 ON oc1.product_id = oc2.product_id AND oc1.yr+1 = oc2.yr. + +## Table Schema Structure + +```sql +Create table If Not Exists Orders (order_id int, product_id int, quantity int, purchase_date date); +``` + +## Sample Input Data + +```sql +insert into Orders (order_id, product_id, quantity, purchase_date) values ('1', '1', '7', '2020-03-16'); +insert into Orders (order_id, product_id, quantity, purchase_date) values ('2', '1', '4', '2020-12-02'); +insert into Orders (order_id, product_id, quantity, purchase_date) values ('3', '1', '7', '2020-05-10'); +insert into Orders (order_id, product_id, quantity, purchase_date) values ('4', '1', '6', '2021-12-23'); +insert into Orders (order_id, product_id, quantity, purchase_date) values ('5', '1', '5', '2021-05-21'); +insert into Orders (order_id, product_id, quantity, purchase_date) values ('6', '1', '6', '2021-10-11'); +insert into Orders (order_id, product_id, quantity, purchase_date) values ('7', '2', '6', '2022-10-11'); +``` + +## Expected Output Data + +```text ++------------+ +| product_id | ++------------+ +| sample | ++------------+ +``` + +## SQL Solution + +```sql +WITH order_counts AS ( + SELECT product_id,EXTRACT(year FROM purchase_date) AS yr,COUNT(order_id) AS order_count + FROM orders_2292 + GROUP BY product_id,EXTRACT(year FROM purchase_date) + HAVING COUNT(order_id) >= 3 + ORDER BY 1,2 +) +SELECT DISTINCT oc1.product_id +FROM order_counts oc1 +INNER JOIN order_counts oc2 +ON oc1.product_id = oc2.product_id AND oc1.yr+1 = oc2.yr; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `product_id` from `purchase_date`, `orders`, `order_counts`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Create CTE layers (`order_counts`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `order_counts`: reads `purchase_date`, `orders`. +3. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `product_id`. +5. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/2292. Products With Three or More Orders in Two Consecutive Years (Medium).sql b/medium/2292. Products With Three or More Orders in Two Consecutive Years (Medium).sql deleted file mode 100644 index 1792cc3..0000000 --- a/medium/2292. Products With Three or More Orders in Two Consecutive Years (Medium).sql +++ /dev/null @@ -1,11 +0,0 @@ -WITH order_counts AS ( - SELECT product_id,EXTRACT(year FROM purchase_date) AS yr,COUNT(order_id) AS order_count - FROM orders_2292 - GROUP BY product_id,EXTRACT(year FROM purchase_date) - HAVING COUNT(order_id) >= 3 - ORDER BY 1,2 -) -SELECT DISTINCT oc1.product_id -FROM order_counts oc1 -INNER JOIN order_counts oc2 -ON oc1.product_id = oc2.product_id AND oc1.yr+1 = oc2.yr; diff --git a/medium/2298. Tasks Count in the Weekend (Medium).md b/medium/2298. Tasks Count in the Weekend (Medium).md new file mode 100644 index 0000000..8d69dec --- /dev/null +++ b/medium/2298. Tasks Count in the Weekend (Medium).md @@ -0,0 +1,70 @@ +# Question 2298: Tasks Count in the Weekend + +**LeetCode URL:** https://leetcode.com/problems/tasks-count-in-the-weekend/ + +## Description + +Drafted from this solution SQL: write a query on `submit_date`, `tasks` to return `ISODOW`. + +## Table Schema Structure + +```sql +Create table If Not Exists Tasks (task_id int, assignee_id int, submit_date date); +``` + +## Sample Input Data + +```sql +insert into Tasks (task_id, assignee_id, submit_date) values ('1', '1', '2022-06-13'); +insert into Tasks (task_id, assignee_id, submit_date) values ('2', '6', '2022-06-14'); +insert into Tasks (task_id, assignee_id, submit_date) values ('3', '6', '2022-06-15'); +insert into Tasks (task_id, assignee_id, submit_date) values ('4', '3', '2022-06-18'); +insert into Tasks (task_id, assignee_id, submit_date) values ('5', '5', '2022-06-19'); +insert into Tasks (task_id, assignee_id, submit_date) values ('6', '7', '2022-06-19'); +``` + +## Expected Output Data + +```text ++--------+ +| ISODOW | ++--------+ +| sample | ++--------+ +``` + +## SQL Solution + +```sql +SELECT + COUNT(CASE WHEN EXTRACT(ISODOW FROM submit_date) > 5 THEN 1 ELSE NULL END) AS weekend_cnt, + COUNT(CASE WHEN EXTRACT(ISODOW FROM submit_date) <= 5 THEN 1 ELSE NULL END) AS working_cnt +FROM tasks_2298; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `ISODOW` from `submit_date`, `tasks`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Project final output columns: `ISODOW`. + +### Why This Works + +The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost is scanning source rows once; performance usually scales with input size. + +### Common Pitfalls + +- Check edge cases: empty input, single-row input, and duplicate keys. + diff --git a/medium/2298. Tasks Count in the Weekend (Medium).sql b/medium/2298. Tasks Count in the Weekend (Medium).sql deleted file mode 100644 index db32923..0000000 --- a/medium/2298. Tasks Count in the Weekend (Medium).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT - COUNT(CASE WHEN EXTRACT(ISODOW FROM submit_date) > 5 THEN 1 ELSE NULL END) AS weekend_cnt, - COUNT(CASE WHEN EXTRACT(ISODOW FROM submit_date) <= 5 THEN 1 ELSE NULL END) AS working_cnt -FROM tasks_2298; diff --git a/medium/2308. Arrange Table by Gender (Medium).md b/medium/2308. Arrange Table by Gender (Medium).md new file mode 100644 index 0000000..85f6b22 --- /dev/null +++ b/medium/2308. Arrange Table by Gender (Medium).md @@ -0,0 +1,85 @@ +# Question 2308: Arrange Table by Gender + +**LeetCode URL:** https://leetcode.com/problems/arrange-table-by-gender/ + +## Description + +Drafted from this solution SQL: write a query on `genders`, `ranked_genders` to return `user_id`, `gender`. Order the final output by: rnk,rnk2. + +## Table Schema Structure + +```sql +Create table If Not Exists Genders (user_id int, gender ENUM('female', 'other', 'male')); +``` + +## Sample Input Data + +```sql +insert into Genders (user_id, gender) values ('4', 'male'); +insert into Genders (user_id, gender) values ('7', 'female'); +insert into Genders (user_id, gender) values ('2', 'other'); +insert into Genders (user_id, gender) values ('5', 'male'); +insert into Genders (user_id, gender) values ('3', 'female'); +insert into Genders (user_id, gender) values ('8', 'male'); +insert into Genders (user_id, gender) values ('6', 'other'); +insert into Genders (user_id, gender) values ('1', 'other'); +insert into Genders (user_id, gender) values ('9', 'female'); +``` + +## Expected Output Data + +```text ++---------+--------+ +| user_id | gender | ++---------+--------+ +| sample | sample | ++---------+--------+ +``` + +## SQL Solution + +```sql +WITH ranked_genders AS ( + SELECT *, + RANK() OVER (PARTITION BY gender ORDER BY user_id) AS rnk, + CASE WHEN gender = 'female' THEN 0 + WHEN gender = 'other' THEN 1 + ELSE 2 + END AS rnk2 + FROM genders_2308 +) +SELECT user_id,gender +FROM ranked_genders +ORDER BY rnk,rnk2; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id`, `gender` from `genders`, `ranked_genders`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked_genders`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked_genders`: reads `genders`, computes window metrics. +3. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +4. Project final output columns: `user_id`, `gender`. +5. Order output deterministically with `ORDER BY rnk,rnk2`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/2308. Arrange Table by Gender (Medium).sql b/medium/2308. Arrange Table by Gender (Medium).sql deleted file mode 100644 index a2d5e37..0000000 --- a/medium/2308. Arrange Table by Gender (Medium).sql +++ /dev/null @@ -1,12 +0,0 @@ -WITH ranked_genders AS ( - SELECT *, - RANK() OVER (PARTITION BY gender ORDER BY user_id) AS rnk, - CASE WHEN gender = 'female' THEN 0 - WHEN gender = 'other' THEN 1 - ELSE 2 - END AS rnk2 - FROM genders_2308 -) -SELECT user_id,gender -FROM ranked_genders -ORDER BY rnk,rnk2; diff --git a/medium/2314. The First Day of the Maximum Recorded Degree in Each City (Medium).md b/medium/2314. The First Day of the Maximum Recorded Degree in Each City (Medium).md new file mode 100644 index 0000000..6c29c80 --- /dev/null +++ b/medium/2314. The First Day of the Maximum Recorded Degree in Each City (Medium).md @@ -0,0 +1,82 @@ +# Question 2314: The First Day of the Maximum Recorded Degree in Each City + +**LeetCode URL:** https://leetcode.com/problems/the-first-day-of-the-maximum-recorded-degree-in-each-city/ + +## Description + +Drafted from this solution SQL: write a query on `weather`, `ranked` to return `city_id`, `day`, `degree`. Apply filter conditions: rnk = 1 ORDER BY city_id. Order the final output by: city_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Weather (city_id int, day date, degree int); +``` + +## Sample Input Data + +```sql +insert into Weather (city_id, day, degree) values ('1', '2022-01-07', '-12'); +insert into Weather (city_id, day, degree) values ('1', '2022-03-07', '5'); +insert into Weather (city_id, day, degree) values ('1', '2022-07-07', '24'); +insert into Weather (city_id, day, degree) values ('2', '2022-08-07', '37'); +insert into Weather (city_id, day, degree) values ('2', '2022-08-17', '37'); +insert into Weather (city_id, day, degree) values ('3', '2022-02-07', '-7'); +insert into Weather (city_id, day, degree) values ('3', '2022-12-07', '-6'); +``` + +## Expected Output Data + +```text ++---------+--------+--------+ +| city_id | day | degree | ++---------+--------+--------+ +| sample | sample | sample | ++---------+--------+--------+ +``` + +## SQL Solution + +```sql +WITH ranked AS ( + SELECT *, + RANK() OVER (PARTITION BY city_id ORDER BY degree DESC,day) AS rnk + FROM weather_2314 +) +SELECT city_id,day,degree +FROM ranked +WHERE rnk = 1 +ORDER BY city_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `city_id`, `day`, `degree` from `weather`, `ranked`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `weather`, computes window metrics. +3. Apply row-level filtering in `WHERE`: rnk = 1. +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `city_id`, `day`, `degree`. +6. Order output deterministically with `ORDER BY city_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/2314. The First Day of the Maximum Recorded Degree in Each City (Medium).sql b/medium/2314. The First Day of the Maximum Recorded Degree in Each City (Medium).sql deleted file mode 100644 index 98e10d5..0000000 --- a/medium/2314. The First Day of the Maximum Recorded Degree in Each City (Medium).sql +++ /dev/null @@ -1,9 +0,0 @@ -WITH ranked AS ( - SELECT *, - RANK() OVER (PARTITION BY city_id ORDER BY degree DESC,day) AS rnk - FROM weather_2314 -) -SELECT city_id,day,degree -FROM ranked -WHERE rnk = 1 -ORDER BY city_id; diff --git a/medium/2324. Product Sales Analysis IV (Medium).md b/medium/2324. Product Sales Analysis IV (Medium).md new file mode 100644 index 0000000..803a948 --- /dev/null +++ b/medium/2324. Product Sales Analysis IV (Medium).md @@ -0,0 +1,92 @@ +# Question 2324: Product Sales Analysis IV + +**LeetCode URL:** https://leetcode.com/problems/product-sales-analysis-iv/ + +## Description + +Drafted from this solution SQL: write a query on `sales`, `grouped_sales`, `product`, `ranked_sales` to return `user_id`, `product_id`. Apply filter conditions: rnk = 1. Group results by: product_id,user_id ), ranked_sales AS ( SELECT s.product_id,s.user_id,s.quantity*p.price AS spent, RANK() OVER (PARTITION BY s.user_id. Order the final output by: s.quantity*p.price DESC) AS rnk FROM grouped_sales s INNER JOIN product_2324 p ON s.product_id = p.product_id ) SELECT user_id,product_id FROM ranked_sales WHERE rnk = 1. + +## Table Schema Structure + +```sql +Create table If Not Exists Sales (sale_id int, product_id int, user_id int, quantity int); +Create table If Not Exists Product (product_id int, price int); +``` + +## Sample Input Data + +```sql +insert into Sales (sale_id, product_id, user_id, quantity) values ('1', '1', '101', '10'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('2', '3', '101', '7'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('3', '1', '102', '9'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('4', '2', '102', '6'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('5', '3', '102', '10'); +insert into Sales (sale_id, product_id, user_id, quantity) values ('6', '1', '102', '6'); +insert into Product (product_id, price) values ('1', '10'); +insert into Product (product_id, price) values ('2', '25'); +insert into Product (product_id, price) values ('3', '15'); +``` + +## Expected Output Data + +```text ++---------+------------+ +| user_id | product_id | ++---------+------------+ +| sample | sample | ++---------+------------+ +``` + +## SQL Solution + +```sql +WITH grouped_sales AS ( + SELECT product_id,user_id,SUM(quantity) AS quantity + FROM sales_2324 + GROUP BY product_id,user_id +), +ranked_sales AS ( + SELECT s.product_id,s.user_id,s.quantity*p.price AS spent, + RANK() OVER (PARTITION BY s.user_id ORDER BY s.quantity*p.price DESC) AS rnk + FROM grouped_sales s + INNER JOIN product_2324 p ON s.product_id = p.product_id +) +SELECT user_id,product_id +FROM ranked_sales +WHERE rnk = 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `user_id`, `product_id` from `sales`, `grouped_sales`, `product`, `ranked_sales`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`grouped_sales`, `ranked_sales`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `grouped_sales`: reads `sales`. +3. CTE `ranked_sales`: reads `grouped_sales`, `product`, joins related entities, computes window metrics. +4. Combine datasets using INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +5. Apply row-level filtering in `WHERE`: rnk = 1. +6. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +7. Project final output columns: `user_id`, `product_id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. Window expressions calculate comparative metrics without collapsing rows too early. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/2324. Product Sales Analysis IV (Medium).sql b/medium/2324. Product Sales Analysis IV (Medium).sql deleted file mode 100644 index 48cf6e3..0000000 --- a/medium/2324. Product Sales Analysis IV (Medium).sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH grouped_sales AS ( - SELECT product_id,user_id,SUM(quantity) AS quantity - FROM sales_2324 - GROUP BY product_id,user_id -), -ranked_sales AS ( - SELECT s.product_id,s.user_id,s.quantity*p.price AS spent, - RANK() OVER (PARTITION BY s.user_id ORDER BY s.quantity*p.price DESC) AS rnk - FROM grouped_sales s - INNER JOIN product_2324 p ON s.product_id = p.product_id -) -SELECT user_id,product_id -FROM ranked_sales -WHERE rnk = 1; diff --git a/medium/2346. Compute the Rank as a Percentage (Medium).md b/medium/2346. Compute the Rank as a Percentage (Medium).md new file mode 100644 index 0000000..626dccf --- /dev/null +++ b/medium/2346. Compute the Rank as a Percentage (Medium).md @@ -0,0 +1,71 @@ +# Question 2346: Compute the Rank as a Percentage + +**LeetCode URL:** https://leetcode.com/problems/compute-the-rank-as-a-percentage/ + +## Description + +Drafted from this solution SQL: write a query on `students` to return `student_id`, `department_id`, `percentage`. Order the final output by: mark DESC)-1)*100/ (COUNT(student_id) OVER (PARTITION BY department_id)-1),2) AS percentage FROM students_2346. + +## Table Schema Structure + +```sql +Create table If Not Exists Students (student_id int, department_id int, mark int); +``` + +## Sample Input Data + +```sql +insert into Students (student_id, department_id, mark) values ('2', '2', '650'); +insert into Students (student_id, department_id, mark) values ('8', '2', '650'); +insert into Students (student_id, department_id, mark) values ('7', '1', '920'); +insert into Students (student_id, department_id, mark) values ('1', '1', '610'); +insert into Students (student_id, department_id, mark) values ('3', '1', '530'); +``` + +## Expected Output Data + +```text ++------------+---------------+------------+ +| student_id | department_id | percentage | ++------------+---------------+------------+ +| sample | sample | sample | ++------------+---------------+------------+ +``` + +## SQL Solution + +```sql +SELECT student_id,department_id, + ROUND((RANK() OVER (PARTITION BY department_id ORDER BY mark DESC)-1)*100/ + (COUNT(student_id) OVER (PARTITION BY department_id)-1),2) AS percentage +FROM students_2346; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `student_id`, `department_id`, `percentage` from `students`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +2. Project final output columns: `student_id`, `department_id`, `percentage`. +3. Order output deterministically with `ORDER BY mark DESC)-1)*100/ (COUNT(student_id) OVER (PARTITION BY department_id)-1),2) AS percentage FROM students_2346`. + +### Why This Works + +Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/2346. Compute the Rank as a Percentage (Medium).sql b/medium/2346. Compute the Rank as a Percentage (Medium).sql deleted file mode 100644 index 9656b35..0000000 --- a/medium/2346. Compute the Rank as a Percentage (Medium).sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT student_id,department_id, - ROUND((RANK() OVER (PARTITION BY department_id ORDER BY mark DESC)-1)*100/ - (COUNT(student_id) OVER (PARTITION BY department_id)-1),2) AS percentage -FROM students_2346; diff --git a/medium/2372. Calculate the Influence of Each Salesperson (Medium).md b/medium/2372. Calculate the Influence of Each Salesperson (Medium).md new file mode 100644 index 0000000..0bb08d7 --- /dev/null +++ b/medium/2372. Calculate the Influence of Each Salesperson (Medium).md @@ -0,0 +1,84 @@ +# Question 2372: Calculate the Influence of Each Salesperson + +**LeetCode URL:** https://leetcode.com/problems/calculate-the-influence-of-each-salesperson/ + +## Description + +Drafted from this solution SQL: write a query on `customer`, `sales`, `salesperson` to return `salesperson_id`, `name`, `total`. Group results by: c.salesperson_id ) SELECT sp.salesperson_id,sp.name,COALESCE(s.total,0) AS total FROM salesperson_2372 sp LEFT JOIN sales s ON sp.salesperson_id = s.salesperson_id. + +## Table Schema Structure + +```sql +Create table If Not Exists Salesperson (salesperson_id int, name varchar(30)); +Create table If Not Exists Customer (customer_id int, salesperson_id int); +Create table If Not Exists Sales (sale_id int, customer_id int, price int); +``` + +## Sample Input Data + +```sql +insert into Salesperson (salesperson_id, name) values ('1', 'Alice'); +insert into Salesperson (salesperson_id, name) values ('2', 'Bob'); +insert into Salesperson (salesperson_id, name) values ('3', 'Jerry'); +insert into Customer (customer_id, salesperson_id) values ('1', '1'); +insert into Customer (customer_id, salesperson_id) values ('2', '1'); +insert into Customer (customer_id, salesperson_id) values ('3', '2'); +insert into Sales (sale_id, customer_id, price) values ('1', '2', '892'); +insert into Sales (sale_id, customer_id, price) values ('2', '1', '354'); +insert into Sales (sale_id, customer_id, price) values ('3', '3', '988'); +insert into Sales (sale_id, customer_id, price) values ('4', '3', '856'); +``` + +## Expected Output Data + +```text ++----------------+--------+--------+ +| salesperson_id | name | total | ++----------------+--------+--------+ +| sample | sample | sample | ++----------------+--------+--------+ +``` + +## SQL Solution + +```sql +WITH sales AS ( + SELECT c.salesperson_id,SUM(s.price) AS total + FROM customer_2372 c + INNER JOIN sales_2372 s ON c.customer_id = s.customer_id + GROUP BY c.salesperson_id +) +SELECT sp.salesperson_id,sp.name,COALESCE(s.total,0) AS total +FROM salesperson_2372 sp +LEFT JOIN sales s ON sp.salesperson_id = s.salesperson_id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `salesperson_id`, `name`, `total` from `customer`, `sales`, `salesperson`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`sales`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `sales`: reads `customer`, `sales`, joins related entities. +3. Combine datasets using LEFT JOIN, INNER JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `salesperson_id`, `name`, `total`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/2372. Calculate the Influence of Each Salesperson (Medium).sql b/medium/2372. Calculate the Influence of Each Salesperson (Medium).sql deleted file mode 100644 index ce792b1..0000000 --- a/medium/2372. Calculate the Influence of Each Salesperson (Medium).sql +++ /dev/null @@ -1,9 +0,0 @@ -WITH sales AS ( - SELECT c.salesperson_id,SUM(s.price) AS total - FROM customer_2372 c - INNER JOIN sales_2372 s ON c.customer_id = s.customer_id - GROUP BY c.salesperson_id -) -SELECT sp.salesperson_id,sp.name,COALESCE(s.total,0) AS total -FROM salesperson_2372 sp -LEFT JOIN sales s ON sp.salesperson_id = s.salesperson_id; diff --git a/medium/2388. Change Null Values in a Table to the Previous Value (Medium).md b/medium/2388. Change Null Values in a Table to the Previous Value (Medium).md new file mode 100644 index 0000000..0a1f2f0 --- /dev/null +++ b/medium/2388. Change Null Values in a Table to the Previous Value (Medium).md @@ -0,0 +1,84 @@ +# Question 2388: Change Null Values in a Table to the Previous Value + +**LeetCode URL:** https://leetcode.com/problems/change-null-values-in-a-table-to-the-previous-value/ + +## Description + +Drafted from this solution SQL: write a query on `coffee_shop`, `flagged_coffee`, `running_sum` to return `id`, `drink`. Order the final output by: rn) AS rsum FROM flagged_coffee ) SELECT id, FIRST_VALUE(drink) OVER (PARTITION BY rsum) AS drink FROM running_sum. + +## Table Schema Structure + +```sql +Create table If Not Exists CoffeeShop (id int, drink varchar(20)); +``` + +## Sample Input Data + +```sql +insert into CoffeeShop (id, drink) values ('9', 'Rum and Coke'); +insert into CoffeeShop (id, drink) values ('6', NULL); +insert into CoffeeShop (id, drink) values ('7', NULL); +insert into CoffeeShop (id, drink) values ('3', 'St Germain Spritz'); +insert into CoffeeShop (id, drink) values ('1', 'Orange Margarita'); +insert into CoffeeShop (id, drink) values ('2', NULL); +``` + +## Expected Output Data + +```text ++--------+--------+ +| id | drink | ++--------+--------+ +| sample | sample | ++--------+--------+ +``` + +## SQL Solution + +```sql +WITH flagged_coffee AS ( + SELECT *, + ROW_NUMBER() OVER () AS rn, + CASE WHEN drink IS NOT NULL THEN 1 ELSE 0 END AS null_flag + FROM coffee_shop_2388 +), +running_sum AS ( + SELECT *, + SUM(null_flag) OVER (ORDER BY rn) AS rsum + FROM flagged_coffee +) +SELECT id, + FIRST_VALUE(drink) OVER (PARTITION BY rsum) AS drink +FROM running_sum; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `drink` from `coffee_shop`, `flagged_coffee`, `running_sum`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`flagged_coffee`, `running_sum`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `flagged_coffee`: reads `coffee_shop`, computes window metrics. +3. CTE `running_sum`: reads `flagged_coffee`, computes window metrics. +4. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +5. Project final output columns: `id`, `drink`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/2388. Change Null Values in a Table to the Previous Value (Medium).sql b/medium/2388. Change Null Values in a Table to the Previous Value (Medium).sql deleted file mode 100644 index eccf7c1..0000000 --- a/medium/2388. Change Null Values in a Table to the Previous Value (Medium).sql +++ /dev/null @@ -1,14 +0,0 @@ -WITH flagged_coffee AS ( - SELECT *, - ROW_NUMBER() OVER () AS rn, - CASE WHEN drink IS NOT NULL THEN 1 ELSE 0 END AS null_flag - FROM coffee_shop_2388 -), -running_sum AS ( - SELECT *, - SUM(null_flag) OVER (ORDER BY rn) AS rsum - FROM flagged_coffee -) -SELECT id, - FIRST_VALUE(drink) OVER (PARTITION BY rsum) AS drink -FROM running_sum; diff --git a/medium/2394. Employees With Deductions (Medium).md b/medium/2394. Employees With Deductions (Medium).md new file mode 100644 index 0000000..1d31993 --- /dev/null +++ b/medium/2394. Employees With Deductions (Medium).md @@ -0,0 +1,78 @@ +# Question 2394: Employees With Deductions + +**LeetCode URL:** https://leetcode.com/problems/employees-with-deductions/ + +## Description + +Drafted from this solution SQL: write a query on `employees`, `logs` to return `employee_id`. Group results by: e.employee_id,e.needed_hours HAVING COALESCE(SUM(EXTRACT(hour FROM (out_time-in_time))+ FLOOR((EXTRACT(minute FROM (out_time-in_time)) + CEIL(EXTRACT(second FROM (out_time-in_time))/60))/60)),0) < e.needed_hours. Keep groups satisfying: COALESCE(SUM(EXTRACT(hour FROM (out_time-in_time))+ FLOOR((EXTRACT(minute FROM (out_time-in_time)) + CEIL(EXTRACT(second FROM (out_time-in_time))/60))/60)),0) < e.needed_hours. + +## Table Schema Structure + +```sql +Create table If Not Exists Employees (employee_id int, needed_hours int); +Create table If Not Exists Logs (employee_id int, in_time datetime, out_time datetime); +``` + +## Sample Input Data + +```sql +insert into Employees (employee_id, needed_hours) values ('1', '20'); +insert into Employees (employee_id, needed_hours) values ('2', '12'); +insert into Employees (employee_id, needed_hours) values ('3', '2'); +insert into Logs (employee_id, in_time, out_time) values ('1', '2022-10-01 09:00:00', '2022-10-01 17:00:00'); +insert into Logs (employee_id, in_time, out_time) values ('1', '2022-10-06 09:05:04', '2022-10-06 17:09:03'); +insert into Logs (employee_id, in_time, out_time) values ('1', '2022-10-12 23:00:00', '2022-10-13 03:00:01'); +insert into Logs (employee_id, in_time, out_time) values ('2', '2022-10-29 12:00:00', '2022-10-29 23:58:58'); +``` + +## Expected Output Data + +```text ++-------------+ +| employee_id | ++-------------+ +| sample | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT e.employee_id +FROM employees_2394 e +LEFT JOIN logs_2394 l ON e.employee_id = l.employee_id +GROUP BY e.employee_id,e.needed_hours +HAVING COALESCE(SUM(EXTRACT(hour FROM (out_time-in_time))+ + FLOOR((EXTRACT(minute FROM (out_time-in_time)) + CEIL(EXTRACT(second FROM (out_time-in_time))/60))/60)),0) < e.needed_hours +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `employee_id` from `employees`, `logs`. + +### Result Grain + +One row per unique key in `GROUP BY e.employee_id,e.needed_hours`. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with SUM grouped by e.employee_id,e.needed_hours. +3. Project final output columns: `employee_id`. +4. Filter aggregated groups in `HAVING`: COALESCE(SUM(EXTRACT(hour FROM (out_time-in_time))+ FLOOR((EXTRACT(minute FROM (out_time-in_time)) + CEIL(EXTRACT(second FROM (out_time-in_time))/60))/60)),0) < e.needed_hours. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/2394. Employees With Deductions (Medium).sql b/medium/2394. Employees With Deductions (Medium).sql deleted file mode 100644 index d4a13a7..0000000 --- a/medium/2394. Employees With Deductions (Medium).sql +++ /dev/null @@ -1,6 +0,0 @@ -SELECT e.employee_id -FROM employees_2394 e -LEFT JOIN logs_2394 l ON e.employee_id = l.employee_id -GROUP BY e.employee_id,e.needed_hours -HAVING COALESCE(SUM(EXTRACT(hour FROM (out_time-in_time))+ - FLOOR((EXTRACT(minute FROM (out_time-in_time)) + CEIL(EXTRACT(second FROM (out_time-in_time))/60))/60)),0) < e.needed_hours diff --git a/medium/534. Game Play Analysis III.md b/medium/534. Game Play Analysis III.md new file mode 100644 index 0000000..ec5de8a --- /dev/null +++ b/medium/534. Game Play Analysis III.md @@ -0,0 +1,83 @@ +# Question 534: Game Play Analysis III + +**LeetCode URL:** https://leetcode.com/problems/game-play-analysis-iii/ + +## Description + +The query result format is in the following example: Activity table: +-----------+-----------+------------+--------------+ | player_id | device_id | event_date | games_played | +-----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-05-02 | 6 | | 1 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-02 | 0 | | 3 | 4 | 2018-07-03 | 5 | +-----------+-----------+------------+--------------+ Result table: +-----------+------------+---------------------+ | player_id | event_date | games_played_so_far | +-----------+------------+---------------------+ | 1 | 2016-03-01 | 5 | | 1 | 2016-05-02 | 11 | | 1 | 2017-06-25 | 12 | | 3 | 2016-03-02 | 0 | | 3 | 2018-07-03 | 5 | +-----------+------------+---------------------+ For the player with id 1, 5 + 6 = 11 games played by 2016-05-02, and 5 + 6 + 1 = 12 games played by 2017-06-25. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int); +``` + +## Sample Input Data + +```sql +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5'); +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-05-02', '6'); +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '3', '2017-06-25', '1'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5'); +``` + +## Expected Output Data + +```text ++-----------+------------+---------------------+ +| player_id | event_date | games_played_so_far | ++-----------+------------+---------------------+ +| 1 | 2016-03-01 | 5 | +| 1 | 2016-05-02 | 11 | +| 1 | 2017-06-25 | 12 | +| 3 | 2016-03-02 | 0 | +| 3 | 2018-07-03 | 5 | ++-----------+------------+---------------------+ +``` + +## SQL Solution + +```sql +SELECT a.player_id,a.event_date,SUM(b.games_played) +FROM activity_534 a +JOIN activity_534 b ON a.player_id = b.player_id AND a.event_date >= b.event_date +GROUP BY a.player_id,a.event_date +ORDER BY 1,2; + +(OR) + +SELECT player_id,event_date, + SUM(games_played) OVER w AS games_played +FROM activity_534 +WINDOW w AS (PARTITION BY player_id ORDER BY event_date); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `player_id`, `event_date`, `games_played` from `activity`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `player_id`, `event_date`, `games_played`. +3. Order output deterministically with `ORDER BY event_date)`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/534. Game Play Analysis III.sql b/medium/534. Game Play Analysis III.sql deleted file mode 100644 index a692464..0000000 --- a/medium/534. Game Play Analysis III.sql +++ /dev/null @@ -1,12 +0,0 @@ -SELECT a.player_id,a.event_date,SUM(b.games_played) -FROM activity_534 a -JOIN activity_534 b ON a.player_id = b.player_id AND a.event_date >= b.event_date -GROUP BY a.player_id,a.event_date -ORDER BY 1,2; - -(OR) - -SELECT player_id,event_date, - SUM(games_played) OVER w AS games_played -FROM activity_534 -WINDOW w AS (PARTITION BY player_id ORDER BY event_date); diff --git a/medium/550. Game Play Analysis IV.md b/medium/550. Game Play Analysis IV.md new file mode 100644 index 0000000..7fbf29d --- /dev/null +++ b/medium/550. Game Play Analysis IV.md @@ -0,0 +1,68 @@ +# Question 550: Game Play Analysis IV + +**LeetCode URL:** https://leetcode.com/problems/game-play-analysis-iv/ + +## Description + +Write a solution to report the fraction of players that logged in again on the day after the day they first logged in, rounded to 2 decimal places. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int); +``` + +## Sample Input Data + +```sql +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5'); +insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-02', '6'); +insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0'); +insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5'); +``` + +## Expected Output Data + +```text ++-----------+ +| fraction | ++-----------+ +| 0.33 | ++-----------+ +``` + +## SQL Solution + +```sql +SELECT ROUND(COUNT(DISTINCT b.player_id)::NUMERIC/COUNT(DISTINCT a.player_id),2) +FROM activity_550 a +LEFT JOIN activity_550 b ON a.player_id = b.player_id AND a.event_date + 1 = b.event_date; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns the required output columns from `activity`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/550. Game Play Analysis IV.sql b/medium/550. Game Play Analysis IV.sql deleted file mode 100644 index c4cfe44..0000000 --- a/medium/550. Game Play Analysis IV.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT ROUND(COUNT(DISTINCT b.player_id)::NUMERIC/COUNT(DISTINCT a.player_id),2) -FROM activity_550 a -LEFT JOIN activity_550 b ON a.player_id = b.player_id AND a.event_date + 1 = b.event_date; diff --git a/medium/570. Managers with at Least 5 Direct Reports.md b/medium/570. Managers with at Least 5 Direct Reports.md new file mode 100644 index 0000000..9113193 --- /dev/null +++ b/medium/570. Managers with at Least 5 Direct Reports.md @@ -0,0 +1,79 @@ +# Question 570: Managers with at Least 5 Direct Reports + +**LeetCode URL:** https://leetcode.com/problems/managers-with-at-least-5-direct-reports/ + +## Description + +The Employee table holds all employees including their managers. Every employee has an Id, and there is also a column for the manager Id. Given the Employee table, write a SQL query that finds out managers with at + +## Table Schema Structure + +```sql +Create table If Not Exists Employee (id int, name varchar(255), department varchar(255), managerId int); +``` + +## Sample Input Data + +```sql +insert into Employee (id, name, department, managerId) values ('101', 'John', 'A', NULL); +insert into Employee (id, name, department, managerId) values ('102', 'Dan', 'A', '101'); +insert into Employee (id, name, department, managerId) values ('103', 'James', 'A', '101'); +insert into Employee (id, name, department, managerId) values ('104', 'Amy', 'A', '101'); +insert into Employee (id, name, department, managerId) values ('105', 'Anne', 'A', '101'); +insert into Employee (id, name, department, managerId) values ('106', 'Ron', 'B', '101'); +``` + +## Expected Output Data + +```text ++------------------+ +| result | ++------------------+ +| derived values | ++------------------+ +``` + +## SQL Solution + +```sql +WITH managers AS( + SELECT manager_id + FROM employee_570 + GROUP BY manager_id + HAVING COUNT(manager_id)>=5 +) + +SELECT name +FROM employee_570 +WHERE id IN (SELECT * FROM managers); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name` from `employee`, `managers`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`managers`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `managers`: reads `employee`. +3. Apply row-level filtering in `WHERE`: id IN (SELECT * FROM managers). +4. Project final output columns: `name`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/570. Managers with at Least 5 Direct Reports.sql b/medium/570. Managers with at Least 5 Direct Reports.sql deleted file mode 100644 index bc93075..0000000 --- a/medium/570. Managers with at Least 5 Direct Reports.sql +++ /dev/null @@ -1,10 +0,0 @@ -WITH managers AS( - SELECT manager_id - FROM employee_570 - GROUP BY manager_id - HAVING COUNT(manager_id)>=5 -) - -SELECT name -FROM employee_570 -WHERE id IN (SELECT * FROM managers); diff --git a/medium/574. Winning Candidate.md b/medium/574. Winning Candidate.md new file mode 100644 index 0000000..41114ca --- /dev/null +++ b/medium/574. Winning Candidate.md @@ -0,0 +1,84 @@ +# Question 574: Winning Candidate + +**LeetCode URL:** https://leetcode.com/problems/winning-candidate/ + +## Description + +return the winner B. + +## Table Schema Structure + +```sql +Create table If Not Exists Candidate (id int, name varchar(255)); +Create table If Not Exists Vote (id int, candidateId int); +``` + +## Sample Input Data + +```sql +insert into Candidate (id, name) values ('1', 'A'); +insert into Candidate (id, name) values ('2', 'B'); +insert into Candidate (id, name) values ('3', 'C'); +insert into Candidate (id, name) values ('4', 'D'); +insert into Candidate (id, name) values ('5', 'E'); +insert into Vote (id, candidateId) values ('1', '2'); +insert into Vote (id, candidateId) values ('2', '4'); +insert into Vote (id, candidateId) values ('3', '3'); +insert into Vote (id, candidateId) values ('4', '2'); +insert into Vote (id, candidateId) values ('5', '5'); +``` + +## Expected Output Data + +```text ++--------------+ +| candidate_id | ++--------------+ +| sample | ++--------------+ +``` + +## SQL Solution + +```sql +SELECT name +FROM candidate_574 +WHERE id IN ( + SELECT candidate_id + FROM vote_574 + GROUP BY candidate_id + ORDER BY COUNT(candidate_id) DESC + LIMIT 1 +); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `name` from `candidate`, `vote`. + +### Result Grain + +One row per unique key in `GROUP BY candidate_id`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: id IN ( SELECT candidate_id FROM vote_574. +2. Aggregate rows with COUNT grouped by candidate_id. +3. Project final output columns: `name`. +4. Order output deterministically with `ORDER BY COUNT(candidate_id) DESC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/574. Winning Candidate.sql b/medium/574. Winning Candidate.sql deleted file mode 100644 index 2c0d600..0000000 --- a/medium/574. Winning Candidate.sql +++ /dev/null @@ -1,9 +0,0 @@ -SELECT name -FROM candidate_574 -WHERE id IN ( - SELECT candidate_id - FROM vote_574 - GROUP BY candidate_id - ORDER BY COUNT(candidate_id) DESC - LIMIT 1 -); diff --git a/medium/578. Get Highest Answer Rate Question.md b/medium/578. Get Highest Answer Rate Question.md new file mode 100644 index 0000000..dd5fa09 --- /dev/null +++ b/medium/578. Get Highest Answer Rate Question.md @@ -0,0 +1,71 @@ +# Question 578: Get Highest Answer Rate Question + +**LeetCode URL:** https://leetcode.com/problems/get-highest-answer-rate-question/ + +## Description + +Get the highest answer rate question from a table survey_log with these columns: uid, action, question_id, answer_id, q_num, timestamp. + +## Table Schema Structure + +```sql +Create table If Not Exists SurveyLog (id int, action varchar(255), question_id int, answer_id int, q_num int, timestamp int); +``` + +## Sample Input Data + +```sql +insert into SurveyLog (id, action, question_id, answer_id, q_num, timestamp) values ('5', 'show', '285', NULL, '1', '123'); +insert into SurveyLog (id, action, question_id, answer_id, q_num, timestamp) values ('5', 'answer', '285', '124124', '1', '124'); +insert into SurveyLog (id, action, question_id, answer_id, q_num, timestamp) values ('5', 'show', '369', NULL, '2', '125'); +insert into SurveyLog (id, action, question_id, answer_id, q_num, timestamp) values ('5', 'skip', '369', NULL, '2', '126'); +``` + +## Expected Output Data + +```text ++-------------+ +| survey_log | ++-------------+ +| 285 | ++-------------+ +``` + +## SQL Solution + +```sql +SELECT question_id +FROM surveylog_578 +GROUP BY question_id +ORDER BY COUNT(CASE WHEN action='answer' THEN question_id ELSE NULL END)/COUNT(CASE WHEN action='show' THEN question_id ELSE NULL END) DESC +LIMIT 1; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `question_id` from `surveylog`. + +### Result Grain + +One row per unique key in `GROUP BY question_id`. + +### Step-by-Step Logic + +1. Aggregate rows with COUNT grouped by question_id. +2. Project final output columns: `question_id`. +3. Order output deterministically with `ORDER BY COUNT(CASE WHEN action='answer' THEN question_id ELSE NULL END)/COUNT(CASE WHEN action='show' THEN question_id ELSE NULL END) DESC`. + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/578. Get Highest Answer Rate Question.sql b/medium/578. Get Highest Answer Rate Question.sql deleted file mode 100644 index 99d478b..0000000 --- a/medium/578. Get Highest Answer Rate Question.sql +++ /dev/null @@ -1,5 +0,0 @@ -SELECT question_id -FROM surveylog_578 -GROUP BY question_id -ORDER BY COUNT(CASE WHEN action='answer' THEN question_id ELSE NULL END)/COUNT(CASE WHEN action='show' THEN question_id ELSE NULL END) DESC -LIMIT 1; diff --git a/medium/580. Count Student Number in Departments.md b/medium/580. Count Student Number in Departments.md new file mode 100644 index 0000000..7f74c82 --- /dev/null +++ b/medium/580. Count Student Number in Departments.md @@ -0,0 +1,81 @@ +# Question 580: Count Student Number in Departments + +**LeetCode URL:** https://leetcode.com/problems/count-student-number-in-departments/ + +## Description + +A university uses 2 data tables, student and department, to store data about its students and the departments associated with each major. Write a query to print the respective department name and number of students majoring in each + +## Table Schema Structure + +```sql +Create table If Not Exists Student (student_id int,student_name varchar(45), gender varchar(6), dept_id int); +Create table If Not Exists Department (dept_id int, dept_name varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Student (student_id, student_name, gender, dept_id) values ('1', 'Jack', 'M', '1'); +insert into Student (student_id, student_name, gender, dept_id) values ('2', 'Jane', 'F', '1'); +insert into Student (student_id, student_name, gender, dept_id) values ('3', 'Mark', 'M', '2'); +insert into Department (dept_id, dept_name) values ('1', 'Engineering'); +insert into Department (dept_id, dept_name) values ('2', 'Science'); +insert into Department (dept_id, dept_name) values ('3', 'Law'); +``` + +## Expected Output Data + +```text ++-----------+---------------+ +| dept_name | student_count | ++-----------+---------------+ +| sample | sample | ++-----------+---------------+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT dept_id,COUNT(*) AS student_count + FROM student_580 + GROUP BY dept_id +) + +SELECT dept_name,COALESCE(student_count,0) AS student_count +FROM department_580 d +LEFT JOIN cte c ON c.dept_id = d.dept_id +ORDER BY student_count DESC,dept_name ASC; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `dept_name`, `student_count` from `student`, `department`, `cte`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `student`. +3. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +4. Project final output columns: `dept_name`, `student_count`. +5. Order output deterministically with `ORDER BY student_count DESC,dept_name ASC`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/580. Count Student Number in Departments.sql b/medium/580. Count Student Number in Departments.sql deleted file mode 100644 index ff77712..0000000 --- a/medium/580. Count Student Number in Departments.sql +++ /dev/null @@ -1,10 +0,0 @@ -WITH cte AS( - SELECT dept_id,COUNT(*) AS student_count - FROM student_580 - GROUP BY dept_id -) - -SELECT dept_name,COALESCE(student_count,0) AS student_count -FROM department_580 d -LEFT JOIN cte c ON c.dept_id = d.dept_id -ORDER BY student_count DESC,dept_name ASC; diff --git a/medium/585. Investments in 2016.md b/medium/585. Investments in 2016.md new file mode 100644 index 0000000..8dc7daa --- /dev/null +++ b/medium/585. Investments in 2016.md @@ -0,0 +1,93 @@ +# Question 585: Investments in 2016 + +**LeetCode URL:** https://leetcode.com/problems/investments-in-2016/ + +## Description + +Write a solution to report the sum of all total investment values in 2016 tiv_2016, for all policyholders who: - have the same tiv_2015 value as one or more other policyholders, and - are not located in the same city as any other policyholder (i. The result format is in the following example. + +## Table Schema Structure + +```sql +Create Table If Not Exists Insurance (pid int, tiv_2015 float, tiv_2016 float, lat float, lon float); +``` + +## Sample Input Data + +```sql +insert into Insurance (pid, tiv_2015, tiv_2016, lat, lon) values ('1', '10', '5', '10', '10'); +insert into Insurance (pid, tiv_2015, tiv_2016, lat, lon) values ('2', '20', '20', '20', '20'); +insert into Insurance (pid, tiv_2015, tiv_2016, lat, lon) values ('3', '10', '30', '20', '20'); +insert into Insurance (pid, tiv_2015, tiv_2016, lat, lon) values ('4', '10', '40', '40', '40'); +``` + +## Expected Output Data + +```text ++----------+ +| tiv_2016 | ++----------+ +| 45.00 | ++----------+ +``` + +## SQL Solution + +```sql +--(1st approach is called co-relevant subquery because inner query is dependent on outer query) + +SELECT SUM(tiv_2016) AS tiv_2016_sum +FROM insurance_585 i +WHERE tiv_2015 IN (SELECT tiv_2015 + FROM insurance_585 + WHERE pid <> i.pid) AND + (lat,lon) NOT IN (SELECT lat,lon + FROM insurance_585 + WHERE pid <> i.pid); + + +(OR) + + +SELECT SUM(tiv_2016) AS tiv_2016_sum +FROM insurance_585 +WHERE tiv_2015 IN (SELECT tiv_2015 + FROM insurance_585 + GROUP BY tiv_2015 + HAVING COUNT(*) > 1) AND +(lat,lon) IN (SELECT lat,lon + FROM insurance_585 + GROUP BY lat,lon + HAVING COUNT(*) = 1); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `tiv_2016_sum` from `insurance`. + +### Result Grain + +One row per unique key in `GROUP BY tiv_2015`. + +### Step-by-Step Logic + +1. Apply row-level filtering in `WHERE`: tiv_2015 IN (SELECT tiv_2015 FROM insurance_585. +2. Aggregate rows with COUNT, SUM grouped by tiv_2015. +3. Project final output columns: `tiv_2016_sum`. +4. Filter aggregated groups in `HAVING`: COUNT(*) > 1) AND (lat,lon) IN (SELECT lat,lon FROM insurance_585 GROUP BY lat,lon HAVING COUNT(*) = 1). + +### Why This Works + +Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. Predicate filtering removes irrelevant rows before expensive downstream computation. `HAVING` ensures only groups that satisfy business thresholds survive. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. +- If nullable columns are involved, validate null-handling explicitly (`IS NULL`, `COALESCE`). + diff --git a/medium/585. Investments in 2016.sql b/medium/585. Investments in 2016.sql deleted file mode 100644 index 94f27c0..0000000 --- a/medium/585. Investments in 2016.sql +++ /dev/null @@ -1,25 +0,0 @@ ---(1st approach is called co-relevant subquery because inner query is dependent on outer query) - -SELECT SUM(tiv_2016) AS tiv_2016_sum -FROM insurance_585 i -WHERE tiv_2015 IN (SELECT tiv_2015 - FROM insurance_585 - WHERE pid <> i.pid) AND - (lat,lon) NOT IN (SELECT lat,lon - FROM insurance_585 - WHERE pid <> i.pid); - - -(OR) - - -SELECT SUM(tiv_2016) AS tiv_2016_sum -FROM insurance_585 -WHERE tiv_2015 IN (SELECT tiv_2015 - FROM insurance_585 - GROUP BY tiv_2015 - HAVING COUNT(*) > 1) AND -(lat,lon) IN (SELECT lat,lon - FROM insurance_585 - GROUP BY lat,lon - HAVING COUNT(*) = 1); diff --git a/medium/602. Friend Requests II: Who Has the Most Friends.md b/medium/602. Friend Requests II: Who Has the Most Friends.md new file mode 100644 index 0000000..cd27938 --- /dev/null +++ b/medium/602. Friend Requests II: Who Has the Most Friends.md @@ -0,0 +1,103 @@ +# Question 602: Friend Requests II: Who Has the Most Friends + +**LeetCode URL:** https://leetcode.com/problems/friend-requests-ii-who-has-the-most-friends/ + +## Description + +Write a solution to find the people who have the most friends and the most friends number. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists RequestAccepted (requester_id int not null, accepter_id int null, accept_date date null); +``` + +## Sample Input Data + +```sql +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('1', '2', '2016/06/03'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('1', '3', '2016/06/08'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('2', '3', '2016/06/08'); +insert into RequestAccepted (requester_id, accepter_id, accept_date) values ('3', '4', '2016/06/09'); +``` + +## Expected Output Data + +```text ++----+-----+ +| id | num | ++----+-----+ +| 3 | 3 | ++----+-----+ +``` + +## SQL Solution + +```sql +WITH cte AS( + SELECT requester_id AS uid + FROM request_accepted_602 + UNION ALL + SELECT accepter_id AS uid + FROM request_accepted_602 +) + +SELECT uid,COUNT(uid) num_of_friends +FROM cte +GROUP BY uid +ORDER BY num_of_friends DESC +LIMIT 1; + + +--Answer of follow-up question: + +WITH cte AS( + SELECT requester_id AS uid + FROM request_accepted_602 + UNION ALL + SELECT accepter_id AS uid + FROM request_accepted_602 +) + +SELECT uid +FROM cte +GROUP BY uid +HAVING COUNT(uid) = (SELECT COUNT(uid) num_of_friends + FROM cte + GROUP BY uid + ORDER BY num_of_friends DESC + LIMIT 1); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `uid` from `request_accepted`, `cte`. + +### Result Grain + +One row per unique key in `GROUP BY uid`. + +### Step-by-Step Logic + +1. Create CTE layers (`cte`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `cte`: reads `request_accepted`. +3. Aggregate rows with COUNT grouped by uid. +4. Project final output columns: `uid`. +5. Filter aggregated groups in `HAVING`: COUNT(uid) = (SELECT COUNT(uid) num_of_friends FROM cte GROUP BY uid. +6. Merge compatible result sets with `UNION`/`UNION ALL` before final projection. +7. Order output deterministically with `ORDER BY num_of_friends DESC`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. `HAVING` ensures only groups that satisfy business thresholds survive. Set-union logic combines multiple valid pathways into one consistent output. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, subquery execution. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/602. Friend Requests II: Who Has the Most Friends.sql b/medium/602. Friend Requests II: Who Has the Most Friends.sql deleted file mode 100644 index 681dc00..0000000 --- a/medium/602. Friend Requests II: Who Has the Most Friends.sql +++ /dev/null @@ -1,33 +0,0 @@ -WITH cte AS( - SELECT requester_id AS uid - FROM request_accepted_602 - UNION ALL - SELECT accepter_id AS uid - FROM request_accepted_602 -) - -SELECT uid,COUNT(uid) num_of_friends -FROM cte -GROUP BY uid -ORDER BY num_of_friends DESC -LIMIT 1; - - ---Answer of follow-up question: - -WITH cte AS( - SELECT requester_id AS uid - FROM request_accepted_602 - UNION ALL - SELECT accepter_id AS uid - FROM request_accepted_602 -) - -SELECT uid -FROM cte -GROUP BY uid -HAVING COUNT(uid) = (SELECT COUNT(uid) num_of_friends - FROM cte - GROUP BY uid - ORDER BY num_of_friends DESC - LIMIT 1); diff --git a/medium/608. Tree Node.md b/medium/608. Tree Node.md new file mode 100644 index 0000000..feec840 --- /dev/null +++ b/medium/608. Tree Node.md @@ -0,0 +1,79 @@ +# Question 608: Tree Node + +**LeetCode URL:** https://leetcode.com/problems/tree-node/ + +## Description + +Write a solution to report the type of each node in the tree. Return the result table in any order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Tree (id int, p_id int); +``` + +## Sample Input Data + +```sql +insert into Tree (id, p_id) values ('1', NULL); +insert into Tree (id, p_id) values ('2', '1'); +insert into Tree (id, p_id) values ('3', '1'); +insert into Tree (id, p_id) values ('4', '2'); +insert into Tree (id, p_id) values ('5', '2'); +``` + +## Expected Output Data + +```text ++----+-------+ +| id | type | ++----+-------+ +| 1 | Root | +| 2 | Inner | +| 3 | Leaf | +| 4 | Leaf | +| 5 | Leaf | ++----+-------+ +``` + +## SQL Solution + +```sql +SELECT DISTINCT t1.id, + CASE WHEN t1.p_id IS NULL THEN 'Root' + WHEN t2.id IS NULL THEN 'Leaf' + ELSE 'Inner' END AS Type +FROM tree_608 t1 +LEFT JOIN tree_608 t2 ON t1.id = t2.p_id +ORDER BY t1.id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `Type` from `tree`. + +### Result Grain + +One row per distinct combination of projected columns. + +### Step-by-Step Logic + +1. Combine datasets using LEFT JOIN, JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `id`, `Type`. +3. Remove duplicate result tuples using `DISTINCT` where uniqueness is required. +4. Order output deterministically with `ORDER BY t1.id`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/608. Tree Node.sql b/medium/608. Tree Node.sql deleted file mode 100644 index 81d71cb..0000000 --- a/medium/608. Tree Node.sql +++ /dev/null @@ -1,7 +0,0 @@ -SELECT DISTINCT t1.id, - CASE WHEN t1.p_id IS NULL THEN 'Root' - WHEN t2.id IS NULL THEN 'Leaf' - ELSE 'Inner' END AS Type -FROM tree_608 t1 -LEFT JOIN tree_608 t2 ON t1.id = t2.p_id -ORDER BY t1.id; diff --git a/medium/612. Shortest Distance in a Plane.md b/medium/612. Shortest Distance in a Plane.md new file mode 100644 index 0000000..5d36dba --- /dev/null +++ b/medium/612. Shortest Distance in a Plane.md @@ -0,0 +1,67 @@ +# Question 612: Shortest Distance in a Plane + +**LeetCode URL:** https://leetcode.com/problems/shortest-distance-in-a-plane/ + +## Description + +Drafted from this solution SQL: write a query on `point_2d` to return `shortest`. + +## Table Schema Structure + +```sql +Create Table If Not Exists Point2D (x int not null, y int not null); +``` + +## Sample Input Data + +```sql +insert into Point2D (x, y) values ('-1', '-1'); +insert into Point2D (x, y) values ('0', '0'); +insert into Point2D (x, y) values ('-1', '-2'); +``` + +## Expected Output Data + +```text ++----------+ +| shortest | ++----------+ +| sample | ++----------+ +``` + +## SQL Solution + +```sql +SELECT ROUND(MIN(SQRT(POWER(a.x-b.x,2)+POWER(a.y-b.y,2)))::NUMERIC,2) AS shortest +FROM point_2d_612 a +JOIN point_2d_612 b ON (a.x,a.y) <> (b.x,b.y); +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `shortest` from `point_2d`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Project final output columns: `shortest`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. + diff --git a/medium/612. Shortest Distance in a Plane.sql b/medium/612. Shortest Distance in a Plane.sql deleted file mode 100644 index db95841..0000000 --- a/medium/612. Shortest Distance in a Plane.sql +++ /dev/null @@ -1,3 +0,0 @@ -SELECT ROUND(MIN(SQRT(POWER(a.x-b.x,2)+POWER(a.y-b.y,2)))::NUMERIC,2) AS shortest -FROM point_2d_612 a -JOIN point_2d_612 b ON (a.x,a.y) <> (b.x,b.y); diff --git a/medium/614. Second Degree Follower.md b/medium/614. Second Degree Follower.md new file mode 100644 index 0000000..1139806 --- /dev/null +++ b/medium/614. Second Degree Follower.md @@ -0,0 +1,72 @@ +# Question 614: Second Degree Follower + +**LeetCode URL:** https://leetcode.com/problems/second-degree-follower/ + +## Description + +In facebook, there is a follow table with two columns: followee, follower. Please write a sql query to get the amount of each follower's follower if he/she has one. + +## Table Schema Structure + +```sql +Create table If Not Exists Follow (followee varchar(255), follower varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Follow (followee, follower) values ('Alice', 'Bob'); +insert into Follow (followee, follower) values ('Bob', 'Cena'); +insert into Follow (followee, follower) values ('Bob', 'Donald'); +insert into Follow (followee, follower) values ('Donald', 'Edward'); +``` + +## Expected Output Data + +```text ++-------------+------------+ +| follower | num | ++-------------+------------+ +| B | 2 | +| D | 1 | ++-------------+------------+ +``` + +## SQL Solution + +```sql +SELECT b.followee AS follower,COUNT(b.follower) AS num +FROM follow_614 a +JOIN follow_614 b ON a.follower = b.followee +GROUP BY b.followee; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `follower`, `num` from `follow`. + +### Result Grain + +One row per unique key in `GROUP BY b.followee`. + +### Step-by-Step Logic + +1. Combine datasets using JOIN. Join predicates control row matching and prevent accidental cartesian growth. +2. Aggregate rows with COUNT grouped by b.followee. +3. Project final output columns: `follower`, `num`. + +### Why This Works + +Join conditions align related entities so each output row is built from the correct source records. Grouping keys define the reporting grain; aggregate functions then summarize values at exactly that grain. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, join operations. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Verify join keys and cardinality; wrong joins can duplicate or drop rows silently. +- Every non-aggregated selected column must belong to the grouping grain. + diff --git a/medium/614. Second Degree Follower.sql b/medium/614. Second Degree Follower.sql deleted file mode 100644 index 605845d..0000000 --- a/medium/614. Second Degree Follower.sql +++ /dev/null @@ -1,4 +0,0 @@ -SELECT b.followee AS follower,COUNT(b.follower) AS num -FROM follow_614 a -JOIN follow_614 b ON a.follower = b.followee -GROUP BY b.followee; diff --git a/medium/626. Exchange Seats.md b/medium/626. Exchange Seats.md new file mode 100644 index 0000000..e2af9af --- /dev/null +++ b/medium/626. Exchange Seats.md @@ -0,0 +1,89 @@ +# Question 626: Exchange Seats + +**LeetCode URL:** https://leetcode.com/problems/exchange-seats/ + +## Description + +Write a solution to swap the seat id of every two consecutive students. Return the result table ordered by id in ascending order. The result format is in the following example. + +## Table Schema Structure + +```sql +Create table If Not Exists Seat (id int, student varchar(255)); +``` + +## Sample Input Data + +```sql +insert into Seat (id, student) values ('1', 'Abbot'); +insert into Seat (id, student) values ('2', 'Doris'); +insert into Seat (id, student) values ('3', 'Emerson'); +insert into Seat (id, student) values ('4', 'Green'); +insert into Seat (id, student) values ('5', 'Jeames'); +``` + +## Expected Output Data + +```text ++----+---------+ +| id | student | ++----+---------+ +| 1 | Doris | +| 2 | Abbot | +| 3 | Green | +| 4 | Emerson | +| 5 | Jeames | ++----+---------+ +``` + +## SQL Solution + +```sql +WITH ranked AS( + SELECT id,student, + LAG(id) OVER (w) AS lag, + LEAD(id) OVER (w) AS lead + FROM seat_626 + WINDOW w AS (ORDER BY id) +) + +SELECT + CASE WHEN MOD(id,2) = 1 AND lead IS NOT NULL THEN lead + WHEN MOD(id,2) = 0 THEN lag + ELSE id + END AS id, + student +FROM ranked +ORDER BY id; +``` + +## Solution Breakdown + +### Goal + +The query builds the final result columns `id`, `student` from `seat`, `ranked`. + +### Result Grain + +Row grain follows the post-filter join output. + +### Step-by-Step Logic + +1. Create CTE layers (`ranked`) to decompose the logic into smaller, testable steps before the final SELECT. +2. CTE `ranked`: reads `seat`, computes window metrics. +3. Use window functions (`... OVER (...)`) to compute rankings/running metrics while preserving row-level detail. +4. Project final output columns: `id`, `student`. +5. Order output deterministically with `ORDER BY id`. + +### Why This Works + +CTEs separate transformation stages, which makes dependencies explicit and easier to validate. Window expressions calculate comparative metrics without collapsing rows too early. The final projection exposes only the columns required by the result contract. + +### Performance Notes + +Primary cost drivers are sorting/grouping, window partitions. Indexes on join/filter/group keys typically provide the biggest speedup. + +### Common Pitfalls + +- Window `PARTITION BY`/`ORDER BY` choices change semantics; test with tied values and nulls. + diff --git a/medium/626. Exchange Seats.sql b/medium/626. Exchange Seats.sql deleted file mode 100644 index 8d75730..0000000 --- a/medium/626. Exchange Seats.sql +++ /dev/null @@ -1,16 +0,0 @@ -WITH ranked AS( - SELECT id,student, - LAG(id) OVER (w) AS lag, - LEAD(id) OVER (w) AS lead - FROM seat_626 - WINDOW w AS (ORDER BY id) -) - -SELECT - CASE WHEN MOD(id,2) = 1 AND lead IS NOT NULL THEN lead - WHEN MOD(id,2) = 0 THEN lag - ELSE id - END AS id, - student -FROM ranked -ORDER BY id;