You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ANALYZE command collects statistics of value distribution in table columns and stores them in the pg_statistic system catalog. The planner will only be able to determine an optimal execution plan if the ANALYZE command was executed for a given table and its columns significant for the query.
The EXPLAIN command displays the query execution plan. Using the VERBOSE option will show more details. The ANALYZE option automatically executes ANALYZE and the query itself, letting EXPLAIN display not only the plan but also the actual execution results.
Projects
Exercise for Indexing with SQL.
Includes 8 exercises including answers.
Task 1: Hash-Based Indexes
Execute a query that displays all orders for the buk1 composition. Check the execution plan and take note of the execution times.
Add a hash-based index that can accelerate the aforementioned query to the orders table. Take note of the plan and execution plan.
Solution for Task 1, should be like this:
exercise1.txt-- Execute a query that displays all orders for the buk1 composition. Check the execution plan and take note of the execution times.SELECT*FROM all_orders WHERE compositions_no ='buk1'; -- compositions_no also refers composition_idCREATEINDEXcompositions_inON orders USING hash (compositions_no);
EXPLAIN ANALYZE SELECT DESCRIPTION *FROM all_orders WHERE composition_no ='buk1';
Task 2: B-tree indexes
Delete the index created in the previous exercise and create a similar one, but using B-trees. Repeat the previous query and take a note of the results.
Execute a query displaying orders of all compositions with IDs beginning with letters appearing before the letter 'c' in the alphabet. Is the index in use?
Run another query, displaying the remaining orders (beginning with 'c' and so on). Is the index in use now?
Try forcing the use of indexes by switching the enable_seqscan parameter:
SET ENABLE_SEQSCAN TO OFF;
This will "encourage" PostgreSQL to use indexes – this kind of optimisation may be useful when using mass storage with short seek times, such as SSD drives. Repeat the two queries and compare their execution plans and times.
Solution for Task 2, should be like this:
exercise2.txt-- Delete the index created in the previous exercise and create a similar one, but using B-trees. Repeat the previous query and take a note of the results.CREATEINDEXcompositions_inON all_orders (compositions_no);
EXPLAIN ANALYZE SELECT DESCRIPTION *FROMWHERE compositions_no ='buk1';
EXPLAIN ANALYZE SELECT*FROMWHERE compositions_no <'c'-- Execute a query displaying orders of all compositions with IDs beginning with letters appearing before the letter 'c' in the alphabet. Is the index in use?
EXPLAIN ANALYZE SELECT compositions_no from all_orders where compositions_no ='c'AND paid='f';
EXPLAIN ANALYZE SELECT*FROM all_orders WHERE compositions_no ~ '^c'; -- can be removed DROPINDEX compositions_in;
Task 3: Indexes and pattern matching
Create an index for the remarks column and query the database for all orders with remarks beginning with the "do" string. Is the index in use?
Delete the index and create a new one, but this time explicitly set its operator class (varchar_pattern_ops):
exercise3.txt-- Create an index for the remarks column and query the database for all orders with remarks beginning with the "do" string. Is the index in use?CREATEINDEXorders_remarksON orders (remarks);
EXPLAIN ANALYZE SElECT*FROM orders WHERE remarks LIKE'do%';
DROPINDEX orders_remarks;
CREATEINDEXorders_remarks_idxON orders (remarks varchar_pattern_ops);
DROPINDEX orders_remarks;
Task 4: Multi-column indexes
Create a multi-column index covering the client_id, recipient_id and composition_id columns.
Choose one value existing in these columns and run:
a query which joins the constraints for the three columns using the AND operator,
a similar query, but using OR
Compare the execution plans.
Now query the database for all orders of the buk1 composition.
Remove the index and create separate indexes for each of the three columns. Re-run the previous queries and compare the results.
Solution for Task 4, should be like this:
exercise4.txt-- Create a multi-column index covering the client_id, recipient_id and composition_id columns.CREATEINDEXmulti_column_indexON orders (client_id, recipient_id, composition_id);
EXPLAIN ANALYZE SELECT*FROM orders WHERE client_id ='msowins'AND recipient_id =1AND composition_id ='buk1';
EXPLAIN ANALYZE SELECT*FROM orders WHERE client_id ='msowins'OR recipient_id =1OR composition_id ='buk1';
EXPLAIN ANALYZE SELECT*FROM orders WHERE composition_id ='buk1';
DROPINDEX multi_column_index;
CREATEINDEXcomposition_indexON orders (composition_id);
CREATEINDEXrecipient_indexON orders (recipient_id);
CREATEINDEXclient_indexON orders (client_id);
EXPLAIN ANALYZE SELECT*FROM orders WHERE client_id ='msowins'AND recipient_id =1AND composition_id ='buk1';
EXPLAIN ANALYZE SELECT*FROM orders WHERE client_id ='msowins'OR recipient_id =1OR composition_id ='buk1';
EXPLAIN ANALYZE SELECT*FROM orders WHERE composition_id ='buk1';
Task 5: Indexes and sorting
Run a query which returns all orders sorted by their composition ID. Is the index in use?
Now delete the index and repeat the query. Compare result, then drop all indexes.
Solution for Task 5, should be like this:
exercise5.txt-- Run a query which returns all orders sorted by their composition ID. Is the index in use?
EXPLAIN ANALYZE SELECT*FROM all_orders ORDER BY composition_id ORDER BY composition_id ASC;
DROPINDEX composition_index;
DROPINDEX recipient_id_idx;
DROPINDEX client_index;
EXPLAIN ANALYZE SELECT all_orders, composition_id FROM orders ORDER BY composition_id ASC;
Task 6: Partial indexes
Create an index on the client_id column, but only for orders which have been paid. Check it by retrieving all paid orders for a given client. Repeat the query for unpaid orders.
Now calculate the sum of all unpaid orders. Is the index in use?
Solution for Task 6, should be like this:
exercise6.txt-- Create an index on the client_id column, but only for orders which have been paid.CREATEINDEXclient_indexON orders (client_id) WHEREall.paid='t';
EXPLAIN ANALYZE SELECT*FROM orders WHEREall.paid='t';
EXPLAIN ANALYZE SELECT*FROM orders WHERE client_id='msowins'ANDall.paid='t';
EXPLAIN ANALYZE SELECTsum(price) FROM orders WHEREall.paid='f';
DROPINDEX client_index;
Task 7: Indexes on expressions
Create an index and write a query which retrieves clients living in a city which begins with a given string (e.g. "krak"), regardless of the case (e.g. "krak"="Krak"="KRAK", etc.).
Check if the index runs properly.
Solution for Task 7, should be like this:
exercise7.txt-- Create an index and write a query which retrieves clients living in a city which begins with a given string (e.g. "krak"), regardless of the case (e.g. "krak"="Krak"="KRAK", etc.).
EXPLAIN ANALYZE SELECT*FROM all_clients WHERE city LIKE'Krak%';
CREATEINDEXname_inON clients (client_id);
DROPINDEX name_in;
EXPLAIN ANALYZE SELECT*FROM all_clients WHERE city LIKE'Krak%';
Task 7: Indexes on expressions
Add a column called the location of type point to the orders table. Populate this column with random data within the (0,0)–(100,100) range:
retrieve all orders within 10 units from the city center (50, 50),
retrieve all orders for the north-west quarter of the city.
Solution for Task 8, should be like this:
exercise8.txt-- Add a column called location of type point to the orders table. Populate this column with random data within the (0,0)–(100,100) range:ALTERTABLE orders ADD COLUMN location point;
UPDATE all_orders SET location =point ( random ( ) *100 , random ( ) *100 ) ;
EXPLAIN ANALYZE SELECT*FROM all_orders ORDER BY location <->point'(50,50)'LIMIT10;
CREATEINDEXlocation_idxON all_orders USING gist(location);
EXPLAIN ANALYZE SELECT*FROM all_orders ORDER BY location <->point'(50,50)'LIMIT10;
DROPINDEX index_of_location;
ALTERTABLE all_orders DROP COLUMN location;