What is Indexing?
Indexing is a data structure technique that allows the database to quickly locate and access specific data. It’s similar to the index at the back of a book, which helps you find specific pages quickly.
How Indexing Works
- Index Creation: The database creates an index on a specified column(s).
- Index Data Structure: The index is stored in a data structure (e.g., B-tree).
- Query Execution: When a query is executed, the database checks if an index exists for the filtered column(s).
- Index Scanning: If an index exists, the database scans the index to quickly locate the required data.
Types of Indexes
- Clustered Index: Reorders the physical records of the table according to the index keys. Each table can have only one clustered index.
- Non-Clustered Index: Creates a separate data structure that contains the index keys and pointers to the corresponding data rows.
- Unique Index: Ensures that each value in the indexed column(s) is unique.
- Composite Index: Indexes multiple columns together.
- Function-Based Index: Indexes the result of a function or expression.
- Full-Text Index: Optimizes queries that search for specific words or phrases within a column.
- Spatial Index: Optimizes spatial queries
Syntax for Creating Indexes
Create Index
CREATE INDEX index_name ON table_name (column_name);
Create Unique Index
CREATE UNIQUE INDEX index_name ON table_name (column_name);
Drop Index
DROP INDEX index_name ON table_name; -- MySQL
DROP INDEX index_name; -- SQL Server, Oracle
Clustered Index
CREATE CLUSTERED INDEX idx_name
ON table_name (column1, column2, ...);
Non-Clustered Index
CREATE NONCLUSTERED INDEX idx_name
ON table_name (column1, column2, ...);
Composite Index
CREATE INDEX idx_name
ON table_name (column1, column2, ...);
Function-Based Index
CREATE INDEX idx_name
ON table_name (FUNCTION(column1));
Full-Text Index
CREATE FULLTEXT INDEX idx_name
ON table_name (column1);
Spatial Index
CREATE SPATIAL INDEX idx_name
ON table_name (column1);
Uses of Indexes
- Speed up query execution: Indexes can significantly reduce the time it takes to retrieve data.
- Improve data retrieval: Indexes can help retrieve data more efficiently, especially when filtering or sorting data.
- Enforce uniqueness: Unique indexes can ensure that duplicate values are not inserted into a column.
Advantages of Indexes
- Improved query performance: Indexes can significantly speed up query execution.
- Reduced disk I/O: Indexes can reduce the amount of disk I/O required to retrieve data.
- Improved data integrity: Unique indexes can help maintain data integrity.
Disadvantages of Indexes
- Additional storage space: Indexes require additional storage space.
- Insert, update, and delete overhead: Maintaining indexes can slow down insert, update, and delete operations.
- Index fragmentation: Indexes can become fragmented over time, leading to decreased performance.
Scenarios for Using Indexes
- Frequently queried columns: Index columns that are frequently used in WHERE, JOIN, and ORDER BY clauses.
- Unique or primary key columns: Create unique indexes on columns that require unique values.
- Large tables: Indexes can significantly improve query performance on large tables.
- Columns with low cardinality: Indexes can be beneficial for columns with low cardinality (e.g., boolean or enum columns).
Example Use Case
Suppose we have a table employees
with columns id
, name
, email
, and department
. We frequently query employees by email
and department
. We can create a composite index on email
and department
to improve query performance.
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(255),
email VARCHAR(255),
department VARCHAR(255)
);
CREATE INDEX idx_email_department ON employees (email, department);
By creating an index on email
and department
, we can significantly speed up queries that filter employees by these columns.
Example:–
CREATE INDEX idx_column_name ON table_name (column_name);
SELECT
column_name,
COUNT() AS occurrence FROM table_name WHERE some_filter = 'value' GROUP BY column_name HAVING COUNT() > 1;
EXPLAIN PLAN FOR
SELECT
column_name,
COUNT(*)
FROM
table_name
GROUP BY
column_name
HAVING
COUNT(*) > 1;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Leave a Reply