Nested Subqueries in SQL

Introduction

A subquery (also called an inner query or nested query) is an SQL query placed inside another SQL statement. A subquery is most commonly embedded in the WHERE clause of the outer (or main) query and returns data that the main query uses to restrict or compute its result set.

  • Subqueries may be used with SELECT, INSERT, UPDATE, and DELETE statements.
  • Subqueries are frequently combined with operators such as =, <, >, , , IN, EXISTS, ANY, ALL, etc.
  • Subqueries may be either correlated (reference columns from the outer query) or non-correlated (independent queries evaluated once).

Rules and common restrictions for subqueries

  • Subqueries must be enclosed within parentheses unless the SQL dialect provides an explicit clause (for example, some DBMS allow WITH or inline table expressions without parentheses).
  • When a subquery is used where a single value is expected (a scalar context), it must return at most one column and at most one row.
  • When a subquery is used with a multivalue operator (for example, IN), it may return multiple rows but should return a single column, unless the context explicitly compares tuples (for example, (a,b) IN (SELECT x,y ...)).
  • In many SQL implementations an ORDER BY is not permitted inside a subquery unless it is paired with a limiting clause such as LIMIT or TOP. The outer query may use ORDER BY.
  • Aggregate functions may appear inside subqueries; however, an aggregate cannot be applied directly to the entire subquery result as if it were an ordinary column (that is, you cannot immediately wrap the subquery itself in a set function).
  • Subqueries that return more than one column must be used in contexts that allow row/tuple comparison.
  • Some operators are not directly usable with a subquery that returns multiple values. For example, the BETWEEN operator is not typically used with a multirow subquery; however, BETWEEN may be used inside a subquery, and scalar subqueries can be used to supply the lower and upper bounds.
  • The SELECT list of a subquery should not include columns whose types are unsupported in that context (for example, some DBMS restrict using BLOB/CLOB results directly in comparison contexts).

Types of subqueries

  • Scalar subquery: returns a single value (one row, one column). It can be used wherever a single expression is allowed.
  • Single-row subquery: returns one row (one or more columns). Typically used with single-row comparison operators (=, <, >).
  • Multi-row subquery: returns multiple rows (single column). Used with IN, ANY, ALL, etc.
  • Multi-column (tuple) subquery: returns one or more rows and multiple columns; used where row comparison is intended (for example, (a,b) < (SELECT x,y ...) or (a,b) IN (SELECT x,y ...)).
  • Correlated subquery: the subquery refers to columns of the outer query; it is evaluated once per row of the outer query.
  • Non-correlated subquery: independent query evaluated once; its result is then used by the outer query.

Operators and constructs commonly used with subqueries

  • IN - tests membership in the set returned by a subquery.
  • EXISTS / NOT EXISTS - tests whether the subquery returns any rows.
  • ANY / SOME - compares a value to each value in the subquery result; value > ANY (subquery) means value greater than at least one returned value.
  • ALL - requires the condition to be true for all values returned by the subquery.
  • Direct comparison operators (=, <, >) may be used with scalar subqueries.
  • EXISTS is often more efficient than IN when the subquery is correlated, especially when NULLs or large result sets are involved; implementation details vary by DBMS.

Subqueries with the SELECT statement

Subqueries are most frequently used within SELECT statements. The general form is:

SELECT column_list FROM table_list WHERE column_name operator (SELECT column_list FROM table_list [WHERE ...]);

Example: CUSTOMERS table

The sample CUSTOMERS table (columns: ID, NAME, AGE, ADDRESS, SALARY) contains the following rows:

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 35 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | Kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

Example query using a subquery with IN:

SELECT * FROM CUSTOMERS WHERE ID IN ( SELECT ID FROM CUSTOMERS WHERE SALARY > 4500 );

Result produced by the above query:

+----+----------+-----+---------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+---------+----------+ | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+---------+----------+

Correlated subquery example

Correlated subqueries reference a column from the outer query. They are evaluated once per outer row. Example: find customers whose salary is greater than the average salary of customers from the same address (city):

SELECT c.ID, c.NAME, c.SALARY, c.ADDRESS FROM CUSTOMERS c WHERE c.SALARY > ( SELECT AVG(c2.SALARY) FROM CUSTOMERS c2 WHERE c2.ADDRESS = c.ADDRESS );

This correlated subquery computes the average salary per city for each row of the outer query and returns those outer rows whose salary is above the local average.

Subqueries with the INSERT statement

A subquery can supply rows to insert into another table. The INSERT INTO ... SELECT form copies query results into a target table.

INSERT INTO table_name [(column1, column2, ...)] SELECT column_list FROM source_tables [WHERE ...];

Example: create a backup copy of the CUSTOMERS table by inserting all rows into CUSTOMERS_BKP (assumed to have compatible columns):

INSERT INTO CUSTOMERS_BKP SELECT * FROM CUSTOMERS WHERE ID IN ( SELECT ID FROM CUSTOMERS );

The inner SELECT in this example is redundant (SELECT * FROM CUSTOMERS could be used directly), but it demonstrates that a subquery can be used to supply the rowset for the INSERT.

Subqueries with the UPDATE statement

Subqueries may be used to determine new values or to select which rows to update.

UPDATE table_name SET column_name = new_value_expression WHERE some_column operator ( SELECT column_name FROM other_table [WHERE ...] );

Example: using CUSTOMERS_BKP (a backup of CUSTOMERS) to update the salaries of customers whose age is greater than or equal to 27. The intent is to set salary to 0.25 times the current value for matching rows.

UPDATE CUSTOMERS SET SALARY = SALARY * 0.25 WHERE AGE IN ( SELECT AGE FROM CUSTOMERS_BKP WHERE AGE >= 27 );

Explanation and effect:

The subquery returns the ages that are ≥ 27 from the backup table. The outer UPDATE updates rows in CUSTOMERS whose AGE matches any of those ages. In the sample data, IDs 1 (AGE 35) and 5 (AGE 27) match; hence their salaries change.

Final state of the CUSTOMERS table after the UPDATE (calculated from original salaries):

+----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 35 | Ahmedabad | 500.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | Kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 2125.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+-----------+----------+

Subqueries with the DELETE statement

Subqueries can select which rows to delete:

DELETE FROM table_name WHERE some_column operator ( SELECT column_name FROM other_table [WHERE ...] );

Example: using CUSTOMERS_BKP to delete rows from CUSTOMERS where AGE is ≥ 27:

DELETE FROM CUSTOMERS WHERE AGE IN ( SELECT AGE FROM CUSTOMERS_BKP WHERE AGE >= 27 );

Effect on the sample data: rows with AGE 35 (ID 1) and AGE 27 (ID 5) are removed. The resulting CUSTOMERS table contains the remaining rows:

+----+----------+-----+---------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+---------+----------+ | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | Kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 6 | Komal | 22 | MP | 4500.00 | | 7 | Muffy | 24 | Indore | 10000.00 | +----+----------+-----+---------+----------+

Performance, rewriting, and best practices

  • Prefer non-correlated subqueries when possible: non-correlated subqueries are executed once and can be more efficient than correlated subqueries that execute per outer row.
  • For many queries a JOIN can be used instead of a subquery; rewriting as a join can improve performance because the optimizer may choose better join algorithms and reduce repeated work.
  • EXISTS is often more efficient than IN when the subquery result is large or when NULLs are present; semantics differ when NULLs are involved, so choose the operator deliberately.
  • Use scalar subqueries where a single value is logically required (for example, fetching a single aggregate). Ensure they return at most one row or use aggregation/limits to enforce it.
  • When using subqueries in WHERE clauses, ensure appropriate indexes exist on the referenced columns to avoid full-table scans.
  • Limit nesting depth; multiple nested levels are supported by many DBMS but can become difficult to read and maintain. Consider using WITH (common table expressions) to factor out complex subqueries for clarity when supported.
  • Test and compare execution plans for subquery and join rewrites to choose the most efficient form for your DBMS and data distribution.

Common pitfalls and notes

  • Be careful with NULLs: IN and relational operators may yield different results from EXISTS when NULL values are present in the subquery result.
  • Some SQL dialect differences exist (for example, allowing ORDER BY in subqueries only with TOP/LIMIT, or allowing lateral joins or inline table functions). Write portable queries if portability is required.
  • When a subquery returns multiple columns, ensure the outer context supports tuple comparisons; not all SQL contexts accept multi-column subqueries directly.
  • Avoid assuming the subquery result is ordered unless the outer query explicitly orders the final result.

Summary

Subqueries are a powerful and flexible way to express queries that depend on the results of other queries. Understanding the types (scalar, single-row, multi-row, correlated), the operators that accept subqueries (IN, EXISTS, ANY, ALL), and how to rewrite subqueries as joins or common table expressions helps produce correct, maintainable and performant SQL. Apply the rules and best practices above to design clear and efficient queries.

The document Nested Subqueries in SQL is a part of the Computer Science Engineering (CSE) Course Database Management System (DBMS).
All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)
Explore Courses for Computer Science Engineering (CSE) exam
Get EduRev Notes directly in your Google search
Related Searches
Sample Paper, Objective type Questions, Nested Subqueries in SQL, shortcuts and tricks, Summary, Free, past year papers, study material, ppt, Nested Subqueries in SQL, Viva Questions, mock tests for examination, pdf , MCQs, Exam, Previous Year Questions with Solutions, Semester Notes, Extra Questions, video lectures, practice quizzes, Nested Subqueries in SQL, Important questions;