SQL Set Operators
UNION
Description:
The UNION operator combines the results of two or more SELECT queries into a single result set, removing duplicate rows.
Conditions:
Each SELECT query within the UNION must have the same number of columns.
The corresponding columns must have compatible data types.
Columns are returned in the order they appear in the first query.
Example:
Assume you have two tables, ClientsUSA and ClientsCanada, with columns ID and Name.
SELECT ID, Name FROM ClientsUSA UNION SELECT ID, Name FROM ClientsCanada;
This query returns a list of clients from both tables, without duplicates.
Key Points:
UNION sorts the results by default, which can impact performance.
Use UNION ALL if you do not want to remove duplicates.
UNION ALL
Description:
The UNION ALL operator combines the results of multiple SELECT queries and includes all results, including duplicates.
Conditions:
The conditions on columns and the number of columns are the same as for UNION.
Example:
SELECT ID, Name FROM ClientsUSA UNION ALL SELECT ID, Name FROM ClientsCanada;
This query returns all clients from ClientsUSA and ClientsCanada, including duplicates.
Key Points:
UNION ALL is often faster than UNION because it does not check for duplicates.
Ideal when duplicates are needed for further analysis.
INTERSECT
Description:
The INTERSECT operator returns only the rows that are present in the results of all SELECT queries involved. It gives the intersection of the result sets.
Conditions:
Each query must have the same number of columns with compatible data types.
Columns are compared for equality and must match to be included in the result.
Example:
SELECT ID, Name FROM ClientsUSA INTERSECT SELECT ID, Name FROM ClientsCanada;
This query returns clients who are present in both ClientsUSA and ClientsCanada.
Key Points:
Duplicates are removed from the final result set.
Used to find common elements between different data sets.
EXCEPT (or MINUS in some DBMS)
Description:
The EXCEPT operator returns rows that are present in the first SELECT query but not in the subsequent queries. It is used to get the difference between data sets.
Conditions:
Queries must have the same number of columns with compatible data types.
Duplicates are removed in the final result set.
Example:
SELECT ID, Name FROM ClientsUSA EXCEPT SELECT ID, Name FROM ClientsCanada;
This query returns clients who are present in ClientsUSA but not in ClientsCanada.
Key Points:
Duplicates in the first query are removed, but the entire row must be different to be excluded.
Useful for excluding specific data sets from a main set.
Common Points and Specifics of Set Operators
Column Compatibility: All set operators require that the queries involved have the same number of columns with compatible data types.
Column Order: The order of columns in the combined result is determined by the order of columns in the first query.
Performance:
UNION is more resource-intensive than UNION ALL as it involves duplicate removal.
INTERSECT and EXCEPT can also be more performance-costly as they require comparison and filtering operations.
ORDER BY Clause:
The ORDER BY clause must be used after combining the queries to sort the final result set.
Sorting orders should be applied to the final result set.
Complete Example with ORDER BY
To combine results from two tables, keep duplicates, and sort the results, you can use:
SELECT ID, Name FROM ClientsUSA UNION ALL SELECT ID, Name FROM ClientsCanada ORDER BY Name;
This query combines clients from both tables, retains duplicates, and sorts the results by name.