The UNION Operator in SQL

The UNION Operator in SQL

Functionality of UNION

The UNION operator in SQL is used to combine the result sets of two or more SELECT queries into a single result set. The key characteristic of UNION is that it removes duplicate rows from the result set. The result set will only include unique rows across all the combined queries.

Syntax of UNION

The basic syntax for using UNION is: 

SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;

Points to Remember:

  • Column Consistency: Each SELECT statement must have the same number of columns, and the columns must be in the same order. The data types of the corresponding columns should be compatible.
  • Column Names: The column names in the result set are taken from the first SELECT statement.

Example of UNION

Consider the following tables:

  • Table: EmployeesUSA
ID Name Position
1 John Smith Manager
2 Emily Johnson Developer
  • Table: EmployeesCanada
ID Name Position
3 Michael Brown Developer
4 Sarah Davis Designer

Query Using UNION: 

SELECT ID, Name, Position
FROM EmployeesUSA
UNION
SELECT ID, Name, Position
FROM EmployeesCanada;

 Result:

ID Name Position
1 John Smith Manager
2 Emily Johnson Developer
3 Michael Brown Developer
4 Sarah Davis Designer

Here, UNION combines the rows from both tables and removes any duplicate rows (if any existed).

Detailed Points and Use Cases

Column Matching and Compatibility

  • Data Type Compatibility: The data types of corresponding columns in each SELECT statement must be compatible. For example, if the first column in one SELECT is an INTEGER, the corresponding column in the other SELECT should also be an INTEGER or a type that can be implicitly converted to INTEGER.
  • Column Order: The order of columns must be the same across all SELECT statements. If SELECT column1, column2 is used in the first query, the subsequent queries should also use SELECT column1, column2.

Performance Considerations

  • Duplicate Removal: Since UNION removes duplicate rows, it can be slower than UNION ALL, especially when dealing with large datasets. This is because the database engine has to perform additional processing to identify and eliminate duplicates.
  • Indexes: Having appropriate indexes on the columns involved in UNION queries can improve performance, especially for large datasets.

Sorting Results

  • ORDER BY Clause: To sort the results of a UNION query, you need to use the ORDER BY clause at the end of the final query. The ORDER BY clause can sort based on any of the columns from the SELECT statements.

Example with Sorting: 

SELECT ID, Name, Position
FROM EmployeesUSA
UNION
SELECT ID, Name, Position
FROM EmployeesCanada
ORDER BY Name;

This query sorts the combined result set by the Name column.

Use Cases

  • Combining Similar Data: UNION is useful for combining data from similar tables that might have the same structure but are stored in different tables. For instance, combining employees from different country-specific tables into a single list.
  • Creating Reports: When creating reports that need to aggregate data from various sources or segments, UNION allows you to merge these segments into a comprehensive report.
  • Handling Partitioned Data: In cases where data is partitioned across multiple tables or databases, UNION helps in querying and aggregating this distributed data into a unified result.

Common Pitfalls and Tips

  • Check Data Types: Ensure that corresponding columns have compatible data types to avoid errors or unexpected results.
  • Column Count: Always match the number of columns and their order across all SELECT statements.
  • Performance Tuning: For large datasets, consider using UNION ALL if duplicates are not a concern, or ensure that appropriate indexes are in place to optimize performance.

Example of Using UNION in Practice

Let’s consider a scenario where you have different tables for different branches of a company and you want to consolidate the employee data.

Tables:

  • Table: EmployeesNY
ID Name Department
1 Alice Green HR
2 Bob White IT
  • Table: EmployeesLA
ID Name Department
3 Charlie Black IT
4 Dana Blue Marketing

Query to Consolidate Data: 

SELECT ID, Name, Department
FROM EmployeesNY
UNION
SELECT ID, Name, Department
FROM EmployeesLA;

This query combines employees from the New York and Los Angeles branches into a single result set.

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Facebook
Twitter
LinkedIn
WhatsApp
Email
Print