Everything You Need To Know About SQL SET Operators

Everything You Need To Know About SQL SET Operators

SQL Set operators are conceptually derived from the mathematical set principle. The three fundamental set operators are UNION, INTERSECT, and EXCEPT.

All three, with variations (in particular, OUTER UNION) and options, could be carried out in PROC SQL for combining data from a number of tables in the SQL procedure.

JOINS VS SQL SET OPERATORS

Before we delve into the main points of PROC SQL set operators, let’s set up the basic distinctions between joins and set operators.

DATA first;
A = 1;
RUN;
DATA second;
B = 2;
RUN;

So each of these tables has one row and one column. You can use a PROC SQL step to combine the two through a simple cross join:

SELECT *
FROM first, second
;

Now take a look at the UNION, which is the simplest type of probably the most extensively used of the set operators. The code to combine our two tables is:

SELECT *
FROM first
UNION
SELECT *
FROM second
;

UNION is inserted between two SELECTs (each of which has, as it should, a subordinate FROM clause).

A set operator works on the outcomes of two SELECTs. This is not like a join, which is carried out inside the FROM clause of a single SELECT.

See the two numeric values, this time arranged vertically rather than horizontally. This
demonstrates the fundamental difference between joins and set operators.

Joins align rows and accrete columns whereas set operators align columns and accrete rows.

This is something of an oversimplification in fact. SQL just isn’t a matrix language and provides comparatively little symmetry between rows and columns. So the distinction is drawn here between joins and set operators is barely a basis for the main points to observe.

Rules for SET operators

Set operators adhere to fundamental guidelines of operation.

  1. If a SELECT statement consists of multiple set operator, set operators will likely be applied in the order specified.
  2. By default, duplicate rows are eliminated from the results.
  3. To allow duplicates, the ALL option should be specified with a set operator.
  4. Arguments are evaluated from left to right.
  5. Set operators can be used in:
    1. Queries
    2. Subqueries
    3. Derived tables
    4. View definitions
    5. INSERT with SELECT clause

SQL Operators and Precedence

Set operators adhere to an order of precedence. The following precedence rules apply:

  1. When multiple set operator is specified, each is applied in the order specified:
    1. Top to bottom
    2. to right
  2. The default order of precedence for processing set operators follows:
    1. INTERSECT
    2. UNION and/or EXCEPT
    3. When parentheses are specified, the default order of precedence could be altered.

Data Type Compatibility

The alignment of columns in these examples has laboured smoothly because the aligned columns have matched with respect to the data type (numeric or character). Since column alignment is an important aspect of virtually all the set operators, it’s worth exploring this a bit more.

DATA num;
id = 3;
worth = 0;
RUN;
DATA char;
id = 4;
worth = 'abc';
RUN;

VALUE is numeric in data set NUM but a character in data set CHAR. So once you
try a DATA step concatenation with

DATA both;
SET num char;
RUN;

SAS will give this log message:

The new data set (BOTH) is created but contains no observations.

If you run the parallel SQL code:

CREATE TABLE both AS
SELECT *
FROM num
OUTER UNION CORRESPONDING
SELECT *
FROM char
;
ERROR: Column 2 from the first contributor of OUTER UNION is not the same
type as to its counterpart from the second.

Unlike the DATA step, PROC SQL doesn’t even create an empty table on this scenario.

There is only one set operator which is resistant to data type mismatches as a result of it does no column alignment; that’s the OUTER UNION operator without the CORRESPONDING option.

 

Accessing Rows from the Intersection of Two Queries (INTERSECT)

The INTERSECT operator creates query results that consist of all of the distinctive rows from the intersection of the two queries. Put another approach, the intersection of two queries (A and B) is represented by C, which signifies that the rows which are produced occur in both A and B

SQL SET Operators

proc sql;
select * from sashelp.prdsale
where precise < 20
intersect
select * from sashelp.prdsale
where prodtype = "OFFICE";
quit;

Everything You Need To Know About SQL SET Operators 1

  • It is assumed that the tables in each query are structurally an identical as a result of the wildcard character is specified within the SELECT assertion.
  • The INTERSECT operator produces rows which are widespread to each query.

Accessing Rows from the Combination of Two Queries (UNION)

The UNION operator preserves all the distinctive rows from the combination of queries.

The result is identical as if an OR operator is used to combine the results of each question. Put another approach, the union of two queries (A and B) represents rows in A or in B or in both A and B.

SQL Union

The UNION operator automatically eliminates duplicate rows from the results, until the ALL keyword, is specified as a part of the UNION operator. The column names assigned to the results are derived from the names in the first query.

In order for the union of two or more queries to achieve success, each query should specify the same number of columns of the same or compatible types.

Type compatibility signifies that column attributes are defined the same way. Because column names and attributes are derived from the primary table, data types should be of the same type. The data types of the result columns are derived from the source table(s).

proc sql;
select * from sashelp.prdsale
where actual < 5
intersect
select * from sashelp.prdsale
where prodtype = "OFFICE";
quit;

SQL union

Concatenating Rows from Two Queries (OUTER UNION)

The OUTER UNION operator concatenates the results of two queries. As with a DATA step or PROC APPEND concatenation, the results include rows which are combined vertically.

Put another way, the outer union of two queries (A and B) represents all rows in both A and B with no overlap. 

outer union

The OUTER UNION operator automatically concatenates rows from two queries with no overlap, until the CORRESPONDING (CORR) keyword is specified as a part of the operator.

proc sql;
create table combine as 
SELECT name,age,height FROM sashelp.class where age <12
OUTER UNION 
SELECT name,age,weight FROM sashelp.class where age < 12;
quit;

Everything You Need To Know About SQL SET Operators 3

The column names which are assigned to the results are derived from the names in the first query. In the following example, the CORR keyword enables tables columns with the same name and attributes to be overlaid.

proc sql;
create table combine as 
SELECT name,age,height FROM sashelp.class where age <12
OUTER UNION corr
SELECT name,age,weight FROM sashelp.class where age < 12;
quit;

Everything You Need To Know About SQL SET Operators 5

Comparing Rows from Two Queries (EXCEPT)

The EXCEPT operator compares rows from two queries to find out the modifications made to the primary table that aren’t present in the second table.

The following results show new and altered rows in the first table that is in the second table however don’t show rows which were deleted from the second table.

When working with two tables that consist of similar data, you need to use the EXCEPT operator to find out new and modified rows.

data subjects_original;
input name $ city $;
datalines;
AAA Pune
BBB Delhi
CCC Mumbai
EEE Kolkata
;
run;

data subjects_new;
input name $ city $;
datalines;
AAA Patna
BBB Delhi
CCC Mumbai
DDD Bangalore
;
run;

proc sql;
create table checkdata as
select * from subjects_original 
except 
select * from subjects_new;
quit;

Everything You Need To Know About SQL SET Operators 7

The EXCEPT operator is used to determine rows in the first table (or a question), however, shouldn’t be used to determine rows in the second table (or query).

It additionally uniquely identifies rows which have modified from the primary table to the second table. Columns are compared within the order that they appear in the SELECT statement.

About the Author - Subhro Kar

Been in the realm with the professionals of the IT industry. I am passionate about Coding, Blogging, Web Designing and deliver creative and useful content for a wide array of audience.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.