It is one of the most common data manipulation task to find records that exist in table one that also exists in the table two. In other words, finding common rows that exist in both the tables. This post includes 3 methods with PROC SQL and 1 method with data step merge to solving it.
Suppose you have two data sets (tables), one and two. You want to find the records that are present in both the tables.
Creating two datasets – one and two.
data one; input id $ value; datalines; A 1 B 2 C 3 D 4 E 5 ; run;
input id $ value;
A PROC SQL subquery returns a single row and column. This method uses a subquery in its SELECT clause to select ID from table two. The subquery is evaluated first, and then it returns the id from table two to the outer query.
proc sql; select * from one where id in (select id from two); quit;
PROC SQL INNER JOIN returns rows common to both tables (data sets). The query below returns values B and D from the variable ID in the combined table as these two values are common in datasets one and two.
proc sql; select distinct t1.* from one as t1 inner join two as t2 ON T1.id=T2.id; quit;
The INTERSECT operator returns common rows in both tables.
proc sql; select * from one intersect select * from two; quit;
Proc sort data=one; By id; Run; Proc sort data=two; By id; Run; Data final; Merge one(in=t1) two(in=t2); By id; If t1 and t2; Run;