PROC DATASETS is a very useful SAS Procedure to manage, manipulate and modify your SAS datasets. it is more efficient than performing the same tasks with a Data Step.
PROC DATASET is efficient because it does not need to read in or write observations of datasets to make modifications to them.
PROC DATASET can perform a very wide range of tasks and is usually faster than other methods. You can even manage several datasets in one PROC DATASETS statement.
Here are some of the tasks you can do with PROC DATASETS:
1. Renaming SAS Files
data class; set sashelp.class; run; Proc datasets lib=WORK; change class=students; Run;
The lib= option is not required if your dataset is in the work library.
2. Renaming Variables
Proc datasets lib=students; Modify LakeTravisISD; rename ID=studentid; Run;
If there’s already a variable name that you are trying to rename, an error will be generated.
If there is a simple index for the variable you rename, PROC DATASETS also renames the
index if there is a simple index for the variable. If the old variable has a composite index, the composite index will be automatically referencing the new variable name.
3. Copying Datasets
Proc datasets library=<source-libname> Copy out=<destination-libanem> Run;
libname new '/home/sasdatasets/examples'; Proc datasets library=WORK; Copy out=new; Run;
In this example, I have copied all the datasets from the work library to a newly created library.
To copy only some of the files, add a
EXCLUDE statement as below:
Proc datasets library=sashelp; Copy out=work; Select class cars shoes hearts; Run; Proc datasets library=work; Copy out=new; Exclude class: ; Run;
4. Deleting SAS Files
Proc datasets lib=new; Delete cars; Run;
Be careful – the datasets are deleted immediately. If the dataset you delete is indexed, PROC DATASETS also deletes the indexes.
To delete all the datasets in the library, you can use the
proc datasets lib=new kill; quit; run;
5. Appending Datasets
Proc datasets; Append base=class force data=sashelp.classfit; Run;
This example appends the dataset sashelp.classfit to the base dataset work. class.
If there are variables in the BASE= dataset that are not in the DATA= dataset, those variables will be missing for the observations from the DATA= datasets.
If there are variables in the DATA= dataset that are not present in the BASE= dataset as in the above example, you can still append using a
FORCE option. The additional variables will not be included in the BASE dataset and a warning message will be written to the log.
If the variables in the two datasets have different attributes, the attributes in the BASE= dataset will apply.
You can also use a WHERE statement to restrict the observations in the new dataset that are appended to the base dataset.
Proc datasets; Append base=class force data=sashelp.classfit; where name = :('A'); Run;
Note, the use of =: Operator. This is to select only the observation in the variable name that begins with ‘A’.
Read more on Where Operators.
A SET statement can also be used for appending datasets but when you use a DATA step with a SET statement to append one dataset to another, SAS reads all the observations from both datasets into the Program Data Vector. But when you use PROC DATASETS, only the observations in the new dataset (data=dataset-name) are read into the Program Data Vector. Hence, PROC DATASET is considered to be more efficient when it comes to appending large datasets.
Also, Read, Exploring the SET Statement in SAS
6. Formatting Variables
The FORMAT statement, used with a MODIFY statement, is used to assign, change or remove variable formats:
Proc datasets; Modify class; Format name $30.; Run;
7. Labelling a Dataset
Proc datasets; Modify class; Label="Student's Dataset"; Run;
8. Labelling Variables
Proc datasets; Modify class; Label name="Student Name" age="Student age"; Run;
9. Removing Labels and Formats
Proc datasets; Modify students; Attrib _all_ label=''; Attrib _all_ format=; Run;
10. Password Management
To password protect a dataset with the password “sas”:
When you password protect a SAS dataset you will not be able to alter the datasets without the password. For, example If I have to apply the format to a password-protected dataset I have to use the alter option with the password, else it will show an error in the log.
ERROR: Invalid or missing ALTER password on member WORK.CLASS.DATA.
Proc datasets; Modify class(alter="sas"); Label name="Student Name" age="Student age"; Run;
To change a password (in this example, from “sas” to “sasdata”):
Proc datasets; Modify class (pw=sas/sasdata); run;