Rolling Data also known as Moving average is a time-based calculations to get an insight of trends for a defined period of time.
If time frame for moving average is 12 months, on each month the data which is 13 month old be dropped and the new months data will be added.
In the below example, we will be calculating rolling 3 months of sales data.
data raw_Data; a="01JAN2019"d; b="30NOV2019"d; do i=1 to 100; sales=rand("Integer",100,10000); transaction_Date=a+floor((b-a)* rand("uniform")); month=month(transaction_Date); output; end; format transaction_Date mmddyy10.; drop a b i; run;
- I have prepared the data by generating random dates (between 01JAN2019 to 30NOV2019) and numbers between 100 to 100 which represents sales amount.
proc Summary data=raw_Data nway; class month; var sales; output out=trends(drop=_TYPE_ _FREQ_) sum=sales; run;
- The sum of sales amount has is grouped so that we get the total sales data for each month.
data Rolling_data; set trends; rolling_total + sales - coalesce(lag3(sales),0); run;
- Lag of Sales is calculated for looking back 3 observation.
- Running total is calculated by adding the current sales and subtraction the first sales of the window period. In this case the window period is 3.
- Once, the 4th sales is added to the running total , the 1st sales is subtracted.
- Coalesce function computes, the first non missing values and assigns 0. This sis required as the 1st 3 sales will have missing values.