Rolling Data, known as Moving average, is a time-based calculation to get an insight into trends for a defined period.
If the time frame for the moving average is 12 months, the data that is 13 months old be dropped, and the new month’s data will be added.
In the below example, we will be calculating rolling three months of sales data.
data raw_Data; a="01JAN2019"d; b="30NOV2019"d; do i=1 to 100; sales=rand("Integer",100,10000); transaction_Date=a+floor((b-a)* rand("uniform")); month=month(transaction_Date); output; end; format transaction_Date mmddyy10.; drop a b i; run;
We have prepared the data by generating random dates (between 01JAN2019 to 30NOV2019) and numbers between 1 to 100, which represent sales amount.
proc Summary data=raw_Data nway; class month; var sales; output out=trends(drop=_TYPE_ _FREQ_) sum=sales; run;
The sum of the sales amount is grouped so that we get the total sales data for each month.
data Rolling_data; set trends; rolling_total + sales - coalesce(lag3(sales),0); run;
- Lag of Sales is calculated by looking back at three observations.
- The running total is calculated by adding the current sales and subtracting the first sales of the window period. In this case, the window period is 3.
- Once the 4th sale is added to the running total, the 1st sale is subtracted.
- Coalesce function computes the first non-missing values and assigns 0. This is required as the 1st three sales will have missing values.