Do you need to calculate a rolling average in Power BI?
Rolling averages (or moving averages) allow the smoothing of data fluctuations over a specified period. While calculating it may seem complex, it’s a pretty straightforward process.
All you need to do is determine a fixed window or interval, take the average of the values within it, and then shift the window through the dataset.
The rolling average is a very useful tool because it conveniently highlights trends and portrays data patterns and trajectories with greater clarity and precision. It’s also a useful tool that simplifies the process of monitoring and tracking various trends in different areas.
In this article, you will learn how you can calculate rolling averages in Power BI.
Creating Rolling Average with DAX
The rolling average calculates the value of a particular dimension over a defined period. Therefore, to create a rolling average measure in Power BI, it is essential to have a date table in place.
When creating a date table for any time-based analysis, you should take certain requirements into account. These requirements are as follows.
The date table must:
- contain every day for all years within your fact table.
- have at least one field set as a Date or DateTime datatype.
- only contain unique date or datetime values, without repetition.
- be marked as a date table (best practice).
You can create a date table in Power BI using the Power Query editor or DAX’s CALENDAR OR CALENDARAUTO function.
Date Table = VAR MinYear = YEAR( MIN( ‘Sales Table’[transaction_date])) VAR MaxYear = YEAR( MAX( ‘Sales Table’[transaction_date])) RETURN ADDCOLUMNS( FILTER( CALENDARAUTO(), AND( YEAR( [Date] ) >= MinYear, YEAR( [Date] ) <= MaxYear ) ), "Year", YEAR( [Date] ), "Month Number", MONTH( [Date] ), "Month Name", FORMAT( [Date], "mmm" ), "Quater", "Q" & INT( FORMAT( [Date], "q")) )
If your data doesn’t come with a date table, this is a syntax you can use to create a date table. It’s a dynamic and reusable syntax that only requires you to change the ‘Sales Table’[transaction_date] column name to reflect the date column in your data set.
The syntax not only removes the need to extract date components into separate columns manually, as you would in Power Query, but it also streamlines the entire process.
Let’s break down the syntax step by step:
VAR MinYear = YEAR(MIN(‘Sales Table’[transaction_date])) – the MinYear variable is assigned the value of the minimum year from the ‘Sales Table’[transaction_date] column. It uses the YEAR function to extract the year component from the minimum date.
VAR MaxYear = YEAR(MAX(‘Sales Table’[transaction_date])) – The MaxYear variable is assigned the value of the maximum year from the ‘Sales Table’[transaction_date] column. It also uses the YEAR function to extract the year component from the maximum date.
RETURN – The RETURN statement includes the main calculation logic. It uses the ADDCOLUMNS function to create a date table whose values are based on the filtered years between MinYear and MaxYear.
ADDCOLUMNS(FILTER(CALENDARAUTO(), AND(YEAR([Date]) >= MinYear, YEAR([Date]) <= MaxYear)), "Year", YEAR([Date]), "Month Number", MONTH([Date]), "Month Name", FORMAT([Date], "mmm"), "Quarter", "Q" & INT(FORMAT([Date], "q"))) – This is the syntax that creates the date table. The ADDCOLUMNS function returns a table with one or more columns.
Using the FILTER, CALENDARAUTO, and AND functions, ADDCOLUMNS creates a table with dates that fall within the year range specified by the MinYear and MaxYear variables.
Next, the Year, Month Number, Month Name, and Quarter columns are populated with their respective values with the help of the INT and FORMAT functions. These functions extract the desired values from the [Date] column created by the CALENDARAUTO function.
Now that you have created a date table, you can proceed with creating the moving average measure. You can do this in two ways – by using the AVERAGEX or DATESINPERIOD function.
Create a Rolling Average with the AVERAGEX Function
The AVERAGEX function is designed to calculate the arithmetic mean of an expression evaluated across each row in a specified table. This characteristic makes it an ideal tool for computing the rolling average.
In a way, you can say the AVERAGEX function calculates the rolling average of a dimension daily since it returns the arithmetic mean of a set of values.
To use it to calculate the moving average over longer periods, such as 7 days, 30 days, or a year, the AVERAGEX function would require modifications.
The syntax for utilizing the AVERAGEX function to calculate a 30-day moving average is as follows.
Moving Average (AVERAGEX) = VAR LastTransactionDate = MAX('Dates'[Transaction_Date]) VAR AverageDay = 30 VAR PeriodInVisual = FILTER( ALL( 'Dates'[Transaction_Date] ), AND( 'Dates'[Transaction_Date] > LastTransactionDate - AverageDay, 'Dates'[Transaction_Date] <= LastTransactionDate ) ) VAR OutPut = CALCULATE( AVERAGEX( 'Dates', [Total Sales] ), PeriodInVisual ) RETURN OutPut
This syntax calculates a 30-day moving average using the AVERAGEX function. Let’s break it down step by step:
VAR LastTransactionDate = MAX('Dates'[Transaction_Date]) – This variable returns the maximum (latest) value from the ‘Dates'[Transaction_Date] column. It represents the reference point for calculating the moving average.
VAR AverageDay = 30 – This variable, AverageDay, sets the number of days used as the moving average window. In this case, it is set to 30 days.
VAR PeriodInVisual = FILTER(ALL('Dates'[Transaction_Date]), AND('Dates'[Transaction_Date] > LastTransactionDate - AverageDay, 'Dates'[Transaction_Date] <= LastTransactionDate)) –This variable, PeriodInVisual, uses the FILTER function to create a filtered table. It includes all the dates that fall within the window of the last AverageDay days, starting from the LastTransactionDate.
VAR OutPut = CALCULATE(AVERAGEX('Dates', [Total Sales]), PeriodInVisual) –This variable, OutPut, uses the CALCULATE function along with the AVERAGEX function. It calculates the average of the [Total Sales] column for each date in the Dates table but considers only the dates included in the PeriodInVisual filtered table.
RETURN OutPut –This line returns the value of the OutPut variable, which represents the calculated moving average.
The syntax calculates the moving average by defining a window of the last 30 days from the maximum transaction date as defined by the current filter context. It then calculates the average of the [Total Sales] column within that window and returns the result as the moving average.
When you use the measure in a visual, at first glance, it looks like the measure only returns the arithmetic mean. And that’s what it does but for the first 29 days. On the 30th day, the Moving Average (AVERAGEX ) measure starts to return the 30-day average of [Total Sales] measure in this visual.
An alternative approach to understanding how the calculation works is by creating a moving total measure. You can do this by replacing the AVERAGEX with the SUMX function in the Moving Average (AVERAGEX) syntax.
When you place this measure in the visual, you will notice a discrepancy between the total value for January in the [Moving TOTAL (SUMX)] column and the [Total Sales] column.
This difference arises because the sum in the Moving [TOTAL (SUMX)] column is based on a 30-day period, whereas the [Total Sales] column considers the total for the entire month of January, which has 31 days.
If you divide the moving total value in the January 5 row by 5 (since it’s the total sales value over 5 days), you will get the moving average value in the [Moving Average (AVERAGEX)] column.
Same way when you divide the January 30 value in the moving total column by 30, you will get the moving average value in the [Moving Average (AVERAGEX)] column. And from this point on, every moving total value is divided by 30 to return the 30-day moving average.
If you want to use a different period for your running average calculation, you can change the value of the AverageDay variable to 7 or 365. This will return the weekly or yearly moving average values.
Create a Rolling Average with the DATESINPERIOD Function
Power BI has several time intelligence functions that assist in measuring over a predetermined period. One such function is DATESINPERIOD. It returns a single-column date table that starts on the chosen start date and continues backward or forwards for the specified interval duration.
The DATESINPERIOD function has the following parameters:
<Dates>– a column containing dates.
<StartDate>– the start or end date of the period based on the interval’s positive or negative value.
<NumberOfIntervals>– a positive or negative number that specifies the interval between the date period. When
<NumberOfIntervals>is negative, the dates are counted backwards from
<StartDate>. Conversely, if
<NumberOfIntervals>is positive, the dates are counted forward from
<Interval>– the type of interval. This could be Year, Quarter, Month, or Day intervals.
Consider this case: If
<NumberOfIntervals> is -7 while
<Interval> is DAY, DATESINPERIOD would return a one-column date table containing all the dates from the previous week starting from the designated
Moving AVERAGE (DATESINPERIOD) = CALCULATE( [Total Sales], DATESINPERIOD( Dates[Transaction_Date], MAX( Dates[Transaction_Date] ), -30, DAY ))/30
Use this DATESINPERIOD syntax to calculate the running average.
Let’s break it down step by step:
DATESINPERIOD(Dates[Transaction_Date], MAX(Dates[Transaction_Date]), -30, DAY) –This function, DATESINPERIOD, is used to create a table of dates within a specified period. It takes four arguments
- the column that contains the dates (‘Dates[Transaction_Date]’),
- the end date of the period (MAX(Dates[Transaction_Date])),
- the number of periods to go back (-30),
- and the granularity of the period (DAY).
It returns a table of dates starting from the end date and going back by 30 days at a daily granularity.
CALCULATE([Total Sales], DATESINPERIOD(Dates[Transaction_Date], MAX(Dates[Transaction_Date]), -30, DAY)) – The CALCULATE function is used to modify the context in which the [Total Sales] measure is calculated. It takes two arguments: the measure to calculate [Total Sales], and the filter or context to apply – the DATESINPERIOD syntax. This modifies the calculation of [Total Sales] to only include the dates within the period defined by the DATESINPERIOND syntax.
/30 – This division is performed to obtain the average. Without this division, the syntax will only return the running total. Since the period is set to 30 days, dividing the value CALCULATE syntax returns by 30 provides the average sales per day within that period.
When you add the measure to a visual, you’ll get the running total for each date in the visual.
In general, the DATESINPERIOD and AVERAGEX moving average measures operate similarly. However, there is a subtle distinction in the values observed for the first 29 days.
The DATESINPERIOD measure divides the running total values by 30, while the AVERAGEX function takes the sum of the running total values and returns their arithmetic mean.
Nevertheless, as the duration for which the running average is being calculated progresses, the values for both measures tend to converge. Notably, on the 30th of January, the values for both measures become similar.
Another notable distinction between the DATESINPERIOD and AVERAGEX measures lies in their Total values. Specifically, the two measures yield different totals due to the way they handle blank rows within the data.
The AVERAGEX function excludes dates with blank rows from the calculation, leading to a discrepancy in the total values. On the other hand, the DATESINPERIOD measure does not exclude dates with blank rows.
In this table, the AVERAGEX function excludes April 30 from the calculation due to the presence of a blank row. As a result, the running total for that date is divided by 29 days instead of the intended 30-day period.
When selecting a syntax for calculating your rolling average measure, it is essential to consider this factor. While both options are viable, opting for the DATESINPERIOD syntax is the recommended choice if uncertainty exists regarding dates with blank values.
This ensures a more accurate and reliable calculation, considering any potential gaps in the data.
Power BI offers a comprehensive set of tools and techniques that are vital for analyzing data trends and patterns effectively.
By leveraging the functionalities of the DATESINPERIOD and AVERAGEX functions, you gain the ability to create insightful time-based calculations that greatly aid in identifying and understanding trends.
While the DATESINPERIOD and AVERAGEX measures operate similarly in many aspects, they do exhibit differences in handling the initial days and total values.
The DATESINPERIOD measure divides running total values by the specified period, whereas the AVERAGEX function calculates the arithmetic mean of the running total values, excluding blank rows.
Understanding these nuances is crucial for accurately interpreting and comparing results. It is also important to consider the specific requirements of the analysis and choose the appropriate measure accordingly.