DAX introduction

This article shows basic differences between SQL and DAX.

Using the DAX studio query builder, you will soon find out DAX is not that difficult, it is just different than SQL.

Simpliest Query

You need to use EVALUATE keyword to return data. In it’s simpliest form, you just tell DAX what table you want to query.

SELECT * FROM DIM.AIRLINES
EVALUATE ('DIM AIRLINES')

Count Rows

It is easy to retrieve number of rows of a table, just use COUNTROWS function.

SELECT COUNT(*) as [Number of Rows]
FROM DIM.AIRLINES
EVALUATE
ROW (
    "Number of Rows",
    COUNTROWS ( 'DIM AIRLINES' )
)

Filtering

If you want only a part of the records in a table, use FILTER function:

SELECT *
FROM  DIM.AIRLINES
WHERE COUNTRY_OF_ORIGIN = 'Italy'
EVALUATE
FILTER (
  'DIM AIRLINES',
  [COUNTRY_OF_ORIGIN] = "Italy"
)

If you need more conditions, just use && for AND and || for OR. And as many parentheses as you wish.

Grouping / Distinct values

Use SUMMARIZECOLUMNS to get distinct values:

SELECT  AIRLINE_TYPE, COUNTRY_OF_ORIGIN
FROM    DIM.AIRLINES
GROUP BY AIRLINE_TYPE, COUNTRY_OF_ORIGIN
EVALUATE
SUMMARIZECOLUMNS(
  'DIM AIRLINES'[AIRLINE_TYPE],
  'DIM AIRLINES'[COUNTRY_OF_ORIGIN]
)

Measures

Nice, distinct values, but how do I add a measure value to those? Like this:

SELECT a.AIRLINE_NAME,
   AVG(d.DELAY_MINUTES) AS [AvgDelay]
FROM DIM.AIRLINES AS a
  JOIN FACT.DEPARTURES AS d
      ON a.AIRLINE_ID = d.AIRLINE_ID
GROUP BY a.AIRLINE_NAME;
EVALUATE
SUMMARIZECOLUMNS (
  'DIM AIRLINES'[AIRLINE_NAME],
  "AVG Delay in Minutes", [AVG Delay in Minutes]
)

Notice the difference between SQL and DAX. You need to join two tables in SQL, in DAX it is much easier to retrieve the data. The query uses defined relationships from the model.

Sometimes you may need a value of a measure, without any context / filters. Use ROW function to display it. This DAX query returns value of Total Revenue measure:

EVALUATE
ROW ( "Total Revenue", [Total Revenue] )

Sorting

When comparing data, sets match expectation needs sorted data. This is very similar, just use ORDER BY as in SQL. You can use any column name(s).

SELECT a.AIRLINE_NAME,
  AVG(d.DELAY_MINUTES) AS [AvgDelay]
FROM DIM.AIRLINES AS a
  JOIN FACT.DEPARTURES AS d
     ON a.AIRLINE_ID = d.AIRLINE_ID
GROUP BY a.AIRLINE_NAME
ORDER BY AvgDelay DESC;
EVALUATE
SUMMARIZECOLUMNS (
  'DIM AIRLINES'[AIRLINE_NAME],
  "AVG Delay in Minutes", [AVG Delay in Minutes]
)
ORDER BY [AVG Delay in Minutes] DESC

Limit Number of Rows

Here is an example that gets top three airlines with biggest average delay.

SELECT TOP (3)
   a.AIRLINE_NAME,
   AVG(d.DELAY_MINUTES * 1.0) AS [Avg Delay in Minutes]
FROM DIM.AIRLINES AS a
   JOIN FACT.DEPARTURES AS d
      ON d.AIRLINE_ID = a.AIRLINE_ID
GROUP BY a.AIRLINE_NAME
ORDER BY [Avg Delay in Minutes] DESC;
EVALUATE
TOPN (
  3,
  SUMMARIZECOLUMNS (
      'DIM AIRLINES'[AIRLINE_NAME],
      "AVG Delay in Minutes", [AVG Delay in Minutes]
  ),
  [AVG Delay in Minutes],
  DESC
)
ORDER BY [AVG Delay in Minutes] DESC

Trick

This example shows two tricks. First one uses function SWITCH - it compares hardcoded value TRUE with two conditions and returns text.

The second returns hardcoded value within expected range. Both queries are DAX.

EVALUATE
ROW (
  "Revenue Actual",
  SWITCH (
    TRUE,
    [Total Revenue] > 500000000, "to high",
    [Total Revenue] < 100000000, "to low",
    "within expected range"
  )
)
EVALUATE
ROW (
  "Revenue Expected",
  "within expected range"
)

You can use similar solutions as a fallback if you have troubles to express your expectation using built-in expectations in CAT. Issue similar two queires and use sets match expectation.