Data Studio
1.7K members online now
1.7K members online now
Ask questions about Data Studio
Guide Me
star_border
Reply

ROW_NUMBER() with PARTITION BY in BigQuery

Visitor ✭ ✭ ✭
# 1
Visitor ✭ ✭ ✭

Hello,

 

We are trying to use Data Studio with ETL data in BigQuery.

 

As suggested by Cloud Support, we are using a model similar to WePay [1].

 

In our example, we have events emitted by our checkout process and to make it simple, let's say it is "sessionId, status, timestamp".

 

If we want to only grab the most recent status for each sessionId, we would use "ROW_NUMBER() OVER (PARTITION BY [sessionId] ORDER BY [timestamp] DESC) etl_row_num", then filter on "etl_row_num = 1", as suggested in the WePay article.

 

Does anyone know of a way to achieve this in Data Studio without using a BigQuery view?

As using a view, we either couldn't make use of the date partition filter, or we'd have to restrict the date range manually in the view.

 

Example query to get the results we are looking for directly in BigQuery:

 

SELECT
  COUNT(id) cnt,
  status
FROM
  (
    SELECT
      id, status, ROW_NUMBER() OVER (PARTITION BY [id] ORDER BY [timestamp] DESC) etl_row_num
    FROM
      [******:checkout.events]
    WHERE
      _PARTITIONTIME = TIMESTAMP("2017-06-28")
  )
WHERE
  etl_row_num = 1
GROUP BY status
ORDER BY cnt DESC

 

Regards,

Samuel Melrose

 

[1]: https://wecode.wepay.com/posts/bigquery-wepay