6.2K members online now
Ask questions about Data Studio
Guide Me
star_border
Reply
Highlighted

Wildcards

Visitor ✭ ✭ ✭
# 1
Visitor ✭ ✭ ✭

Hi, I try to extract different numbers out of a row of numbers.

 

E.g.: I try to extract 8 and use 

 

CASE WHEN REGEXP_MATCH(Text "Source", "^.*8.*") THEN "8_XYZ" ELSE "Others" END

or maybe just CASE WHEN REGEXP_MATCH(Text "Id [Courses]", ".*8.*") THEN "8_XYZ" ELSE "Others" END

 

For my understanding these expressions should extract the 8 out of every series but it doesn't... I have the following options with 8 in the series:

39x(7,8) + 3x(8) + 2x(8,1) + 3x(8,9) + 1x(11,8) + 3x(11,7,8) + 11x(11,7,8,9) + 31x(11,7,8,9,10) + 1x(11,7,8,9,10,12) + 2x(7,8,10) + 224x(7,8,9) + 13x(7,8,9,10) = 334 times the 8 in a series.

But my result is 48 so my assumption is that only (7,8), (8), (8,1), (8,9) and (11,8) are detected...

 

Does any body know why? :/ Thank you for any advice.

Wildcards

Follower ✭ ✭ ☆
# 2
Follower ✭ ✭ ☆

Hi Tonja,

 

as you describe you try to extract a part of a string. Therefore I'd suggest to make use of the REGEXP_EXTRACT function.

 

What I understand is that you want to

a) Check if there is a single "8" listed in the comma separated list of numbers inside of the brackets and

b) If so, extract the number before the "x" as a result

 

So in the end you'd need to do a combination of the REGEXP_MATCH and REGEXP_EXTRACT functions inside a CASE statement:

 

CASE
  WHEN REGEXP_MATCH(Text "Source", "x\\([^8\s]*8[^8\s]*\\)") THEN REGEXP_EXTRACT(Text "Source", "([\d]*)x")
ELSE "0"
END

 

Here are the regex patterns in action:

x\\([^8\s]*8[^8\s]*\\) (for some reason we need to have two backslashes in data studio to escape a character)

([\d]*)x

 

 

Now there are two issues with this (the solution won't work!):

* THEN results can only contain a Metric, a Dimension or a Literal. We cannot put a function as a THEN result

* We cannot conversion Dimensions to Metrics, so even if we extract numbers from the Dimensions, we won't be able to calculate a SUM of them.

 

That is my understanding. Hope it helps.

Wildcards

Visitor ✭ ✭ ✭
# 3
Visitor ✭ ✭ ✭

Hello @Tom2,

 

thanks a lot for your reply. The first step fits my problem ....

 

CASE 
  WHEN REGEXP_MATCH(Text "Source", "x\\([^8\s]*8[^8\s]*\\)") THEN "Hello" 
ELSE "0" 
END

 

.... without the regexp in the then clause... unfortunetly data studio replies that its not possible to parse the function. 

 

Any idea why the REGEXP_MATCH is not working as you intended?  

 

Anyway I didn't know before that \\ is needed to escape a character and the regex-tool is great. Thank you. 

 

Hope you or anybody knows whats wrong with the syntax above. Many Thanks

 

 

Wildcards

Follower ✭ ✭ ☆
# 4
Follower ✭ ✭ ☆

Hi Tonja,

 

looks like I didn't follow my own advice. (double escape characters were missing)

 

Please use "x\\([^8\\s]*8[^8\\s]*\\)" instead.

 

Hope that helps!

Wildcards

Visitor ✭ ✭ ✭
# 5
Visitor ✭ ✭ ✭

Oh - sometimes it is worthy to think about things before writing Smiley Very Happy 
The formula is accepted now but the result is not like expected - unfortunetly data studio is only returning "others". I tried to reduce the formula ... with [^8\s]*8[^8\s]* it detects some of the cases but again just (7,8), (8), (8,1), (8,9) and (11,8)*

 

*Reduced Formula will also detect "88" but I just have numbers from 5-16 so thats okay...

 

I will keep on trying or wait till postgres connection supports SSL :/

 

Thank you very much @Tom2 for your fast replies and great help Smiley Happy