*star_border*

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Subscribe
- Printer Friendly Page

# Wildcards

June

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Email to a Friend
- Report Abuse

Hi, I try to extract different numbers out of a row of numbers.

E.g.: I try to extract 8 and use

CASE WHEN REGEXP_MATCH(Text "Source", "^.*8.*") THEN "8_XYZ" ELSE "Others" END

or maybe just CASE WHEN REGEXP_MATCH(Text "Id [Courses]", ".*8.*") THEN "8_XYZ" ELSE "Others" END

For my understanding these expressions should extract the 8 out of every series but it doesn't... I have the following options with 8 in the series:

39x(7,8) + 3x(8) + 2x(8,1) + 3x(8,9) + 1x(11,8) + 3x(11,7,8) + 11x(11,7,8,9) + 31x(11,7,8,9,10) + 1x(11,7,8,9,10,12) + 2x(7,8,10) + 224x(7,8,9) + 13x(7,8,9,10) = 334 times the 8 in a series.

But my result is 48 so my assumption is that only (7,8), (8), (8,1), (8,9) and (11,8) are detected...

Does any body know why? :/ Thank you for any advice.

## Wildcards

June

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Email to a Friend
- Report Abuse

Hi Tonja,

as you describe you try to extract a part of a string. Therefore I'd suggest to make use of the REGEXP_EXTRACT function.

What I understand is that you want to

a) Check if there is a single "8" listed in the comma separated list of numbers inside of the brackets and

b) If so, extract the number before the "x" as a result

So in the end you'd need to do a combination of the REGEXP_MATCH and REGEXP_EXTRACT functions inside a CASE statement:

**CASE**

**WHEN** REGEXP_MATCH(Text "Source", "x\\([^8\s]*8[^8\s]*\\)") **THEN** REGEXP_EXTRACT(Text "Source", "([\d]*)x") **ELSE** "0" **END **

Here are the regex patterns in action:

* x\\([^8\s]*8[^8\s]*\\) (for some reason we need to have two backslashes in data studio to escape a character)

* ([\d]*)x

Now there are two issues with this (**the solution won't work!**):

* **THEN** results can only contain a *Metric*, a *Dimension* or a *Literal*. We cannot put a function as a **THEN** result

* We cannot conversion *Dimensions* to *Metrics*, so even if we extract numbers from the *Dimensions*, we won't be able to calculate a **SUM** of them.

That is my understanding. Hope it helps.

## Wildcards

June

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Email to a Friend
- Report Abuse

Hello @Tom2,

thanks a lot for your reply. The first step fits my problem ....

**CASE**

**WHEN** REGEXP_MATCH(Text "Source", "x\\([^8\s]*8[^8\s]*\\)") **THEN** "Hello" **ELSE** "0" **END**

** **

.... without the regexp in the then clause... unfortunetly data studio replies that its not possible to parse the function.

Any idea why the REGEXP_MATCH is not working as you intended?

Anyway I didn't know before that \\ is needed to escape a character and the regex-tool is great. Thank you.

Hope you or anybody knows whats wrong with the syntax above. Many Thanks

## Wildcards

June

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Email to a Friend
- Report Abuse

Hi Tonja,

looks like I didn't follow my own advice. (double escape characters were missing)

Please use "x\\([^8\\s]*8[^8\\s]*\\)" instead.

Hope that helps!

## Wildcards

June

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Email to a Friend
- Report Abuse

Oh - sometimes it is worthy to think about things before writing

The formula is accepted now but the result is not like expected - unfortunetly data studio is only returning "others". I tried to reduce the formula ... with [^8\s]*8[^8\s]* it detects some of the cases but again just (7,8), (8), (8,1), (8,9) and (11,8)*

*Reduced Formula will also detect "88" but I just have numbers from 5-16 so thats okay...

I will keep on trying or wait till postgres connection supports SSL :/

Thank you very much @Tom2 for your fast replies and great help