AdWords is now Google Ads. Our new name reflects the full range of advertising options we offer across Search, Display, YouTube, and more. Learn more

Data Studio
5.8K members online now
5.8K members online now
Ask questions about Data Studio
Guide Me
star_border
Reply

Regex - how to extract substring when delimiter is more than one character

Visitor ✭ ✭ ✭
# 1
Visitor ✭ ✭ ✭

This is a follow-up question from my earlier one about finding the nth substring in a delimited Campaign Name:

https://www.en.advertisercommunity.com/t5/Data-Studio/Regex-how-to-find-substring-at-nth-delimiter-s...

 

I have a Regex that works to find the nth substring of Campaign  when the delimiter is a single character (and there may or may not be a delimiter after my substring):

 

REGEXP_EXTRACT(Campaign, '^(?:[^-]*-){n-1}([^-]*)-*')  

 

Where "-" is my delimiter and n is the number of the substring I want.

 

Does anyone know how I would change this to work with a multi-character delimiter - e.g. " - "  (space-dash-space)?

 

I've tried replacing all of the '-'s with all of the following:

( - )    " - "  \Q - \E   "\Q - \E"  

but none seem to work - any ideas, please?

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Marked as Best Answer.
Solution
Accepted by topic author JennyJ
July

Regex - how to extract substring when delimiter is more than one character

Visitor ✭ ✭ ✭
# 6
Visitor ✭ ✭ ✭

Hi Goncalo - Thank you so much for your reply. I'm sorry it has taken me so long to get back to you!

 

I understand your answer, and I'm sure it will work (I haven't had a chance to try it). It is precise if I want to ignore delimiters with NO spaces (i.e. only use delimiters WITH spaces), but in the meantime, I have found a simpler solution (that will work if I don't care whether the delimiter has a space around it or not):  just use the TRIM() function to strip off leading / trailing spaces.

 

So, for example, TRIM(REGEXP_EXTRACT(Campaign, '^(?:[^-]*-){n-1}([^-]*)-*'))

View solution in original post

Regex - how to extract substring when delimiter is more than one character

Visitor ✭ ✭ ✭
# 2
Visitor ✭ ✭ ✭

Hi Jenny, 

 

I’m having a similar issue as you had above. I was wondering whether you found a solution for the “ - “ problem?

 

Thanks in advance.

Regex - how to extract substring when delimiter is more than one character

Visitor ✭ ✭ ✭
# 3
Visitor ✭ ✭ ✭

Hi Nicholas

 

Sadly not.  I am still keen to find a solution! 

 

In the meantime, I'm advising users to (where possible), do 1 of the following:

 

1. Set up campaign names with single character delimiters (i.e. remove the problem)

2. Ensure all campaign names have the same number of delimited segments.  This means that, for a campaign name with N segments:

     Segment 1 will have no leading space, but a trailing space

     Segments 2 to (N-1) will all have leading and trailing spaces

     Segment N will have a leading but not a trailing space

At least this will mean that the segments can be grouped successfully when used in a calculated field.  I think the leading spaces don't show in Data Studio, and as long as any formulae take account of the trailing space, it should be workable.

 

The problems come with historic data or (in an agency setting) when you don't have control over the naming of campaigns...

Regex - how to extract substring when delimiter is more than one character

Visitor ✭ ✭ ✭
# 4
Visitor ✭ ✭ ✭

Still looking for answers - can anyone help please?

Regex - how to extract substring when delimiter is more than one character

Visitor ✭ ✭ ✭
# 5
Visitor ✭ ✭ ✭

just add the space after and before the "-" (I'll replace the [space] for _ just for the sake of explanation 

REGEXP_EXTRACT(Campaign, '^(?:[^-]*- ){n-1}([^-]*) -*') ---> with _ instead of [Space] REGEXP_EXTRACT(Campaign, '^(?:[^-]*-_){n-1}([^-]*)_-*')

Marked as Best Answer.
Solution
Accepted by topic author JennyJ
July

Regex - how to extract substring when delimiter is more than one character

Visitor ✭ ✭ ✭
# 6
Visitor ✭ ✭ ✭

Hi Goncalo - Thank you so much for your reply. I'm sorry it has taken me so long to get back to you!

 

I understand your answer, and I'm sure it will work (I haven't had a chance to try it). It is precise if I want to ignore delimiters with NO spaces (i.e. only use delimiters WITH spaces), but in the meantime, I have found a simpler solution (that will work if I don't care whether the delimiter has a space around it or not):  just use the TRIM() function to strip off leading / trailing spaces.

 

So, for example, TRIM(REGEXP_EXTRACT(Campaign, '^(?:[^-]*-){n-1}([^-]*)-*'))