Analytics
5.5K members online now
5.5K members online now
Ask questions about filter set-up and issues with using filters in Google Analytics reports
 
Guide Me
star_border
Reply

Remove text before URL in GA or GTM

[ Edited ]
Visitor ✭ ✭ ✭
# 1
Visitor ✭ ✭ ✭

I am trying to remove a whole bunch of junk before my URLs that cause issues with pageviews.


Currently when I go to the all pages report and export a monthly report I end up with URLs that have a lot of junk in front of them. Is there a way to remove everything before my site URL in GTM so when it gets sent to GA it only shows the URL? 

 

Here are some examples, but the list is not all inclusive and that is why I am wanting to remove everything that starts before my URL #$#*%_!)#($~@^&*$*#*#(%)@($*$/blank.com/.

What's really annoying the Google translate generates a decent amount of traffic to my site and so does Google weblight. I do have the spam and bots box checked, but we do have legit people who use a translator to access our site so the traffic is legit.

 

If I can't do this through GTM can I do it in GA?

 

/transpage?query=https://blank.com/&from=en&to=zh&source=url
/transpage?query=https://blank.com/blogs/issues-watch&from=en&to=zh&source=url
/transpage?query=https://blank.com/commentary/currency-idea&from=en&to=zh&source=url
/transpage?query=https://blank.com/united-states-world-economy&from=auto&to=zh&source=url
act=url&depth=1&hl=es&ie=UTF8&prev=_t&rurl=translate.google.com&sl=en&tl=es&u=https://blank.com/blogs&usg=ALkJrhg4-yg2KUgU7WOQILvSuSrSOyDCtg
/translate_c?act=url&depth=1&hl=pt-BR&ie=UTF8&prev=_t&rurl=translate.google.com.br&sl=pt-BR&tl=en&u=https://blank.com/watch/light-elections-expansion-and-inequality&usg=ALkJrhjM-N2rme-LIbF4TRmq0A_lQpl...

ingRemove text before URL in GA or GTM

[ Edited ]
Explorer ✭ ✭ ☆
# 2
Explorer ✭ ✭ ☆

Hi @Jeremey T,

You can try using Analytics view filters > search and replace or advanced, where you can extract the unwanted part of the URL. (You'll have to use proper regex, try doing it in test view first)
Check here : https://support.google.com/analytics/answer/1033162?hl=en

 
Or

 

In GTM you can use fields to set option to fire the fresh page url to analytics (so you dont use filter in analytics). 
Check : http://www.lunametrics.com/blog/2016/09/27/fields-set-google-tag-manager/ 

Hope this helps
Thanks,
Ritwik

Re: ingRemove text before URL in GA or GTM

Visitor ✭ ✭ ✭
# 3
Visitor ✭ ✭ ✭

Thanks for the response. 

 

In terms of the regex, that is where I am lost, I have no clue how that would look. I also wonder since there are so many variables that the regex may be to complicated. 

 

Thanks for pointing more towards that link. I've read it from top to bottom, but can't seem to find how it will help me. 

Remove text before URL in GA or GTM

Explorer ✭ ✭ ☆
# 4
Explorer ✭ ✭ ☆

Hi @Jeremey T,

In search and replace analytics view filters for url: 
/transpage?query=https://blank.com/commentary/currency-idea&from=en&to=zh&source=url
you can use search and replace filter
In search : .*https\:\/\/ and replace ' '. So result will be just

blank.com/commentary/currency-idea&from=en&to=zh&source=url

 

You can use this tools to confirm the regex filters:
http://www.regexe.com/

Here's a good interactive site to learn regex :
https://regexone.com/


You can see some examples for search and replace Analytics view filters here:
https://support.google.com/analytics/answer/1034834?hl=en

 


Hope this helps
Thanks,
Ritwik

Re: Remove text before URL in GA or GTM

Visitor ✭ ✭ ✭
# 5
Visitor ✭ ✭ ✭

Thanks for that! A couple of questions for clarification. Will this work for more than the transpage? You can see here in a Google Sheet. I Also sometimes the URL is http and not https. 

 

What if I added a filter to incorporate the full domain name would that make it easier in terms of this regex or complicate things because then it would be two domanin URLs? 

 

Would this affect the current filter I plan on using to remove the query parameters at the end of the URL? If so, should the filter you are suggesting be in #1 in the filter order? 

 

 

Capture1.PNG

Capture.PNG

 

Your help is very much appreciated!!!

Remove text before URL in GA or GTM

Explorer ✭ ✭ ☆
# 6
Explorer ✭ ✭ ☆

Hi @Jeremey T,

You can try using one filter : ((&|\?)(.*)=[^&]*)|(.*https?\:\/\/) 
((&|\?)(.*)=[^&]*) :To remove parameters which you already have applied.
(.*https?\:\/\/) : To remove all the string before http:// and https://.

Or
the same thing can be done by applying the (.*https?\:\/\/) first and then separately ((&|\?)(.*)=[^&]*).

Eg: /transpage?cb=translateCallback&ie=utf8&source=url&query=http://www.blank.com/experts/research-analysts&from=en&to=zh&token=&monLang=zh

 

Would give : www.blank.com/experts/research-analysts
(try testing on links above)

 

Also, you can further apply filter on this URL too (if you want to attach other domian name,etc)

Let me know if that helps.

Thanks,
Ritwik 

Re: Remove text before URL in GA or GTM

Visitor ✭ ✭ ✭
# 7
Visitor ✭ ✭ ✭

This does help, but I am not clear on what you mean by your last comment. 

 

"Also, you can further apply filter on this URL too (if you want to attach other domian name,etc)"

 

I would like to try using the all in one filter approach I have setup in a screenshot. Is there way I can also exclude the my domain name? 

 

Eg: /transpage?cb=translateCallback&ie=utf8&source=url&query=http://www.blank.com/experts/research-analysts&from=en&to=zh&token=&monLang=zh

 

Would give : www.blank.com/experts/research-analysts

 

My ultimate goal is to have no domain: /experts/research-analysts

 

Capture.PNG

 

 

Remove text before URL in GA or GTM

[ Edited ]
Explorer ✭ ✭ ☆
# 8
Explorer ✭ ✭ ☆

Hi @Jeremey T,

Ohk I thought your goal is to get domain name in url. Actually what I meant was you can modify the output  www.blank.com/experts/research-analysts further (like excluding domain name,changing it ,etc)

Now to get to your final goal
Use this in search string as: ((&|\?)(.*)=[^&]*)|(.*blank\.com)

Eg: /transpage?cb=translateCallback&ie=utf8&source=url&query=http://www.blank.com/experts/research-analysts&from=en&to=zh&token=&monLang=zh

 

Would give : /experts/research-analysts

Let me know if it worked

Thanks,
Ritwik

Re: Remove text before URL in GA or GTM

[ Edited ]
Visitor ✭ ✭ ✭
# 9
Visitor ✭ ✭ ✭

Sorry for the confusion. 

 

Does the * before the blank\.com also remove other variations of a domain name? For example, if I had lank.com as domain and people still link to it, will it remove that as well? 

 

Domain variations

blank.com

lank.com

blankinstitute.org

 

You've been a tremendous help, I really appreciate it! 

 

Update:

 

It seems to have removed all but one query parameter. I will upload the examples tomorrow as the data is still coming in. 

 

Re: Remove text before URL in GA or GTM

[ Edited ]
Visitor ✭ ✭ ✭
# 10
Visitor ✭ ✭ ✭

Would it be possible for me to share a test view in GA with you? That may be easier for the both of us.

 

As I mentioned in my previous post it removed all query parameters but one. It may have also caused a couple of issues one with URLs it's not a lot, but I am wondering if it could snowball. 

 

Here is a link to a Google Sheet that gives you a snapshot of the issues.

https://drive.google.com/file/d/0B43l1TGiK4KlZ1cxemUzT25EMDg/view?usp=sharing

 

In that sheet you'll also see an issue with displaying search results in our site. I have attached screenshots to show how I am capturing them and filtering them.

search.PNG

 

 

search-term-filter-test.PNGfilters.PNG

 

 Update: I added a filter that I plan on using going forward. 

 

filters-1.png