Analytics
4.9K members online now
4.9K members online now
Ask questions about filter set-up and issues with using filters in Google Analytics reports
 
Guide Me
star_border
Reply

Track subfolders on many levels with and without trailing slash

Visitor ✭ ✭ ✭
# 1
Visitor ✭ ✭ ✭

I want to track subfolders, which could be nested to five levels.

 

The folders look like:

  • /a-1/ and /a-1,
  • /a-1/b-2/ and /a-1/b-2,
  • /a-1/b-2/c-3/ and /a-1/b-2/c-3,
  • /a-1/b-2/c-3/d-4/ and /a-1/b-2/c-3/d-4,
  • /a-1/b-2/c-3/d-4/e-5/ and /a-1/b-2/c-3/d-4/e-5.

There are, as in example, two url variants: with and without trailing slash. I need to track them only based on their path,  /a-1/ and /a-1,  /a-1/b-2/ and /a-1/b-2 and so on.

 

To match them manually is not the solution: there are already many subfolders, and many of them will come in the future. I mean, the kind to match them should be based on regular expression, which acts like advanced filter (or segment).

 

My idea for regex was to count slashes in the strings to get to know the nesting level, and then to get to know, whether a string end up with trailing slash. According to this idea matching of two subfolders, /a-1/b-2/ and /a-1/b-2 would contain two rules:

  • if there are 3 slashes and the string ends up with slash
  • if there are 2 slashes and the string ends up without slash

My created regex for matching second level subfolders,

(.*\/){3}\/$|(.*\/){2}[^\/]$

,  doesn't work - it matches something completely different.

 

 

Please point me to the solution, or just into the right direction.

 

Bets regards from Berlin

egon

 

1 Expert replyverified_user

Re: Track subfolders on many levels with and without trailing slash

Top Contributor
# 2
Top Contributor

Hi there,

as much as I love regex, GA does not play well with repeating elements in a regular expression, let alone with URL structure.

I would advise updating your tagging to capture folder levels in Custom Dimensions (reserve 5 of those, hit-scope, 1 for each "level").


In your tracking code, use the following: (assuming you use Universal Analytics)

 

<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-7634164-5', 'auto');

  // parse the URL structure
  var _dp = document.location.pathname;
  var folders = _dp.split('/'); folders.shift();

  // change pageview tracking to include custom dimensions with folder "levels"
  ga('send', 'pageview',{
    'dimension1': folders[0],
    'dimension2': folders[1],
    'dimension3': folders[2],
    'dimension4': folders[3],
    'dimension5': folders[4]
  });

</script>

Hope that helps.

VP & Chief Evangelist at Hub'Scan | Contact me
Level 80 Digital Analytics Warrior, KPI Therapist and Keeper of the One True Tagging Plan

Re: Track subfolders on many levels with and without trailing slash

[ Edited ]
Visitor ✭ ✭ ✭
# 3
Visitor ✭ ✭ ✭

Thank you for an answer, but it doesn't help.

 

In the meanwhile i got the solution by myself with regular expression. I post it here, in case somebody will look for the same:

 

There are two regular expressions needed to get the job done: the first matches needed subfolders, the second excludes subfolders from the nesting level higher as needed.

 

For example, we need to track subfolders from the second nesting level, with and without trailing slash. They look  like /a-1/b-2/ and /a-1/b-2 :

 

1. Go into Behavior → Site Content → All Pages

 

2. Matching needed subfolders. Create an advanced filter, first filtering rule: include, dimensions → page, matching RegExp

(/[^\s]+){2}?$

 

3. Excluding subfolders from the nesting level higher as 2: create second filtering rule: excludedimensions → page, matching RegExp

(/[^\s]+){3}?$

Mission accomplished - Enjoy!

 

PS: To access the subfolder statistics always and easy,  it is very comfortable to create shortcuts.

 

 

 

 

 

 

 

 

Re: Track subfolders on many levels with and without trailing slash

Top Contributor
# 4
Top Contributor
Nice!
With hindsight, couldn't you have used the drilldown URL reports?
VP & Chief Evangelist at Hub'Scan | Contact me
Level 80 Digital Analytics Warrior, KPI Therapist and Keeper of the One True Tagging Plan

Re: Track subfolders on many levels with and without trailing slash

Visitor ✭ ✭ ✭
# 5
Visitor ✭ ✭ ✭

how do you mean it?

 

i think, this, what i've done, is a kind of drilled down report.

 

BTW: do you probably know, why regular expressions and the kind, how i used them in the filter, don't work in the advanced segmentation? I tried to create an advanced segment with the same regular expressions, but miserably failed, because regex delivered absolutely different, not expected results.

Re: Track subfolders on many levels with and without trailing slash

Top Contributor
# 6
Top Contributor

that's what I was referring to, re: regex, they get tricky or sometimes "break" segments - not sure exactly why, though.

 

About drilldowns, go to Behavior > Site Content > Content Drilldown

Screenshot 2016-07-05 13.27.23.png

From there you can drill down the folder structure.

VP & Chief Evangelist at Hub'Scan | Contact me
Level 80 Digital Analytics Warrior, KPI Therapist and Keeper of the One True Tagging Plan

Re: Track subfolders on many levels with and without trailing slash

Visitor ✭ ✭ ✭
# 7
Visitor ✭ ✭ ✭

ah, got it with drilldown - no, i tested it, but this function doesn't proceed my urls clear enough, so i can't use it Smiley Sad it would be very comfortable, however.

 

i tested advanced segments with regex a bit further: it seems to depend from account, whether advanced segment deliver correct output after eating regex. I tested the same regex on 5 analytics accounts (5 different websites): don't know why, but in 4 accounts the results from the same regex in filter and in segment are pretty the same, but in one account segment shows something completely different, as the filter, and behaves like it would be not interpret the regex.

 

Why, why, just why?