Analytics
1.8K members online now
Understand information in your reports and troubleshoot reporting issues such as self-referrals, (not set) data, and inaccurate information
 
Guide Me
star_border
Reply

page views much lower than internal counters and access log analysis

[ Edited ]
Visitor ✭ ✭ ✭
# 1
Visitor ✭ ✭ ✭

We have noticed a significant difference(105%) in Google Analytics page views versus analyzing the access log with Sawmill.  We also track views on certain pages internally in our database (views per story).  Our internal numbers match sawmill.

 

I've verified our analytics code with the Tag assistant and it's just a basic install, nothing fancy.

 

The main page I am focusing on is the view story page, so the user would generally stay on it a good amount of time, allowing analytics plenty of time to load.

 

Can anyone advise me on what I could check next?

 

Here is one of the pages in question:  http://www.cubshq.com/update/player/Arrieta-If-I-Id-stay-19230

 

Thanks very much!

 

 

 

 

 

 

2 Expert replyverified_user

Re: page views much lower than internal counters and access log analysis

Rising Star
# 2
Rising Star
Hey Brian,

Its been a LONG time since I parsed and server logs; with that condition let me just say that GA works differently and produces data more discreetly than server logs. That is server logs have all data regardless of the origin of the request.

Outside of the measurement protocol and referral spam, GA requires JavaScript and images to be enabled in order for a hit to be generated. For example, most web crawlers will not register any activity in GA while they will be fully represented in your server logs. Bottom line, GA represents more humans.

Hope that helps and Go Cubbies!

Best,

Theo Bennett
Analytics Evangelist at MoreVisibility | Contact Me
Connect on LinkedIn

Re: page views much lower than internal counters and access log analysis

Visitor ✭ ✭ ✭
# 3
Visitor ✭ ✭ ✭
Thanks for the reply Theo. Yes I understand that GA requires javascript and images. Sawmill, which I am using to parse the access log reports that only 3% of visitors were spiders, so that does not seem to be the issue. Most ad blockers do not disable javascript either and I've read that the majority of ad blockers do not block GA.

One thing to note is that our traffic is around 90% mobile, but since I have the GA code right below the Google DFP code in the head, it seems that it would load before the page does. So it doesn't seem like traffic being mobile would have any affect unless they just didn't wait for the page to load.

Here is an example from yesterday:

GA: 536 views
Sawmill: 3,972 views (after subtracting 3% for spiders)

So that is almost an 8x difference.

Thanks again for any suggestions!

--Brian.

Re: page views much lower than internal counters and access log analysis

Rising Star
# 4
Rising Star
Hey Brian,

I would hesitate to compare any log parser against tag based analytics. You may want to run another tag based platform in parallel (your results will still vary from one to the next). This is not an endorsement, but you could try something like Piwik - which is also free and see how the data compares to GA.

You should not get any proper ad blocker blocking GA, and it works fine on mobile.

Hope that helps...

-Theo

Analytics Evangelist at MoreVisibility | Contact Me
Connect on LinkedIn

Re: page views much lower than internal counters and access log analysis

Visitor ✭ ✭ ✭
# 5
Visitor ✭ ✭ ✭
Thanks again Theo. I did try Piwik and it shows almost the exact same numbers as GA. We are still seeing the huge discrepancy between verified page views in the logs and internal hit tracking and GA. There is someone from sawmill trying to help figure it out, which I really appreciate. If I ever get an answer I'll post it here. Thanks.

Re: page views much lower than internal counters and access log analysis

Rising Star
# 6
Rising Star
Brian one thing to keep in mind is that server log files work at the server level and GA works at the application or website level, my server logs show an almost 400% more pageviews than GA, but that is ok because they are apples and organges you can not and never will match one for one between the two. If Piwik is similar to GA in its counts then you have no issues just different fruit. Some where on this forum is a thread or two about this subject and what can cause GA to not count a pageview. Sorry just too busy at the end and beging on the months to really search for it, but it is out there.

Re: page views much lower than internal counters and access log analysis

Visitor ✭ ✭ ✭
# 7
Visitor ✭ ✭ ✭
Thanks Brian. Yes I totally realize they are apples and oranges, and would never match, but I would expect them to be much closer than they are.

I am focusing on article pages for my testing. So the majority of people should stay on the page plenty long enough for GA to register the page view. There has to be a reason why a given page view is recorded on the server and not by GA.

Reasons I know of:

* Bots / Spiders - My log analyzer only shows 3% are bots/spiders but lets just say 10%
* real user with Javascript disabled - would have to be extremely low < 1%
* User clicked away before the page view was recorded - not very likely especially in my scenario, but lets just say 20%
* Ad blockers - not supposed to block GA normally and our measured ad block rate is only 5% anyway

That adds up to around 30%. Can you think of any other reasons I am missing that can get me to 400%? I do load my ad server code before GA but I don't think errors in ads would have any affect on executing GA. I am going to put GA in front of it though just to rule that out.

Thanks again for any advice!

Re: page views much lower than internal counters and access log analysis

Rising Star
# 8
Rising Star
Brian I can not think of any additional items that would cause a page not to be counted in GA.

Re: page views much lower than internal counters and access log analysis

Visitor ✭ ✭ ✭
# 9
Visitor ✭ ✭ ✭
I tried using the Google Measurement Protocol (https://developers.google.com/analytics/devguides/collection/protocol/v1/), which is just google analytics called on the server side. Those numbers are matching my internal counters and server logs after subtracting googlebot.

I also put an iframe inside of a noscript tag, so I could count how many page views had javascript disabled. There were only 5 out of 15k page views. I was really surprised by this. For some reason googlebot does not trigger the noscript, but I can see exactly how many page views googlebot was from the webmaster control panel.

Could there be some kind of cache issue going on on the client side? I couldn't find any kind of cachebuster parameter for the GA tag.

Thanks again for any help. There has to be a way to find out why so many hits are not counted by GA.

Re: page views much lower than internal counters and access log analysis

Visitor ✭ ✭ ✭
# 10
Visitor ✭ ✭ ✭
I thought of one more interesting piece of info. The views on our articles ranges from 500 - 15,000 but the discrepancy rate remains around the same percentage. If the not counted views were coming from some kind of bot or spammer traffic, I would expect the percent missing would be greater for the articles with less views. So to me that kind of rules out bots/non human traffic as the source of the problem.

I know this is unlikely, but could there be a DNS issue with some big ISP where it's not resolving google-analytics.com?