AdWords
5.3K members online now
5.3K members online now
For questions related to Google Shopping and Merchant Center. Learn to optimize your Shopping ads
Guide Me
star_border
Reply

Bug in google crawler for merchant data feed

Follower ✭ ☆ ☆
# 1
Follower ✭ ☆ ☆

When we first uploaded our data feed our product images were restricted by robots.txt. However, I have changed this but am still getting the error "Images cannot be crawled because of robots.txt restriction".

 

Our robots.txt can be found here: http://www.klokkegiganten.no/robots.txt

 

An example of a product image we are providing in the data feed can be found on the following url:

 

Primary image

<g:image_link>
</g:image_link>
 
Additional image:
<g:additional_image_link>
</g:additional_image_link>
 
I have tested in GWT and there is no problem crawling the image! Is this a bug in merchant center or am I doing something wrong?
1 Expert replyverified_user

Re: Bug in google crawler for merchant data feed

[ Edited ]
Top Contributor
# 2
Top Contributor

google may take 24-72 hours or so to re-crawl the website and
all product images and the results reflected in the account
after any changes are made and the (feed) data resubmitted.

also, the same robots message may be generated if an image
violates any google policy or the website cannot keep pace
with all googlebot and googlebot-image crawl requests.

if so, try reducing the image to approximately 300x300 pixels or so
and be certain the image has very limited borders and white-space --
generally, the image must be only of the product itself.

as to the robots.txt -- wild-card matches are typically unsupported and
user-agent directives should only be used at the start of a rule-record.

also, regardless of what might seem well, or may have worked in the past,
or until the issue is resolved directly with google, try adding the following
nine (9) rule-records (lines) to the very end of the controlling robots.txt
#

User-agent: Googlebot-image
Disallow:

User-agent: Googlebot
Disallow:

#
otherwise, the best likely course is to contact google directly so a person
can look at the specific images being flagged, the website, and account.

Re: Bug in google crawler for merchant data feed

Follower ✭ ☆ ☆
# 3
Follower ✭ ☆ ☆
@Celebird
We do have user-agent directives at the start of a rule record (http://www.klokkegiganten.no/robots.txt). We do not want to open up for crawling of the whole website by googlebot-image and Googlebot, as we have other image archives that we do not want crawled and indexed. Therefore the wild-cards.
Marked as Best Answer.
Solution
Accepted by topic author Klelund
September 2015

Re: Bug in google crawler for merchant data feed

[ Edited ]
Top Contributor
# 4
Top Contributor


generally, rule-records begin after blank-line boundaries
and wild-cards are not necessarily recognized by robots.

again, one possibility is to simply (a) reduce the image sizes to about
300x300 pixels or so and (b) lessen the white-space boundaries for
one or two sets of flagged images -- while (c) fixing the user-agent
entry and (d) adding the nine googlebot-image and (e) googlebot

entries to the very end of the controlling robots.txt file -- exactly

as indicated -- temporarily -- then, resubmitting the data-feed file

and waiting 24-72 hours or so for a full re-crawl.

 

these each might be done in five 72-hour stages to help pinpoint the issue --

a data-feed re-upload would be required between each change to trigger

another re-crawl and to update the account status.

otherwise, the best likely course is to contact a person at google directly.