Scheduled Fetch Failing
We have a scheduled fetch set up for a url like the following: https://example.com/media/feeds/test_google_base_default.txt
This link is accessible in the browser just fine and the file adheres to all of the specifications.
However when we issue a fetch, wether it be automatic or manually, the following errors is returned:
Failed to connect to the remote server. Please make sure the source URL in your feed configuration is a valid one.
Can anyone provide any isight on this issue? We have also checked that robots.txt is not blocking access to this file/directory.
Any help is appreciated. Thanks!
Re: Scheduled Fetch Failing[ Edited ]
March 2017 - last edited March 2017
normally, a scheduled-fetch triggers a get-request by google and is indicated
by a web/server log-entry from one of google's user-agents via one of google's
126.96.36.199 - - [14/Mar/2017:22:23:29 -0400] "GET /help-forum/help-forum-au.xml HTTP/1.1" 200 - - "google-xrawler" "celebird-support.appspot.com" ms=7 cpu_ms=0 cpm_usd=0 loading_request=0 instance=- app_engine_release=1.9.48 trace_id=fa5020116f2d4de10e4724f8fcbaa22f
such details is one reason why most browsers simply cannot effectively verify
google's automated requests -- network/server configurations may block such
requests, but not from a typical browser/user-agent.
generally, a browser cannot be used to verify data-feed fetches by google --
most browsers, by default, cannot effectively verify the automated googlebot,
adsbot-google, or googlebot-image crawls or related feed fetches by google.
a similar log-entry should occur within the server's logs, a few seconds or
so after a fetch-now is requested, or soon after a scheduled-time is reached.
also, depending on the hosting-company, the servers may have separate error,
access, conditional, or syslog logging -- these should all likely be checked;
asking the hosting-company's support organization directly may sometimes help.
scheduled-fetch details are also available when using a test-feed --
which may be a good option when attempting to locate a root-cause,
without adversely affecting a live/standard feed or live/active products.
a test-feed has the added advantage of allowing more tests
with different file locations, while not interfering with live data.
one possibility would be to use a test-feed with http, rather than https, with
a fetch-now, to at least verify the crawler/ip/user-agent can access any file
within the website -- assuming the sever responds to both protocols.
another test with a test-feed would be to simply duplicate the https url
with a few fetch-nows in succession while watching the server's logs --
possibly again but with much smaller feed-files for quicker analysis.
check the server's log files; check any server or
networking blocks of user-agents or ip-addresses;
verify the exact https/protocol response is 200/OK,
at all times -- especially around fetch-times.
however, this is mainly a peer-to-peer forum -- forum-members cannot look
into any submitted feed or account details; forum-members can mainly offer
suggestions based on the details posted here in public.
posting the exact specific link/url being used for the fetch here within the
public-forums, may sometimes help others offer more specific suggestions.
otherwise, the best likely course is to simply contact google directly --