Naming convention for large feeds split into multiple smaller files
I have a feed that is 1.2GB in size, documentation states the file must be under 1GB.
From what I have seen / tested - the files I upload must match the name of the file setup on the data feed, so a wrongfully named file will not be processed.
How would I go about doing what the documentation says about breaking large file into smaller files?
Would I just need to create another data feed and upload 2 files that way?
Or would it need to be something along the lines GoogleFeed_1.csv, GoogleFeed_2.csv and while the data feed is set to use GoogleFeed.csv?
I can GZ compress the feed file to shrink it down to 57MB, when I uploaded it it was not processed.
Any direction or clear answer would be appreciated!
Re: Naming convention for large feeds split into multiple smaller file[ Edited ]
September 2015 - last edited September 2015
(1) a feed-file that is uploaded manually must be less than 20mb.
(2) a feed-file that is uploaded via ftp or a scheduled-fetch
must be less than 1gb uncompressed -- 500mb compressed.
compression is mainly to save upload time (network bandwidth) --
not to submit larger data-feed files than the 1gb maximum.
a best-practice is for each file to contain 50,000 - 100,000 items or so.
(3) the registered feed file name should match the uploaded file name.
(4) a google-xml data-feed file name should end in .xml
a tab-delimited data-feed file name should end in .txt
csv formatted files are not documented as
supported -- the results may be unexpected.
(5) there is no requirement for any naming-convention for any (split) feed --
each data-feed file-name should simply be unique across all target-countries;
generally, use plain (ascii) text and no spaces for file names.
typically, for splitting, if the original file were store1-us-products-feed.txt
then, two files would exist -- the original and the newly registered split; e.g.
a feed should be registered only once.
(6) items are tracked by id
id values must be unique per item across all feeds and all target-countries --
a best-practice is to use a combination of numbers and letters for id; e.g.
an id value should never be changed once
assigned to a physical inventory item --
any missing item (id) from a feed will
trigger a delete.
if a data-feed file is split, the original, missing, item (id) will be deleted --
until the second feed file is submitted with the missing item id; the first
time the split files are uploaded the entire processes may take 72-hours
or more, before all the deletes and re-inserts are completed.
generally, be certain to pause any automated feed re-uploads --
until after all the (initially split) files are entirely and completely
processed and all items have been given a final (searchable) status.
subsequent re-uploads should simply be processed in-place, minus
the deletes and re-inserts -- assuming that id values never change.
otherwise, very large inventory may require using the
content-api to submit items, depending on the details --
rather than any feed-files.
importantly, be certain that all items submitted as in stock
are physically in your on-hand physical inventory; especially
with respect to very large product-feeds.