If you attempt to store a file larger than roughly 5GB, the Amazon service will generate the following error and fail to store the file:
<Error><Code>EntityTooLarge</Code><Message>Your proposed upload exceeds the maximum allowed object size</Message><ProposedSize>6091682399</ProposedSize><MaxSizeAllowed>5368709120</MaxSizeAllowed></Error>
One way around this limitation is use the GNU/Linux command split to divide the file into several smaller chunks, which are in turn stored in Amazon’s S3. It’s important to know the order of these chunks as the original file is split. The good news is that the split command preserves the order of the chunks by adding a suffix to each chunk. Here’s an example of the split command in action. In this example, we’re splitting file outfile.tmp, which is of size 2.3GB, into chunks of 1GB each:
# du -sh outfile.tmp
# split -a 1 -b 1073741824 outfile.tmp outfile.tmp.
# ls -alh outfile.tmp.*
-rw-r–r– 1 root root 1.0G Apr 11 09:51 outfile.tmp.a
-rw-r–r– 1 root root 1.0G Apr 11 09:51 outfile.tmp.b
-rw-r–r– 1 root root 274M Apr 11 09:51 outfile.tmp.c
As you can see, split is appending alphabetical letters to the end of the split file name. This comes handy when reassembling the file. Here’s a quick proof of concept following our earlier example:
# md5sum outfile.tmp
# cat outfile.tmp.a outfile.tmp.b outfile.tmp.c > reassembled.tmp
# md5sum reassembled.tmp
All as planned! That’s all folks.