Viewing last 25 versions of post by twkr in topic Tumblr NSFW Upload thread

twkr

So, I think my Tumblr dumping queue just finished getting downloaded, which is supposed to be a good thing considering I've fed the whole stage2 list to the dumper. Now I only need to find out how good or bad the stuff I have is and how I can compress or store the stuff efficiently (@>IMPLYING@ not on my server). Thought about using S3 at first but it costs a bloody arm and a leg to use FML.

All Tumblr dumps combined weight 4144GB...


"@Xaxu-Slyph":/uppers/tumblr-nsfw-upload-thread/post/4167484#post_4167484
My nickname reads TWeaKeR and yeah, after 2010 people tend to misspell it very frequently. As for the archive.org dumper's speed fluctuations, it can be almost anything ranging from the flaky channel width of archive.org servers to an improper implementation of concurrency in the dumper itself. I skimmed through the code and it seems fine at a first glance but that doesn't mean it's completely error-free.
Reason: Cleaned excessive whitespace
Edited by twkr
twkr

So, I think my Tumblr dumping queue just finished getting downloaded, which is supposed to be a good thing considering I've fed the whole stage2 list to the dumper. Now I only need to find out how good or bad the stuff I have is and how I can compress or store the stuff efficiently (@>IMPLYING@ not on my server). Thought about using S3 at first but it costs a bloody arm and a leg to use FML.

StartedAll coTunting storage used by tumblr dumps. If I were tco take a guess, mbit'ned bwe in tghet 4T144GB range...


"@Xaxu-Slyph":/uppers/tumblr-nsfw-upload-thread/post/4167484#post_4167484
My nickname reads TWeaKeR and yeah, after 2010 people tend to misspell it very frequently. As for the archive.org dumper's speed fluctuations, it can be almost anything ranging from the flaky channel width of archive.org servers to an improper implementation of concurrency in the dumper itself. I skimmed through the code and it seems fine at a first glance but that doesn't mean it's completely error-free.
No reason given
Edited by twkr
twkr

So, I think my Tumblr dumping queue just finished getting downloaded, which is supposed to be a good thing considering I've fed the whole stage2 list to the dumper. Now I only need to find out how good or bad the stuff I have is and how I can compress or store the stuff efficiently (@>IMPLYING@ not on my server). Thought about using S3 at first but it costs a bloody arm and a leg to use FML.

Started counting storage used by tumblr dumps. If I were to take a guess, it'd be in the 4TB range.


"@Xaxu-Slyph":/uppers/tumblr-nsfw-upload-thread/post/4167484#post_4167484
My nickname reads TWeaKeR and yeah, after 2010 people tend to misspell it very frequently. As for the archive.org dumper's speed fluctuations, it can be almost anything ranging from the flaky channel width of archive.org servers to an improper implementation of concurrency in the dumper itself. I skimmed through the code and it seems fine at a first glance but that doesn't mean it's completely error-free.
No reason given
Edited by twkr