Tumblr NSFW Upload thread

Poll results: halp

I'll do my part and help
53.11% 777 votes
I just want to sit back and wank
46.89% 686 votes

Poll ended with 1463 votes.

Xaxu-Slyph
My Little Pony - 1992 Edition
Wallet After Summer Sale -
Not a Llama - Happy April Fools Day!
Artist -

Joltin' Jojo
@CMC Scootaloo
It grabs just about everything that the Wayback Machine grabs, but mainly as html files. Which is fine. A bit annoying to parse as you have to bring them all up one by one, BUT manageable. The only script I know of runs off Ruby. This will get whatever you want from TWM. I have only have exactly ONE that it refuses to grab and that is Femmegasm. To anyone familiar with it. It's by the same artist as Ask Meanie Belle. And if anyone has an archive of that or knows where to download it. I would appreciate it. Here is the script — https://github.com/hartator/wayback-machine-downloader

Twrk mentioned they might be able to rewrite it in something else.

@twkr
CMC Scootaloo
Duck - Common sense 'n stuff
Wallet After Summer Sale -
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Artist -

Scootaloo Fanclub Member
@Xaxu-Slyph

I see. I was talking about a tool/script that does the job of putting sites on the WaybackMachine for you, though. In the same way as you paste the URL into it and then click on "Save Page", except for letting the script do it in an automated way.
Raiden Gekkou
Perfect Pony Plot Provider - Uploader of 10+ images with 350 upvotes or more (Questionable/Explicit)

@The Smiling Pony
@Raiden Gekkou
They aren't deleting anything yet. What they changed was fixing yet another hole in their system where other people (ie. not the blog owner) could see flagged posts.

RIP that dream. I did find out that you can reblog some posts from google's cache archive. Here's (NSFW) an example of a post that's deleted, but can still be reblogged from the web cache. Quick reblogging(alt+left click) doesn't work though.
twkr

@Xaxu-Slyph
I did and I still stand by my proposal to do that but that won't be of any real benefit since the wayback machine is very slow and there won't be any noticeable performance difference, if at all, and there's no other reason to rewrite it. And please, don't butcher my nickname in the future…
Xaxu-Slyph
My Little Pony - 1992 Edition
Wallet After Summer Sale -
Not a Llama - Happy April Fools Day!
Artist -

Joltin' Jojo
@twkr
I sincerely apologise for that. I don't have dyslexia, per se, but I do jumble letters occasionally. Probably my mind trying to make something where there is nothing. May I ask, if you don't mind, what the letters pertain to? You are, of course, free to tell me or not. Just curious. I don't also suppose others have made this mistake as well?(I also happen to have a name IRL that gets misheard on the phone a lot. So I can relate.)

As for the re-write. 'I' don't personally need it myself. What I got to work with does just fine for me. I do agree with how slow it can be at times. It's weird how it seems to speed up or slow down on my end of things. I would also like to Thank You for all you've done for this endeavor.

Might I further inquire to anyone what the torrent folders are exactly? I don't have it in front of me at the moment, but I remember the smaller ones simply being .json files only? I couldn't originally open the torrent file itself in utorrent as I would get an error. I switched to, I think, Bitorrent… which crashes if you do much of anything. I remrmber it being mentioned this is partly due to the size of the torrent file itself?

Sorry for the long message. I had some inquires that I was going to bring up anyway.
twkr

So, I think my Tumblr dumping queue just finished getting downloaded, which is supposed to be a good thing considering I've fed the whole stage2 list to the dumper. Now I only need to find out how good or bad the stuff I have is and how I can compress or store the stuff efficiently (>IMPLYING not on my server). Thought about using S3 at first but it costs a bloody arm and a leg to use FML.

All Tumblr dumps combined weight 4144GB…

@Xaxu-Slyph
My nickname reads TWeaKeR and yeah, after 2010 people tend to misspell it very frequently. As for the archive.org dumper's speed fluctuations, it can be almost anything ranging from the flaky channel width of archive.org servers to an improper implementation of concurrency in the dumper itself. I skimmed through the code and it seems fine at a first glance but that doesn't mean it's completely error-free.
Lisboeta
Not a Llama - Happy April Fools Day!

@The Smiling Pony
Pardon me if this is a stupid question, but when exactly they patched it? I archived quite a few things before the deadline, but then seen that access was still possible and kinda relaxed and dropped from the archiving.
imsesaok

I'm quite late to the party but better than never I guess.

I only backed up a handful of blogs(flashytheflashpone, lethanvas, milkmare-of-trottingham, nastylittlepest, rarecandy19, sosoftandtender, xrandomxnessx) but it ain't small (12GB.) Download link for the torrent is here.

I'm in a bad position because I have to run my laptop to seed. I'll keep it up as long as I can, but if no activity occurs until Saturday, I would not be seeding for the next 3 days when I'm away for the lunar new year.

EDIT: Somebody's been downloading for a while, already in 10%. I'll keep the thing running and hopefully one or more people gets a full copy. Thanks!
twkr

So, I've been slowly working through my 4tb dump and found one very interesting thing. Nor sure if it's a common knowledge already but meh. Blog meta object contains a "header_image" field. This is the link to a raw image that was uploaded bu blog owner to be used as the blog's header image. I just checked multiple blogs manually and found at least one where the header image is a raw image of the other art that was posted and uploaded here.

Manual checking using DOG technique revealed that the header image is not an upscale though I doubt my own eyes as of late and if someone has a better method of upscaling detection, I'd be glad to have this "raw" header image checked. You can find both images here.
ElectricGears

I guess it would be a good time to ask what the end-game is here. I have about 2600 blogs totaling about 9TB captured with TumblThree. Obviously I'm not going to be able to publicly host this or manually upload everything. Plus I ended up with a ton of non-pony furry stuff. I'm wondering what everyone else's plans are for the project. It might make sense to have some kind of special tag tumblrSuicide? for these uploads in case there is a need to sort them out in the future.
CMC Scootaloo
Duck - Common sense 'n stuff
Wallet After Summer Sale -
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Not a Llama - Happy April Fools Day!
Artist -

Scootaloo Fanclub Member
@ElectricGears

I am still in this, too, even though I have been quiet lately. There is still a good number of Tumblrs I want to save, because the new automated system might still axe an inactive blog that isn't against the rules, if it hasn't already.
I will upload all I have, too, just want to collect a bit more. And the end game is pretty much that, uploading it all to Derpibooru. There is little use to the archiving efforts if the archived material stays hidden on the HDs of the archivists and if no one else than them can access it.
Not being able to is not so obvious as you think. You would drop dead rather soon if you would try to upload all of this at once, but that isn't necessary. Set aside a certain amount of time each day and upload as much as you can during that time. That is going to take a while, but eventually, the pile will all be up here.

Other than that, and I asked this before, is it possible to re-create an entire Tumblr blog, with design, layout and posts? TumblThree downloads a lot more than just the pictures themselves and I guess all these files have to be good for something.
Interested in advertising on Derpibooru? Click here for information!
My Little Ties crafts shop

Derpibooru costs over $25 a day to operate - help support us financially!

Syntax quick reference: *bold* _italic_ [spoiler]hide text[/spoiler] @code@ +underline+ -strike- ^sup^ ~sub~