edit: an anon on 4chan /mlp/ is working on getting the data we need. Go there if you care.
Pastebin is deleting a bunch of pastes. A cursory search shows some pastebin.com URLs here on Derpibooru. It is impossible to tell from a basic search whether or not these Pastebin URLs have been posted elsewhere (ex. 4chan) or if they are unique content to Derpibooru that need to be preserved.
The kind folks at ArchiveTeam are willing to archive URLs - but they need a list of URLs to grab.
Is it possible to query the Derpi database(s) for pastebin-y URLs? Something like pastebin.com (in SQL, that’d be pastebin.com or even pastebin).
For posterity, the output should probably be something like
comment-id,comment text
but I don’t know how your DB is structured or if this is possible at all. I’m not a Derpi expert, but here are the areas to search that I can think of:
- comments
- uploader comment
- source URL
- forum posts
Of course, we could just run most of these queries independently of site admins and parse them ourselves. I guess this post can serve a second purpose - is there anywhere on Derpi that would contain a pastebin url that isn’t in the public DB exports?
Thanks for any help.