![]() ![]() These were intentionally overestimated to ensure that my VPS actually had enough space and time, but actually the time estimates are off by a factor of 9, since it only took 15 hours to scrape 5GBs of data (from /s4s/), not 6 days. Wget -b -tries=10 -nc -c -i to_be_removed_in_order.txt -user-agent="Bibliotheca Anonoma Website Archiver/1.1 (+)" -w 1īelow are rough estimates for scraping time, procedurally calculated based on the amount of images listed. Sed -e 's|^./|$board/image/|g' -i to_be_removed_in_order.txt Wget $board/image/to_be_removed_in_order.txt (change board name in URL to view another list) This will take about a month at least, and that's assuming we're scraping in parallel. It's not elegant, but it works, and thankfully the admin has provided some image lists. ![]() Using wget, we just scrape the images off the server. ![]() <- <- (late 2014 threads lost) <- Archive.moe <. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |