Tell me your scraping stories, Jow Forums

Tell me your scraping stories, Jow Forums.
Tell me how your data is.

Attached: Untitled.png (1920x1080, 276K)

Your mom scraped my ass with her tongue

You wanna get anally vored again?

how does archive.org get away with so much theft ?

kek kek

>playstation
>roms

I'm guessing they can claim fair use?

>theft
>not abandonware

not him but lol i draw lots of porn of this and people thought it was sexy and gave me attention for it but then all my work got purged from eka's portal because it was underage :(

An ISO file is just a disk image that's read only memory.
A bin file is still read only memory.

nice.
Post some censored ones. This is a blue board, after all.

My question is how some people got the Redump archives. Like, isn't that project manned by a small group of people who don't give out the archives?

>using chromedriver for something you could easily just parse from the html
b l o a t w a r e

>Need login credentials in order to download the files.
>Even if I did parse the html (which I do to get the links), I would not be able to download from each link due to auth.

Why don't you do it, if you're so good?

Share py script?

scrapped cvent a while back. Must have sent too many request at one go and they changed some settings to their site.
Also tip: run torify to not get ip banned.

>due to auth
wtf that's literally just adding a header
this is a very autistic way of mass downloading files

>writing a python download script when you could just use the provided torrent

Attached: 1534712952297.jpg (250x250, 6K)

you know you could do this using downthemall on an older version of firefox? you wouldn't learn web scraping, but it would be faster/easier.

anyone tried scraping with c?

Archive.org creates torrents for everything and they add web seeds to them. I think torrenting would be more efficient.

Archive.org is considered a library in California, but they will delete content like new games or new movies if the holder asks nicely.

Don't forget to donate.

Huh. Well, It's too late now.

Torrent wasn't working for some reason when I was using it.

why scrape when you can have others host for free? why not just store the search terms/links to the content instead?

>Why scrape when you can have others host for free?
Time. The amount of time it has taken me to download roms when I want to play them has been extraordinate. I'd rather have all of them downloaded at my disposal when I want to play them. I' just frontloading the downloading.
The second reason is that I'd like to curate a database, and I think a rom database would be a awesome first project for that. Just to be hosted by some computer that I can pull from and store all the data on.

I was practicing python so I wrote a scraper that scraped every tutortialspoint pdf. Was a good learning experience.

do you still have them?
asking for myself.

yes, playstation uses CD-ROMs to store games