Web Scraping General

Can we get a web automation thread going?

I never really see this discussed on Jow Forums, but I am sure there are plenty of people who are also interested in this type of programming.

I enjoy using Python and Selenium to build webscrapers, account automation, and stuff like that. What types of stuff is Jow Forums automating?

Attached: selenium_logo_320x260-300x260.jpg (300x260, 13K)

Other urls found in this thread:

github.com/zalando/zalenium
twitter.com/SFWRedditGifs

seriously? all there is to it are html parsers and regular expressions

If you teach me how I'll give you 3 cisco courses of content for free

I scrape latest manga chapters because I like using comicrack

if you know a bit of java, and css selectors you're there with jsoup.

do you know how to do selenium with docker containers? I want to break the few web scrapers I have out into containers to make the logging cleaner

Can I ask why you guys bother? Is there some profit in this? Are you scraping for content that interests you personally? What's the deal?

Me personally I've been writing my own crawler to self host a search engine. All part of my quest to de-google myself. I don't technically scrape, just crawl, scan for keywords in the doc, and index them.

So I'm a student with a netacad account. I want to be able to review my course material 5 years from now but don't have faith I'll have perpetual access to the content. If I scrape the course content it's like I own 3 Cisco textbooks i can keep and share. I downloaded a website scraper program, but all the actual course content wasn't retrieved.

>scrape content
>use content to train algorithim
>???