How do you use regex to parse html?

Question

How do you use regex to parse html?

Daniel Sanders

Attached: regex_examples.png (700x496, 146K)

April 22, 2018 - 00:46

Other urls found in this thread:

stackoverflow.com/a/1732454/2378146
github.com/PuerkitoBio/goquery.
html.spec.whatwg.org/multipage/parsing.html#parsing
twitter.com/AnonBabble

Owen Richardson

>How do you use regex to parse html?
don't do that

April 22, 2018 - 00:47

Jayden Moore

H̸̡̪̯ͨ͊̽̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̝͖ͭ̏ͮ͟O̮̪̝͍ͮM̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

April 22, 2018 - 00:49

Brayden Young

stackoverflow.com/a/1732454/2378146

April 22, 2018 - 00:50

Levi Watson

if absolutely need to parse html, use something like goquery,
github.com/PuerkitoBio/goquery.
example:
func ExampleScrape() {
// Request the HTML page.
res, err := http.Get("metalsucks.net")
if err != nil {
log.Fatal(err)
}
defer res.Body.Close()
if res.StatusCode != 200 {
log.Fatalf("status code error: %d %s", res.StatusCode, res.Status)
}

// Load the HTML document
doc, err := goquery.NewDocumentFromReader(res.Body)
if err != nil {
log.Fatal(err)
}

// Find the review items
doc.Find(".sidebar-reviews article .content-block").Each(func(i int, s *goquery.Selection) {
// For each item found, get the band and title
band := s.Find("a").Text()
title := s.Find("i").Text()
fmt.Printf("Review %d: %s - %s\n", i, band, title)
})
}

April 22, 2018 - 00:50

Charles Brown

haha le EPIC zalgo meme, upvoted my r/stackoverflow friends :^)

April 22, 2018 - 00:50

Connor Fisher

Writing an HTML parser (and validator) is a great exercise for anyone trying to learn programming. No regex required.

April 22, 2018 - 00:51

Zachary Sanders

>The cannot hold
gets me every time

April 22, 2018 - 00:53

Oliver Smith

Just use an HTML parser. Regex sucks for this task.

April 22, 2018 - 01:27

Carson Rogers

Good luck with that - the spec is ridiculously complex.
HTML is not XML, it's a really fault tolerant langauge and that's what makes it so difficult to properly parse.

Look at it: html.spec.whatwg.org/multipage/parsing.html#parsing Do you think that is "a great exercise for anyone trying to learn programming"? That's a great exercise for a fucking whole team of experiences programmers.

April 22, 2018 - 01:29

1 2 3 Next

How do you use regex to parse html?

Last threads