I have a firefox bookmarks html file. The schema:

I have a firefox bookmarks html file. The schema:

$bookmark_name

I want to parse it and get lines like : $url $bookmark_name
How do I do that using gnu/linux tools? Thanks in advance!

Attached: gedit_2018-11-04_18-38-31.png (937x340, 41K)

Other urls found in this thread:

pypi.org/project/pyquery/
stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
twitter.com/SFWRedditImages

plz help guys

try asking in the mozilla forums

nah, I am asking your mom out tonight already.

regexr.com

html is not regular

xmllint

Could you be more specific? Do you just want the variables?

With awk you stupid nigger

Attached: 2c6b381.jpg (2580x2450, 140K)

This

>Firefox

Those are not the variables in the usual sense, I just showed the pattern. I want values in HREF="value" and value

And how would one do it? I am in the hurry, so if you could just help me out with this that'd be so fucking great.

sed, awk, tr, cut, idk do some fucking researches you lazy fuck. Also anyone using linux rn without knowing how to use its coreutils should uninstall linux immediately and install wangblows or osx.

use cut, looks like = can be used as your delimiter

But urls or datum with the equal could fuck you so maybe =" would be where to cut

Why limit yourself? Use a parsing package designed for the job of walking through xml or write a script in a common Linux language like python3 that utilises their language-native xml parsing library. Cut and regex and etc are awful hack jobs full of corner cases.

Well the thing is that I don't know how to program. Embarassing, I know.

I think
$ see "s/.*|(.*)/\1 \2/"
Or something close to it should work. Good luck

So I tried this:
cat bookmarks.html | sed -r -e 's/.*HREF="(.*)"\ ADD.*>(.*)/"\1":"\2"/' | grep : > file.txt

But I get picrelated
I tried to replace : between url and name with:
cat file.txt | tr '\042:\042' ' '

but it replaces : in the url too. What should I do?

Attached: bash_2018-11-05_02-13-43.png (599x70, 6K)

you could prolly find a way to get jquery to work from the command line in node, it'd make navigating HTML like that pretty easy

good project to learn on—very specific victory condition for you

I added the .*| Because you can't print what you don't match.
On the other hand you could change the print from "\1":"\2" to "\1"dicksdicksdicks"\2". Then grep for the dicks instead

Do it locally on the server

pypi.org/project/pyquery/

you can't parse html with regex:
stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

Luckily enough he's not parsing HTML with regex, you imbecile. He's pulling values out of very specific places in a file.
And it's not can't it's shouldn't.

I have a Jow Forums post full of insults directed at my (You)'s. I need to read the HTML document to find just the insults. Can I do this with regex?

>
>Luckily enough he's not parsing HTML

>I want to parse it and get lines like

you can still parse parts of it with regex

Attached: grammars aren't everything.png (352x412, 16K)

>While it is true that asking regexes to parse arbitrary HTML is like asking a beginner to write an operating system, it's sometimes appropriate to parse a limited, known set of HTML.
Read before you post.

OP doesn't even know how to use regex, he's obviously not a programmer and if you had read the question you would know he's not parsing HTML because he's not validating it as proper HTML. Just because OP doesn't know how to phrase his dumb question doesn't have you the right to be retarded on purpose.
The file he's getting does not need to be valid HTML for him to get what he needs
Shut the fuck up and consider suicide.

dude you have your answer already. just substitute ": and replace with "\" " instead. It's not the best way but it works for your case.

something like this in BASH, same concept if you using sed
bookmarks=$( bookmarks.txt

It only works for the first line tho

ok, found sed solution myself:
cat file | sed 's:/ /2'

Oh boy, I should really learn coreutils

For sure, piping cats is basically a sin.