UTF8

Attached: Photoshop_2019-02-26_12-13-44.png (1106x296, 50K)

anyone noticed the recent 9front patches?
Fellas have non-standard behavior of %.*s in printf where length is in UTF-8 codepoints and not in bytes.
It's so non-standard that they have been using it the other way all over the place and how they are putting O(n) length lookup.
Fucking retards.

I CAN'T BELIEVE IT ACTUALLY EXISTS FUCK

amazing

Why would you use printf for bytes?

>oh noes, UTF-16 with only 2 bytes are not enough to store all these pointless characters no one uses and gay emojis with nigger and pajeet variants
a slightly modified and less gay UTF-16 would be perfection
fuck the unicode consortium

I was dealing with parsing an UTF-8 string recently and would keep track a of an offset and length values for whatever symbol I was dealing with, and using printf("%.*s", len, str+offset); made sense and it doesn't need to calculate pointless codepoints and shit

because all data bytes
the length is there so you don't have to use nul-termination (or defensively not use and rather do bounded access)
why would you have it there in utf-8 codepoints? codepoints don't reliably correspond to anything useful about the text

There are a lot of shenanigans you can pull with unicode "homoglyphs" like that. For example, a lot of spam/abuse filters aren't smart enough to normalize (meaning replacing all 'a' looking characters to the ascii 'a'). Or, many systems will render URL's written in unicode differently based on how they normalize (or not) the strings before running URL detectors on it.

lmao what a nerd u'll never get any pusy

UTF-8 is horrible honestly, UTF-32 should be the standard.

fuck you, you stole my catchphrase

what does an encoding have to do with the unicode consortium adding pointless characters
do you even know what you're talking about

UTF-8 is not your average encoding.

This is the turning point where I go from being angry at the chaos to enjoying it. Burn it all.

that's why I said we should use the UTF-16 encoding but with sane codepoints not made by trannies and sjws

Good. That's how it should be and makes total sense in that context. You're determining how many lexical characters to print, nothing to do with the type char or (size in) bytes.

If you want to print 1 character, you want to print 1 character, regardless of how many bytes wide it is. ASCII dependent applications deserve to be broken.

⫸deleted
kek

no
it's for bounded memry access, thus offset in bytes
glyphs or encoded characters can be composed from multiple codepoints
specifying number of codepoints while knowing the length of buffer in bytes can cause buffer overflow or again O(n) lookup

also what you want with your claim is the %.*S extension (that works over array of glyphs)

DELETE
THIS