Using Hangul as a base 11,172 instead of base64

So I was thinking,

Base64 is used to serialize binary data pretty often even though it inflates the size of the data (4 bytes for every 3 raw binary). What I was thinking is that in order to save as much screen space as possible we could use Korean Hangul Syllables (unicode blocks). The reasons for this are:

1. They are composable. You can spell them out and sound them out, unlike shitty Chinese characters.
2. They take up much less space that base64 or hex

sha512 hex - 128 characters
sha512 b64 - 86 characters
sha512 Korean Syllable block - 0xA00-U+D7A4. 38 characters

057622eb43740427d3dacd50a1
fe4b8b3d768b683b2550b6a9b14
eadf0c7ad77fc64a8a19d692ee3b
66345bdeb421ec2e9b1b6663815
6b830add02319e6974c6

BQmWA9RhOBEJT116ChhsVqq
yhDb55oqWYekyZ8R10pz2juGI0
hOwNLI3kvAIPMQ3dWNspvSDL
F95P9fNrnyGai

귳언쐏눨뻋됙맅쪅몥뜶턮삔풀퓅똅쳥절
훴쭫쑜꽪쬐쉞쉳뱱쳢쥮꾭녒숶깵믜퓠퉋
맫픜멶멞

I think this would be pretty cool for sharing links. What do you think?

Attached: base64.jpg (352x200, 14K)

Other urls found in this thread:

ipfs.io/ipfs/QmbmmXPqwJESzeMctCZmAV1php93XGQrx6ry2VfEAJ4t8t/
twitter.com/NSFWRedditVideo

im not really good at binary data so excuse my ignorance, but i do speak Korean and use it on my computer all the time.

There are often issues with hangul encoding on certain systems/programs where UTF-8 isnt supported; causing hangul characters to be displayed as boxes.

Wouldn't this cause some issue somehow?

you're right, it would cause issues for any system that can't support UTF-8. But that's why they should install support. I think it would be ok for most.

>causing hangul characters to be displayed as boxes.
>Wouldn't this cause some issue somehow?
only display issues for utf-8 capable software, clicking the link would still work even if the glyphs for some/all code points are missing.
>though it inflates the size of the data
it still would, even more than b64, because some bits (those idicating hangul code point range) would be the same for every character. b64 is somewhat efficient because it uses only 8-bit code points, for which ascii range is implied.

It takes less space visually, but requires multi-byte encoding, thus ends up consuming more space than base64 with utf-8 encoding.

Hangul characters are more than 1 byte, retard. You're not actually saving any data.

>characters
>characters
>characters
Computers speak in bits, not characters. Your 38 Korean moonspeak is just as bad as a normal hex.

who gives A FUCK? systems where UTF8 isn't supported? should we also care about making all our graphics 16-bit to support people without 24-bit color?

trips and dubs of PURE LIES FROM SATAN HIMSELF. the only reason it would take more space in bits is because of poorfags stuck on binary systems.

jesus titty sucking christ, of course it takes up more space data-wise. I'm talking about screen space when you're trying to share a long hashed link. think ipfs or something else.

i'm not a gook so I honestly would prefer latin encoding than some sort of pictograph stoneage bullshit

ok now make some sort of converter so we can use it

I'd rather see 128 aesthetic alphanumerical characters than 38 vomit inducing subhuman runes.
82df2f1cb9eb4883d183de19d449f6fd.png
Look at that. That is FUCKING AESTHETIC. Makes me cum just as much as the image itself.
Take your fucking subhuman runes elsewhere, chink-eyed nigger.

english speakers chastizing koreans for their spelling system just makes people laugh

Uh, sweety, unicode takes up 4 bytes per character

>takes up more space
>becomes unreadable
>needs unicode support suddenly
>breaks 80% of applications
This is a good idea.

>Spelling system
Nice reading comprehension. I'm talking about how butt fucking ugly the runes are. Not how they spell things.

We will never move forward if worrying about such things as compatibility. Everyone will come to our system when it is shown to be better. Just like the iPhone

Your system is worse and it is a step backwards.

the solution is that we go full ham and instead of using a gookaboo rune system, we use emojis

why not full han chinese

chinkrunes are shit.

This wouldn't be an issue, would it? You don't need to display it, you need to decode it - and your decoder would treat it as a simple byte stream.

got a better solution?

Yes. Alphanumeric.

So you're saying stoneage tech is good enough?

qr code characters

base64 is what is it because it's printable, in every sense of the word, it can be physically printed, displayed on screen, typed on a keyboard, and stored as 7-bit ascii text
it's not going anywhere

i can't read it, and i don't even have a font for this shit
if i can't read it with my eyes and type it into another machine, it has failed and might as well just be a qrcode or something

Attached: 2019-08-17-183732_278x265_scrot.png (278x265, 20K)

Alphanumeric is the best and simplest way. Hangul or any other unicode encoding is fucking retarded. Why the fuck do you want to throw away the simplicity of ascii for variable fucking width encoding?
There's zero fucking advantage to using a unicode encoding.
Also, most people can actually fucking read and type alphanumeric strings, most people can't read or type hangul.
>w-why would you need to do that
I commonly look at the first few characters, and compare it to another hash to see if it's roughly the same. That's much harder to do with hangul because I can't fucking read hangul nor do I know their runes and therefore they are much harder to remember and compare. It's already hard enough comparing Japanese strings which I already do, but at least I can read half of Japanese, unlike hangul.

this idea reeks of newbie programmer/webdeveloper who thinks he's real fuckin smart and has no clue why things are the way they are, and just fucks everything up by making a more modern octagonal wheel

If using Unicode characters was acceptable, Base 65536 is a thing. The reason we use Base64 is because we specifically want to use ASCII only characters. We want to encode binary data as something we can type out on a standard US-English Keyboard.

can some smartass here tell me how to convert an image into base64 on arduino. I know there are libraries for it but I have no idea and not enough ram apparently on my ESP32

Attached: 1566018686472.png (603x604, 497K)

> t. brainlet that's happy with binary systems

Also, OP... by the looks of things, the Hangul you typed out in your example takes up 114 bytes when encoded as UTF-8. By comparison, in the same encoding, the Base64 takes up 86 bytes.

And yes, that is BYTES we are concerned with, NOT CHARACTERS. The point of Base64 is to encode as much binary data as possible into as few printable BYTES as possible. For every 6 bits of binary, it takes 8 bits to encode. Simple stupid. Each Hangul character takes a minimum of 16 bits, more if they are combined. And no, you can't choose another encoding than UTF-8, because the other requirement is that it be PRINTABLE.

that was not the purpose though. why don't you go ahead and calculate the number of pixels that each encoding takes on the screen?

op is concerned with characters on screen, not the actual byte size

These are not relevant factors when it comes to encoding binary data as printable text. If we were concerned about pixels on a screen, you can just encode the binary data as an image. 32 bits per pixel.

it's a fucking character, the squiggles are absolutely arbitrary, what the fuck are you talking about

if screen space is paramount, and human readability doesn't matter, then use a qrcode or something

Attached: a.png (41x41, 336)

yeah but an image cant be copy/pasted as text. plus consider gpu power, when you consider that the hangul characters take fewer pixels, it becomes obvious that you will actually save resources on any system with a gpu because the character encoding is only the most basic part of the picture, whereas screen rendering takes the most processing power and memory. once you account for things like anti-aliasing, bitmap rendering and compositing, hangul is much more efficient than any existing system

pls ignore the relevant arguments and answer this.

Attached: download.jpg (228x221, 13K)

The number of characters on screen is irrelevant. Hex has survived because it's easier to parse, not because it's tighter. Being able to sound them out in a tiny-ass part of the world is of no relevance to *anything*.

not on topic, post it in /sqt/

pay for amazon compute instance time and run the conversion there

if you're just copy/pasting it, and not transcribing it, then why in the holy fuck does it matter how long it is?

how the hell do i send an image from esp32 to amazon aws?

Hangeul is completely regular, deterministic and alphabetic. You're confusing it with the Chinese characters.

op, you need to understand why base64 is the way it is before making up ways of improving on it
if all your concerned about is saving screen space and only making it copy/pastable, then the best solution is to make a button in your program to place the link in your clipboard directly, or send it to the other program, rather than displaying it on screen at all

network block device

Base64 exists to send data over a channel that only supports ASCII.
If you can send arbitrary binary data already, then why not just send the data?

the only reason anyone pushes hex is because they've been convinced by russian hackers/social engineers that hex is good. Russians prefer hex because software like wxhexeditor allows them to do hacking. SO we've uncovered another reason OP is right -- moving away from hex will reduce hacking

Nevermind, now I get why you're doing this.
Sounds like a good idea, OP.

here's an example of this
the BTIH (the hex text) is usable to download the torrent just the same as the "download" button next to it, but why would you copy/paste the btih when you can just click the download button, which then brings up your associated program?

hangul is not arbitrary

pic

he didn't claim otherwise

Attached: 2019-08-17-192506_537x93_scrot.png (537x93, 9K)

>hangul is not arbitrary
>he didn't claim otherwise
>If you can send arbitrary binary data already

are you just talking out your ass? I said with arduino. there is no info.........................

Attached: hqdefault.jpg (246x138, 16K)

in the context of making up a new text-based data display format, where the only limit is the number of displayed characters on screen, yes, the data is arbitrary
op has made it clear he's not concerned with what the actual data is

op wrote
>I think this would be pretty cool for sharing links. What do you think?
which seems to imply, but im no expert, that this would be a way of putting links into text forms using a more compact display format that is within the accepted data format of the text form

What problem are you solving? You could link to every atom in the universe multiple times with 128 characters. Screen space for links is a non-issue. Use a smaller font.

use a network block device. I don't know what the fuck an esp32 is, but if it shows up in linux you can forward the device to a remote server. if it is a block device you can use nbd, if it's a character device you can just use netcat or one of the many drivers or methods for forwarding character devices.

If you use zero width characters, it will use up even less screen space.

op doesn't have control over user font size but he does have control over character count. you are just throwing red herrings

>I don't know what the fuck an esp32 is
it's a microcontroller, it doesn't run linux. he also mentioned arduino, which is a device and suite for microcontroller development

from what i can tell, op wants to make things like
ipfs.io/ipfs/QmbmmXPqwJESzeMctCZmAV1php93XGQrx6ry2VfEAJ4t8t/
shorter, which is fine, but going beyond what ascii can do introduces various problems with basically no advantages

Have you heard of link shorteners OP?

sounds like botnet

Attached: bitcoin_miners.jpg (3263x1839, 1.13M)

Yes, what problem is that solving? How many people are struggling with the life threatening problem of links being 46 characters long rather than 20 characters long? How many people who have their life qualitatively improved by shortening links by a few characters?

what if your browser limits visible url to 24 characters and the 20 characters is just under that limit, allowing you to clearly see your location? seems to me that you actually just hit the nail on the head in terms of explaining why OP would want to achieve this

op doesn't have control over whether the user's client can actually display hangul text either or if the user's font has hangul characters.

Then get a better browser. Increasing a limit is much easier to do then implementing an entirely new encoding, this is like fixing a faucet by building a new dam. This is a solution looking for a problem.

better to have 20 boxes in the text field than 21 and three dots

What the fuck are you talking about?

How is a random string of hangul characters more useful to the user than a random string of alphanumeric characters, truncated or not?

And those arbitrary squiggles are uglier than my languages arbitrary squiggles.

because it is accepted as form input and it fits in the fucking url bar. is this really so complicated? if you were a developer and I was steve jobs i'd fire you on the spot

You have no FUCKING IDEA what you're talking about. Fuck off and never talk about anything computer related ever again.

>it fits in the fucking url bar
what. how is this even relevant?

basically nobody and no device can read/write/type hangul
being able to display it and copy/paste is alone is not as useful as base64

because that is the entire purpose. How is using text at all even relevant? I guess we should just shove vibrators up our ass that give us signals through pulse modulation since visual representation is irrelevant

>You can spell them out and sound them out, unlike shitty Chinese characters.

Umm, no i can't.
How am I supposed to type that on my keyboard? How am I supposed to pronounce these?
Am I supposed to learn an entire alphabet and an entire keyboard layout?
And for what? So I can just save a few characters of screen real estate when dealing with hashed data?

No.

Attached: pQo7e8E.jpg (514x536, 51K)

So the entire problem this is solving is so that users can see □□□□□□□□□□□ in the URL bar instead of pXzsh1GDqLrohKwobY...?

> he doesn't know that those chars take up two or three bytes instead of one
100% retard.

that's one part, yeah. are you seriously this short sighted and tunnel-visioned? if we were building the Apple iMac you'd be the retard that says "DOES U RALLY WANNA PUT DA CPU IN DA MONITA? DAT GONNA GET HOT N CUNFUSING"

Attached: linus telling it.jpg (425x445, 56K)

it takes like 30 minutes to learn hangul, 병신 .

Attached: 640144_900.jpg (653x716, 89K)

>It takes less space visually,
nobody.. and i MEAN fucking nobody in this universe gives one SINGLE FUCKING FUCK WHAT BASE(whatever) LOOKS LIKE ON THE SCREEN, YOU BRAIN DAMAGED FUCKING MORON.

hurr dur
ok so what if your browser does not support hangul?
what if your os does not support hangul?
what if your browser supports only the letter 'A'?
what if there is some other ARBITRARY limitation that I decide to impose to make your idea look stupid just like you are trying to pull ARBITRARY limitations out of your ass to make it look better?

For every limitation you see this idea working in, I can come up with 2 more that make it now work.

every non-shit browser and non-shit OS supports hangul out of the box, kiddo.

> what if you don't know how to read
> what if you are blind
> what if you lost all your fingers in the war
> what if my IQ is too low to comprehend
> what if I don't even use linux
see this shit goes both ways, dumbass

Attached: pleading stevie.jpg (456x349, 25K)

The use case you are proposing is to narrow to justify it replacing the existing systems in place.
Running out of screen space is not an often enough issue to where I will agree to memories a new alphabet (and keyboard layout, that's important too)
What will end up happening is that I will have to use these symbols once every five years, and relearn the system all over. It just wastes more time than anyone would gain from using it.

That's literally what he's saying you nigger

already done with emojis

the reason you want less characters is typeability, not screen space. korean characters are not typeable unless you know korean

> why would we make a touch screen without a stylus? every touch screen for 20 years has used a stylus. a finger will never be as accurate. people know how to use a pointing device. a finger is confusing and strange, no one will want to use a touchscreen that way

>see this shit goes both ways, dumbass
my point exactly.
We can keep pulling arbitrary limitations out of our asses till we are blue in the face, but it doesn't mean that this idea is actually any good.

yea and most modern browsers support url longer than 20 characters, which is what my point was. The limitation is just not an issue in today times.
I guess i should have quoted this:
>what if your browser limits visible url to 24 characters and the 20 characters is just under that limit

Your being fking dumb. People use and point to shit with their fingers every fucking day.

These moon runes are only useful once in a very specific circumstance, not justifying the need to spend time to learn a new alphabet and keyboard layout to use it.

people didnt need to relearn how to press and point at things when they got an iphone, they had that skill from other areas of their lives. It was already intuitive, which is why it worked. Your idea is highly unintuitive, which is why I do not like it one bit.

so we can all agree that OP had a good idea but he chose the wrong character set?

I think Katakana is way more a e s t h e t i c, would take a lot more characters but who cares it's /cyb/ as fuck.

Attached: 1555021105619.png (499x808, 299K)

computers with built in monitors existed well before the iMac
ever heard of portables, or laptops?

and don't use "it's new, so some adjustment is to be expected" to be a justification for everything, it's lazy and helps nobody
making a change to the way things are done is ok if the advantages of changing outweigh the advantages of staying with what is in place
saving a few chars in hashes which will end up being relatively long while losing;
- ascii compatibility
- printability
- human readability
- ability to be typed on a standard keyboard
is nonsense, it creates problems, while not solving anything