A new character table

Question

A new character table

David Carter

I took a closer look at the ASCII table and couldn't find out why it was setup the way it is.

I plan a little RISC V project which is basicly starting from scratch to teach myself everything about computing.
So for the purposes of this project I thought why not make a easier-to-remember character table.
By looking at ASCII and EBCDIC I noticed that the control characters were encoded at the start of the table.
Any logical, non deprecated reason for that?

I made a little text file in which I wrote down my current table.
The project would of course translate it back to ASCII or UTF-8 if it's needed.

Currently it starts with the numbers 0-9 followed by the standard latin uppercase letters, then the lowercase ones.
The rationale behind this is that the hexcodes for the numbers are the same as the numbers, the letters A - F correspond to the correct hex values as well, making it easy to remember.
The start of the alphabet has an offset of 10 and if you want to convert upper- and lowercase you'd just have to know the length of the alphabet, which is 26.

Attached: new char table.jpg (239x915, 109K)

February 16, 2019 - 18:04

Other urls found in this thread:

en.wikipedia.org/wiki/Bucky_bit
twitter.com/SFWRedditGifs

Nicholas Ross

>and couldn't find out why it was setup the way it is.
Brainlet

Notice 'A' (uppercase) is 100 0001, while 'a' lowercase is 110 0001. This goes for all the alphabetical characters. This makes it easy: you just flip the 6th least significant bit to go between upper and lower case.
Or, if you were to clear the 7th lsb, then it turns A into Ctrl-A.

Both of these things made it extremely easy for early computers and programmers, because both shift and control could be implemented simply by flipping a single bit.

February 16, 2019 - 18:18

Chase Morris

>I took a closer look at the ASCII table and couldn't find out why it was setup the way it is.

Attached: file.png (645x765, 464K)

February 16, 2019 - 18:22

Noah Adams

afaik Commodore 64 and Terry rejected ASCII in their system, maybe look at that
>I took a closer look at the ASCII table and couldn't find out why it was setup the way it is.
you suck at searching then
>non-deprecated reasons
no, it's for typewriters and only alive for stubbornness and dogmatism of unix culture and backward compatibility
preserved forever in UTF-8

February 16, 2019 - 18:24

Cooper Perez

Good luck having an ascii system where NUL is not 0x00. If you make NUL 0xAB, likely you'll have to change all of your memory cleaning routines to use 0xAB.

February 16, 2019 - 18:28

Cooper Lewis

This is a very neat observation, thanks for sharing.

February 16, 2019 - 18:29

Carter Miller

First, you won't be able to use 0 as a null-terminator.
Second, you won't be able to write string literals in C.
Just stick to the standard my dude, make your life a million times easier.

Attached: 1550101874334.png (272x265, 74K)

February 16, 2019 - 18:30

Liam Campbell

Does it have a benefit for modern computing then?

February 16, 2019 - 18:32

Brody Ross

No which is why utf exists

February 16, 2019 - 18:34

Jack Jackson

no, he's a retard and as we see with more complex, non latin characters the same rules don't apply anyhow.

OP it's pointless to try. ASCII is basically baked in at this point and the most popular unicode coding scheme is basically a superset of ASCII (utf8)

February 16, 2019 - 18:35

Leo Butler

Well as I said, I'm starting from scratch, basicly having nothing but a hex editor and something to convert the 32-bit instrunctions to hex,
maybe getting an assembler running from there and also hoping I don't end up losing my sanity in the proccess.

February 16, 2019 - 18:41

Jacob Edwards

But ASCII and UTF-8 are encoded exactly the same for the first 128 characters.

>no, he's a retard and as we see with more complex, non latin characters the same rules don't apply anyhow.
Sure, which is why Unicode is a thing. But the first 128 unicode codepoints are still the same as ASCII. UTF-8 even encodes them the same, while UTF-16 and -32 just use the same codepoints but encode them differently.

Either way, there's no reason to reinvent the wheel here.

February 16, 2019 - 18:44

Ethan Reyes

WHY DID THEY NEVER TEACH US THIS IN SCHOOL

February 16, 2019 - 18:49

Zachary Nelson

It's basicly just for myself, so I can remember these things more easily. From the outside you wouldn't even notice the custom encoding running there.

February 16, 2019 - 18:50

Nathan Young

there is no fucking difference, reading unitialized memory is bug anyway

February 16, 2019 - 18:54

Austin Lee

An extremely common bug. If he ever hopes to use a library which he didn't code, he's going to be squashing bug after bug (ex: strlen won't work unless he modifies it)

February 16, 2019 - 18:57

Lincoln Cook

implying someone sane would use nul-terminated strings when doing anything from scratch

February 16, 2019 - 19:13

Jason Russell

>It's basicly just for myself, so I can remember these things more easily.
Print this out and tape it to your desk.

Attached: table.gif (715x488, 27K)

February 16, 2019 - 19:27

Jaxon Cook

Becuase of all the onions CS students who post whining about having to learn about assembly and compilers because it's "not necessary to code."

February 16, 2019 - 22:20

Jaxon Lee

> tfw this is not only a real article, but there are literally dozens of stories (including video) of this nigger bragging about all of it

February 16, 2019 - 22:54

Lucas Powell

that reminds me about this:
en.wikipedia.org/wiki/Bucky_bit

February 16, 2019 - 23:11

Jordan Gomez

Attached: standards.png (500x283, 24K)

February 16, 2019 - 23:13

1 2 3 Next

A new character table

Last threads