Programming Challenge - Binary Parser

The objective of this challenge is to build a program able to translate binary digit characters (i.e. '1' and '0') from a file into raw binary data to another file. You may use any language to solve this challenge. You must translate this text into raw binary data:
01000011
01101111
01101110
01100111
01110010
01100001
01110100
01110101
01101100
01100001
01110100
01101001
01101111
01101110
01110011
00101100
00100000
01111001
01101111
01110101
00100000
01101000
01100001
01110110
01100101
00100000
01110011
01101111
01101100
01110110
01100101
01100100
00100000
01110100
01101000
01100101
00100000
01100011
01101000
01100001
01101100
01101100
01100101
01101110
01100111
01100101
00101110
00001010


Good luck.

Attached: computerwhiz.gif (250x180, 562K)

Other urls found in this thread:

scratch.mit.edu/
github.com/thulsadum/bash-turing-machine/blob/master/turing.sh
godbolt.org/z/CsEdZn
en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets#BMI2_(Bit_Manipulation_Instruction_Set_2)
felixcloutier.com/x86/BSWAP.html
software.intel.com/sites/landingpage/IntrinsicsGuide/
godbolt.org/z/dHuRSG
github.com/Wunkolo/qreverse
prettyuglylittleliar.net/topic/10146-jenna-lynn-meowri-cosplayer/?page=1
twitter.com/AnonBabble

how is that a challenge?

Not doing your 1st semester homework/

to lazy to type out the 1-liner to solve this desu

boring ... and stupid break line character use a least 8 bits per line.

Easy enough, did it in one statement in the browser console (minus the trivial file I/O part).
Why do JS strings not implement reduce() anyway?

while read c ; do echo -e -n \\x0$c ; done < the-file.bin

Not going to do you much good though, that is just one learning opportunity wasted for you.

i made my own solution before posting this thread, but for some reason Jow Forums isn't letting me post it. also that looks really easy, why don't you make the same program is a real programming language, faggot.

psst dude, fyi bash is turing-complete

so are turing machines and punch cards

I actually was trying to be helpful (not going the sed route), using:
> while (i.e loops)
> grep (tokenization)
> ">" and "

Code golf in C? Code golf in c.

main(a,b){while(a=getchar()+1)a-11?b

Attached: 1572391472982.png (1129x971, 2.13M)

bump

how tf is this a challenge??? this is literally something all 1st year cs students can do

It's a challenge because you have to do it in the smallest amount of characters possible, baka.

Attached: 1538698118826.webm (852x478, 2.58M)

looks like nobody's up for a late saturday programming session OP

Attached: challenge.jpg.png (687x946, 55K)

Based and redpilled

main(a,b) {
while(a = getchar() + 1) { //aka terminate if getchar() returned 255?
if(a - 11) { //if a is not a newline
b

well, at least yours has error checking and shit

Attached: Symbos21pcw.gif (720x510, 28K)

In Unix everything is a file sweaty :^)

Attached: Symbos msx2.gif (512x424, 43K)

Yeah, you got everything right except
>terminate if getchar() returned 255
Actually EOF is represented as -1, so terminate at EOF.
That's why I had to increase 10 and 47 by 1 to compensate, giving 11 and 48 respectively.
Function arguments without a type are assumed to be ints, but it doesn't matter because the variable is truncated to a char when passing it to putchar.
I didn't remember all those operators from memory, finding a short solution took quite a bit of research and trial and error. Tbqh I think it's the first time I've ever used bitwise operators.

Attached: 1538563744451.png (1200x796, 561K)

that 16 bit OS looks pretty gool but also it's 16 fucking bits

Actually it's an 8 bit OS, Hitler.

Attached: Symbos21msx.gif (512x424, 32K)

how the hell am I supposed to do that. I know nothing about coding or how binary works.

>doesn't know how to program
>doesn't know binary

get the fuck out, go back to your gayms, have your mommy buy you glamour-wear in fortnite zoomy

ah okay, thank you!

Maybe scratch.mit.edu/ is more your speed, user.

Attached: 1567825187262.png (1527x1017, 2.87M)

>bash is turing-complete
proof ?

I'm interested in this as well. Sure you can control the execution flow pretty well, but how can you arbitrarily request and use memory at runtime?

Why

Because it's a programmers tradition to play that game. It's called code golf.

bit = char - 0x30

doesn't compile bro

Attached: 1573267849363.jpg (500x500, 43K)

>fopen
read from stdin, write to stdout. no need for FILE fuckery if you're on a proper OS

for u

void Main()
{
var input = new []
{
01000011, 01101111, 01101110, 01100111, 01110010, 01100001, 01110100, 01110101, 01101100, 01100001,
01110100, 01101001, 01101111, 01101110, 01110011, 00101100, 00100000, 01111001, 01101111, 01110101,
00100000, 01101000, 01100001, 01110110, 01100101, 00100000, 01110011, 01101111, 01101100, 01110110,
01100101, 01100100, 00100000, 01110100, 01101000, 01100101, 00100000, 01100011, 01101000, 01100001,
01101100, 01101100, 01100101, 01101110, 01100111, 01100101, 00101110, 00001010
};

Console.WriteLine(string.Join("", input.Select(x => Encoding.ASCII.GetString(new byte[] { Convert.ToByte(x.ToString(), 2) }))));
}

lame

>this is literally something all 1st year cs students can do
>all
You clearly never been in uni to study cs

void Main(){foreach(var l in File.ReadAllLines("i")){Console.Write(Encoding.ASCII.GetString(new byte[]{Convert.ToByte(l,2)}));}}

128 characters

How do I run it?

In LinqPad. Create a file name i (no extension) in LinqPad folder (eg. C:\Program Files (x86)\LINQPad 5). Copy those binary numbers into that file.

It's 115 chars, if I use C# Statements project type.

Attached: linqpad.png (1360x728, 24K)

test

You missed an extremely important problem. What happens if characters other than 1 or 0 are encountered?

UB

github.com/thulsadum/bash-turing-machine/blob/master/turing.sh

Now 89 chars.

Attached: linqpad.png (1360x728, 23K)

This is the fastest implementation I can come up with. Just try and beat its speed.
Lemme know if you need anything explained.
godbolt.org/z/CsEdZn
#include
#include
#include
// compile with -O2 -mbmi2 -mmovbe
int main()
{
std::uint64_t Bin64;
while( std::fscanf( stdin, "%8s", &Bin64) == 1 )
{
std::putchar(
_pext_u64(
__builtin_bswap64(Bin64),
0x0101010101010101
)
);
}
return EXIT_SUCCESS;
}

Attached: qpavae.webm (800x450, 1.84M)

>the whole program is just 19 instructions and a 3 byte string
this guy wins

Ok I'll give it a go
Hope the thread doesn't die before then

the only speedup i can imagine is to optimize fscanf while also skipping whitespace somehow

Can we agree on an objective way of measuring 'fast'?

Maybe
time for i in {1..5000}; do ./a.out >/dev/null < input.txt; done

why do you need the "== 1"

even sed is turing complete

We're talking TC without calling external programs

bump

does bash even support arrays?

My language of choice is Swedish this is the translation. These are the first three lines
Noll, Ett, Noll, Noll, Noll , Noll, Ett, Ett
Noll, Ett, Ett, Noll, Ett, Ett, Ett, Ett, Ett
Noll, Ett, Ett, Noll, Ett, Ett, Ett, Ett, Noll

can you explain more or less what you did here? what do these functions do?

It isn't critically important for the program to be able to handle any other character, you can see that some of the programs other people have posted don't deal with non-binary digit characters and those programs are able to solve the challenge nonetheless, however it will make your program more robust if it is able to handle such characters.

this is my solution and it is able to handle non-binary digit characters, if it encounters such a character it will ignore it and move the array index back by 1, the reason for this is that without the instruction there would be zero value gaps in the array or characters left over from the last byte if a non-binary digit character is encountered.

This is my ASM solution on WSL (it didn't want to run natively on Linux x64 last time I tried but I made some changes so who knows):

.globl main
main:
call getchar #load char into ax
mov %al,%cl #save it on cx for EOF testing
cmp $10, %al #test for newline
cmove %bx, %di
jne .+7 #if they aren't equal (it's not a newline), don't print anything
call putchar
shl $1, %bl #bitshift
sub $48, %al #substract 48
or %al, %bl #bitwise or
cmp $-1,%cl #test for EOF
jne main #jump to main


Performance is similar to this user's when testing using .
90% of the time is spent in the kernel doing IO, which is a testament of how complex and bloated modern kernels are, although sure, WSL does not to it any favors with its notably slow IO.

This is that user's solution:
user@desktop:/mnt/c/Users/user/Documents/temp/fast$ g++ -std=c++11 -O2 -mbmi2 -mmovbe anons.cpp -o anons.out
user@desktop:/mnt/c/Users/user/Documents/temp/fast$ time for i in {1..5000}; do ./anons.out >/dev/null < input.txt; done
real 0m29.915s
user 0m1.156s
sys 0m27.078s
user@desktop:/mnt/c/Users/user/Documents/temp/fast$ time for i in {1..5000}; do ./anons.out >/dev/null < input.txt; done
real 0m30.184s
user 0m1.281s
sys 0m26.953s


This is mine:
user@desktop:/mnt/c/Users/user/Documents/temp/fast$ gcc mine.s -o mine.out
user@desktop:/mnt/c/Users/user/Documents/temp/fast$ time for i in {1..5000}; do ./mine.out >/dev/null < input.txt; done
real 0m30.329s
user 0m1.313s
sys 0m27.266s
user@desktop:/mnt/c/Users/user/Documents/temp/fast$ time for i in {1..5000}; do ./mine.out >/dev/null < input.txt; done
real 0m30.144s
user 0m1.328s
sys 0m26.703s

User-mode performance could likely be improved using SIMD instructions but it's probably not worth it.

Attached: 1547281429453.jpg (970x545, 208K)

Fuck, fucked up the formatting. Oh well

Attached: 169386197594.png (716x1024, 374K)

Any suggestions on how to improve my ASM? I'm new to this tbqh

Since we are expecting binary "0"(0x30) and "1"(0x31)
All we have to do to convert from 0x30 to 0x00 and 0x31 to 0x1 is test
the last bit, and then use pext to collect all these bits, and pack them
into just one byte. We also must do an endian reverse as the ascii binary
characters are given to us in the reverse order of the actual bits

pext is a BMI2 instruction(bit manipulation instruction set)
en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets#BMI2_(Bit_Manipulation_Instruction_Set_2)
this is supported with all processors since haswell(intel) and excavator(amd)

__builtin_bswap64 gets the compiler to emit a bswap instruction, which
reverses the bytes of a given register.
felixcloutier.com/x86/BSWAP.html
this would involve a regular "mov" and then a "bswap". To get some
extra speed I used the "movbe" instruction, which does both the "mov"
and "bswap" at once using one instruction. The compiler flag "-mmovbe"
allows the compiler to emit this instruction which is supported since
haswell.

fscanf documents that it will return an integer of the amount of successfully
parsed arugments (the %8s) here counts as "one" formatting argument
by testing that it consistantly returns 1, then this loop will run
so long as we can successfully read 8 continuous non-whitespace characters.
Otherwise it will return "EOF"

And where did you learn all that shit? Do you read Intel's manuals?

software.intel.com/sites/landingpage/IntrinsicsGuide/

>processing one character at a time vs processing 8 characters at once

Is it though? It's slightly faster, like 15% maybe. But who knows how the fascanf function is implemented, it still has to load the file byte by byte I think.
I can't wrap my head around the rest of his code, don't get why he needs to swap the bytes around, or what 0x0101010101010101 is.
And x86 is an abstracted frontend anyways, in reality who knows what instructions are actually being executed in the pipeline per cycle.

Give the sed man pages a read.

you actually can know what instructions are being executed in the pipeline.
godbolt.org/z/dHuRSG

yeah, that's not the point. we're talking about the bash executable itself, not the whole range of unix utilities

fstream might even be faster, hell even a memory mapped file could be faster

so what do you do for a living that you need to keep your code so tight?

Attached: 1475179818035.jpg (700x512, 24K)

if you write code that is expected to run in real-time or run for years and years at a time then yea this kind of stuff would matter.
You also wouldnt want to bottle neck or use an instruction that could potentially down-clock your entire core just because you used one of its instructions (AVX2, AVX512) and it could be used to identify where data-level parallelism can take place.

probably kernel shit you fucking trumper

Why is the aspect ratio so crappy here?

Yeah that's what I thought, videogames, embedded systems or some crypto shit for the government

Ah, yeah, didn't think of that one

Dunno I haven't actually ran the system yet, just some pics I took from the internet

some dude made a writeup where he got a x33 speedup by centering his algorithm around some SIMD instructions. So being aware of this kind of shit seems to actually matter when you can get practically near-ASIC speed for your particular algorithm.

Attached: nasa.webm (360x696, 137K)

found it
github.com/Wunkolo/qreverse

>this thread went from bash to highly optimized and enlightening assembly
Jow Forums on a good day

I dunno man, think about all the patents Intlel must hold on that stuff

I'd probably prefer something simpler that wasn't so proprietary

>bits = stdin.read
>count = bits.count
>buffer = allocate count / 8
>for each bit, index in bits
> buffer[index] |= (bit stdout.write buffer

Too lazy to actually test it

How does that work? Is it something I can run on my own machine?

bump

Post more of this slut. Now.

t. round earth shill

fuck off kike she looks like th average braindead 38 yo camwhore

listen to ignore

I don't care. Post more. Now.

she is a fake sociopath
prettyuglylittleliar.net/topic/10146-jenna-lynn-meowri-cosplayer/?page=1

also her boobs in that pic are fake rubber mounds.

it took me like 5 minutes to understand what you meant by "she's a fake sociopath". like, she pretends to be a sociopath? wut

fuck I'm autistic

Yes, and?

sudo post-more -t now --slut jenna-lynn

dude just go to a porn site wtf

No.

su -c

its right in the link you fuck. just look up her name.
Theres some lolcow thread full of her pics.

please lets go back to talking about intel intrinsics and quit bumping the thread with your horny package manager posts

for i in $(radix -d 2 `cat bin` -b 16); do
printf "\x$i"
done
Congratulations, you have solved the challenge.

Just post more already.

I thought he was just bumping the threads so more anons could see it and post their own solutions?

i dont want the thread to reach the bump limit and get deleted just because some horny fuck cant click a link.

I'm actually enjoying the thread otherwise