/dpt/ — Daily Programming Thread

Previous: What are you working on, Jow Forums?

Attached: Screen Shot 2018-06-30 at 11.08.08 PM.png (1280x720, 1.06M)

Other urls found in this thread:

docs.idris-lang.org/en/latest/reference/uniqueness-types.html
pastebin.com/raw/F35NF0Pn
gwan.com/blog/20140524.html
pastebin.com/raw/88s9BTpC
en.wikipedia.org/wiki/Separation_logic
unicode.org/Public/UNIDATA/UnicodeData.txt
pastebin.com/yTub7kX9
twitter.com/SFWRedditImages

Is this specifically for Unicode names? Honestly I hadn't checked.

Shit you're right. Still, 5 bits should be manageable, I'll just have to mask off pairs of bytes rather than single bytes, so only a little more effort.

Idris is the future of industrial FP.

I started intending to compile julia, but got stuck reading about m4

Good. Julia fans are insufferable and perpetually angry.

What does Idris have that other languages don't?

dependent types and a form of linear types
docs.idris-lang.org/en/latest/reference/uniqueness-types.html

it's probably the dynamic type system. I'll probably pass on it just because of that honestly, but I figured I'd check it out a little.
Reading some computing history via `info m4` is proving a lot more enjoyable anyhow. I'll probably check out m4 instead of julia.

Bunched types > linear types

why's that?

I made a quick powershell script to check for duplicate files on my Jow Forums folder and clean it up a bit.
pastebin.com/raw/F35NF0Pn
post your results if you want
careful! its recursive

I don't remember that scene. Is that a spinoff or something?

second season

only one here

Attached: dup.png (354x51, 3K)

None, a while ago I made a script that keeps hashes of all files I have hoarded, then when I save a bunch of new files it checks against the hash database and auto removes the duplicates. It also keeps track of files I removed so I don't see them again either. The list is also plugged into my scrapper.

Anime is dead, for now. Possibly forever.

If something like Texhnolyze did come along though, I wouldn't know. Despera probably won't ever exist, and perhaps should not exist.

Bunched logic has simpler semantics, it forms the basis of separation logic which is actually used to verify pointer programs in practice, and it is also perfect for higher-order abstract syntax. Linear types on the other hand are pretty much limited to reducing GC impact without being able to verify interesting pointer programs (see Cyclone, Rust, etc.).

>Cyclone
i was actually looking at it a few days ago, guess i'll give it a try.

It's abandoned, though you might still pick up a thing or two

That post wasn't supposed to be a Cyclone endorsement, point was it's limited in much the same ways Rust is - the substructural type system is just there to reduce GC/refcounting, not verify anything interesting that you couldn't do safely in any functional language.

>not verify anything interesting that you couldn't do safely in any functional language.
eh if i need verification, there's always f*

on linux you can use fdupes

I'd fill their holes, if you know what I mean.

I MADE AN ONION ROUTING THINGY WOAH.

Attached: Screen Shot 2018-07-01 at 6.48.54 PM.png (1089x590, 617K)

gwan.com/blog/20140524.html
>micro-optimization is the root of all evil! focus on algorithms and data structure, and pay for an expensive college to learn it! the compiler will fix everything!
CS cucks completely and eternally BTFO

Attached: 1507688062719.png (858x1006, 66K)

Maybe markov chains?
For words, you can use base 26.

I'm still not going to hire you without a degree no matter how much unmaintainable C you write

very nice user, it werks like a charm. could you add some prompt that lets you decide which duplicate to delete?

Attached: werks.png (1161x845, 177K)

Correctness is the most important part of 100% of your program. Speed is the second most important factor maybe in 2%. Yes, these 2% should require due attention and optimizations, but optimizing the rest 98% is absolutely pointless.

Again with the separation logic shilling. Weren't you already blown the fuck out yesterday when I asked you to implement a circular linked list and prove its correctness using your shitty little toy logic?

Fuck off back to whatever misdirected effort CS department you came from, you idiot.

I wouldn't work for a company with this policy, even if it means taking a pay cut. It's "company smell." Working on my masters also

The image was unrelated and only intended to showcase the deluded mindset of the CS cuck.
The code in the post was far from correct.

You said concurrent circular linked list, and I asked for a use case so that I wasn't wasting my time trying to verify your useless data structure you made up by mashing together a few buzzwords.

>CS weenie seething that he wasted money and time on a worthless degree anyone can get at home

>The code in the post was far from correct.
In the linked blog post, yes. But it wasn't about microoptimizations, it was about experience, viewpoints and how it affects programs and algorithms (and, more broadly, how we research and create new things). But the Jow Forums post is just another uninformed cry of some CS major in a shitty uni about a hilariously bad home assignment which doesn't relate to the blog post at all.

A segment that only executes once per program / subroutine run will still potentially execute several million times, compounded greatly if your program achieves any widespread use.

Optimize everything. I can't tell you how much missed opportunity I've seen for trivial optimizations across an entire program, not to mention insidious frankly lazy garbage written out of that very attitude. They might get around to "optimizing" one loop they happen to think is important for whatever reason, and the rest gets left a suboptimal mess full of historical artifacts that were thoughtlessly hacked around as the program changed.

No. That attitude needs to get gone and stay gone. Think a bit in advance and do it right the first time.

The pic was just filler, you bloody mong. And it wasn't a homework assignment, it's from this book.

Attached: 1518758931308.jpg (600x831, 96K)

here you go :)
it lets you choose the file to remove
pastebin.com/raw/88s9BTpC

This. In 99% of cases, clean code and optimized code go hand in hand. The remaining 1% is the freaky bit hacks, and those you can limit to the inner loops. But the amount of garbage produced by, say, C++ or Java programmers is simply unfathomable.

Of course I don't mean that you should deliberately write slow code. But you should think and prioritize. Thankfully, at my job I have reliable data about most common usecases, so I know the performance characteristics of the most computationally expensive paths.

omg user you're doing god's work. thank you!!!

Attached: 1530478064887.png (372x391, 98K)

Let's have a code jam of some sort /dpt/.
Opening the door to suggestions of theme, constraints, or to kys myself

Attached: 1527278345618.jpg (598x699, 133K)

>useless data structure you made up by mashing together a few buzzwords
Never reply to me ever again, you fucking monkey. You are the intellectual equivalent of toilet paper. You are not even deserving of this third reply I'm giving (you), but you've pissed me off now by trying to call me out despite it being clear that you're an undereducated little moron regurgitating buzzwords he read on the internet/heard in his department.

Fact 1: Separation logic is unwieldy. Try writing a non-trivial program (the kind of program that actually needs to be formally verified) and verifying non-trivial aspects of it with your shitty toy logic, and tell me how it goes. This isn't exclusive to SL, mind you, but it's still a huge defect.

Fact 2: Separation logic has produced nothing of worth since its inception. Its biggest achievement is verifast, which despite generating some buzz, has only ever had actual programs written for by its creators. Hell, it's even an open question whether verifast's implementation of SL is sound or not.

Fact 3: Separation logic is inexpressive. Trying to use locks? Good luck expressing yourself when trying to prove the correctness of anything other than a simple critical region (e.g. try implementing an useful read/write lock, I'll wait LOL). And let's not even mention lockfree data structures (which is another flaw that's not exclusive to SL, but still huge).

Fact 4: Separation logic is INCAPABLE, by DESIGN, of modelling circular data structures. Want a use case for them? Fine: fibonacci heaps use circular linked lists to maintain their roots. Prove the correctness of THAT in your precious SL. I'LL FUCKING WAIT!

Asking you for an implementation of a concurrent circular linked list wasn't the result of me typing out random words that I thought of, it was me taking a shit all over your dumb post by exposing the weaknesses of your toy logic using 4 words.

Fuck off and never ever post again, you are dragging down /dpt/'s collective IQ.

* opening the floor, sorry am retarded

*snap*

Theme is compilers, write a working compiler/interpreter in a week.

Many audio APIs use circular buffers.

>ring buffers

HRT and fursuits will save systems programming!

For you

Make a game in m4
Make a game in MUMPS
Themes: alcohol, paper, birdsong
Measurable success: pick an open source project and improve its performance
Measurable success: pick a major open source project and have a PR accepted

You'd be *snap*ping his post instead if you understood how much of an idiot he is.

Circular buffer implementations are actually reasonably trivial to prove correct in SL though, since they're a regular array with 2 indices. Where SL actually shits itself is when you have circular references.

Keeping the name while this discussion is on-going btw.

>a circular linked list is a circular buffer

In all seriousness, it's not hard to have circular references in SL, such as a circular or doubly linked list. Even an XOR doubly linked list is pretty straightforward. Are you perhaps thinking of decidable fragments of SL?

Codelet here but that's fun. I will participate and probably feel stupid

>In all seriousness, it's not hard to have circular references in SL, such as a circular or doubly linked list. Even an XOR doubly linked list is pretty straightforward. Are you perhaps thinking of decidable fragments of SL?

Uh? Yes it is. A separating conjunction requires that both memory pieces be disjunct. How do you prove any interesting conditions about a circular data structure in SL?

>You are not even deserving of this third reply I'm giving (you)
Third sentence and I've already lost it.

>Measurable success: pick an open source project and improve its performance
would we individually choose projects or all decide on one to work on? The former seems problematic (find helloworld.py) and the second sounds more funny anyhow

>Measurable success: pick an open source project and improve its performance
This is pretty much the bulk of my programming activities.
>Need thing to do thing
>Find thing that does thing pretty good
>Make thing do thing better
>Make best thing to do thing

p->q * q->p is a simple circular assertion. p points to q, which points to p, but p != q. A circular list predicate is the same as a regular list predicate but instead of using null at the end it carries the head pointer through and goes back to it, e.g. isList h [1,2,3] = exists p,q. h->(1,p) * p->(2,q) * q->(3,h).

I'd love for /dpt/ to have their own gamejam, and see how different it is to /agdg/. But most of the nerds here will never admit they like games.

>base 26
>:thinking:
It's all-caps alpha plus space and dash, and sometimes a hex number that can be calculated from the code point value. Also how would Markov chains help here, I've only used them for generating random strings.

:^)

en.wikipedia.org/wiki/Separation_logic

Give me h1 _|_ h2 for p->q * q->p (I'm using wikipedia's notation).

Too much overlap. Want to do a gamejam? Just go to /agdg/, we'll have one after this demoday.
/dpt/ codejam should be different.

I like this the best. It fits /dpt/ and leaves room for everybody.
I think while having all of /dpt/ converging on a project sounds fun, it is a bit too specific and would leave some otherwise enthusiastic /dpt/ers in the dust.
Picking your own project and improving on it seems really good. Of course you could pick a helloworld.py but nobody will be impressed, including yourself hopefully. A jam doesn't need a winner, /agdg/ jams don't have that and it's still fun.

h1(p) = q
h2(q) = p

/agdg/ are a boring bunch of engine babies who can't actually program
>imp-
/dpt/ is atleast marginally better, and actually interested in languages.

Cool post

For p =/= q, obviously.

So h1 (resp. h2) doesn't have enough of the heap to prove p->q (resp. q->p), meaning that that's not a valid statement in SL.

Thanks for playing. You (and your shitty toy logic) are the weakest link, goodbye.

Encode the words as base 27 little endian.
Use markov chains to see if a certain word always comes after another word, or if it does so with a high enough probability that you can instead of doing A + B (97%) or A + C (3%), do AB (97%) or AC (3%). You duplicate the string but save on pointers.
Here's another trick: don't use a char pointer, use an index into the local char table. That way, they take up 16 bits instead of 64.

How would submissions work? Assumably gitlab, but how?

>I like this the best. It fits /dpt/ and leaves room for everybody.
And it is feasible. I actually have done this, to my knowledge there's no png encoder in existence that generates smaller output than mine, and it's nowhere near done yet. Gifsicle might well be next.

Also allows ideas for other projects in the back of one's mind to coalesce.

>So h1 (resp. h2) doesn't have enough of the heap to prove p->q (resp. q->p)
p->q by itself is only an assertion "about" p. Likewise, q->p by itself is only an assertion "about" q. q and p respectively could be anything. When you conjoin the assertions (and heaps), you get the full picture.

Clearly you are the one who doesn't get it.

We should go with that. It's a very decent idea for a first /dpt/ codejam.
"Find a project and improve it" ? Also leaving out improving performance, still allows for improving performance, but also allows for frontend people to participate.

@66574730
No more (you)s for (you), baiter.

Think about what you just posted and realise why it's senseless to say that h1 is enough to prove p->q when h1 doesn't hold p.

This is my last reply. Go read up on SL before you try to evangelise /dpt/ with your autistic toy logic. Bye.

It also leads towards contributing pull requests, which is something no college teaches to my knowledge

my memory scanner/editor varedit

Attached: ftl_varedit_small.gif (640x360, 3M)

are you gonna make it open source? it looks nice

Someday I will be able to do this kind of shit

not super familiar with licenses but it's on my github: github.com/asherlie/varedit
i've been focusing on making it very sensible to use and keeping it very lightweight

No need for a central repository, at least initially. Only a list of submissions, then each user is responsible for their own showcase be it gitlab, pastebin, gitea or whatever else

>base 27 little endian
I still don't understand this, it's all binary. If I could magically get more values per bit that'd be great.
>You duplicate the string but save on pointers.
My biggest issue is space taken by strings, so I'm trying to do the opposite.
>don't use a char pointer, use an index into the local char table
Already doing this, saves only about a MB.

how can i use my programming skills to get passive income?

>when h1 doesn't hold p

Attached: 1525891708311.png (229x220, 6K)

Alright codejam0: find a project and improve it. Host your own solution, gitlab, pastebin whatever.

Two weeks from now. I'll try to make the next OP with it.
Sunday, July 15, 2018.
1531630800

Attached: lenna_by_zaidot-d6ccp4i.png (512x512, 164K)

Not per bit. But you can do it more exactly.
If you use 5 bits, that gives you 32 values. But you only need 27. So 4 characters take 20 bits, but 19 bits are enough.

What about splitting up the words into tokens, then storing LATIN LETTER A like {26, 119, 1)? You can use the varint scheme for this, and put the more common entries at the start. The rest should just be finding a neat way to store all the numbers. Maybe you could do a delta representation, and reset on each new block. Then store a bitfield for each character which tokens have changed, then run length encode that, then you only need to store the changed tokens hodgepodge since the indexing will be fixed easily.

Or you could just do huffman encoding.

Explanation for the 2nd:
Say you have something like
LATIN CAPITAL LETTER A
LATIN CAPITAL LETTER B
...
LATIN CAPITAL LETTER Z
LATIN SMALL LETTER A
...
LATIN SMALL LETTER Z
GREEK LETTER ALPHA
...
GREEK LETTER OMEGA

Then that can be represented like
27 28 29 1
27 28 29 2
...
27 28 29 26
27 30 29 1
...
27 30 29 26
54 29 31
...
54 29 53

Then you subtract each from the previous, as such
27 28 29 1
0 0 0 1
...
0 0 0 1
0 2 0 -25
...
0 0 0 1
{reset, since new category}
54 29 31 //maybe change the last one to some designated IGNORE value, look at how varint handles it, but otherwise treating 0 = unprintable is probably good enough
...
0 0 1

Then generate bitmaps for each
1 1 1 1
0 0 0 1
...
0 0 0 1
0 1 0 1
...
0 0 0 1
...
1 1 1

For variable length, use some kind of null terminator, store it in a global table (use this encoding maybe?), or pad it (modify varint maybe, so it rounds to 2^x and not 8*x)

Many questions. Optimize however much you need. If there are only 16k words, a short for each is fine, and none of this is needed.

Also, your application is probably fucked, it's not Unicode's fault.
unicode.org/Public/UNIDATA/UnicodeData.txt is 1.7M, and that's as a CSV.

int strhsh(char *str)
{
char *ptr = str;
unsigned int hash = 0;

while(*ptr) {
hash += *ptr;
if(hash >= 10) {
hash = (hash / 10) + (hash % 10);
}
++ptr;
}
if(hash >= 10) {
hash = (hash / 10) + (hash % 10);
}
return hash;
}

my first hash generator :3

Attached: 239538046345.gif (500x280, 984K)

Feel good about yourself because you've done well!

Here's what I was thinking of trying next: 5-bit encoding A-Z, space, and dash and building a lookup table of the ~11,000 unique words; this will just be a bunch of offset computation & bit-shifting. Then I'll store the code point names as strings of ushort IDs into this table, but if I use diffs like you suggest I may be able to change this to one ushort and several signed chars.
>UnicodeData.txt is1.7M
I'm well aware, fuck I'd be happy with double that. So far my two approaches have been 12MB and 22MB.

How much of this applies to you, /dpt/?

Attached: nu-g.png (800x600, 309K)

thx senpai

Attached: teeheethxsenpai.gif (500x294, 601K)

Figured out how to do it well.
pastebin.com/yTub7kX9

But you're overcomplicating. Just write it in C and it'll sort itself out.

>when you congratulate someone and they turn out to be an anime poster
disgusting

>Just write it in C and it'll sort itself out.
What do you think I'm using

Python?
Are you indexing on codepoint, or contigous-transformed codepoint?
Anyway, see my answer in pastebin - cloudflare didn't like it.
You could also skip the prefix compression part and index data[] directly, it's not that much of a waste anyway.

I'm using Python to process UnicodeData.txt and generate C/sepples (std::vector and std::map have too much memory overhead so it's basically C)
But yeah a solution that essentially dumps the text file into the binary might be acceptable.

I think you save ~10chars/codepoint (320k/860k) with my "prefix-based compression", based on eyeballing cut -f 2 -d ';' UnicodeData.txt | head -n $(($RANDOM)) | tail -n 2.
Don't use std::vector or std::map, just do strlen over and over again.
Here, this is all the preprocessing you need to go down to 860k:
cat

I don't even know what most of those things are.

Keep it that way, stay pure.