Have we reached the theoretical limits of data compression or is there still room for advancement?

Have we reached the theoretical limits of data compression or is there still room for advancement?

Attached: file-compression-ch.jpg (200x200, 26K)

Other urls found in this thread:

github.com/philipl/pifs
en.wikipedia.org/wiki/Pigeonhole_principle
en.wikipedia.org/wiki/Kolmogorov_complexity.
twitter.com/SFWRedditVideos

I've heard pied piper has had some advancements but that's practically it.

nice meme

github.com/philipl/pifs

Once we have quantum computers it might become viable

That's silly, storing the index of the file in pi may require as much storage as storing the file itself.

How about storing the SHA-512 hash or something and then recreate the file by generating every possible file until the SHA-512 matches? Might get a few false positives but eventually you'll also get the correct file and then you can store which one it was (e.g. 125 collisions so it's the 126th file).

The theoretical limit is only asymptotically reachable.
It so happens that the classic LZ algorithms reach the bound asymptotically, so the problem was solved 40 years ago.

A few collisions? try infinite.
If you understand Pigeonhole principle, you'd know why this is impossible without storing the file itself
en.wikipedia.org/wiki/Pigeonhole_principle

Yes. Nothing will beat stealth.

Don't worry I've improved it even more: store the file size of the original file and you only have to generate the possible combinations of ones and zeroes until you get a matching hash

You just added a few extra bits to the hash...

theres a lot of advancements being made, for example, micro-compression. This is where the data can be compressed so that each bit is only 2 atoms wide

Yes but if you have a known file size then you do at least have a finite amount of possible combinations

I've actually given this a lot of thought. If you store the file size, multiple hashes of different kinds like sha2, md5 etc. and perhaps even the file type. then maybe it would be possible to reduce the amount of collisions to a manageable amount.
But this requires an unlimited processing power to be feasible anyway

All form of encryption is a tradeoff between CPU time and storage space. This is just a very extreme form.

It doesn't really matter if compression gets better because storage space and bandwidth is exploding.

Wasn't their some guy that claimed to be able to compress files by 99%, but it turned out he was creating shortcuts?

>and then recreate the file by generating every possible file
do you realize just how rapidly this becomes impossible? Lets say you have a 64-byte file, which is smaller than this post is. It would take you billions of years at billions of files per second to create all possible 64-byte files. Now do it with a 400KB image file.

google had some cool bleeding edge compression for pictures, which stores only a few details, and recreates everything else using ai
of course it isn't looseless and won't work in generic files, but only for pictures, videos, and audio, which constitute 90% storage space anyway

>theoretical limits

Well, "in theory" some files should blow up, since the compression has to be an injective mapping, but in practice that has literally never happened to my files, so the current state of the art can't be all too bad. (Although I guess the reason for that is simply that compression algorithms check, if they actually compress what they are trying to compress, and if they do not, then they set a flag to indicate that and use the raw data.)
I suppose there is still stuff that can be done for certain special purpose compression tasks, but it doesn't feel like there is a breakthrough for general purpose compression in the making. If at all, then in terms of speed, but not due to new algorithms for current computers.

>being this impatient
My anime images are worth the wait.

There is always pigzip.

Attached: cartoon310.png (759x280, 30K)

I don't think there will be any big improvement in lossless compression.
In lossy, there will be. Just look at the new media formats appeared semi-recently like h.265, webp or webm. With motion picture going to retarded sites for 8k-17k-32k images there might be new codecs.

No. Just no. There is no "quantum" anything, this isn't poorly understood near magic effects of some mythical theoretical particle. This is simply electrons being so small they can move through any material at the path of least resistance, because nothing can exert 100% perfect electrical control over them. It is current leakage. It is nothing but current leakage. It is current leakage in short channel devices, and it happens at literally every feature size, it is not exclusive to small FinFET devices like upcoming 5nm EUV FinFETs. Even planar devices have extremely high degrees of leakage through their channels, directly under the gates, electrons still leak out. Yet despite this the transistors still function.

Quantum tunneling is a meme regurgitated by people who know nothing about the field of FETs.

>I don't think there will be any big improvement in lossless compression.
Compared to what we had 10 years ago WebP and FLIF both provide a pretty good lossless compression. And with AVIF on the horizon we might have an image format that even beats FLIF.

amen to that

based

there's just no need

storage is so cheap, bandwidth is so cheap

Attached: 36974486_173620796835367_6330065337924452352_n.jpg (720x635, 45K)

magnet link is a form of file compression

>mobile bandwidth
>cheap

In what world do you live?
Sure, it's not ridiculously expensive anymore, but I still always disable images in browser when not connected to Wi-Fi.

You're insane.

What.

You are mixing up compression and distributed storage.

Creating a UID for a file doesn't make it compressed.

what the fuck?

I have unlimited 4G mobile internet for €25 per month. How cheap are you?

>How cheap are you?
In theory unlimited 4G at €4, but it gets slowed down past one or two GB (not sure, I hardly ever reach the "limit").

just use black holes

Attached: serveimage.jpg (400x400, 16K)

there is still room for advancement.
theoretically it would be possible for an AI to determine the best way to compress a file to reduce file size even further than what we currently have.

here's a possibly stupid idea, it may be possible that an AI could analyze a file and determine an algorithm that re-creates that file instead of compressing the file. the question is, will the algorithm determined by the ai have the same amount entropy as the file itself?

Wow racist and sexist much?

The only advancement I really see is a potential speed-up with quantum computers.
Those could attempt many ways of compression in parallel, and then choose the best one.
Doesn't improve the result, but the speed.

I tried this once storing only the md5 hash and file size and managed to "unpack" a 2 byte text file. I posted about it here too, although obviously in that case the hash was bigger than the original file. The thing was that the "collisions are impossibly rare" thing only works out in practical scenarios. When you're generating every possible file, you are also going to hit every possible collision, so you would probably expect to see the amount of collisions (and therefore the collision index) to grow proportionally with the file size. So it has the same downfall as that pi storage.

What?

keked

Attached: deedf5b0.jpg (850x720, 49K)

Compression ratios are likely not going to move a whole lot unless some completely unexpected and novel compression method is discovered. More recent advancements in compression algorithms have been more about getting the smallest size at close to real time speeds.

We already have ways to compress data to very small sizes but the more something is compressed the more work is required to uncompress it. If you compress something so much that it makes more practical sense to access the file uncompressed then you've only really wasted your own time.

There are actual limits for how much you can compress something, and it gets exponentially harder to compress more and more, as it gets harder and harder to find redundancies.

Aren't they using autoencoders, basically?

AI assisted lossy compression and interpolating decompression

Then just store the index itself in pi too

Cloud compression. Upload the file to the cloud. You now reference the file by its SHA512 hash. To decompress the hash back to the original file, send the hash to the server.

sandy bridge ought to be enough for everybody

You technically can compress data down to the size of a bit generating algorithm. The time it would take to discover such an algo to match your data would be like brute forcing a very long hash.

Store the start and ending position.

That's not compression, that's just a shitty torrent knock off.

The length of the shortest alogrithm that prodcues a file is its Kolmogorov Complextity en.wikipedia.org/wiki/Kolmogorov_complexity. It is also uncomputable, so not even an AI is going to help.

>AI
can we ban these letters?

Attached: 1549606791362.jpg (800x450, 84K)

>Wasn't their some guy that claimed to be able to compress files by 99%

My friend once said that. Turned out he was lying.

>AI compression
I would literally never ever use this. Triggers me immensely knowing I wouldn't be getting the same image back.

(USER WAS BANNED FOR THIS POST)

Attached: 1549678430994.jpg (500x579, 83K)

All data is just a really really long number.
Theoretically, if AI can find a really short representation of that number, you can have lossless compression with very high compression ratios. You'd just have to find AI that can do that.

Are we at peak human civilization that has perfected physics, math, and engineering?

Always room for advancement.
Algorithms too memory and processor intensive today could be cheap in the future.

Until someone breaks physics and signal theory, then yes. And I'm not sure I want to be around when that happens.

Possibly. If we fall to socialism.

who else remember the 1mb gta sa kgb meme from a decade ago

Attached: meme.jpg (480x360, 33K)

You must never use jpg images ever then.

yes, jpegs are turbogay.

It's math. There never was any room for advancement. You can't compress 1 bit of data to less than 1 bit of data.

The only advancement is in lossy compression, which is just figuring out how much data you can throw out without a human being able to tell. e.g. for a blind person, you can compress any video to 0 bytes. For a colorblind person, you can throw out a range of color data completely.

Underrated and best post in thread. Wave functions cannot violate neither the unitarity principle of quantum mechanics nor the third law of thermodynamics, even in gravitational singularity. So the information isn't lost in a black hole.

>Jow Forums
>or /sci/
>into science

This thread proves Jow Forums is full of uneducated retards

your brain can't comprehend just how large the numbers are
think about it this way: sha-512 is considered by basically anyone to be a very safe way to ensure a file is exactly what it should be, right? yet; there are an infinite number of possible collisions for each hash, /infinite/. so how the hell can we consider it highly correlated with a single file? because finding a collision, either accidentally or intentionally, is designed to be near impossible, the search space is literally infinite

they thought the same when Newton was around

then one asshole came to tell them they were all wrong

Attached: 1538918550368.png (611x611, 350K)

ITT people with room temperature IQs

Retard

For general, lossless compression, yes, probably.
For specific types of data (like images or web traffic) and lossy compression?
Probably not.