Question

Question

Question

Bentley Rogers

Why don't we store bits instead of raster images? Then when an image needs to be displayed, it can be generated from the bits that are already stored on that substrate. For instance, if I wanted a 1x1 image of a red square, and that red square was used in another image on my hard drive (as part of a mosaic of red squares of varying shades), why can't that property be reused, saving memory? I understand this would probably take some computing power but I want to know exactly why we don't do it.

Attached: 1547313557809.jpg (1318x912, 1.49M)

January 13, 2019 - 17:46

Other urls found in this thread:

github.com/philipl/pifs
libraryofbabel.info/search.html
kernel.org/doc/html/latest/vm/ksm.html
youtube.com/watch?v=M5c_RFKVkko
youtube.com/watch?v=RXJKdh1KZ0w
twitter.com/SFWRedditGifs

Aiden Brooks

the current state of Jow Forums
install gentoo

Attached: foto_no_exif-min.jpg (4096x3072, 2.84M)

January 13, 2019 - 17:53

Cooper Ross

Can you explain what's stupid about my question? I just want to know more about how and why information is stored the way it is.

January 13, 2019 - 17:56

James Moore

What the actual fuck

January 13, 2019 - 17:57

Ayden Young

duuuuude

January 13, 2019 - 17:58

Kevin Cooper

Care to explain your idea more? I don't really understand what do you mean.

January 13, 2019 - 17:58

Bentley Hughes

Interesting concept, but it would be way to slow to store an image, because we would have to go through all the images to find matching pixel groups

January 13, 2019 - 17:59

Nicholas Moore

Because storing 1s and 0s us the most efficient way to store information

January 13, 2019 - 18:02

Jeremiah Roberts

As if all images are part of a big mp4 and you only need to know the frame number?

January 13, 2019 - 18:03

Evan Price

How would you reference the location of the red image and its shade in less space than 3 0-255 values? I don't really understand what you're even proposing.

January 13, 2019 - 18:03

Jayden Ward

Capacity is cheap and your method would be computationally expensive.

January 13, 2019 - 18:04

Ayden Gutierrez

It's absolutely retarded. Any sub-image addressing scheme would only expand the final size of images.

To save space we already developed a compression algorithms which are doing exactly what OP wants, but properly.

January 13, 2019 - 18:04

Camden Turner

That's why I was thinking AI could do it.

I know but can't they be reused to safe storage space?

Reusing bits for separate files.

Is that really how compression algorithms work?

January 13, 2019 - 18:06

Oliver Bell

Just order them through similarity and make a h256 video from it

January 13, 2019 - 18:06

John Cook

That's basically JPEG compression.
Rather than store the image it stores how individual segments compare to a list of pre defined patterns.
Look it up

Also, what happens to your photo if the song or whatever is deleted or changed?

January 13, 2019 - 18:07

Samuel Thompson

No, the patterns I am speaking of are actual information stored on the same substrate as the image being uploaded. I'm talking about images sharing and reusing properties (in this case a single color)

January 13, 2019 - 18:10

Dominic Gonzalez

i'll be more specific
in your example we'd have two copies of the same color but used in different images. yes, there are fewer colors used, but the same color is still duplicated, taking up twice as much space. my method would allow for images with richer colors.

January 13, 2019 - 18:11

Ryder Peterson

>Is that really how compression algorithms work?
There are plenty of compression mechanisms that each work in its own way, even using a static dictionary is better that referencing a dynamic data on the storage.

January 13, 2019 - 18:11

Christopher Sanders

Why? 16 million colors could be stored in 16 megabytes.

January 13, 2019 - 18:15

Caleb Phillips

/thread

January 13, 2019 - 18:16

Levi Howard

You cannot have more efficient storage of 16 milion colors that the sequence of bits with 16 milion combinations. Only way to save a space in the image is done by compressing areas not individual pixels, see . You wouldn't be able to effectively reference them without losing information.

I recommend you to study some basics of data and image compression before you start thinking.

January 13, 2019 - 18:21

Thomas Carter

I wasn't trying to put you out of a job, I was just curious.

The question isn't about saving space in an image, it's about how to avoid having to store images entirely. I'm not sure you understand the question. Are you from Netherlands?

January 13, 2019 - 18:23

Juan Sullivan

Oh you're asking why don't we just associate cached images to every picture file? Because data changes all the time.
Look I get what you're saying and that's vector graphics and it works much faster and more reliably.

January 13, 2019 - 18:24

Luke Bell

github.com/philipl/pifs

January 13, 2019 - 18:26

Luis Ortiz

Ohhh OK. How does it change? Thanks for trying to understand my misdirected inquiry. Some of the other anons here think I'm trying to show off.

January 13, 2019 - 18:26

Grayson Reyes

What am I looking at here? I'm scared...

January 13, 2019 - 18:28

Matthew White

you will need an index of what color to use an where is the position, and the image will be bigger than save repetitive colors
or something like that, it is a stupid idea.

Attached: 47501528_940352676161721_1614859561007579136_n.jpg (960x956, 146K)

January 13, 2019 - 18:30

Christopher Lee

>if I wanted a 1x1 image of a red square, and that red square was used in another image on my hard drive (as part of a mosaic of red squares of varying shades), why can't that property be reused, saving memory?
If you break this idea down you'll pretty soon seen what's wrong with it. In order to be able to retrieve a red pixel of the correct shade, the file needs to contain information stating what shade of red the pixel is... Which could just be used to render the pixel in the first place.

You could get around this by, when the image is first saved, looking for blocks of identical data on the computer and storing their addresses instead of the color but that's very complicated, you need to check all files for these addresses when you delete one of the original files (or keep some kind of massive reverse index), some of the files available in storage may be on network storage and so only available temporarily, etc. It's also unlikely to be a space saving unless you're storing very regular images, most images are very noisy making finding large common blocks between them hard and the addresses will be much larger than the colors (probably 24 bits for the colors and 64 bits for the addresses, plus some information about the dimensions of the block).

January 13, 2019 - 18:30

Logan Roberts

The one technology that is able to rival Stealth's compression efficiency.

Attached: Stealth.png (756x2038, 58K)

January 13, 2019 - 18:31

Nathaniel Mitchell

It's a version of the Library Of Babel, where every sequence of words and letters that could ever exist is contained somewhere within the library already.

libraryofbabel.info/search.html

January 13, 2019 - 18:32

Wyatt Miller

I'll continue my layman's research from here. Thanks for all the help, fellas.

Attached: coop.gif (350x350, 2.92M)

January 13, 2019 - 18:34

Henry Williams

>Why don't we store bits instead of raster images?
Raster images are stored in bits, everything is. What do you mean?

A 1x1 red jpeg looks like this:
d8ff dbff 4300 0600 0504 0506 0604 0506
0706 0607 0a08 0a10 090a 0a09 0e14 0c0f
1710 1814 1718 1614 1a16 251d 1a1f 231b
161c 2016 202c 2623 2927 292a 1f19 302d
282d 2530 2928 ff28 00db 0143 0707 0a07
0a08 0a13 130a 1a28 1a16 2828 2828 2828
2828 2828 2828 2828 2828 2828 2828 2828
0000 0000 0000 0000 0000 0000 0000 0000
2828 2828 2828 2828 2828 2828 c0ff 1100
0008 0001 0301 2201 0200 0111 1103 ff01
00c4 0015 0101 0000 0000 0000 0000 0000
0000 0000 0700 c4ff 1400 0110 0000 0000
0000 0000 0000 0000 0000 0000 c4ff 1500
0101 0001 0000 0000 0000 0000 0000 0000
0600 ff08 00c4 1114 0001 0000 0000 0000
0000 0000 0000 0000 ff00 00da 030c 0001
1102 1103 3f00 9c00 1c00 1fa4 d9ff

If you don't know how to translate hex to bits then google it.
You can run hexdump [filename] on linux to get that
>Then when an image needs to be displayed, it can be generated from the bits that are already stored on that substrate. For instance, if I wanted a 1x1 image of a red square, and that red square was used in another image on my hard drive
You can just compress it to 1 color channel and then compress it with xz and it will be smaller than a pointer pointing to that place on the disk
A 1x1 red bitmap would actually be by it's self smaller than the pointer.
*a pointer is just an address (aka a bunch of bits) that point to a place on disk or in memory
But pointing to a place in memory instead of just loading 2 exact same things is actually done sometimes, in linux you can enable this in the kernel config. So if you start the same program twice linux just replaces the similar bits in the second one with a pointer. Well it's actually a bit more complicated but that's kinda how it works.
For more info:
kernel.org/doc/html/latest/vm/ksm.html

Attached: mm.jpg (690x862, 229K)

January 13, 2019 - 18:45

Carson Walker

Thanks for dedicating this much effort to your answer. I suppose I underestimated how much space the "pointer" would take up (I swear I did consider it, though).

January 13, 2019 - 18:55

Josiah Murphy

Well in some image formats a pixel can be represented by as little as 1 bit so 1600 bits for a 40 x 40 black and white image while a pointer to an image that let's say is located at 16GiB on a disk would take 41 bits, considering the image isn't just blank that would take ages just to find 1 pointer to a pattern that matches, because in order to save space you would need to point to a pattern 45+ bits long, and even then you would be saving 1-4 bits, because you don't want to scan the entire drive, and also it would take longer to access the file because you would have to read the pointer and the actual bits.
In RAM this is a lot different because RAM is fast and you could just start 2 of the same program which will most certainly have some shared read only data.

Attached: fancy image so you read my post.png (1920x938, 159K)

January 13, 2019 - 19:29

Jack Sanchez

By bits you probably mean small parts of an image, not computer bits. And the answer to that is you will lose more space storing those parts if you wan't a set that can represent any significant number of images. For text compression this actually works and it's used in some archivers. Filesystem compression can do what you have in mind and more.

January 13, 2019 - 19:35

Samuel Wilson

>Why don't we store bits instead of raster images? Then when an image needs to be displayed, it can be generated from the bits that are already stored on that substrate.
There are compression algorithms that do that. They use a fractal compression scheme that maps how different parts of an image are similar to each other. They can get very good compression, but are complex as hell to implement and have also been very patent-encumbered in the past.

January 13, 2019 - 20:12

Adrian Morales

The pointer is only 64 bits. What you're looking at is the header data and data about the format.

January 13, 2019 - 20:19

Michael Sanchez

read

January 13, 2019 - 20:25

Bentley Lopez

What you said made no sense, the aceess time is going to be the same whether it's 8 bits or 200.

January 13, 2019 - 20:32

Brayden Rogers

i'm thinking theres multiple pointers because if you have a png with random noise theres a chance you could go through the entire filesystem and not find a match

January 13, 2019 - 20:35

Elijah Johnson

?? Let me explain pointers with an analogy. Pointers store an address to a place in memory. Let us say I give you the address to a store where I need you to pick something up for me. You're not going to scan the entire city until you arrive at the address by chance. You're going to drive directly there. It's the same with a pointer on your computer.

If the image is all stored contiguously in the secondary storage, it only needs one pointer to the head of the file. It can scan the remaining data sequentially.

January 13, 2019 - 20:44

Hunter Turner

graphically, you don't store the "color red", you just store a value that would indicate to the computer that the color that it's displaying is red. If you had a pointer instead, then it would be the same as just be an identifier for red. If what you mean is reusing data that the developer already knows that you store locally, or at least have easy access to, that's called dependencies.

sorry if everyone else on this thread was laughing at you, they're just too high on their ego trip to provide any useful information. for future reference, ask these types of questions on Jow Forumssqt to have a higher chance of finding people who are willing to take you seriously

January 13, 2019 - 20:44

Nathan Lee

You still have to somehow store what color to search for. If the image uses a reduced number of colors, you could get away with using a lower number of bits to store each color, otherwise you're pretty much fucked.

January 13, 2019 - 20:45

Eli Mitchell

you scan the disk to find a matching pattern and you point the part of that file to the pattern
i know how pointers work trust me

January 13, 2019 - 20:51

Mason Ross

OP is basically asking for the visual equivalent of MIDI for audio.

I often think about things like static linked binaries and non-compressed formats like BMP and think it should be the responsibility of the filesystem to deduplicate data.
While not exactly the same, it's somewhat similar.
Optimally you'd have 1 copy of each pattern, this could be aided by type information to know bounds and sections of file formats that could still potentially cross other formats.
like raw images acting as raw keyframes in a video.

While technically possible, it is likely to be expensive and hard to implement.
Although I've seen people talking about filesystems being more aware of file formats than they are now. To better optimize instead of just relying on generic/arbitrary blocks.

January 13, 2019 - 20:52

Jacob Hall

That's the most reddit comment of the day, congrats you fucking retard.

January 13, 2019 - 20:54

Jacob Baker

I doubt you're being ironic on purpose, but you probably will say you are.

January 13, 2019 - 20:56

Austin Mitchell

You realize accessing a file by a pointer has a time complexity of O(1). What you're proposing would have a time complexity of O(n). Linear time on the slow secondary storage of your computer. Why the fuck would anyone do that?

January 13, 2019 - 20:57

Jaxon Hernandez

You win the world's most retarded post award.

January 13, 2019 - 20:57

Nolan Russell

christianity was just just as divided back then as it is now, if not more

January 13, 2019 - 21:02

Landon Murphy

what im thinking:
(file to compress split for visibility)
addr FFF000:
00000100100100
00001001000100
00010010000100

(pattern 1)
addr FFBBA0:
0000010010010010101...

(pattern 2)
addr AAAFFF:
00001001000100110101...

(pattern 3)
addr BBABBB:
000100100001001000111...

(final file split for visibility)
(pointer size 2 bytes|pointer|bytes to read 2 bytes)
06FFBBA00E
06AAAFFF0E
06BBABBB0E

so you would have to read the pointers and the file which is slower than just the file

January 13, 2019 - 21:13

Chase Martinez

Because knowing where the 1x1 red pixel you want is stored takes up more space than just making a copy.

If you go look in the infinite digits of Pi, all the images that exist are somewhere in there, so you could just write down at which digit of Pi your image is instead of storing the image.
Except the position in the digits, well that turns out to be bigger than your image almost always.

It's the same with deduplicating your hard drive. If you try to deduplicate too small things like individual bits and pixels, you waste WAY more space remembering where everything is.
If you try to deduplicate larger things (this exists, see filesystems like ZFS), then you'll notice that most of the things on your hard drives are not actually duplicate so you waste a lot of computing power and gain little.

January 13, 2019 - 21:33

Landon Murphy

>so you would have to read the pointers and the file which is slower than just the file

I don't follow. The os is going to have a table to lookup. Scanning a drive is really slow plus you would have to constantly load memory and perform a comparison to see if you're at the right file. A pointer is quick and direct. It's O(1), no debate about it.

January 13, 2019 - 21:35

Oliver Gomez

you have to scan the drive to get the pattern then you put the pointer to the pattern in the file and when you want to read the file you have to read the pointer from the file and follow it then read the next pointer and so on until you run out of pointers to follow and reading and following a bunch of pointers is slower than just having the file

January 13, 2019 - 21:40

Nolan Diaz

You can't store the pointer pointing to itself in the file. That would defeat the point of a pointer. You store the pointer to the file elsewhere such as a lookup table so the os can easily lookup where the file is.

Storing the pointer telling you where the file is in the file itself is ultimately pointless.

January 13, 2019 - 21:46

Lucas Johnson

that's not what im saying
you have 2 patterns in a file
you found a pattern in a second file
and another in a third file
you store, in the first file, the pointer to the pattern in the second file and another pointer to the pattern in third file

January 13, 2019 - 21:55

Jack Murphy

What if the original file changes

January 13, 2019 - 22:05

Jack Perry

idk this idea is impractical for disks anyway
you need paging for it to work properly
maybe store a checksum?

January 13, 2019 - 22:10

Angel Martin

You could do that but images are highly compressed already. And except for situations exactly like you describe (a monotone red square) sequences of pixels are unlikely to be duplicated in a way that would improve the storage size if you referenced them. For example any real picture would have almost no sections in common with another image.

January 13, 2019 - 22:16

Kayden Rivera

You want a linked list. That's slow, it's not as bad on an SSD, but that's what happens when an older computer gets really slow because the data is heavily fragmented so then you defragment the hard drive, and the performance is significantly better. I mean redundant data is already done with programs with common dependencies. That's what the linker does.

January 13, 2019 - 22:16

Gabriel Brooks

youtube.com/watch?v=M5c_RFKVkko

January 13, 2019 - 22:26

Blake Torres

This is you OP
youtube.com/watch?v=RXJKdh1KZ0w

January 13, 2019 - 22:35

Xavier Cook

1. Storing a reference to a byte is at least as storage-heavy as saving a pointer to a byte.

2. In practise, how often do you think say, the same large (large enough that storing an address becomes efficient) string of bytes is going to be identically replicated across a drive.

3. A single write operation will now be hundreds, maybe millions of times slower, and require vast amounts of RAM and CPU as it chews through the contents of your drive

4. Adds a huge set of security weaknesses to the storage

January 13, 2019 - 23:20

1 2 ... 7 Next

Question

Last threads