Hey guys! I posted here a few times in the past with a pretty large thread each time...

Hey guys! I posted here a few times in the past with a pretty large thread each time. I made a Python library that allows you to embed files into pictures and video that is resistant to corruption and compression. Larger files are broken into multiple frames, so they are 'streamed' over a video. Steganography only works if you're receiving the untouched file. My library is resistant to that... you can change the format, resize the image, and even change codecs, and it can still work. The difference is that the color data on the frame is the carrier of the data itself, rather than the binary data that composes the file.

This isn't an efficient way to send data, nor is it designed to be. It gives you the ability of greatly increased data portability.... anywhere a video or image can be hosted, you can essentially store as many files as you want there.

I wanted to just let you guys know that the library got finished a couple days ago, and is ready to be tried out! You're free to ask me anything about it.
github.com/MarkMichon1/BitGlitter

Here's a demo video with a ~170KB payload:
youtube.com/watch?v=HrY4deFrOoA

Here's some common questions/comments I've received in the past:

> This breaks Jow Forums's rules about embedding data inside of images!
This doesn't apply to academic discussion of it. All of my threads last through archiving.

> Why did you make it?
Because I'm new to programming and this seemed like a great way to dive deep into many concepts. It was a lot of fun.

> This seems stupid, it has no use, etc...
It was never made with any special intentions. If anyone sees potential in it and wants to do something with it, that's just icing on the cake. I've already received two offers for commercial deals along with a few others wanting to integrate my library into projects of their own, so there's that.

(cont)

Attached: bg.png (1656x918, 606K)

Other urls found in this thread:

github.com/MarkMichon1/BitGlitter#installation
a.uguu.se/rhQTvqicnoF2_c.mp3
mega.nz/#!wT4m3KLQ!UTnPeuqtJrvyYJNOOHBt1D1nyR4T1BXzN2k0Fa_hOfs
mega.nz/#!0SxmyQgS!Gymqfr5Hd9JV8TWm-DXC11ALIYXsFFsShW4J-93CsAU
en.wikipedia.org/wiki/List_of_ITU-T_V-series_recommendations#Simultaneous_transmission_of_data_and_other_signals
warosu.org/g/thread/51159395
github.com/Valkryst/Schillsaver
github.com/Valkryst/Schillsaver/releases
en.wikipedia.org/wiki/Specific_Area_Message_Encoding
en.wikipedia.org/wiki/Error_correction_code
github.com/MarkMichon1/BitGlitter#write-converting-files-into-bitglitter-streams
github.com/hellerve/programming-talks#on-theory
youtube.com/watch?v=we4G_X91e5w
github.com/MarkMichon1/BitGlitter/blob/staging/bitglitter/palettes/paletteutilities.py#L27
twitter.com/NSFWRedditImage

(cont)

> How does multi-frames work with larger files?
Each "stream" (bundle of files/folders) has a unique ID generated at creation. This ID is inside of a header on each frame, as well as frame number. As each frame gets loaded, sequentially or not, they get verified, and then the payload for the frame gets saved as a small binary file. Once it's detected all of the frames have been loaded, the main payload is assembled from all of those frames, decrypted (if enabled), decompressed (if enabled), and then finally unpackaged. Files can be spread across 2 frames, 100, 500, or 8000. It functions the same way.

Didn't read, lol

Attached: 1530853677995.gif (112x112, 20K)

Thanks for the thread bump.

>python

Attached: 1558401772168.png (645x773, 23K)

It's been a good first language to learn. I'm planning to learn some C next so I can move some of the more CPU-intensive parts of my library over to it.

Neat. Will give it a look, user.

Cool, let me know if you have any questions. All dependencies install with pip, but you need to install ffmpeg.exe manually.... I have yet to figure out the best way to automate this. The readme explains further, and has a download link for it.

github.com/MarkMichon1/BitGlitter#installation

I work in cryptography and your encryption scheme is broken. AES-CBC PKCS#7 is a vulnerable algorithm and practical, easy attacks already exist against it. I recommend replacing your current cryptography library with "pynacl"; the default settings should be sufficient.
You can keep scrypt for passwords since it looks like you've set it up correctly. Let me know if you have questions.

Thank you for letting me know about this. I'm looking into the library now, and will have more to say in a few minutes.

Is it the padding algorithm itself that is the issue? I spoke with a few people about the crypto I'm using and I was under the impression that CBC was the 'best' choice of the other modes. I'm trying to find information on it's vulnerability but haven't found anything so far.

How does it feel to know you're enabling child abusers?

About the same way car manufacturers feel for enabling drunk drivers.

>frogposter
>retarded
All in order.

>Is it the padding algorithm itself that is the issue?
No.
>I spoke with a few people about the crypto I'm using and I was under the impression that CBC was the 'best' choice of the other modes.
Also no. For AES, the best mode is typically GCM. However, pynacl doesn't use AES.
>I'm trying to find information on it's vulnerability but haven't found anything so far.
Look up the Cryptopals challenges. The exploits are described in problems 11, 16, 17, and 27. If those don't help, the general principle is that CBC mode is weak because it's unauthenticated.
Very roughly speaking, attackers can edit the encrypted file without detection.

but op you arent allowed to encode things into images here

Thank you for elaborating; from what I've been reading the past few minutes, that makes a lot of sense. Security and file integrity have been my top priority in designing this. There are a lot of mechanisms in place to prevent data corruption, but I didn't have as much confidence with the crypto... even after studying it for a few days. The frames already have their own SHA-256 for their payload, so direct tampering of the frames themselves wouldn't work. The reader compares the read hash with the calculated one, and if the two aren't equal the frame is rejected. So I think what you're talking about mainly applies to the payload sitting on their computer before its automatic deletion.

I'll be making these changes soon though.

And thanks for taking the time to look at my code. Much appreciated.

You're welcome! Your project is very interesting. Good luck with your changes.

>not a frogposter
>easily amused
All in order.

Attached: 1532263155354.jpg (1209x756, 460K)

good work, mark

Thanks user

I love to fiddle with ffmpeg converting video to audio and audio to video, so, this project is pretty nice, appreciated!

You're welcome! Funny you mention audio.... its actually on my project's roadmap to embed data into audio streams using some kind of modulation scheme (I haven't gone too far to figure out how to do it yet). That would allow cassette tapes, records, or any other audio media to essentially store digital data.

nobody is going to use this shit until you make a windows binary

you clearly want attention

A desktop app with a GUI is the next major goal of the project. And yeah, this is how releasing open source software works.

Please make it an electron app just to piss off Jow Forums

Wouldn't this only preserve data for formats where the color doesn't get mangled? It'd only work for lossless formats.

cassette tapes, records, and anything else that's analog can simply have bits written to it, so that would be kind of pointless. But along the lines of using youtube as a filehost, you could embed files into mp3s and they could be posted on soundcloud. I don't see very many white-hat ethical use cases for this sort of thing. The only thing I can think of, is keeping an encryption key or important file safe by obfuscating what it is, making it appear as if it's not data, but video or audio, but to really do that, it would be ideal to make the program interleave data with a source media.

That was actually suggested to me the other day. The other option is PyQt. What do people have against electron?

It doesn't, and that's what separates this from all stenography apps available. If you check out the read me, there's numerous color palettes to choose from by default. Some are black and white at the simplest end, the other end is 6 bit and 24 bit color (as in, that many bits of data per color block).

The program scans the inner 50% of the block, takes the mean RGB value of what's scanned, and then compares its value to all of the colors in the palette. The "closest" one (thinking of it spatially) is selected. It does indeed work after compression; I've tried it in various environments online and you get 100% frame reads with the default config of the library.

With that said.... the more colors you use the lower that margin (but you carry more data), and the fewer colors you use you have a much greater margin, yet you're transporting less per frame.

Yeah I just like brainstorming. I have an audio engineer friend who told me higher frequencies can be utilized for the AFSK or whatever scheme I'd use, so that listening to the stream wouldn't reveal any sort of distortion at all. It would require some kind of analytic software to even detect it.

Attached: gNMti.jpg (774x236, 24K)

Converting digital files to audio is nice, but It uses too much space in time.
ffmpeg -i c.mp3 -f u8 -ar 48000 -ac 2 - | mpv --demuxer=rawvideo --demuxer-rawvideo-fps=15 --demuxer-rawvideo-w=480 --demuxer-rawvideo-h=360 --demuxer-rawvideo-mp-format=yuv420p -
a.uguu.se/rhQTvqicnoF2_c.mp3

Yeah it wouldn't be practical to store large files in, just smaller text files, documents, etc. Do you know if that stream is compressed prior to encoding to audio? At least from casually listening to it, there seems to be a lot of empty space. That was something I integrated into my project early. Running payloads without compression increased their size by a good margin (depending on the formats), and some frames were full of big black areas (which were long strings of 0 bits). Pic related.... its the only example of this I have saved on my machine.

I have to get to sleep so any replies from me will have to wait until then.

Attached: test_10.png (1920x1080, 18K)

I'm the one that talked about the color mangling. Sounds really interesting OP, well done. I'll check your repo out later.

>Do you know if that stream is compressed prior to encoding to audio?
No, it's not.
The empty space would be dark areas in the video, I don't think removing it would be great to the final output...

in what way is this different than zipping up my files and slapping a BMP header at the beginning of it?

it survives lossy compression

hm, thats interesting, you made me look into this

Final post before I go to sleep-

I'm talking about compressing the bytes of whatever files/data prior to encoding, just so we're both on the same page. gzip, zlib, some library like that.

Yep, this. The concept really isn't much different from a QR code. That's in fact one of the main inspirations behind doing this; maximizing the concept behind that and barcodes in general.

Thanks, I'm looking forward to people trying it out

>desktop app with a GUI

Just write a web application in Java (seriously).
That way you also make it 100% portable to any system with a web browser

Attached: 1460934318461.jpg (640x640, 114K)

>compressing the bytes of whatever files/data prior to encoding
Just did that, it worked! But the audio now just sounds like random noise, meaning it can't go to a lossy compression.

Now THIS is the kind of threads I want to see on Jow Forums. Thanks op, you're pretty cool

ffmpeg -hide_banner -i b.wav -f u8 -ar 48000 -ac 2 - | gzip -d - | mpv --demuxer=rawvideo --demuxer-rawvideo-fps=15 --demuxer-rawvideo-w=480 --demuxer-rawvideo-h=360 --demuxer-rawvideo-mp-format=yuv420p -
To play a.wav just remove the gzip pipe
mega.nz/#!wT4m3KLQ!UTnPeuqtJrvyYJNOOHBt1D1nyR4T1BXzN2k0Fa_hOfs
mega.nz/#!0SxmyQgS!Gymqfr5Hd9JV8TWm-DXC11ALIYXsFFsShW4J-93CsAU

this desu

oh it's this thread again
i="out.jpg"; ffmpeg -r 10 -pix_fmt monob -f rawvideo -s 160x120 -i

Attached: out.jpg (4096x995, 3.73M)

How is this any different to dial up internet?

Sending information over audio is not a new thing
en.wikipedia.org/wiki/List_of_ITU-T_V-series_recommendations#Simultaneous_transmission_of_data_and_other_signals

This already deals with the limitations that the phone network is only able to send frequencies important for human speech. You'd need something similar for a cassette tape machine because it will have limitations too, but as someone else pointed out its far more efficient to just use the magnetic tape to store bits.

Jow Forums did this years ago with what became known as Schillsaver.

warosu.org/g/thread/51159395

github.com/Valkryst/Schillsaver

Attached: 1446554048822.png (512x512, 56K)

Please fuck off and die, normalfag scum.

You can upload private videos to YouTube and have unlimited cloud storage.

>A desktop app with a GUI is the next major goal of the project.

Just use Shillsaver then.

github.com/Valkryst/Schillsaver/releases

Shillsaver also works for large files, 3GB file stored in a video = 9GB video file. Not bad.

>You can upload private videos to YouTube and have unlimited cloud storage.

Until YouTube decides to nuke all such videos.

Good.

This seems really really cool. Are you seriously new to programming though? Obviously you already know a lot about tech in general if you know about steg and crypto etc, but this is still quite advanced for a supposed first project, even the code base is properly split up into different files. Like I'm surprised that someone new to programming had even made a project that *works* let alone have it actually seem really useful. Mind if I ask how you've been learning?

Why only 24-bit color depth and not 32?

Fucking based, fuck nigger OP
Ever have of a sage , new faggot? Back to red.dit you uncreative swine

I remember someone posting about a year or two ago?
Was it you? Has it really been that long? You actually made it?

The slightest bit of compression will completely break the entire thing. Like I've been saying throughout this thread, this is where already available steganography breaks. My library is resistant to it.

Yeah, this is indeed my first project, and truthfully, when I first started I had no idea how most of it was going to be coded. Aside from needing to learn quite a bit to make this work, planning everything out was another challenge. I first heard of Python about a year ago, and I got serious about working on this project in the late fall/early winter, working and learning simultaneously. It was quite challenging at times to figure out the given solution for a problem, but I always made it a point to stick it though. Breaking into this industry is something important to me, so that's why I aimed high with a relatively ambitious first project. This was a deliberate choice.

The last 8 bits of a 32 bit palette are for transparency. This wouldn't make much sense for a standalone stream like this. It's on my roadmap to allow streams to be "partial frame," meaning a normal video can be playing with a stream playing on some portion of the screen along side of it. You could in theory compare the alpha there, but to read it would require an original lossless copy of the video and steganographic methods to extract the data which are outside of the scope of this project (for now at least). 24 bit is the practical maximum size for now without getting very fancy in how to decode it, if its at all possible.

Not me, it was probably that other thing people are linking to.

It will still work, you just have to slow down the transmission rate or just fewer frequencies... hard for me to answer specifically now because I'm half asleep and not completely sure how what you're using works. Back in the planning phase of this, I spent a lot of time learning about EAS, or the Emergency Alert System (you'll know what I'm talking about if you live in the US). Those scary sounding beeps at the beginning of messages are AFSK encoded messages to various alerting hardware called SAME headers. Without getting into how they work they are pretty resistant to corruption getting broadcasted over long distances, while streaming ~65 B of data per second. Like what I did with BitGlitter, I think the answer to this problem lies somewhere in the middle to this problem, just like how my library is in "between" steganography that breaks with any compression, and QR codes that were never built for high density data transmission. A little more info on the SAME headers if you're interested:

en.wikipedia.org/wiki/Specific_Area_Message_Encoding

(cont) Error correcting code is another interesting avenue to consider:

en.wikipedia.org/wiki/Error_correction_code

After all, the entire point of this isn't about performance necessarily (although we'll maximize it as we can). This is mostly about exotic reliable alternatives to data storage.

god you're an insufferable cunt


if people want to post a video they can use an app which allows it.


and if the app doesn't allow it, you didn't NEED to post it in the first place.

It would benefit you to read the thread. This has been gone over at least three times now.

>The slightest bit of compression will completely break the entire thing.

No it won't, have you even tried it? The size of the blocks are big enough to be resistant to most compression, I've personally tried it. You can make the blocks bigger if you need more resistance.

>What do people have against electron?
Absurdly high RAM usage.

Example:

Etcher
>basically a GUI for dd and drive listing
>uses 300MB RAM while doing nothing
meanwhile
>using the terminal will use less than 50MB depending on your terminal emulator
>mintstick uses 60-70MB
>woeusb uses less than 50MB
>unetbootin uses less than 25MB

If you need 250+MB just for a fucking GUI then your software is trash. Qt is cross-platform and uses significantly less RAM. Electron is just a web browser wrapping your software, it's shit and only used by garbage devs.

>im seething because i've never made anything and this hurts my feelies
>mommy more tendies

>Steganography
This is not steganography, this is just an encoding. Steganography would require that the file in which the data is hidden appears to be a normal thing.

this is really cool even though I don't know what I'd use it for :3 good job friend

Perhaps I should've used a different word. Any changes in geometry will break the stream. It says that in its documentation. I've tested converting 1080p streams to 720p, and they still work. A lot of this test data is why certain default configs are the way they are (although you can still customize it however you want).

Thanks for explaining.

You're right. It was brought up because this is the benchmark a lot of people hold it to, when in fact its functionally different.

Thank you. The entire thing is just a prototype/proof of concept. I learned a lot making it.

>Any changes in geometry will break the stream

Only if aspect ratios change. And they don't in 99% of cases.

(cont) If you look closely, the first frame of my demo video has an alternating color bar. This serves as a calibrator; an unsigned integer representing frame geometry is encoded in it.

Oh, very cool. Do you know if there's any kind of embedded ECC or hashes to ensure file integrity?

Attached: calibrator.png (1032x444, 31K)

So how did you go about learning Python and good programming practise etc? Again, it's very impressive for a first project, and I'd love to see some of the reasoucres you drew on to know what to do.

>I'm talking about compressing the bytes of whatever files/data prior to encoding, just so we're both on the same page. gzip, zlib, some library like that
what surprises me is that you went to all that effort making an encoder/decoder and compression was the last thing you thought of?

>Until YouTube decides to nuke all such videos.
which would be incredibly easy for youtube's content matching algorithms to detect. i'm not sure if anyone would bother considering using such methods to hide data as the video size is inherently much larger than the encoded data. identifying files that have encoded data using this method is also staggeringly easy.
>Steganography
that isn't steganography, Jow Forumsenius. we can clearly see your data. with steganography it is the opposite: the data is hidden in plain sight.

Sorry, I'm confused by what you mean. For my library, compression was actually one of the first things that was coded. It's enabled in all streams by itself. I don't see any reason to not have it enabled, but nonetheless I still give people that choice.

github.com/MarkMichon1/BitGlitter#write-converting-files-into-bitglitter-streams

I didn't go down this route because I didn't have time and I don't want to turn this into a black-hat product, but there are ways to make detection a lot harder of this. Throughput will suffer from all of them.

Refer to , 3rd response.

I started with Udemy courses that ranged from mediocre to terrible. That got me to the point of at least being able to write code, and understand the fundamentals of loops, objects, modules, etc. From there, I got two books, "The Illustrated Guide to Python 3" and "Python Programming" by John Zelle. Reading those back to front helped. I watched a lot of programming talks from dev conferences, a lot of stuff from Bob Martin. SOLID principle stuff. There were 2-3 MIT courses from MIT OpenCourseware I went through that helped a lot..

Even though it seems outside of the scope of my project, learning about safety-critical systems, aircraft-control systems, software running on hospital machine, and other must-work system benefited me quite a bit. It instills a sort of mindset in how you approach problems. What I've learned affected how I coded things, error messages, how things fail, and more. There is an excellent programming talk on Youtube about how rigorously aircraft control systems are built, discussing warnings, how to alert the pilot, how things should fail, etc. I'll try to find it.

(Cont due to character limit)

Jow Forums thought my links were spam, so these are the courses online I watched:

MIT 6.0001 Introduction to Computer Science and Programming
MIT 6.002 Circuits and Electronics
MIT 16.885J Aircraft Systems Engineering

Corey Schafer and Sentdex are two great Youtube channels that helped a lot as well. I think I'm missing some things, but that's about 90% of it. I learned as much as I could on programming concepts itself, as well as application in various fields.

>Oh, very cool. Do you know if there's any kind of embedded ECC or hashes to ensure file integrity?

I don't know, but if you're embedding files in videos you'll take care of that with embedded winrar recovery records anyways.

if this works with Google Photo you can literally use unlimted storage and have a translator in your pc to decipher the file

Pycon talks as well were pretty good. Don't watch them all (or even most), but just what you think you need.

github.com/hellerve/programming-talks#on-theory
Here's a great link I used.

Last talk, this was the aircraft one I was talking about earlier. I loved this one.

youtube.com/watch?v=we4G_X91e5w

So yeah, I think that's most of it. It may seem like a lot, and that's because it is lol. But you learn a little bit here, a little bit there, and I could see myself becoming increasingly better and faster at developing.

My library prevents that from happening at the frame level, and it will alert you of this through the terminal. This way the end user will know from the beginning that there was a problem, rather after reading and decoding all of the other frames. All other valid frames will still get read after it. Let's pretend you watched a 468 frame video and 3 random frames were corrupted. The reader, if you give it a better quality version of the video, will 'fast-forward' to the frames it needs without having to read everything all over again.

These are two people.
And here's you:
>my first project! I better post it in 4channel to get some attention! To compensate lack of affection from my mommy!
Not everyone is attention whores like you.

Attached: 1549208878750s.jpg (250x250, 8K)

Hey that's cool.

Thanks!

Attached: 1558716666042.jpg (1200x911, 128K)

Yeah this all seemed very QR code -ish to me

Have a bump

Will monitor thread

>FBI: the thread

Attached: 1535376687843.png (190x300, 28K)

>I made a Python library that allows you to embed files into pictures and video that is resistant to corruption and compression.
wew, that's a doozy

New tripcode, on a new machine. Verifying with another picture.

Thanks user.

It was an awesome experience, that's for sure. Check out the code if you can understand Python. I spent a while at the end making sure its easy to understand.

Attached: lockon.png (1611x775, 38K)

^^^

why'd you change your github from MarkMichon7 to MarkMichon1?
who's @designerkyra and why'd she(he) give you a shoutout?

It's an ongoing matter I can't talk about yet for the first question, and that is family who helped with designing the logo. Hit her up if you want some good design services.

user you really sure you want to post your stuff on 4chinz? I guess it's too late now, but don't be surprised if you get doxxed

I'm just here sharing a project I've been working on. That requires being public. 99/100 people who have contacted me have been nothing but nice, but unfortunately its just part of the territory to attract a small percentage of malevolent people as well. It is what it is.

bump

I'm still here if anyone has any questions/comments. I look at the tab every now and then.

Can you make a thing that converts the hexadecimal data of a file into an image where each pixel is colored with 3 bytes. So 0a 1b 2c (the three bytes) would be one pixel colored #0a1b2c in the image and so on. Is that what this is?

You pretty much exactly described how it works for 24 bit color:
github.com/MarkMichon1/BitGlitter/blob/staging/bitglitter/palettes/paletteutilities.py#L27

That's just one config of many that it can do, though. While you get the greatest densities this way, ANY compression or corruption will render the entire thing unreadable. That's why I play around with larger block sizes, and smaller color palettes. I already explain this somewhere in the thread, but I've done a lot of testing to see what configs can survive compression, size changes (1080p to 720p), and minor corruption of the file. The default config for write() gets 100% readability out of the box through all of these conditions. Also note that the RGB values on the video/image are the carrier of data itself, not the binary data of the file. This is essentially how it's resistant to changes, and why larger blocks and fewer colors (64 color, 6 bit default) are a necessity for "real world" applications where services compress and distort your multimedia.

I was just bumping because it was on page 10 and it was a good thread

Try D instead for modernity

It's appreciated. I'm learning from this as well to see if my readme lacks proper explanation in some parts.

I didn't read all that bullshit, but looks neat

>not really matters
>I just want to watch the world burn
>WOW STOP POSTING YOUR CRINGE TOPICS HERE YOU'RE RUINING THE BOARD
>STUPID LIBERALS ARE KILLING SOCIETY WHY IS NO ONE TAKING THIS SERIOUSLY
Really makes you think if they think

based