Backup data

>backup data
>silent data corruption and bitrot kills your files over time

Why aren't you adding redundancy and checksumming your files regularly?

Attached: 1200px-Data_loss_of_image_file.jpg (1200x1500, 331K)

Other urls found in this thread:

github.com/Parchive/par2cmdline
en.wikipedia.org/wiki/Apple_File_System#Data_integrity
en.wikipedia.org/wiki/NTFS#Internals
jodybruchon.com/2017/03/07/zfs-wont-save-you-fancy-filesystem-fanatics-need-to-get-a-clue-about-bit-rot-and-raid-5/
louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html
blog.fosketts.net/2017/07/10/zfs-best-filesystem-now/
twitter.com/AnonBabble

what kind of data could possibly be that important?

>bitrot
Yeah next thing you know rotational velocidensity is killing all your MP3s

Rare music, film and porn. Would be incredibly hard, if not impossible to find some of it again.

frankly, my 12TB archive of Bailey Jay videos isn't that important, and can easily be replaced.

the digital kind

This is baffling. The world runs on computers. You don't think there's any data used by banks, businesses, governments, artists, museums, etc where this would be a concern?

>copy files onto sd card
>insert into ice tray and freeze
>produce sd card popsicle
then all you have to is melt when needed

Attached: birth.png (225x225, 4K)

The Apple Macbook Pro with Retina Display doesn't have this problem.

Never thought about cooling flash memory to retain info?
How well does dropping 20C work?

Attached: 1522787426564.gif (295x218, 1.69M)

>Never thought about cooling flash memory to retain info?
how is that a question?

>condensation collects internally and completely rapes your sd card
brilliant

i use mdraid10+ext4 and tbqh i dont know what happens if a bit flips in one of the mirrors
im expecting the worse

That's not bitrot, that's digital dust. It's an important distinction. Maybe you will learn to use uncompressed extensions from now on.

No

running a raid1 btrfs with weekly scrub

OP's image looks just like my pictures when my SSD was failing. There were no problems except random corrupted images. I never would've even known if I didn't run CRC scans.

What brand/model?

That's what you get for alowing your HDDs to spin faster than 5400rpm. Bits start jumping around and never come back to their initial position.

I have BtrFS RAID10 that autocorrects bit rot.

>Ctrl+F
>ECC
>0/0
stay poor

MD RAID 1 can detect inconsistencies but without checksums it can't tell which side of the mirror is correct. It picks a side at random and updates the other side to match it.

>It picks a side at random and updates the other side to match it.
no it doesn't, it has an option to do that, but it's stupid unless the upper layer can handle that
normally picking which one to keep is a manual operation, such inconsistencies are logged for an admin to handle

well what's stupid is that this needs manual attention at all. The "layering violations" of ZFS and Btrfs are what enable them to take care of this shit for you, on every read. The filesystem and the RAID layer have to be in cahoots for that to happen.

>porn
i don't get this meme. Why would people ever save porn? It's just a waste of hdd space.
Just stream it, you dumb niggers.

some of us actually have preferences and tastes beyond "whatever over-compressed 480p garbage happens to be on the front page of pornhub today"

yea, mdadm /could/ talk to btrfs to find out automatically which block is correct, but that would be a layering violation, as you said
they're not necessarily a bad thing. i personally would like to have the option
with btrfs > mdadam, there's enough information to automatically sort out which file(s) are affected, and figure out which stripe is accurate, all it needs is some glue, some option in mdadm to allow it to ask btrfs which one is good

>"preferences and tastes"

how long till the FBI know down your door arrest you for all that cp?

Get a gf then? I swear you won't see any pixels irl

my porn is almost entirely still pictures, with a number of text files, and just a handful of videos
my interests are too specific and uncommon to pass up downloading them

I don't know why the hell you'd put btrfs on top of mdadm instead of using the built-in RAID functions. That's the whole point, since btrfs (or ZFS) is doing both, and ignoring layering-violation concerns, the filesystem can tell the RAID system "Hey, this block you gave me has a bad checksum. Give me the same thing, but from the other side of the mirror/rebuilt from parity"

as much of a pain in the ass as storage management is, it beats dealing with women.

>I don't know why the hell you'd put btrfs on top of mdadm instead of using the built-in RAID functions.
because btrfs raid5/6 is.. well it's hard to trust it

a.) a lot of fixes went into recent kernels. off the top of my head I know that 4.12 and 4.16 improved a lot in the btrfs RAID56 world. Btrfs RAID1 has been fine for quite a while.
b.) If you still aren't sure, that's fine, there's ZFS, which has an excellent stability record and, like Btrfs, gets you the checksumming and self-correcting advantages that MD RAID lacks.

i tried btrfs raid5 back when they first said it was all good, and it turned out to not be all that good
i used zfs before, but switched to btrfs because it's way more flexible and integrates better with linux (compared to last i used it)
at this point i'm just holding out for bcachefs to be mainlined (for the love of fuck please don't let that be another btrfs)

I get the strong impression that the bcache people don't really care about checksumming and bit-rot protection, they care about performance and tiering. (same with that stratis thing that red had was developing a while back)

>checksumming your files regularly
What's a good way to do this?

it already has checksumming implemented, scrubbing is planned
they do care about performance and tiering of course, since that's what bcache was all about, and it's built on top of that

>not uploading your backups to Google Drive and let the botnet do it for you

Have you ever thought "Wow that scene was great, I might use that again" and remember the actors/actresses' name? Then several months later you try to find it again to discover:
* You don't remember the name(s)
* You don't remember the producer
* You don't remember which site(s) it was on
* The scene was taken down

For me, that's a big deal. I just save everything I like. That one scene from 5 years ago might suddenly pop in my head and if I can't find the scene again I'm going to have a wicked horn going on without a satisfying nut bust.

They are, you're just using the wrong file systems it seems.

What's the best way to archive rare music long term without worrying about errors? I have about 70gb of music with a fair chunk of that being irreplaceable. A while back I was listening to some of my music and noticed some skips and errors in some of my music files that were never there before and somehow that drive still tested good. Luckily I had a backup on another drive, but since then I've been terrified of data corruption. Since then I have all my music on 2 separate drives in my pc and another backup on my nas, but I'm looking for a more long term backup.

nice copypasta, loser

My filesystem does that. Both NTFS and APFS have protection against this.

>as much of a pain in the ass as storage management is, it beats dealing with women.
I'm 150% sure you are virgin.

>* The scene was taken down
I'm mostly worried about this one. Now I youtube-dl lots of stuff even from youtube itself. Hate it when I scroll my liked videos and stuff was deleted or made private

Parchive (parity archive)
github.com/Parchive/par2cmdline

>text files
Do you fap to fiction or have an ASCII art fetish

because basically none of my data is important

Attached: [HorribleSubs] Asobi Asobase - 05 [1080p].mkv_snapshot_21.40.jpg (1920x1080, 65K)

OCZ Vertex or Vector, I can't remember which one. It was the highest end one.

Size fetish porn

You're using shitty backup software.

>bitrot
Are you the same dude from the NTFS thread?

"Apple File System uses checksums to ensure data integrity for metadata, but not user data."
en.wikipedia.org/wiki/Apple_File_System#Data_integrity

"A file system journal is used to guarantee the integrity of the file system metadata but not individual files' content."
en.wikipedia.org/wiki/NTFS#Internals

i have lots of photos of friends from back when people uploaded their photos to facebook, as well as instagram posts, stories, etc... that i download occasionally.

it's like 6 or 7GB of fapping material that's extremely personally relevant. can't post it anywhere, can't get it back if it's lost.

>Now... this is fapping

>silent data corruption
Redpill me on this.

I do regular incremental backups that apparently are useless because silent data corruption will be propagated to the backup.
What can I do to combat it?

I have warez burned on cdr’s from around 97-99, they still read without any problem. just backup regulary really important file and use dvdisaster to create a recovery record and your important data will be fine

Dump

family photos? graduation videos? life savings?
use your brain

This guy gets it.

22+ year old photos / gifs. No issues after all this time. Have saved in several locations and different HD / SSDs.

Shrugs. I don't believe this stuff will ever "rot" unless you simple do not bother to use them ever again and don't have at least one extra backup.

>Why aren't you adding redundancy and checksumming your files regularly?
I don't need to do that because ZFS does it for me.

But I am user

pool: zpool0
state: ONLINE
scan: scrub repaired 0B in 2h20m with 0 errors on Sun Sep 9 02:44:05 2018
config:

NAME STATE READ WRITE CKSUM
zpool0 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
wwn-0x50014ee26562f217 ONLINE 0 0 0
wwn-0x50014ee2100db7a2 ONLINE 0 0 0
wwn-0x50014ee2bab8184c ONLINE 0 0 0
wwn-0x50014ee2100dbd59 ONLINE 0 0 0
wwn-0x50014ee2656306d0 ONLINE 0 0 0
wwn-0x50014ee26563100f ONLINE 0 0 0

errors: No known data errors

Ive always been worried about this after I had some corruption in some files a while back

whats the best solution?
btrfs?
zfs?

I remember hearing bad things about both

an error correcting file system is important (still unsure which is best), and multiple backups in case any of them get corrupted and are unrecoverable

Ive always been worried that if one of the incremental backups is fucked, then you can't restore any backups made after it, depending how the deltas are done

you're just lucky, some cd roms are unreadable in only 5 years, and heat can destroy them easily

>I don't know why the hell you'd put btrfs on top of mdadm instead of using the built-in RAID functions.

btrfs has had some *terrible* data corruption bugs in its RAID5/6 implementation. They've been fixed, but only in like the last 4 kernel releases. The write-hole still exists.

Use ZFSonLinux.

He's not you moron. He's talking about silent data corruption on disk and also maybe even bad sectors that weren't reallocated in time for data corruption to not happen, and then also there's unrecoverable read errors.

It doesn't matter what backup tool you use if you can't trust any of your data to be correct.

give me a super simple, minimalist program to do that in the background and i'll start right away

You can also use SnapRAID + some sort of drive pooling software as a good-but-not-zfs-level solution. It has some advantages in flexibility.

>an error correcting file system is important
How good is NTFS on this front?

>multiple backups
This won't be of much help since the corruption is silent and it will be propagated to ALL the backups. This is my major worry.

What can we do about this?

BON is fully encrypted with the keyfile located on rpool and multipathed. Quite nice.
pool: BON
state: ONLINE
scan: scrub repaired 0B in 11h18m with 0 errors on Sun Sep 9 11:42:19 2018
config:

NAME STATE READ WRITE CKSUM
BON ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
vdev1disk1 ONLINE 0 0 0
vdev1disk2 ONLINE 0 0 0
vdev1disk3 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
vdev2disk1 ONLINE 0 0 0
vdev2disk2 ONLINE 0 0 0
vdev2disk3 ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
vdev3disk1 ONLINE 0 0 0
vdev3disk2 ONLINE 0 0 0
vdev3disk3 ONLINE 0 0 0

errors: No known data errors

pool: LEON
state: ONLINE
scan: scrub repaired 0B in 0h6m with 0 errors on Sun Sep 9 00:31:02 2018
config:

NAME STATE READ WRITE CKSUM
LEON ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
mpathc ONLINE 0 0 0
mpathd ONLINE 0 0 0
mpathl ONLINE 0 0 0

errors: No known data errors

pool: rpool
state: ONLINE
scan: scrub repaired 0B in 2h56m with 0 errors on Sun Sep 9 03:20:31 2018
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
wwn-0x5000c50040a7e177 ONLINE 0 0 0
wwn-0x5000c50056b1b943 ONLINE 0 0 0

>How good is NTFS on this front?
It is so good I lost 400GBs to a silently flipping NTFS volume.

bcachefs

just wait(tm)

zfs fans, please comment on these

jodybruchon.com/2017/03/07/zfs-wont-save-you-fancy-filesystem-fanatics-need-to-get-a-clue-about-bit-rot-and-raid-5/

louwrentius.com/the-hidden-cost-of-using-zfs-for-your-home-nas.html

blog.fosketts.net/2017/07/10/zfs-best-filesystem-now/

snapraid

>How good is NTFS on this front?
It isn't one. On Windows your best bet is SnapRAID. On Linux or FreeBSD, use ZFS or possibly btrfs (wouldn't recommend btrfs honestly, it has yet to stabilise and prove itself, and the code is a clusterfuck).

Noice. I was a bit too much of a pussy to encrypt my ZFS volume. What sizes are your disks in BON?

>SnapRAID

How can SnapRAID help with silent corruption? If a software bug writes something to a place on the disk it shouldn't, how does SnapRAID know it was unintentional?

will look into it

will look into it

the multiple backups are so you can recover from the backups if they get corrupted/destroyed, which can potentially stop all backups from working and be as disastrous as not having any backups at all

not prevent bitrot in your base files which would need a self correcting filesystem

Put like this, the only way my data can be destroyed is if my house burns. Servers + Network gear all ran from UPS.
Primary Server - ReFS File System
Core Server with everything - ZFS
Backup Archive server - ZFS
Backup NAS with everything (Shutdown)

8 drives all have to go tits up at once before everything is lost. I ain't worried about shit.
My media files will outlive me and be 100% intact as the day I dumped them on the server's drives. Eventually I'll build a server with two Raid Z3 arrays and mirror them. 6 drives would then all have to kick before things got messy.

your disks protect you from "bitrot" already with CRC regardless of your filesystem, if a sector fails and can't be saved you'll be alerted. Just keep a backup.

4TBs.
Every single one has bad sectors so I've made the dataset for the music have copies=2 just in case.

who cares
the only things you should really preserve are documents which might be required for job, retirement etc.
everything else is disposable and aside from historical preservation of historically-relevant material, entirely useless
your 100TB archive of music and movies isn't going to matter to anybody but other data hoarders, worthwhile movies, music and other art pieces have already been archived or at least well documented
you're going to die one day and leave all that data behind, just learn to live without it

sure it sucks not being able to find that one movie or game or porn after a few years but it also sucks not being able to go back in time to relive your past, yet data hoarders despair super hard over digital shit they lost yet somehow don't feel the same despair over lost age which arguably is way more important, since every year passed puts one closer to death, losing that uncompressed FLAC collection of the world's rarest vinyl record doesn't

data hoarding is a mental disorder just like hoarding of physical objects

>If a software bug writes something to a place on the disk it shouldn't, how does SnapRAID know it was unintentional?
No filesystem would know that the data that is written in the first place is completely borked. A good filesystem can't fix broken software.

ZFS also will think it was intentional when fed garbage data. We're talking about silent data corruption on-disk when stored for a long time, when a bad sector isn't caught in time or simply too many bits flip at once so the hardware ECC can't recover it.

>if a sector fails and can't be saved you'll be alerted. Just keep a backup.
It's almost as if a good filesystem or snapraid could help you with that.

Of course, keep a backup anyway, but it's nice to not worry about the data on your disks.

>jodybruchon.com/2017/03/07/zfs-wont-save-you-fancy-filesystem-fanatics-need-to-get-a-clue-about-bit-rot-and-raid-5/
This article is seriously a pile of shit. Checksumming undoubtedly helps the filesystem detect data corruption *regardless* of the cause, and to perfection. You then restore the data from parity (or in a mirror: known good data). It's far more reliable given that ECC is limited in terms of how many bits can flip before it can't correct your data and you're fucked, or it doesn't even notice.

Furthermore, this bit here:
>While it is a possible scenario, it is also very unlikely. A drive that has this many bit errors in close proximity is likely to be failing and the the S.M.A.R.T. status should indicate a higher reallocated sectors count or even worse when this sort of failure is going on.
Yeah, sure, but if your drive is failing and it's extremely obvious in the SMART tables, then you probably lost data already that if you ran ZFS you'd still have. Holy shit, is this guy retarded?

1/?

The problem is that it doesn't point out -what- needs to be restored.
One bad sector dropping some mysterious file somewhere is no reason to restore your entire backup instead of plucking the damaged file from backup in like a minute.
Unless you want to reverse lookup sectors to files on a rw and replace them manually...

Attached: 1501774852878.jpg (530x530, 58K)

>worthwhile movies, music and other art pieces have already been archived or at least well documented

who says you will have access to them?

how are you going to watch the original starwars trilogy when disney bans everything except the sjw editions?

if sjws had their way they'd ban everything made before 2008

Uh, which would you rather do, re-rip/download all your movies after a failure of some type which would take several months if you got a lot or simply restore from backup that takes maybe a day?
I'd take the day v.s several months anytime.

Attached: Untitled-1.jpg (1440x900, 271K)

>your 100TB archive of music and movies isn't going to matter to anybody but other data hoarders
Who do you think I cater to?

Also stuff like IPFS can make for efficient distribution of these files to whoever wants to look them up and also share them.

>If the drive can’t recover and sends back the dreaded uncorrectable error (UNC) for the requested sector(s), the drive’s error detection has already done the job that the ZFS CRCs are supposed to do; namely, the damage was detected and reported.
Err, yeah, but then the ZFS checksum alerts the filesystem that it needs to recover from good data, either parity information or a mirrored disk. What the fuck is your point? The disk on its own has no way to recover that. It's gone, and on ZFS it isn't, so you just buy a new drive and resilver.

>ZFS with CRC checking will detect the damage despite the drive failing to do so and the damage can be handled by the OS appropriately…but what has this gained us? Unless you’re using specific kinds of RAID with ZFS or have an external backup you can restore from, it won’t save your data
Yes, it fucking will. Who in fuck's name uses ZFS with fully striped vdevs and no redundancy? Let's see, where does ZFS save your data?

Full stripe/RAID0: Nope
Mirrored/RAID1: Check
RAIDz1/2/3: Check
Striped and mirrored/RAID10: Check

Oh, would you look at that. There's ONE way to configure your pool that fucks you over. Maybe this is why people don't use RAID0 even with traditional RAID, when data integrity is important. Maybe it's why people DON'T MAKE FUCKING RAID0 ZFS POOLS.

What a fucking stupid argument. Whoever made a striped zpool knows they're at risk of losing data and isn't going to come with claims of data integrity, but no-one does this. Everyone uses mirrors or RAIDzX.

2/3

are there any file system that are designed for quick incremental backups by marking which directories have had their contents modified since the last backup?

it could be done by comparing timestamps but that's not always accurate

I did not have any "silent data corruption" even when my HDD was literally scraping its platters. Files were either lost entirely or not at all.

OP confirmed to use an 8 disk hardware RAID0

Attached: NEVER EVER.gif (800x430, 564K)

ZFS and BTRFS snapshots.
With how CoW works it allows you to temporally read only old data and start streaming only the changes from there.
These snapshots can be transfered to other disks or just pumped into a lone file for later retrieval.

>If your drive’s on-board controller hardware, your data cable, your power supply, your chipset with your hard drive interface inside, your RAM’s physical slot connection, or any other piece of the hardware chain that goes from the physical platters to the CPU have some sort of problem, your data will be damaged.
Uhh, yeah? No-one ever said RAID or ZFS is a complete replacement for off-site backups, but it undeniably protects you from DISK FAILURES. A filesystem will NEVER protect you if your power supply decides to fuck up your entire machine, holy shit, this isn't even an argument against ZFS because it applies to literally any situation of having a hard drive and a PSU in the same computer.

>By now, you’re probably realizing something about the data CRC gimmick: it doesn’t hold much value for data integrity and it’s only useful for detecting damage, not correcting it and recovering good data.
By now, *you* should realise that the checksums are there to alert the filesystem so it can make use of its data recovery features, you dumb fucking asshole. This is just an autistic rant which specifically attacks the imagined boogeyman using a striped vdevs (RAID0) zpool, and doesn't actually say anything that people don't know already. The whole *point* of the checksumming is that it can restore data from known good data on e.g. a mirror or calculate it from parity information.

Fucking no-one ever said that you shouldn't make backups if you care about your data. This whole blog post is just an imagined strawman.

3/3

Also, Netflix and (others) can pull content anytime. Torrents/Direct links can all die as well leaving you fucked when you want something. Nope, better to keep it all local so you don't gotta worry about shit like that.

thanks for that

based and redpilled

>who says you will have access to them?
>Uh, which would you rather do, re-rip/download all your movies after a failure of some type which would take several months
did you two faggots miss the point of my post
it's all disposable
if you can't access it then move the fuck on, it's just a bunch of audiovisual data, not being able to rewatch a movie or listen to a song again isn't going to make your life any worse and it only makes the nostalgia sweeter
you'll die one day and whether or not you had that original star wars trilogy (which is vastly overrated anyway) saved among a million TBs of data that lay untouched for decades isn't going to fucking matter

I wasn't calling you a dumb asshole btw, I was angry at the blog writer for writing that stupid crap in the first place. Yeah, don't configure your ZFS pool to striped vdevs, and do mirrors/RAIDz instead and literally none of the criticism in this article applies. ZFS will guard your data, but you still need to keep backups obviously, since who knows, maybe your house is an asteroid crater the next day.

Whoever wrote this decided to take some ridiculous strawman of people claiming data integrity after they had configured their ZFS pool to act like RAID0, at which point they didn't need to be told that they'd lose data if a disk fails because they already fucking knew when they made that decision.

t. uncultured swine

>I wasn't calling you a dumb asshole btw
I didn't write the blog post, so I never took it that way lol
just wanted to hear some opposition to it to see how much is valid

>The problemi is
The problem is to prevent silent bitrot. The problem is fixed by the disks themselves, you'll have plenty of time to replace the faulty disk _before_ you reach UNC sectors.
Moreover, it's definitively possible to research which file belongs to a sector without a degree in engineering.
>It's almost as if a good filesystem or snapraid could help you with that.
If you're not using ECC ram you're wasting time in placebo. A "good filesystem" (I assume you refer to ZFS or btrfs) is just a feel-good wankery if you're choosing them to "prevent bitrot".
Hopefully you're taking countermeasure against rotational velocidensity as well.

>you're going to die one day and leave all that data behind
talk for yourself.

Oh by the way, in the response section:
>I believe ZFS checksums constitute a lot of additional computational effort to protect against a few very unlikely hardware errors once the built-in error checking and correction in most modern hardware is removed from the overall picture.
No, it's not unneeded and it's not extra computational load for no gain. ZFS is developed to *guarantee* that your data is intact. It's not there to guarantee in the 95% cases, it's there to ALWAYS guarantee that your data is correct.

Hardware ECC can fail, the implementation can be poor or too many bits flip at once so it doesn't notice. Maybe your hard drive is failing at a rapid pace and data is already corrupt before you manage to move it off the drive, because that's exactly the sort of situation where hardware ECC would be unreliable while your ZFS pool wouldn't be. ZFS in this case would absolutely guarantee that you can recover your data, through resilvering after replacing the drive. This is EXACTLY the point where a good filesystem setup is important. It's about being 100% correct, not 90% correct.

Even if you backup every week, your backups may be incomplete, and your goal may be to lose NO data. That is very possible with any sort of checksumming setup that had redundant data.

Some people just toss just any old hardware together to make a "server" and expect it to work. Well yeah it may work, but for how long or how long till something fucks up and your data goes down the shit hole?
Using Desktop Class drives in a Server? Bad idea, imo. They are not made to run 24/7. So they won't last as long.
Using regular grade ram and not ECC in your Freenas build (when the docs plainly state ECC)? Well when your data gets fucked up don't blame ZFS. ZFS only protects data on the drives. Whatever is in ram before it gets to the drives is prone for errors, which is why ECC is a must.

Not using a proper server case and pack it full of drives expecting cool temps? Fat chance. Proper cooling (39C) is a must if you want your drives to last. I could go on but you get the point. Do research people.

>We're talking about silent data corruption on-disk when stored for a long time, when a bad sector isn't caught in time or simply too many bits flip at once so the hardware ECC can't recover it.

But this kind of corruption is not a problem. Won't any modern filesystem detect that corruption occurred in this situation? Making it non-silent and recoverable from trivial backups.