Results 1 to 16 of 16

Thread: How do i know files are OK after recovery?

  1. #1
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    120

    How do i know files are OK after recovery?

    Hello,

    Well yesterday i got a great New Year's present from Gigabyte & Samsung... they decided that 2 of my drives were suddenly not 931GB drives anymore, but 32MB drives

    Luckily only one of the drives had any data on it, and it way about 500GB of it. (about 80% i have backed up)

    So i finally managed to recover my drives using some tool called HDAT2 (A big thanx to the author where ever he is )

    And then left the computer all night to save the files i found using "File Scavenger"

    It found a lot of my files (about 70%), and now my question is:

    Since there are a lot of .ISO files in ther, and an occasionall .avi and .mkd how do i know the files are OK??

    I mean if i mount an .iso, and it mounts is it safe to say that its OK? Also the same goes for avi and mkd files. If i can read them in BSPlayer are they OK??

    Thanx, any help is greatly appreciated
    Tom,

  2. #2
    Xtreme Member
    Join Date
    Jan 2007
    Location
    Dorset, UK
    Posts
    439
    Basically, you don't. You'll have to trust that the recovery software did its job correctly. Without some external verification made from the original files (PAR2, MD5, CRC) you're just going to have to try them and cross your fingers.

    You could run GSpot over the video files and see if it shows anything odd - it will sometimes catch broken files that are a different length from what is reported in the header - but playing them and skipping through to check that they are consistent is about the only way to be sure. I don't know of a tool that could verify an ISO.

  3. #3
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    120
    I tried mounting a few .ISO files, but they seem to mount OK... would the still mount if they were faulty?

    I did loose a few iso's awhile back, but they didn't even want to mount... so i thought that was "the test"

  4. #4
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    120
    Okay, i just remembered i did a WhereIsIt Catalogue a few weeks ago, is there any way i can use that to check the iso images??

  5. #5
    Xtreme Addict
    Join Date
    Jul 2006
    Posts
    1,124
    Depends on the level of checking you want, images, movies, whatever can have damage or changed bits that will not be detected unless you do a complete bit check and then compare those files (MD5 will handle files up to 2^64 bits before collision (~2EiB) , CRC32 will handle 2^16 bits before collision (8KiB not useful for much these days). All that mounting an image will do is check if the error is in the first couple blocks. I am not familiar with WhereIsIt but it sounds like a file system catalogue utility which won't do any checking. Like IanB said above, without an external check you have no way of knowing.

    |.Server/Storage System.............|.Gaming/Work System..............................|.Sundry...... ............|
    |.Supermico X8DTH-6f................|.Asus Z9PE-D8 WS.................................|.HP LP3065 30"LCD Monitor.|
    |.(2) Xeon X5690....................|.2xE5-2643 v2....................................|.Mino lta magicolor 7450..|
    |.(192GB) Samsung PC10600 ECC.......|.2xEVGA nVidia GTX670 4GB........................|.Nikon coolscan 9000......|
    |.800W Redundant PSU................|.(8x8GB) Kingston DDR3-1600 ECC..................|.Quantum LTO-4HH..........|
    |.NEC Slimline DVD RW DL............|.Corsair AX1200..................................|........ .................|
    |.(..6) LSI 9200-8e HBAs............|.Lite-On iHBS112.................................|.Dell D820 Laptop.........|
    |.(..8) ST9300653SS (300GB) (RAID0).|.PA120.3, Apogee, MCW N&S bridge.................|...2.33Ghz; 8GB Ram;......|
    |.(112) ST2000DL003 (2TB) (RAIDZ2)..|.(1) Areca ARC1880ix-8 512MiB Cache..............|...DVDRW; 128GB SSD.......|
    |.(..2) ST9146803SS (146GB) (RAID-1)|.(8) Intel SSD 520 240GB (RAID6).................|...Ubuntu 12.04 64bit.....|
    |.Ubuntu 12.04 64bit Server.........|.Windows 7 x64 Pro...............................|............... ..........|

  6. #6
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    120
    Okay... thanx for the info... I just looked arround a bit, checked the video files with G-Spot and its says its OK... so that isn't totally lost.

    I also checked the ISO files and it looks like i only lost about 20GB of unbacked up data.
    Guess i'm gonna go RAID 1 now but how can i be sure that my motherboard won't suddenly decide my drives are 32MB again? Does anyone know something about this? I checked google a bit but came up mostly to solutions.... not the actual cause or how to prevent it.

  7. #7
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    120
    Oh and i forgot to ask... i know there are tools out there for making MD5 signatures of folders, so can anyone point me to some?

    I know there are probably threads about this but if anyone can drop in a few links, or complete topics to this subject...

  8. #8
    Xtreme Addict
    Join Date
    Jul 2006
    Posts
    1,124
    Do a search for file integrity checkers on google. I wrote a small one for linux/unix posted here in the BER thread http://www.xtremesystems.org/forums/...d.php?t=212417, afick is another one that I've looked at but it's not multi-threaded (writen in perl). But there are bunches, I would suggest MD5 or even MD4 if you're just doing it for file checking (ie, not security tampering which many of the tools are marketed towards). (for security really look at SHA256 or better). Anyway, any raid level does not protect data, it only allows for hardware (media) availability, if you want data protection you need to have backups.

    |.Server/Storage System.............|.Gaming/Work System..............................|.Sundry...... ............|
    |.Supermico X8DTH-6f................|.Asus Z9PE-D8 WS.................................|.HP LP3065 30"LCD Monitor.|
    |.(2) Xeon X5690....................|.2xE5-2643 v2....................................|.Mino lta magicolor 7450..|
    |.(192GB) Samsung PC10600 ECC.......|.2xEVGA nVidia GTX670 4GB........................|.Nikon coolscan 9000......|
    |.800W Redundant PSU................|.(8x8GB) Kingston DDR3-1600 ECC..................|.Quantum LTO-4HH..........|
    |.NEC Slimline DVD RW DL............|.Corsair AX1200..................................|........ .................|
    |.(..6) LSI 9200-8e HBAs............|.Lite-On iHBS112.................................|.Dell D820 Laptop.........|
    |.(..8) ST9300653SS (300GB) (RAID0).|.PA120.3, Apogee, MCW N&S bridge.................|...2.33Ghz; 8GB Ram;......|
    |.(112) ST2000DL003 (2TB) (RAIDZ2)..|.(1) Areca ARC1880ix-8 512MiB Cache..............|...DVDRW; 128GB SSD.......|
    |.(..2) ST9146803SS (146GB) (RAID-1)|.(8) Intel SSD 520 240GB (RAID6).................|...Ubuntu 12.04 64bit.....|
    |.Ubuntu 12.04 64bit Server.........|.Windows 7 x64 Pro...............................|............... ..........|

  9. #9
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    120
    Really? i mean about that RAID thing. Because i was under the impression that if i do RAID6 at home with say 12 drives i'll be pretty safe from any kind of data loss? (which was my kind of long term storage plan, that's why i mention it)

    So what you're saying is i need to have it all on DVD's or separate drives that are not plugged in the computer and only serve as backups?

  10. #10
    Xtreme CCIE
    Join Date
    Dec 2004
    Location
    Atlanta, GA
    Posts
    3,842
    Quote Originally Posted by Lightning98 View Post
    Really? i mean about that RAID thing. Because i was under the impression that if i do RAID6 at home with say 12 drives i'll be pretty safe from any kind of data loss? (which was my kind of long term storage plan, that's why i mention it)

    So what you're saying is i need to have it all on DVD's or separate drives that are not plugged in the computer and only serve as backups?
    There's concern and there's paranoia. I'll let you decide where this falls, it depends on how important things are to you I guess.
    Dual CCIE (Route\Switch and Security) at your disposal. Have a Cisco-related or other network question? My PM box is always open.

    Xtreme Network:
    - Cisco 3560X-24P PoE Switch
    - Cisco ASA 5505 Firewall
    - Cisco 4402 Wireless LAN Controller
    - Cisco 3502i Access Point

  11. #11
    Xtreme Addict
    Join Date
    Jul 2006
    Posts
    1,124
    nope, NO raids provide any protection against data loss. Their sole purpose is to protect against HARDWARE (namely drive) failure (and only certain types of that). Depending on how much risk you want to live with and how important the data is there is a wide range of data backup (though all of this there is the data integrity as I mentioned before which allows you to check that what you /think/ you're doing is actually happening) options. From none, to doing multiple backups to different media and moving that to geographically distant locations (sometimes more than one).

    The idea of backing up to other drives that may not be on-line may help (it's better than NO backup) but this also has a fallacy that the drive your backing up to is good (data once written et al). This has two issues at the high level 1) drives are NOT designed for many start/stop operations (250-500/year only) The more you do the more wear you put on the drive. 2) errors in the data once stored (part of what we were talking about in the BER discussion(s) here) where any media type is prone to errors and with larger sizes the more the errors will statistically show up.

    I would suggest at least /some/ type of backup and to check that backup (to make sure the data is good) on a regular basis if your data or your time to re-create the data is important. More in-depth backup strategies are better served once you have a better idea on the value of the data to you (and what you can afford to spend to protect that value).

    edit: As Serra says above, there is a wide range and the only person that can make that decision is you.

    |.Server/Storage System.............|.Gaming/Work System..............................|.Sundry...... ............|
    |.Supermico X8DTH-6f................|.Asus Z9PE-D8 WS.................................|.HP LP3065 30"LCD Monitor.|
    |.(2) Xeon X5690....................|.2xE5-2643 v2....................................|.Mino lta magicolor 7450..|
    |.(192GB) Samsung PC10600 ECC.......|.2xEVGA nVidia GTX670 4GB........................|.Nikon coolscan 9000......|
    |.800W Redundant PSU................|.(8x8GB) Kingston DDR3-1600 ECC..................|.Quantum LTO-4HH..........|
    |.NEC Slimline DVD RW DL............|.Corsair AX1200..................................|........ .................|
    |.(..6) LSI 9200-8e HBAs............|.Lite-On iHBS112.................................|.Dell D820 Laptop.........|
    |.(..8) ST9300653SS (300GB) (RAID0).|.PA120.3, Apogee, MCW N&S bridge.................|...2.33Ghz; 8GB Ram;......|
    |.(112) ST2000DL003 (2TB) (RAIDZ2)..|.(1) Areca ARC1880ix-8 512MiB Cache..............|...DVDRW; 128GB SSD.......|
    |.(..2) ST9146803SS (146GB) (RAID-1)|.(8) Intel SSD 520 240GB (RAID6).................|...Ubuntu 12.04 64bit.....|
    |.Ubuntu 12.04 64bit Server.........|.Windows 7 x64 Pro...............................|............... ..........|

  12. #12
    Xtreme Mentor
    Join Date
    Sep 2006
    Posts
    3,246
    [SIGPIC][/SIGPIC]

  13. #13
    Xtreme Member
    Join Date
    Jan 2007
    Location
    Dorset, UK
    Posts
    439
    If what you have are mainly large flat folders with many items in (archival, that don't change often), rather than many nested folders with smaller amounts in, then a simple option for a reasonable level of data integrity is to use QuickPar to make a few repair blocks for the fileset in each folder. That way you'll get an MD5 integrity check for every file that you can test from time to time, and a way of repairing a couple of bit errors if you ever find any. One repair block can be as little as 1/32768 of the total data size, so you won't lose a whole lot of disk space adding some repair capability.

    The only catch with QuickPar (at least back when I was involved with it and writing my own version) is that it will not scan and verify whole folder trees, only single folders. This is not a limitaiton of the PAR2 standard, but was a deliberate limitation in QuickPar that ignores folder information in the filenames and removes it if found. A long time ago I began my own full Assembler version that could do whole folder trees, but got sidetracked badly. It's still a useful idea and might have relevance to stevecs's BER discussion anyway, so I'll try and pick up the thread of that.

  14. #14
    Xtreme Addict
    Join Date
    Jul 2006
    Posts
    1,124
    I was not aware that quickpar did not regress directory trees, that's important for general use (ie, not as a quick file backup scheme but as an on-line data integrity checking program). As for the repair block size being 1/32768 that is much too large to recover bit errors (mulitiple bits being in error) even more so w/ enterprise drives. With some of the spatial data we've seen so far that sectors/bits that have errors are in close proximity to each other on physical media by using a large block group as that will not be able to recover the block when multiple bits are in error.

    |.Server/Storage System.............|.Gaming/Work System..............................|.Sundry...... ............|
    |.Supermico X8DTH-6f................|.Asus Z9PE-D8 WS.................................|.HP LP3065 30"LCD Monitor.|
    |.(2) Xeon X5690....................|.2xE5-2643 v2....................................|.Mino lta magicolor 7450..|
    |.(192GB) Samsung PC10600 ECC.......|.2xEVGA nVidia GTX670 4GB........................|.Nikon coolscan 9000......|
    |.800W Redundant PSU................|.(8x8GB) Kingston DDR3-1600 ECC..................|.Quantum LTO-4HH..........|
    |.NEC Slimline DVD RW DL............|.Corsair AX1200..................................|........ .................|
    |.(..6) LSI 9200-8e HBAs............|.Lite-On iHBS112.................................|.Dell D820 Laptop.........|
    |.(..8) ST9300653SS (300GB) (RAID0).|.PA120.3, Apogee, MCW N&S bridge.................|...2.33Ghz; 8GB Ram;......|
    |.(112) ST2000DL003 (2TB) (RAIDZ2)..|.(1) Areca ARC1880ix-8 512MiB Cache..............|...DVDRW; 128GB SSD.......|
    |.(..2) ST9146803SS (146GB) (RAID-1)|.(8) Intel SSD 520 240GB (RAID6).................|...Ubuntu 12.04 64bit.....|
    |.Ubuntu 12.04 64bit Server.........|.Windows 7 x64 Pro...............................|............... ..........|

  15. #15
    Xtreme Member
    Join Date
    Nov 2007
    Posts
    120
    Quote Originally Posted by IanB View Post
    The only catch with QuickPar (at least back when I was involved with it and writing my own version) is that it will not scan and verify whole folder trees, only single folders. This is not a limitaiton of the PAR2 standard, but was a deliberate limitation in QuickPar that ignores folder information in the filenames and removes it if found. A long time ago I began my own full Assembler version that could do whole folder trees, but got sidetracked badly. It's still a useful idea and might have relevance to stevecs's BER discussion anyway, so I'll try and pick up the thread of that.
    I will look into it, and study the BER discussion a bit more... because thats's exactly what i need. I have separate folders with files in them. No subfolders or anything. "root\folder\files" and a bunch of that.

    The recovery program did a great job of returning the entire folder structure, which was great. I also downloaded "FastSum" and did a quick MD5 check of the recovered files, and the ones i had backed up. It was identical so i guess the files should be OK.

  16. #16
    Xtreme Member
    Join Date
    Jan 2007
    Location
    Dorset, UK
    Posts
    439
    Quote Originally Posted by stevecs View Post
    As for the repair block size being 1/32768 that is much too large to recover bit errors (mulitiple bits being in error) even more so w/ enterprise drives. With some of the spatial data we've seen so far that sectors/bits that have errors are in close proximity to each other on physical media by using a large block group as that will not be able to recover the block when multiple bits are in error.
    I replied in the BER thread before seeing this.

    The fileset you want to protect ("repair set") can be split virtually into blocks that are at smallest 1/32768 (~0.003%) of the total size of the set. This is a limitation in the maths, you can have 32768 blocks maximum. A repair block is the same size as the virtual blocks, and can be used to repair any number of bit errors within a single block that goes bad. You can repair a fileset where you don't have ANY data from a small area with a single block, as long as all the missing data falls within the virtual block boundaries set up when the repair set was PAR2ed. And if two blocks are missing/bad then you need two repair blocks etc..

    So if bit errors are spacially aggregated due to local physical defects rather than occurring randomly through an entire fileset, then I'd argue that PAR2 is extremely efficient and you'll need few repair blocks to cover that sort of defect rate, compared to where the bit errors are spread over many virtual blocks simultaneously and therefore more repair blocks are needed, one per defective block.

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •