2009-04-13:
Fixes in low level data recovery from CDROM and DVD code.
Raw sector reading is not possible on DVDs, but DVDs had been incorrectly detected as CDs leading to failures on low level access.
DVDs are now detected succesfully and offer a drive reset as the only low level operation.
The type of a CD (Audio, Mode1, Mode2, etc...) is now sucesfully detected, and sector alignments are calculated correctly. Since its not possible with reasonable overhead to calculate sector addressing on Mixed-mode CDs (since the sector sizes can change with each track) safecopy does not attempt to raw read from these anymore. Device reset will still be attemted if necessary.
Device resets will now only be issued in case of previous read errors.
The "twaddle" system call is now supported on floppy drives to potentially aid in data resurrection.
Safecopy has now a manpage, which includes basically all the information of the README file in more standardized format.
Project by Corvus Corax (corvuscorax<at>cybertrench.com) - distributed under the GPL (v2 or higher)
Safe copying of files and partitions. Idea: A main problem with damaged storage hardware is, that once you get an unrecoverable IO error further reading from the file / device often fails until the file has been closed and re-opened. The normal copy tools like cat, cp or dd do not allow creation of an image file from a disk or CD-ROM once reading of a sector failed. Safecopy tries to get as much data from the source as possible, even resorting to device specific low level operations if applicable. This is achieved by identifying problematic or damaged areas, skipping over them and continuing reading afterwards. The corresponding area in the destination file is either skipped (on initial creation that means padded with zeros) or deliberately filled with a recognizable pattern to later find affected files on a corrupted device. Safecopy uses an incremental algorithm to identify the exact beginning and end of bad areas, allowing the user to trade minimum accesses to bad areas for thorough data resurrection. Multiple passes over the same file are possible, to first retrieve as much data from a device as possible with minimum harm, and then trying to retrieve some of the remaining data with increasingly aggressive read attempts. For this to work, the source device or file has to be seekable. For unseekable devices (like tapes) you can try to use an external script to execute a controlled skip over the damaged part for you. (For example by using "mt seek" and "mt tell" on an SCSI tape device) See the "-S <seekscript>" parameter for details. Performance and success of this tool depend extremely on the device driver, firmware and underlying hardware. Currently safecopy supports RAW access to CDROM drives to read data directly of a CD, bypassing some driver dependant error correction. This can speed up data retrieval from CDs and reduce system load during recovery, as well as sometimes increase the success rate. Safecopy uses the disc status syscall to determine sector size and addressing of CDs. This fails on mixed-mode or multi-session CDs, since the sector layout can change within the disk, but would still work on the the big majority of disks. Other disks can still be recovered using normal high level data access. Safecopy auto-detects the disk type involved when first attempting low level access. Some CD/DVD drives are known to cause the ATAPI bus to crash on errors, causing the device driver to freeze for times up to and beyond a minute per error. Try to avoid using such drives for media recovery. Using safecopys low level access features might help under some circumstances. Some drives can read bad media better than others. Be sure to attempt data recovery of CDs and DVDs on several different drives and computers. You can use safecopys incremental recovery feature to read previously unreadable sectors only. Different use cases: How do I... - resurrect a file from a mounted but damaged media, that copy will fail on: safecopy /path/to/problemfile ~/saved-file - create an filesystem image of a damaged disk/cdrom: safecopy /dev/device ~/diskimage - resurrect data as thoroughly as possible? (assuming a physical block size of 512 bytes) safecopy source dest -b 512 -f 512 -r 512 -R 8 -Z 2 (assuming logical misalignment of blocks to sectors) safecopy source dest -b 512 -f 512 -r 1 -R 8 -Z 2 - resurrect data as fast as possible, or - resurrect data with low risk of damaging the media further: (you can use even higher values for -f and -r) safecopy source dest -f 65536 -r 16384 -R 0 -Z 0 - resurrect some data fast, then read more data thoroughly later: (assuming a physical sector size of 512 bytes) safecopy source dest -b 512 -f 65536 -r 16384 -R 0 -Z 0 -o badblockfile safecopy source dest -b 512 -f 512 -r 512 -R 8 -Z 2 -I badblockfile - utilize some friends CD-ROM drives to complete the data from my damaged CD: safecopy /dev/mydrive imagefile <someoptions> -b <myblocksize> \ -o myblockfile; safecopy /dev/otherdrive imagefile <someoptions> -b <otherblocksize> \ -I myblockfile -i <myblocksize> -o otherblockfile; safecopy /dev/anotherdrive imagefile <someoptions> \ -b <anotherblocksize> -I otherblockfile -i <otherblocksize> - interrupt and later resume a data rescue operation: safecopy source dest <CTRL+C> (safecopy aborts) safecopy source dest -I /dev/null - interrupt and later resume a data rescue operation with correct badblocks output: safecopy source dest <options> -o badblockfile <CTRL+C> (safecopy aborts) mv badblockfile savedbadblockfile safecopy source dest -I /dev/null -o badblockfile cat badblockfile >>savedbadblockfile - find the corrupted files on a partially successful rescued file system: safecopy /dev/filesystem image -M CoRrUpTeD fsck image mount -o loop image /mnt/mountpoint grep -R /mnt/mountpoint "CoRrUpTeD" (hint: this might not find all affected files if the unreadable parts are smaller in size than your marker string) - exclude the previously known badblocks list of a filesystem from filesystem image creation: dumpe2fs -b /dev/filesystem >badblocklist safecopy /dev/filesystem image \ -X badblocklist -x <blocksize of your fs> - create an image of a device that starts at X and is Y in size: safecopy /dev/filesystem -b <bsize> -s <X/bsize> -l <Y/bsize> - combine two partial images of rescued data without access to the actual (damaged) source data: (This is a bit tricky. You need to get badblocks lists for both files somehow to make safecopy know where the missing data is. If you used the -M (mark) feature you might be able to automatically compute these, however this feature is not provided by safecopy. Lets assume you have two badblocks files. you have: image1.dat image1.badblocks (blocksize1) image2.dat image2.badblocks (blocksize2) The file size of image1 needs to be greater or equal to that of image2. (If not, swap them) ) cp image2.dat combined.dat safecopy image1.dat combined.dat -I image2.badblocks -i blocksize2 \ -X image1.badblocks -x blocksize1 (This gets you the combined data, but no output badblocklist. The resulting badblocks list would be the badblocks that are a: in both badblocks lists, or b: in image1.badblocks and beyond the file size of image2 It should be reasonably easy to solve this logic in a short shell script. One day this might be shipped with safecopy, until then consider this your chance to contribute to a random open source project.) - rescue data of a tape device: If the tape device driver supports lseek(), treat it as any file, otherwise utilize the "-S" option of safecopy with a to be self-written script to skip over the bad blocks. (for example using "mt seek") Make sure your tape device doesn't auto-rewind on close. Send me feedback if you had any luck doing so, so I can update this documentation. FAQ: Q: Why create this tool if there already is something like dd-rescue and other tools for that purpose? A: Because I didn't know of dd(-)rescue when I started, and I felt like it. Also I think safecopy suits the needs of a user in data loss peril better doe to more readable output and more understandable options than some of the other tools. (Then again I am biased. Compare them yourself.) Meanwhile safecopy supports low level features other tools don't. Q: What exactly does the -Z option do? A: Remember back in MS-DOS times when a floppy would make a "neek nark" sound 3 times every time when running into a read error? This happened when the BIOS or DOS disk driver moved the IO head to its boundaries to possibly correct small cylinder misalignment, before it tried again. Linux doesn't do that by default, neither do common CDROM drives or drivers. Nevertheless forcing this behaviour can increase your chance of reading bad sectors from a CD __BIG__ time. (Unlike floppies where it usually has little effect) Q: Whats my best chance to resurrect a CD that has become unreadable? A: Try making a backup image on many different computers and drives. The abilities to read from bad media vary extremely. I have a 6 year old Lite On CDRW drive, that even reads deeply and purposely scratched CDs (as in with my key, to make it unreadable) flawlessly. A CDRW drive of the same age at work doesn't read any data from that part of the CD at all, while most DVD and combo drives have bad blocks every couple hundred bytes. Make full use of safecopys RAW access features if applicable. (-L 2 option) As a general guideline: -CDRW drives usually do better than read-only CD drives. -CD only drives sometimes do better on CDs than DVD drives. -PC drives are sometimes better than laptop ones. -A drive with a clean lens does better than a dirtball. -Cleaning up CDs helps. -Unless you use chemicals. Q: Whats my best chance to resurrect a floppy that became unreadable? A: Again try different floppy drives. Keep in mind that it might be easier to further damage data on a bad floppy than on a CD. (Don't overdo read attempts) Q: What about BlueRay/HDDVD disks? A: Hell if I knew, but generally they should be similar to DVDs. It probably depends how the drives firmware acts up. Q: My hard drive suddenly has many bad sectors, what should I do? A: Avoid accessing bad areas as much as possible to prevent further damage, while rescuing the still good data. Accessing bad sectors will make the drive perform lots of error recovery in its own, leading to lots of physical movement, and potentially lockdown of more disk areas by the firmware. You could use smartmontools to check drive error statistic and details about whats wrong / internal error logs. If you have a list of affected blocks/sectors, write a badblocks file manually and use the -X option to prevent safecopy from accessing them altogether at first. (Syslog may list them, too) Then slowly do incremental recovery, start with a high fault skip (-f) and resolution (-r), set retry (-R) and Head recalibration (-Z) to 0. (Don't set the resolution lower than physical sector size if your driver does correct sector alignments.) Then decrease resolution and fault skip down to physical block size, increase the retry factor and at last try to add the -Z factor. (It probably won't help much on hard disks but its worth a try) If your drive stops responding, reboot, let it cool down for a while if necessary. (I heard from people who used ice-packs successfully as a last resort) !!! If the data is really important, go to a professional data recovery !!! specialist right away, before doing further damage to the drive Safecopy 1.2 by CorvusCorax Usage: safecopy [options] <source> <target> Options: -b <bytes> : Blocksize in bytes for default read operations. Set this to the physical sectorsize of your media. Default: Driver blocksize of input device, if determinable, otherwise 4096 -f <bytes> : Blocksize in bytes when skipping over badblocks. Higher settings put less strain on your hardware, But you might miss good areas in between two bad ones. Default: Blocksize as in -b times 16 -r <bytes> : Resolution in bytes when searching for the exact beginning or end of a bad area. If you read data directly from a device there is no need to set this lower than the hardware blocksize. On mounted filesystems however, read blocks and physical blocks could be misaligned. Smaller values lead to very thorough attempts to read data at the edge of damaged areas, but increase the strain on the damaged media. Default: Blocksize as in -b -R <number> : At least that many read attempts are made on the first bad block of a damaged area with minimum resolution. More retries can sometimes recover a weak sector, but at the cost of additional strain. Default: 3 -Z <number> : On each error, force seek the read head from start to end of the source device as often as specified. That takes time, creates additional strain and might not be supported by all devices or drivers. Default: 1 -L <mode> : Use low level device calls as specified: 0 Do not use low level device calls 1 Attempt low level device calls for error recovery only 2 Always use low level device calls if available Supported low level features in this version are: SYSTEM DEVICE TYPE FEATURE Linux cdrom/dvd bus/device reset Linux cdrom read sector in raw mode Linux floppy controller reset, twaddle Default: 1 --sync : Use synchronized read calls (disable driver buffering) Default: Asynchronous read buffering by the OS is allowed -s <blocks> : Start position where to start reading. Will correspond to position 0 in the destination file. Default: block 0 -l <blocks> : Maximum length of data to be read. Default: Entire size of input file -I <badblockfile> : Incremental mode. Assume the target file already exists and has holes specified in a badblockfile. It will be attempted to retrieve more data from the missing areas only. Default: none -i <bytes> : Blocksize to interpret the badblockfile given with -I. Default: Blocksize as specified by -b -X <badblockfile> : Exclusion mode. Do not attempt to read blocks in badblockfile. If used together with -I, excluded blocks override included blocks. Default: none -x <bytes> : Blocksize to interpret the badblockfile given with -X. Default: Blocksize as specified by -b -o <badblockfile> : Write a badblocks/e2fsck compatible bad block file. Default: none -S <seekscript> : Use external script for seeking in input file. (Might be useful for tape devices and similar). Seekscript must be an executable that takes the number of blocks to be skipped as argv1 (1-64) the blocksize in bytes as argv2 and the current position (in bytes) as argv3. Return value needs to be the number of blocks successfully skipped, or 0 to indicate seek failure. The external seekscript will only be used if lseek() fails and we need to skip over data. Default: none -M <string> : Mark unrecovered data with this string instead of skipping / zero-padding it. This helps in later finding affected files on file system images that couldn't be rescued completely. Default: none -h | --help : Show this text Description of output: . : Between 1 and 1024 blocks successfully read. _ : Read of block was incomplete. (possibly end of file) The blocksize is now reduced to read the rest. |/| : Seek failed, source can only be read sequentially. > : Read failed, reducing blocksize to read partial data. ! : A low level error on read attempt of smallest allowed size leads to a retry attempt. [xx](+yy){ : Current block and number of bytes continuously read successfully up to this point. X : Read failed on a block with minimum blocksize and is skipped. Unrecoverable error, destination file is padded with zeros. Data is now skipped until end of the unreadable area is reached. < : Successful read after the end of a bad area causes backtracking with smaller blocksizes to search for the first readable data. }[xx](+yy) : current block and number of bytes of recent continuous unreadable data. Copyright 2009, distributed under terms of the GPL