TechSomething

Restoring data from a broken disk

example: #

what happened: #

the system started giving errors on a non redundant disk:

Device: /dev/sdb [SAT], 1025 Currently unreadable (pending) sectors
Device: /dev/sdb [SAT], 255 Offline uncorrectable sectors
Device: /dev/sdb [SAT], ATA error count increased from 306 to 903

trying to rsync the files to another disk led to errors and fails

data recovery with ddrescue: #

ddrescue can read the faulty disk to an image (or to another disk) while selecting different settings for aggressivity

we want to create a full disk image on the first pass, avoiding add too much stress on an already faulty drive,
so we'll tune our ddrescue settings to reflect that, for example:

ddrescue --idirect --no-scrape /dev/sdx sdx.img sdx.log

you see some options and settings:

the sdx.log file is very important beacause it maps the bad parts of the disk, so in a second pass we can void rescanning the whole drive but we can just insist on the faulty parts

in the subsequent passes we can omit "--idirect" since we already have a full disk image and we want to recover as much as we can, even breaking the drive:

ddrescue --idirect -r3 /dev/sdx sdx.img sdx.log

the only difference with the first pass has been removing "--no-scrape" and adding "-r3", which is the number of retries we want to do on a bad sector

actual data recovery: #

umount /dev/sdb

we are working on the external sub drive, at /mnt/recovery_8TB:

first pass:

ddrescue --idirect --no-scrape /dev/sdb sdb.img sdb.log

output:

Current status
     ipos:   60025 MB, non-trimmed:   131072 B,  current rate:    166 MB/s
     opos:   60025 MB, non-scraped:        0 B,  average rate:  48489 kB/s
     ipos:    1781 GB, non-trimmed:        0 B,  current rate:   13312 B/s
     opos:    1781 GB, non-scraped:    1213 kB,  average rate:  72448 kB/s
non-tried:        0 B,  bad-sector:    93184 B,    error rate:     170 B/s
  rescued:    2000 GB,   bad areas:      145,        run time:  7h 39m 25s
pct rescued:   99.99%, read errors:      306,  remaining time:      1m 35s
                              time since last successful read:          0s
Finished

second pass:

ddrescue --idirect -r3 /dev/sdb sdb.img sdb.log

output:

Current status
     ipos:    1781 GB, non-trimmed:        0 B,  current rate:       0 B/s
     opos:    1781 GB, non-scraped:        0 B,  average rate:      79 B/s
non-tried:        0 B,  bad-sector:   524800 B,    error rate:     170 B/s
  rescued:    2000 GB,   bad areas:      182,        run time:  2h 44m 23s
pct rescued:   99.99%, read errors:     3924,  remaining time:         n/a
                              time since last successful read:     30m 35s
Finished

transferring recovered data: #

now we can move the data from the image file we created to another mountpoint,
to do that we need to mount the image file

detect partitions on the image file:

kpartx -av sdb.img

look what your loop device looks like:

root@machine:/mnt/recovery_8TB# lsblk
NAME      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0       7:0    0   1,8T  0 loop  
`-loop0p1 253:0    0   1,8T  0 part  

mount the image in ReadOnly mode:

mount -o loop,ro /dev/mapper/loop0p1 /mnt/recovery_8TB/restored_volume

rsync the data somewhere else

rsync -avh --progress /mnt/recovery_8TB/restored_volume/ /mnt/recovery_8TB/recovered_data/

compare data: #

tree -pugsDx -o tree_faulty.txt -a /mnt/faulty/
tree -pugsDx -o tree_restored.txt -a /mnt/recovery_8TB/recovered_data/
diff tree_*

tree:

sources: