Restoring data from a broken disk
example: #
- broken disk: /dev/sdb
- was mounted on /mnt/faulty
- external usb disk: /dev/sde
- formatted in ext4 and mounted on /mnt/recovery_8TB
what happened: #
the system started giving errors on a non redundant disk:
Device: /dev/sdb [SAT], 1025 Currently unreadable (pending) sectors
Device: /dev/sdb [SAT], 255 Offline uncorrectable sectors
Device: /dev/sdb [SAT], ATA error count increased from 306 to 903
trying to rsync the files to another disk led to errors and fails
data recovery with ddrescue: #
ddrescue can read the faulty disk to an image (or to another disk) while selecting different settings for aggressivity
we want to create a full disk image on the first pass, avoiding add too much stress on an already faulty drive,
so we'll tune our ddrescue settings to reflect that, for example:
ddrescue --idirect --no-scrape /dev/sdx sdx.img sdx.log
you see some options and settings:
- --idirect: skip os caches and work directly on the device
- --no-scrape: avoids stressing the disk
- /dev/sdb: the faulty device
- sdb.img: target disk image
- sdb.log: targe disk
the sdx.log file is very important beacause it maps the bad parts of the disk, so in a second pass we can void rescanning the whole drive but we can just insist on the faulty parts
in the subsequent passes we can omit "--idirect" since we already have a full disk image and we want to recover as much as we can, even breaking the drive:
ddrescue --idirect -r3 /dev/sdx sdx.img sdx.log
the only difference with the first pass has been removing "--no-scrape" and adding "-r3", which is the number of retries we want to do on a bad sector
actual data recovery: #
umount /dev/sdb
we are working on the external sub drive, at /mnt/recovery_8TB:
first pass:
ddrescue --idirect --no-scrape /dev/sdb sdb.img sdb.log
output:
Current status
ipos: 60025 MB, non-trimmed: 131072 B, current rate: 166 MB/s
opos: 60025 MB, non-scraped: 0 B, average rate: 48489 kB/s
ipos: 1781 GB, non-trimmed: 0 B, current rate: 13312 B/s
opos: 1781 GB, non-scraped: 1213 kB, average rate: 72448 kB/s
non-tried: 0 B, bad-sector: 93184 B, error rate: 170 B/s
rescued: 2000 GB, bad areas: 145, run time: 7h 39m 25s
pct rescued: 99.99%, read errors: 306, remaining time: 1m 35s
time since last successful read: 0s
Finished
second pass:
ddrescue --idirect -r3 /dev/sdb sdb.img sdb.log
output:
Current status
ipos: 1781 GB, non-trimmed: 0 B, current rate: 0 B/s
opos: 1781 GB, non-scraped: 0 B, average rate: 79 B/s
non-tried: 0 B, bad-sector: 524800 B, error rate: 170 B/s
rescued: 2000 GB, bad areas: 182, run time: 2h 44m 23s
pct rescued: 99.99%, read errors: 3924, remaining time: n/a
time since last successful read: 30m 35s
Finished
transferring recovered data: #
now we can move the data from the image file we created to another mountpoint,
to do that we need to mount the image file
detect partitions on the image file:
kpartx -av sdb.img
look what your loop device looks like:
root@machine:/mnt/recovery_8TB# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1,8T 0 loop
`-loop0p1 253:0 0 1,8T 0 part
mount the image in ReadOnly mode:
mount -o loop,ro /dev/mapper/loop0p1 /mnt/recovery_8TB/restored_volume
rsync the data somewhere else
rsync -avh --progress /mnt/recovery_8TB/restored_volume/ /mnt/recovery_8TB/recovered_data/
compare data: #
tree -pugsDx -o tree_faulty.txt -a /mnt/faulty/
tree -pugsDx -o tree_restored.txt -a /mnt/recovery_8TB/recovered_data/
diff tree_*
tree:
- p: Print the protections for each file.
- u: Displays file owner or UID number.
- g: Displays file group owner or GID number.
- s: Print the size in bytes of each file.
- D: Print the date of last modification
- x: do not traverse filesystems
sources:
- https://superuser.com/questions/905811/faster-recovery-from-a-disk-with-bad-sectors$0
- https://www.technibble.com/guide-using-ddrescue-recover-data/$0
- Previous: Proxmox LXC Containers