Checking all hard drives for errors
First of all, I fired up my trusty SysRescueCD USB stick.
Start with getting some SMART information:
smartctl -a /dev/sda
Perform a short test - takes a few minutes only:
smartctl -t short /dev/sda
smartctl -l selftest /dev/sda
General health report:
smartctl -H /dev/sda
My external USB 3.0 hard drive was not automatically recognized; I guessed that it was
smartctl -d scsi -a /dev/sdc smartctl -t short -d scsi /dev/sdc smartctl -l selftest -d scsi /dev/sdc smartctl -H -d scsi /dev/sdc
This took the longest; around 1.5h for a 250GB hard drive:
badblocks -b 4096 -c 4096 -s -v /dev/sda
An additional filesystem check - make sure that the partitions are not mounted.
-f switch, this took a minute or so per partition.
fdisk -l /dev/sda fsck -fV /dev/sda1 fsck -fV /dev/sda3 fsck -fV /dev/sda4
It did find some errors on one partition, which it asked me to fix, and I answered yes. Alarmed by this, I re-ran the SMART test on that hard drive, a bit more thorought his time. First, let's see how long this will take:
smartctl -c /dev/sdb
Over an hour, ugh. Nevertheless:
smartctl -t long /dev/sdb
But only after a few minutes,
smartctl -l selftest /dev/sdb tells me that the extended test completed without error. Hmmm :-|
I guess this will have to do for now.
Rinse and repeat for all hard drives; make sure they're not mounted.
Some helpful links:
What if Problems Are Found?
# fsck -fV /dev/sdc1 fsck from util-linux 2.32 [/usr/bin/fsck.ext4 (1) -- /home/backup] fsck.ext4 -f /dev/sdc1 e2fsck 1.44.1 (24-Mar-2018) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity /lost+found not found. Create<y>? yes Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: -(131366912--131371007) Fix<y>? yes Free blocks count wrong for group #4009 (10240, counted=32768). Fix<y>? yes Free blocks count wrong for group #4046 (28672, counted=32768). Fix<y>? yes Free blocks count wrong (123602180, counted=123628804). Fix<y>? yes recovery+backup: ***** FILE SYSTEM WAS MODIFIED ***** recovery+backup: 205354/61046784 files (1.8% non-contiguous), 120553212/244182016 blocks # fsck -fV /dev/sdc1 fsck from util-linux 2.32 [/usr/bin/fsck.ext4 (1) -- /home/backup] fsck.ext4 -f /dev/sdc1 e2fsck 1.44.1 (24-Mar-2018) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information recovery+backup: 205354/61046784 files (1.8% non-contiguous), 120553212/244182016 blocks
I had a look in
lost+found, and it was empty. I assume that means no data loss, and this chapter of another useful article seems to confirm the assumption.
Nevertheless, this partitions hosts my backups, so i want to be very sure:
borg check --info --verify-data /path/to/borgbackupdir Starting repository check Starting repository index check Completed repository check, no problems found. Starting archive consistency check... Starting cryptographic data integrity verification... Finished cryptographic data integrity verification, verified 74519 chunks with 0 integrity errors. Analyzing archive 201709030041 (1/9) Analyzing archive 201709031032 (2/9) Analyzing archive 201709031221 (3/9) Analyzing archive 201709091605 (4/9) Analyzing archive 201709161743 (5/9) Analyzing archive 201709240050 (6/9) Analyzing archive 201709301658 (7/9) Analyzing archive 201710071610 (8/9) Analyzing archive 201710141636 (9/9) Archive consistency check complete, no problems found.
This is very slow, for a daily or weekly check one might want to remove