Setting up smartmontools and hddtemp to analyse and monitor storage devices.
What is SMART?
Self-Monitoring, Analysis, and Reporting Technology (S.M.A.R.T.) is a supplementary component build into many modern storage devices through which devices monitor, store, and analyse the health of their operation.
Installation on Debian
# apt-get update && apt-get install smartmontools hddtemp
Basic smartctl Usage
Smartctl is a command line utility designed to perform SMART tasks such as printing the SMART self-test and error logs, enabling and disabling SMART automatic testing, and initiating device self-tests.
Scan for Devices
# smartctl --scan /dev/sda -d scsi # /dev/sda, SCSI device /dev/sdb -d scsi # /dev/sdb, SCSI device
Scan and Try to Open Devices
# smartctl --scan-open /dev/sda -d sat # /dev/sda [SAT], ATA device /dev/sdb -d sat # /dev/sdb [SAT], ATA device
Check for SMART support for Device
# smartctl -i /dev/sda | grep support SMART support is: Available - device has SMART capability. SMART support is: Enabled
Show Identity Information for Device
# smartctl -i /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Scorpio Blue Serial ATA
Device Model: WDC WD3200BEVT-75ZCT1
Serial Number: WD-WXE608NN5731
LU WWN Device Id: 5 0014ee 2ac77d26d
Firmware Version: 11.01A11
User Capacity: 320,072,933,376 bytes [320 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Tue Apr 29 22:18:03 2014 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
# smartctl -i /dev/sdb
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: SandForce Driven SSDs
Device Model: OCZ-VERTEX2
Serial Number: OCZ-SACTRPJY59L43FH3
LU WWN Device Id: 5 e83a97 f02ec6a4d
Firmware Version: 1.37
User Capacity: 50,020,540,416 bytes [50.0 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 6
Local Time is: Tue Apr 29 22:18:04 2014 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Show SMART Capabilities
# smartctl -c /dev/sda
Show SMART Health Status
# smartctl -H /dev/sda
Perform a Short Test
Run a short test (normally takes several minutes) on /dev/sda
:
# smartctl -t short /dev/sda
Show Selftest Log for Device
# smartctl -l selftest /dev/sda
Show Error Log for Device
# smartctl -l error /dev/sda
Get Device’s Temperature
# smartctl -a /dev/sda | grep Temp | cut -d" " -f 2,37 Temperature_Celsius 42
# smartctl -a /dev/sdb | grep Temp | cut -d" " -f 2,37 Temperature_Celsius 30
Configure SMART Disk Monitoring Daemon
Backup the default /etc/default/smartmontools
configuration file first:
# cp /etc/default/smartmontools /etc/default/smartmontools.backup
Start smartd on system startup:
# echo "start_smartd=yes" > /etc/default/smartmontools
Also backup the default /etc/smartd.conf
configuration file:
# cp /etc/smartd.conf /etc/smartd.conf.backup
Create a script to execute if any disk problems are detected:
# touch /root/script.sh
Open the file and add the following lines:
#!/bin/bash # Save the email message (STDIN) to a file: cat > /root/smartdmsd # Append the output of smartctl -a to the message: /usr/sbin/smartctl -a -d $SMART_DEVICETYPE $SMARTD_DEVICE >> /root/smartdmsg # Now email the message to the user at address ADD: /usr/bin/mail -s "$SMARTD_SUBJECT" $SMARTD_ADDRESS < /root/smartdmsg
Make sure the paths defined are correct. Double-check if unsure:
# which smartctl mail /usr/sbin/smartctl /usr/bin/mail
Make the script file executable:
# chmod 0755 /root/script.sh
Make the smartd configuration file empty:
# >/etc/smartd.conf
Now add the following lines to /etc/smartd.conf
:
/dev/sda -a -d sat -o on -S on -s (S/../.././19) -m root -M exec /root/script.sh /dev/sdb -a -d sat -o on -S on -s (S/../.././19) -m root -M exec /root/script.sh
These two lines set up monitoring for /dev/sda
and /dev/sdb
disks.
- -a : turns on -H, -f, -t, -l selftest, -l error, -C 197, -U 198.
- -d sat : the device type is SCSI to ATA Translation (SAT).
- -o on : enables SMART Automatic Offline Testing when smartd starts up.
- -S on : enables Attribute Autosave when smartd starts up.
- -s (S/../.././19) : run Short Self-Test (S) every day between 7-8pm.
- -m root : send a warning email to root.
- -M exec /root/script.sh : run the executable
/root/script.sh
instead of the default mail command, when smartd needs to send email.
Our root email address is set under /etc/aliases
. It is recommended to use “-M test” first which sends a single test email immediately upon smartd startup. This allows one to verify that email is delivered correctly.
Restart the smartmontools daemon:
# /etc/init.d/smartmontools restart [ ok ] Restarting S.M.A.R.T. daemon: smartd.
For troubleshooting, check syslog:
# tail -f /var/log/syslog
Monitor Hard Drive Temperature with hddtemp
The hddtemp utility can monitor hard drive temperature by reading S.M.A.R.T. information on drives that support this feature.
# hddtemp /dev/sda /dev/sdb /dev/sda: WDC WD3200BEVT-75ZCT1: 41°C /dev/sdb: OCZ-VERTEX2: 30°C
Thank you very much for this great introduction and getting started to SMART monitoring. I just followed your steps and even cut and paste for some of the config and script files. I just finished a little troubleshooting and wanted to mention I think I found 2 typos in the root script.sh.
“cat > /root/smartdmsd”, I think should be “cat > /root/smartdmsg” (the d at the end should be g).
and
“/usr/sbin/smartctl -a -d $SMART_DEVICETYPE $SMARTD_DEVICE >> /root/smartdmsg”, I believe there is a “D” missing in $SMART_DEVICETYPE, should be $SMARTD_DEVICETYPE.
I may be wrong, I’m very new to this, and thank you again for your time and effort so others of us can get SMARTer.
Hi David. You’re right, thanks very much for spotting them. I fixed the article.
If you don’t make mistakes then you don’t learn I guess :)
Hi Tomas,
Thank you so much for this extremely useful tutorial. I followed it through, but not entirely sure if I got everything right. In the script, I replaced “$SMARTD_SUBJECT” $SMARTD_ADDRESS with a subject of my choosing and an email address that I want the report delivered to. Is that correct?
Also, kind of touching on the same subject, I’m not quite sure what to do with the recommendation “to use “-M test” first which sends a single test email immediately upon smartd startup.” Where do I apply that? In the configuration file /etc/smartd.conf? Sorry for my ignorance, I’m still figuring this out…
Best,
Jakob
Tomas, never mind! I found out. Also changed the script back the way it was… Thanks a lot again!
You’re welcome.
great post thanks.. got me started fast, one comment though since i have a ton of drives I changed the config to :
DEFAULT -a -d sat -o on -S on -s (S/../.././19) -m root -M exec /root/script.sh
DEVICESCAN
saved me a lot of lines and smartmontools daemon does not fail on a missing/removed drive
Nicely done, thanks for the tip.