WD, Green drives and the dreaded LCC
I've been care-free for the last one and a half years, thanks to FreeNAS and a home-brew 4-disk ZFS setup. That is, until some weeks ago when I got a DEGRADED pool warning due to a read error on one of the disks.
Re-silvering took some minutes, without any loss of data, but when I started digging around I uncovered the sad truth about WD IntelliPower and LCC, or Load Cycle Count. In short, WD advocates the use of a mildly-aggresive power management feature in their desktop-class drives. I have 4 EARX, Green, desktop-class drives. The result:
9 Power_On_Hours 14246 12 Power_Cycle_Count 62 193 Load_Cycle_Count 887606
62 on/offs on the NAS machine in the last one and a half years. 593 days, to be exact (or 14246 hours). Heads were parked a total number of 887606 times, or about once every minute. Which is fortunate, considering that IntelliPower tries to park the heads after 8 seconds of inactivity.Â
Sadly, WD rates Green drives for 300000 load cycles. Reds get a better rating, being server-class and all. At 600000. I was twice over the expected rating for my disks, and then some.
Panic ensues. Backups are quickly refreshed. And a long exchange of mails with WD support begins on RMAing the drives.
To cut the long story short, WD offers an advanced RMA option, where one can submit a credit card to cover the dispatch of replacement drives in advance. It took us a month to realize that some advance security features on the standard credit card I use was the cause of problems with the card clearing service they use. After switching cards it took just one week for the recertified drives to arrive.
I am currently resilvering #4 of the new drives. There were no surprises while exchanging each old disk with its replacement one; it takes about 10 hours to resilver ~5.5TB of data. Process is essentially one-click, with FreeNAS offering a Replace button on a missing disk to kickstart the rebuild.
I've also been running a simple script in the background, that keeps reading one KB of data from each raw disk device every 7 seconds to keep IntelliPower from kicking in. As soon as all disks are safely replaced I'll be running either of the following two utilities to increase the default timeout to 2 minutes:
Always backup your NAS. No matter what. Especially if you've walked into a store and bought 4 desktop-class disks to put into your shiny new N40L enclosure. Those were probably manufactured within seconds of each other. Most internal parts were almost certainly molded in the same batch! And you'll be running them in almost the same way, in the same machine, day and night. Hmm.
Follow the manufacturer's advice, wrt. suggested use of their products.
Having said that, don't trust the manufacturer. Always verify your assumptions with data from the actual setup. And always LISTEN to the machine during its first hours in use. I always get that crazy-person-do-avoid-him-look when I put my ear against a machine that misbehaves. Each disk is singing you his story. Try to listen.
IntelliPower is shit, WD, green energy and all. If you've been parking the heads every 8 seconds, for the past DAY, you should have some fail-safe in the firmware to reconsider that timeout.
Kudos to WD support staff (David, in particular) and their no-questions-asked replacement policy.