Vibration and Strange SATA Performance


Almost two years ago I blogged about a strange performance problem with SATA disks [1]. The problem was that certain regions of a disk gave poor linear read performance on some machines, but performed well on machines which appeared to be identical. I discovered what the problem was shortly after that but was prevented from disclosing the solution due to an SGI NDA. The fact that SGI now no longer exists as a separate company decreases my obligations under the NDA. The fact that the sysadmins of the University of Toronto published all the most important data entirely removes my obligations in this regard [2].

In their Wiki they write “after SGI installed rubber grommits around the 5 or 6 tiny fans in the xe210 nodes, the read and write plots now look like” and then some graphs showing good disk performance appear.

The problem was that a certain brand and model of disk was particularly sensitive to vibrations. When that model of disk was installed in some machines then the vibrations would interfere with disk reads. It seems that there was some sort of harmonic frequency between the vibration of the disk and that of the cooling fans which explains why some sections of the disk were read slowly and some gave normal performance (my previous post has the graphs which show a pattern). Some other servers of the same make and model didn’t have that problem, so it seemed that some slight manufacturing differences in the machines determined whether the vibration would affect the disk performance.

One thing that I’ve been meaning to do is to test the performance of disks while being vibrated. I was thinking of getting a large bass speaker, a powerful amplifier, and using the sound hardware in a PC to produce a range of frequencies. Then having the hard disk securely attached to a piece of plywood which would be in front of the speaker. But as I haven’t had time to do this over the last couple of years it seems unlikely that I will do it any time soon. Hopefully this blog post will inspire someone to do such tests. One thing to note if you want to do this is that it’s quite likely to damage the speaker, powerful bass sounds that are sustained can melt parts of the coil in a speaker. So buy a speaker second-hand.

If someone in my region (Melbourne) wants to try this then I can donate some old IDE disks. I can offer advice on how to run the tests for anyone who is interested.

Also it’s worth considering that systems which make less noise might deliver better performance.

8 thoughts on “Vibration and Strange SATA Performance”

  1. Kieran Morrissey says:

    I recently saw this problem writ large with a series of 1TB SATA drives in a server that had a fan with a dodgy bearing. The bearing was letting out a screaming noise that was barely noticeable above the roar of the datacentre but I suspect had a significant component that was above human hearing range but in a range to which drives would be particularly susceptible.

    At first one drive started showing SMART issues (Offline sectors), then claimed bad sectors, then the next one in the same cage started showing similar issues. Read performance was noticeably below that of identical drives in another cage, but write performance was *more than an order of magnitude worse*.

    In the end iostat was showing await figures of >25000 for the drives in the problematic cage and we had to scale back what was running on the box substantially.

    It’s only that I’d seen this video that gave me the hint:

    .. the instant I shut off the problematic fan the drives’ performance became identical, and the ‘bad sectors’ proved to be an artefact of the fan noise.

    As you did, I noticed that some regions of the drives were mostly fine, whereas others were useless (basically the first 80GB was okay, the next 150GB was passable, but the remainder was pathetic). To stab in the dark, this is probably standing waves on the drive platter or along the head arm.

    I suspect bass may not be the issue and you might get more interesting results with frequencies up around 7KHz or higher. Remember that like a CD mech, a drive head has to track with any wobble in the platter, so anything much below the drive’s rotational speed in terms of frequency will probably be cancelled out mechanically by the drive. Also plywood may absorb and soften the vibrations. A rigid metal cage that will transfer vibrations directly onto the drive chassis might be more telling (the cage we had an issue with is rigid, with no grommets).

  2. Don Marti says:

    I do know someone who got crummy benchmark results with drives mounted on sleds in a flimsy enclosure, and much better results when he permanently mounted the drives to a heavier enclosure. From a desktop POV, this has to have some effect on the sound level attributed to the drive, too.

    (If you want a speaker for testing this stuff, you might check the used section at a music store for an old bass amp with an effects loop — pull the heavy-duty driver (speaker) out and mount it to your enclosure, then run line-level test tones in to the effects return.)

  3. Kieran Morrissey says:

    Whoops, left brain obviously not operative last night, 7200rpm being 120Hz of course so bass might well have an effect.

  4. Robin Humble says:

    the next batch of SGI (supermicro) nodes we got were xe310’s which we got with one better quality SATA disk (seagate NS, no WCE) in each node. this time there was ok rubber grommit vibration isolation for the fan assemblies. sadly we still got bad disk performance! there were 2 issues

    a) vibration again – there are 3 pairs of small fans in each 1/2 of the dual node. one of the pair runs at ~11k rpm, and the one behind which gets pre-accelerated air spins at ~14k rpm (~2x 7200 rpm). argh. in BIOS I turned the fans down from their ‘full speed’ setting to ‘server speed’ which varies with load, but doesn’t get up as high as 14k rpm. this got the fans away from the resonance and the disk performance and variability improved a lot. again, it wasn’t every disk in every node that had this problem, but resetting all the BIOS’s (156 of them!! sigh. thank god for ipmi) was the solution.

    b) controllers/drivers – the seagate drives still didn’t perform nearly as well with the intel SATA on-board controllers (5400 chipset) as they did in a few other desktops with random sata chipsets in them. again the solution was in BIOS. as they shipped, ata_piix was the Linux SATA driver needed which I thought was ok. however, once I turned on ‘SATA RAID’ which is ‘Intel RAID’ in the BIOS (but presumably just in JBOD mode), then Linux could use the ahci driver, and drive write performance increased by about 2x. NCQ enabled was the only user-space visible difference between the two BIOS/driver combos, but there’s absolutely no way that should give a 2x anything improvement, so I still don’t know what really happened there…

  5. Robin Humble says:

    oops, s/xe310/xe320/

    BTW, this isn’t really about SATA or SGI, just disks and vibration. I expect some more enterprisey disks are a bit more hardened against vibration, but they all will suffer at some level. also, we are very happy with SGI as a vendor otherwise we wouldn’t have bought the xe320 based machine from them – they are generally extremely helpful and excellent with parts replacement.

  6. etbe says:
    The above Spanish blog post notes a similar vibration problem when burning CDs.

  7. Pete says:

    So would installation of really effective data centre and server sound proofing increase performance of servers in the data centre overall?

    Would be an interesting experiment…

  8. etbe says:

    Pete: I have not seen any evidence to show that sound through the air can impact performance – at least not any volume of sound that wouldn’t hurt humans in the area. So it seems that the issue is just having solid racks that don’t transmit vibrations much – or at least racks that have harmonic frequencies that don’t support transmitting the problematic vibrations.

    It might do some good to rearrange servers in a rack. The harmonics of a rack that has the bottom half full will be quite different to a rack that has every second position occupied all the way up.

Comments are closed.