<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>etbe - Russell Coker &#187; Benchmark</title>
	<atom:link href="http://etbe.coker.com.au/category/benchmark/feed/" rel="self" type="application/rss+xml" />
	<link>http://etbe.coker.com.au</link>
	<description>Linux, politics, and other interesting things</description>
	<lastBuildDate>Wed, 08 Feb 2012 13:24:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>DRBD Benchmarking</title>
		<link>http://etbe.coker.com.au/2012/01/05/drbd-benchmarking/</link>
		<comments>http://etbe.coker.com.au/2012/01/05/drbd-benchmarking/#comments</comments>
		<pubDate>Thu, 05 Jan 2012 08:31:32 +0000</pubDate>
		<dc:creator>etbe</dc:creator>
				<category><![CDATA[Benchmark]]></category>

		<guid isPermaLink="false">http://etbe.coker.com.au/?p=3112</guid>
		<description><![CDATA[I&#8217;ve got some performance problems with a mail server that&#8217;s using DRBD so I&#8217;ve done some benchmark tests to try and improve things. I used Postal for testing delivery to an LMTP server [1]. The version of Postal I released a few days ago had a bug that made LMTP not work, I&#8217;ll release a [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve got some performance problems with a mail server that&#8217;s using DRBD so I&#8217;ve done some benchmark tests to try and improve things. I used <a href="http://doc.coker.com.au/projects/postal/">Postal for testing delivery to an LMTP server [1]</a>. The version of Postal I released a few days ago had a bug that made LMTP not work, I&#8217;ll release a new version to fix that next time I work on Postal &#8211; or when someone sends me a request for LMTP support (so far no-one has asked for LMTP support so I presume that most users don&#8217;t mind that it&#8217;s not yet working).</p>
<p>The local spool on my test server is managed by Dovecot, the Dovecot delivery agent stores the mail and the Dovecot POP and IMAP servers provide user access. For delivery I&#8217;m using the LMTP server I wrote which has been almost ready for GPL release for a couple of years. All I need to write is a command-line parser to support delivery options for different local delivery agents. Currently my LMTP server is hard-coded to run /usr/lib/dovecot/deliver and has it&#8217;s parameters hard-coded too. As an aside if someone would like to contribute some GPL C/C++ code to convert a string like &#8220;<b>/usr/lib/dovecot/deliver -e -f %from% -d %to% -n</b>&#8221; into something that will populate an argv array for execvp() then that would be really appreciated.</p>
<p>Authentication is to a MySQL server running on a fast P4 system. The MySQL server was never at any fraction of it&#8217;s CPU or disk IO capacity so using a different authentication system probably wouldn&#8217;t have given different results. I used MySQL because it&#8217;s what I&#8217;m using in production. Apart from my LMTP server and the new version of Postal all software involved in the testing is from Debian/Squeeze.</p>
<h3>The Tests</h3>
<p>All tests were done on a 20G IDE disk. I started testing with a Pentium-4 1.5GHz system with 768M of RAM but then moved to a Pentium-4 2.8GHz system with 1G of RAM when I found CPU time to be a bottleneck with barrier=0. All test results are for the average number of messages delivered per minute for a 19 minute test run where the first minute&#8217;s results are discarded. The delivery process used 12 threads to deliver mail.</p>
<table>
<tr>
<th></th>
<th>P4-1.5</th>
<th>p4-2.8</th>
</tr>
<tr>
<td>Default Ext4</td>
<td>1468</td>
<td>1663</td>
</tr>
<tr>
<td>Ext4 max_batch_time=30000</td>
<td>1385</td>
<td>1656</td>
</tr>
<tr>
<td>Ext4 barrier=0</td>
<td>1997</td>
<td>2875</td>
</tr>
<tr>
<td>Ext4 on DRBD no secondary</td>
<td>1810</td>
<td>2409</td>
</tr>
</table>
<p>When doing the above tests the 1.5GHz system was using 100% CPU time when the filesystem was mounted with barrier=0, about half of that was for system (although I didn&#8217;t make notes at the time). So the testing on the 1.5GHz system showed that increasing the Ext4 max_batch_time number doesn&#8217;t give a benefit for a single disk, that mounting with barrier=0 gives a significant performance benefit, and that using DRBD in disconnected mode gives a good performance benefit through forcing barrier=0. As an aside I wonder why they didn&#8217;t support barriers on DRBD given all the other features that they have for preserving data integrity.</p>
<p>The tests with the 2.8GHz system demonstrate the performance benefits of having adequate CPU power, as an aside I hope that Ext4 is optimised for multi-core CPUs because if a 20G IDE disk needs a 2.8GHz P4 then modern RAID arrays probably require more CPU power than a single core can provide.</p>
<p>It&#8217;s also interesting to note that a degraded DRBD device (where the secondary has never been enabled) only gives 84% of the performance of /dev/sda4 when mounted with barrier=0.</p>
<table>
<tr>
<th></th>
<th>p4-2.8</th>
</tr>
<tr>
<td>Default Ext4</td>
<td>1663</td>
</tr>
<tr>
<td>Ext4 max_batch_time=30000</td>
<td>1656</td>
</tr>
<tr>
<td>Ext4 min_batch_time=15000,max_batch_time=30000</td>
<td>1626</td>
</tr>
<tr>
<td>Ext4 max_batch_time=0</td>
<td>1625</td>
</tr>
<tr>
<td>Ext4 barrier=0</td>
<td>2875</td>
</tr>
<tr>
<td>Ext4 on DRBD no secondary</td>
<td>2409</td>
</tr>
<tr>
<td>Ext4 on DRBD connected C</td>
<td>1575</td>
</tr>
<tr>
<td>Ext4 on DRBD connected B</td>
<td>1428</td>
</tr>
<tr>
<td>Ext4 on DRBD connected A</td>
<td>1284</td>
</tr>
</table>
<p>Of all the options for batch times that I tried it seemed that every change decreased the performance slightly but as the greatest decrease in performance was only slightly more than 2% it doesn&#8217;t matter much.</p>
<p>One thing that really surprised me was the test results from different replication protocols. The <a href="http://www.drbd.org/users-guide/s-replication-protocols.html">DRBD replication protocols are documented here [2]</a>. Protocol C is fully synchronous &#8211; a write request doesn&#8217;t complete until the remote node has it on disk. Protocol B is memory synchronous, the write is complete when it&#8217;s on a local disk and in RAM on the other node. Protocol A is fully asynchronous, a write is complete when it&#8217;s on a local disk. I had expected protocol A to give the best performance as it has lower latency for critical write operations and for protocol C to be the worst. My theory is that DRBD has a performance bug for the protocols that the developers don&#8217;t recommend.</p>
<p>One other thing I can&#8217;t explain is that according to iostat the data partition on the secondary DRBD node had almost 1% more sectors written than the primary and the number of writes was more than 1% greater on the secondary. I had hoped that with protocol A the writes would be combined on the secondary node to give a lower disk IO load.</p>
<p><a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=654206">I filed Debian bug report #654206 about the kernel not exposing the correct value for max_batch_time</a>. The fact that no-one else has reported that bug (which is in kernels from at least 2.6.32 to 3.1.0) is an indication that not many people have found it useful.</p>
<h3>Conclusions</h3>
<p>When using DRBD use protocol C as it gives better integrity and better performance.</p>
<p>Significant CPU power is apparently required for modern filesystems. The fact that a <a href="http://support.dell.com/support/edocs/storage/7j376/specs.htm">Maxtor 20G 7200rpm IDE disk [3]</a> can&#8217;t be driven at full speed by a 1.5GHz P4 was a surprise to me.</p>
<p>DRBD significantly reduces performance when compared to a plain disk mounted with barrier=0 (for a fair comparison). The best that DRBD could do in my tests was 55% of native performance when connected and 84% of native performance when disconnected.</p>
<p>When comparing a cluster of cheap machines running DRBD on RAID-1 arrays to a single system running RAID-6 with redundant PSUs etc the performance loss from DRBD is a serious problem that can push the economic benefit back towards the single system.</p>
<p>Next I will benchmark DRBD on RAID-1 and test the performance hit of using bitmaps with Linux software RAID-1.</p>
<p>If anyone knows how to make a HTML table look good then please let me know. It seems that the new blog theme that I&#8217;m using prevents borders.</p>
<p><b>Update:</b></p>
<p>I mentioned my Debian bug report about the mount option and the fact that it&#8217;s all on Debian/Squeeze.</p>
<ul>
<li>[1]<a href="http://doc.coker.com.au/projects/postal/"> http://doc.coker.com.au/projects/postal/</a></li>
<li>[2]<a href="http://www.drbd.org/users-guide/s-replication-protocols.html"> http://www.drbd.org/users-guide/s-replication-protocols.html</a></li>
<li>[3]<a href="http://support.dell.com/support/edocs/storage/7j376/specs.htm"> http://support.dell.com/support/edocs/storage/7j376/specs.htm</a></li>
</ul>
<p>Related posts:</p><ol>
<li><a href='http://etbe.coker.com.au/2009/03/09/i-need-an-lmtp-server/' rel='bookmark' title='I need an LMTP server'>I need an LMTP server</a> <small>I am working on a system where a front-end mail...</small></li>
<li><a href='http://etbe.coker.com.au/2011/12/17/drbd-notes/' rel='bookmark' title='Some Notes on DRBD'>Some Notes on DRBD</a> <small>DRBD is a system for replicating a block device across...</small></li>
<li><a href='http://etbe.coker.com.au/2007/04/26/paper-about-zcav/' rel='bookmark' title='paper about ZCAV'>paper about ZCAV</a> <small>This paper by Rodney Van Meter about ZCAV (Zoned Constant...</small></li>
</ol>]]></content:encoded>
			<wfw:commentRss>http://etbe.coker.com.au/2012/01/05/drbd-benchmarking/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Released Bonnie++ 1.96</title>
		<link>http://etbe.coker.com.au/2009/07/05/released-bonnie-1-96/</link>
		<comments>http://etbe.coker.com.au/2009/07/05/released-bonnie-1-96/#comments</comments>
		<pubDate>Sun, 05 Jul 2009 07:50:38 +0000</pubDate>
		<dc:creator>etbe</dc:creator>
				<category><![CDATA[Benchmark]]></category>

		<guid isPermaLink="false">http://etbe.coker.com.au/?p=1232</guid>
		<description><![CDATA[I have released version 1.96 of Bonnie++ in the experimental branch [1]. The main changes are: Made it compile on Solaris again (version 1.95 broke that) Now supports more files for the small file creation test (16^10 files is the limit), and it handles an overflow better. Incidentally this will in some situations change the [...]]]></description>
			<content:encoded><![CDATA[<p>I have <a href="http://www.coker.com.au/bonnie++/experimental/">released version 1.96 of Bonnie++ in the experimental branch [1]</a>.</p>
<p>The main changes are:</p>
<ol>
<li>Made it compile on Solaris again (version 1.95 broke that)</li>
<li>Now supports more files for the small file creation test (16^10 files is the limit), and it handles an overflow better.  Incidentally this will in some situations change the results so I changed the result version in the CSV file.</li>
<li>Fixed some bugs in bon_csv2html and added some new features to give nicer looking displays and correct colors</li>
</ol>
<p>I still plan to add support for semi-random data and validation of data when reading it back before making a 2.0 release.  But 2.0 is getting close.</p>
<ul>
<li>[1]<a href="http://www.coker.com.au/bonnie++/experimental/"> http://www.coker.com.au/bonnie++/experimental/</a></li>
</ul>
<p>Related posts:</p><ol>
<li><a href='http://etbe.coker.com.au/2007/12/03/new-bonnie-releases/' rel='bookmark' title='New Bonnie++ Releases'>New Bonnie++ Releases</a> <small>Today I released new versions of my Bonnie++ [1] benchmark....</small></li>
<li><a href='http://etbe.coker.com.au/2008/12/10/new-version-of-bonnie-and-violin-memory/' rel='bookmark' title='New version of Bonnie++ and Violin Memory'>New version of Bonnie++ and Violin Memory</a> <small>I have just released version 1.03e of my Bonnie++ benchmark...</small></li>
<li><a href='http://etbe.coker.com.au/2007/07/29/bonnie-and-postal-shirts/' rel='bookmark' title='Bonnie++ and Postal shirts'>Bonnie++ and Postal shirts</a> <small>Dear lazyweb, I want to design T-Shirts for my Bonnie++...</small></li>
</ol>]]></content:encoded>
			<wfw:commentRss>http://etbe.coker.com.au/2009/07/05/released-bonnie-1-96/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Vibration and Strange SATA Performance</title>
		<link>http://etbe.coker.com.au/2009/04/22/vibration-strange-sata-performance/</link>
		<comments>http://etbe.coker.com.au/2009/04/22/vibration-strange-sata-performance/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 09:00:47 +0000</pubDate>
		<dc:creator>etbe</dc:creator>
				<category><![CDATA[Benchmark]]></category>

		<guid isPermaLink="false">http://etbe.coker.com.au/?p=1136</guid>
		<description><![CDATA[Almost two years ago I blogged about a strange performance problem with SATA disks [1]. The problem was that certain regions of a disk gave poor linear read performance on some machines, but performed well on machines which appeared to be identical. I discovered what the problem was shortly after that but was prevented from [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://etbe.coker.com.au/2007/06/27/strange-sata-disk-performance/">Almost two years ago I blogged about a strange performance problem with SATA disks [1]</a>.  The problem was that certain regions of a disk gave poor linear read performance on some machines, but performed well on machines which appeared to be identical.  I discovered what the problem was shortly after that but was prevented from disclosing the solution due to an SGI NDA.  The fact that SGI now no longer exists as a separate company decreases my obligations under the NDA.  The fact that <a href="http://wiki.cita.utoronto.ca/mediawiki/index.php/Xe">the sysadmins of the University of Toronto published all the most important data entirely removes my obligations in this regard [2]</a>.</p>
<p>In their Wiki they write &#8220;<b>after SGI installed rubber grommits around the 5 or 6 tiny fans in the xe210 nodes, the read and write plots now look like</b>&#8221; and then some graphs showing good disk performance appear.</p>
<p>The problem was that a certain brand and model of disk was particularly sensitive to vibrations.  When that model of disk was installed in some machines then the vibrations would interfere with disk reads.  It seems that there was some sort of harmonic frequency between the vibration of the disk and that of the cooling fans which explains why some sections of the disk were read slowly and some gave normal performance (my previous post has the graphs which show a pattern).  Some other servers of the same make and model didn&#8217;t have that problem, so it seemed that some slight manufacturing differences in the machines determined whether the vibration would affect the disk performance.</p>
<p>One thing that I&#8217;ve been meaning to do is to test the performance of disks while being vibrated.  I was thinking of getting a large bass speaker, a powerful amplifier, and using the sound hardware in a PC to produce a range of frequencies.  Then having the hard disk securely attached to a piece of plywood which would be in front of the speaker.  But as I haven&#8217;t had time to do this over the last couple of years it seems unlikely that I will do it any time soon.  Hopefully this blog post will inspire someone to do such tests.  One thing to note if you want to do this is that it&#8217;s quite likely to damage the speaker, powerful bass sounds that are sustained can melt parts of the coil in a speaker.  So buy a speaker second-hand.</p>
<p>If someone in my region (Melbourne) wants to try this then I can donate some old IDE disks.  I can offer advice on how to run the tests for anyone who is interested.</p>
<p>Also it&#8217;s worth considering that systems which make less noise might deliver better performance.</p>
<ul>
<li>[1]<a href="http://etbe.coker.com.au/2007/06/27/strange-sata-disk-performance/"> http://etbe.coker.com.au/2007/06/27/strange-sata-disk-performance/</a></li>
<li>[2]<a href="http://wiki.cita.utoronto.ca/mediawiki/index.php/Xe"> http://wiki.cita.utoronto.ca/mediawiki/index.php/Xe</a></li>
</ul>
<p>Related posts:</p><ol>
<li><a href='http://etbe.coker.com.au/2007/06/27/strange-sata-disk-performance/' rel='bookmark' title='Strange SATA Disk Performance'>Strange SATA Disk Performance</a> <small>Below is a GNUPlot graph of ZCAV output from a...</small></li>
<li><a href='http://etbe.coker.com.au/2007/04/26/paper-about-zcav/' rel='bookmark' title='paper about ZCAV'>paper about ZCAV</a> <small>This paper by Rodney Van Meter about ZCAV (Zoned Constant...</small></li>
<li><a href='http://etbe.coker.com.au/2007/07/02/new-storage-developments/' rel='bookmark' title='New Storage Developments'>New Storage Developments</a> <small>Eweek has an article on a new 1TB Seagate drive....</small></li>
</ol>]]></content:encoded>
			<wfw:commentRss>http://etbe.coker.com.au/2009/04/22/vibration-strange-sata-performance/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>New version of Bonnie++ and Violin Memory</title>
		<link>http://etbe.coker.com.au/2008/12/10/new-version-of-bonnie-and-violin-memory/</link>
		<comments>http://etbe.coker.com.au/2008/12/10/new-version-of-bonnie-and-violin-memory/#comments</comments>
		<pubDate>Wed, 10 Dec 2008 08:33:15 +0000</pubDate>
		<dc:creator>etbe</dc:creator>
				<category><![CDATA[Benchmark]]></category>
		<category><![CDATA[tech]]></category>

		<guid isPermaLink="false">http://etbe.coker.com.au/?p=968</guid>
		<description><![CDATA[I have just released version 1.03e of my Bonnie++ benchmark [1]. The only change is support for direct IO in Bonnie++ (via the -D command-line parameter). The patch for this was written by Dave Murch of Violin Memory [2]. Violin specialise in 2RU storage servers based on DRAM and/or Flash storage. One of their products [...]]]></description>
			<content:encoded><![CDATA[<p>I have just released version 1.03e of <a href="http://www.coker.com.au/bonnie++/">my Bonnie++ benchmark [1]</a>.  The only change is support for direct IO in Bonnie++ (via the <b>-D</b> command-line parameter).  The patch for this was written by Dave Murch of <a href="http://violin-memory.com/">Violin Memory [2]</a>.  Violin specialise in 2RU storage servers based on DRAM and/or Flash storage.  One of their products is designed to handle a sustained load of 100,000 write IOPS (in 4K blocks) and 200,000 read IOPS per second for it&#8217;s 10 year life (but it&#8217;s not clear whether you could do 100,000 writes AND 200,000 reads in a second).  The only pricing information that they have online is a claim that flash costs less than $50 per gig, while that would be quite affordable for dozens of gigs and not really expensive for hundreds of gigs, as they are discussing a device with 4TB capacity it sounds rather expensive &#8211; but of course it would be a lot cheaper than using hard disks if you need that combination of capacity and performance.</p>
<p>I wonder how much benefit you would get from using a Violin device to manage the journals for 100 servers in a data center.  It seems that 1000 writes per second is near the upper end of the capacity of a 2RU server for many common work-loads, this is of course just a rough estimation based on observations of some servers that I run.  If the main storage was on a SAN then using data journaling and putting the journals on a Violin device seems likely to improve latency (data is committed faster and the application can report success to the client sooner) while also reducing the load on the SAN disks (which are really expensive).</p>
<p>Now given that their price point is less than $50 per gig, it seems that a virtual hosting provider could provide really fast storage to their customers for a quite affordable price.  $5 per month per gig for flash storage in a virtual hosting environment would be an attractive option for many people.  Currently if you have a small service that you want hosted a virtual server is the best way to do it, and as most providers offer little information on the disk IO capacity of their services it seems quite unlikely that anyone has taken any serious steps to prevent high load from one customer from degrading the performance of the rest. With flash storage you not only get a much higher number of writes per second, but one customer writing data won&#8217;t seriously impact read speed for other customers (with hard drive one process that does a lot of writes can cripple the performance of processes that do reads).</p>
<p>The experimental versions of Bonnie++ have better support for testing some of these usage scenarios.  One new feature is measuring the worst-case latency of all operations in each section of the test run.  I will soon release Bonnie++ version 1.99 which includes direct IO support, it should show some significant benefits for all usage cases involving Violin devices, ZFS (when configured with multiple types of storage hardware), NetApp Filers, and other advanced storage options.</p>
<p>For a while I have been dithering about the exact feature list of Bonnie++ 2.x.  After some pressure from a contributor to the OpenSolaris project I have decided to freeze the feature list at the current 1.94 level plus direct IO support.  This doesn&#8217;t mean that I will stop adding new features in the 2.0x branch, but I will avoid doing anything that can change the results.  So in future benchmark results made from Bonnie++ version 1.94 can be directly compared to results that will be made from version 2.0 and above.  There is one minor issue, new versions of GCC have in the past made differences to some of the benchmark results (the per-character IO test was the main one)  &#8211; but that&#8217;s not my problem.  As far as I am concerned Bonnie++ benchmarks everything from the compiler to the mass storage device in terms of disk IO performance.  If you compare two systems with different kernels, different versions of GCC, or other differences then it&#8217;s up to you to make appropriate notes of what was changed.</p>
<p>This means that the OpenSolaris people can now cease using the 1.0x branch of Bonnie++, and other distributions can do the same if they wish.  I have just uploaded version 1.03e to Debian and will request that it goes in Lenny &#8211; I believe that it is way too late to put 1.9x in Lenny.  But once Lenny is released I will upload version 2.00 to Debian/Unstable and that will be the only version supported in Debian after that time.</p>
<ul>
<li>[1]<a href="http://www.coker.com.au/bonnie++/"> http://www.coker.com.au/bonnie++/</a></li>
<li>[2]<a href="http://violin-memory.com/"> http://violin-memory.com/</a></li>
</ul>
<p>Related posts:</p><ol>
<li><a href='http://etbe.coker.com.au/2007/12/03/new-bonnie-releases/' rel='bookmark' title='New Bonnie++ Releases'>New Bonnie++ Releases</a> <small>Today I released new versions of my Bonnie++ [1] benchmark....</small></li>
<li><a href='http://etbe.coker.com.au/2007/07/29/bonnie-and-postal-shirts/' rel='bookmark' title='Bonnie++ and Postal shirts'>Bonnie++ and Postal shirts</a> <small>Dear lazyweb, I want to design T-Shirts for my Bonnie++...</small></li>
<li><a href='http://etbe.coker.com.au/2006/08/13/big-and-cheap-usb-flash-devices/' rel='bookmark' title='big and cheap USB flash devices'>big and cheap USB flash devices</a> <small>It&#8217;s often the case with technology that serious changes occur...</small></li>
</ol>]]></content:encoded>
			<wfw:commentRss>http://etbe.coker.com.au/2008/12/10/new-version-of-bonnie-and-violin-memory/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>New HP Server</title>
		<link>http://etbe.coker.com.au/2008/08/05/new-hp-server/</link>
		<comments>http://etbe.coker.com.au/2008/08/05/new-hp-server/#comments</comments>
		<pubDate>Tue, 05 Aug 2008 03:19:16 +0000</pubDate>
		<dc:creator>etbe</dc:creator>
				<category><![CDATA[Benchmark]]></category>

		<guid isPermaLink="false">http://etbe.coker.com.au/?p=683</guid>
		<description><![CDATA[I&#8217;ve just started work on a new HP server running RHEL5 AS (needs to be AS to support more than 4 DomU&#8217;s). While I still have the Xen issues that made me give up using it on Debian [1] (the killer one being that an AMD64 Xen Dom0 would kernel panic on any serious disk [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve just started work on a new HP server running RHEL5 AS (needs to be AS to support more than 4 DomU&#8217;s).  While <a href="http://etbe.coker.com.au/2008/05/02/the-future-of-xen/">I still have the Xen issues that made me give up using it on Debian [1]</a> (the killer one being that an AMD64 Xen Dom0 would kernel panic on any serious disk IO) but the Xen implementation in RHEL is quite solid.</p>
<p>The first thing I did was run <a href="http://www.coker.com.au/bonnie++/">zcav (part of my Bonnie++ benchmark suite) [2]</a> to see how the array performs (as well as ensuring that the entire array actually appears to work).  The result is below.  For a single disk performance is expected to decrease as you read along the disk (from outer to inner tracks).  I don&#8217;t know why the performance decreases until the half-way point and then starts with good performance again and again decreases.</p>
<p><img alt="zcav results from HP CCISS RAID-6 array" src="http://www.coker.com.au/bonnie++/zcav/raid/raid.png" /></p>
<p>The next thing was to ensure that the machine had RAID-6 (I have been convinced that that using only RAID-5 verges on professional malpractice).  As the machine is rented from a hosting company there was no guarantee that they would follow my clear written instructions involving running RAID-6.</p>
<p>The machine is a HP rack-mounted server with a CCISS RAID controller, so to manage the array the command <b>/usr/sbin/hpacucli</b> is used.</p>
<p>The command <b>hpacucli controller all show</b> reveals that there is a &#8220;<b>Smart Array P400 in Slot 1</b>&#8220;.</p>
<p>The command <b>hpacucli controller slot=1 show</b> gives the following (amongst a lot of other output):<br />
RAID 6 (ADG) Status: Enabled<br />
Cache Board Present: True<br />
Cache Status: OK<br />
Accelerator Ratio: 25% Read / 75% Write<br />
Drive Write Cache: Disabled<br />
Total Cache Size: 512 MB<br />
Battery Pack Count: 1<br />
Battery Status: OK<br />
SATA NCQ Supported: True</p>
<p>So the write-back cache is enabled, 384M of data is for the write-back cache and 128M is for the read cache (hopefully all for read-ahead &#8211; the OS should do all the real caching for reads).</p>
<p>The command <b>hpacucli controller slot=1 array all show</b> reveals that there is one array: &#8220;<b>array A (SAS, Unused Space: 0 MB)</b>&#8220;.</p>
<p>The command <b>hpacucli controller slot=1 array a show status</b> tells me that the status is &#8220;<b>array AOK</b>&#8220;.</p>
<p>Finally the command <b>hpacucli controller slot=1 show config</b> gives me the data that I really want at this time and says:<br />
Smart Array P400 in Slot 1  (sn: *****)<br />
array A (SAS, Unused Space: 0 MB)<br />
logicaldrive 1 (820.2 GB, RAID 6 (ADG), OK)</p>
<p>Then it gives all the data on the disks.  It would be nice if there was a command to just dump all that.  I would like to be able to show the configuration of all controllers with a single command.</p>
<p>Also it would be nice if the fact that <b>hpacucli</b> is the tool to use for managing CCISS RAID arrays when using Linux on HP servers was more widely documented.  It took me an unreasonable amount of effort to discover what tool to use for CCISS RAID management.</p>
<ul>
<li>[1] <a href="http://etbe.coker.com.au/2008/05/02/the-future-of-xen/">http://etbe.coker.com.au/2008/05/02/the-future-of-xen/</a></li>
<li>[2] <a href="http://www.coker.com.au/bonnie++/">http://www.coker.com.au/bonnie++/</a></li>
</ul>
<p>Related posts:</p><ol>
<li><a href='http://etbe.coker.com.au/2007/11/21/raid-and-bus-bandwidth/' rel='bookmark' title='RAID and Bus Bandwidth'>RAID and Bus Bandwidth</a> <small>As correctly pointed out by cmot [1] my previous post...</small></li>
<li><a href='http://etbe.coker.com.au/2007/11/16/software-vs-hardware-raid/' rel='bookmark' title='Software vs Hardware RAID'>Software vs Hardware RAID</a> <small>Should you use software or hardware RAID? Many people claim...</small></li>
<li><a href='http://etbe.coker.com.au/2008/06/03/moving-a-mail-server/' rel='bookmark' title='Moving a Mail Server'>Moving a Mail Server</a> <small>Nowadays it seems that most serious mail servers (IE mail...</small></li>
</ol>]]></content:encoded>
			<wfw:commentRss>http://etbe.coker.com.au/2008/08/05/new-hp-server/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>New ZCAV Development</title>
		<link>http://etbe.coker.com.au/2008/07/08/new-zcav-development/</link>
		<comments>http://etbe.coker.com.au/2008/07/08/new-zcav-development/#comments</comments>
		<pubDate>Mon, 07 Jul 2008 21:00:20 +0000</pubDate>
		<dc:creator>etbe</dc:creator>
				<category><![CDATA[Benchmark]]></category>

		<guid isPermaLink="false">http://etbe.coker.com.au/?p=633</guid>
		<description><![CDATA[I have just been running some ZCAV tests on some new supposedly 1TB disks (10^40 bytes is about 931*2^30 so is about 931G according to almost everyone in the computer industry who doesn&#8217;t work for a hard disk vendor). I&#8217;ve added a new graph to my ZCAV results page [1] with the results. One interesting [...]]]></description>
			<content:encoded><![CDATA[<p>I have just been running some ZCAV tests on some new supposedly 1TB disks (10^40 bytes is about 931*2^30 so is about 931G according to almost everyone in the computer industry who doesn&#8217;t work for a hard disk vendor).</p>
<p>I&#8217;ve added a new graph to my <a href="http://www.coker.com.au/bonnie++/zcav/results.html">ZCAV results page [1]</a> with the results.</p>
<p>One interesting thing that I discovered is that the faster disks can deliver contiguous data at a speed of more than 110MB/s, previously the best I&#8217;d seen from a single disk was about 90MB/s.  When I first wrote ZCAV the best disks I had to test with all had a maximum speed of about 10MB/s so KB/s was a reasonable unit.  Now I plan to change the units to MB/s to make it easier to read the graphs.  Of course it&#8217;s not that difficult to munge the data before graphing it, but I think that it will give a better result for most users if I just change the units.</p>
<p>The next interesting thing I discovered is that by default GNUplot defaults to using exponential notation at the value of 1,000,000 (or 1e+06).  I&#8217;m sure that I could override that but it would still make it difficult to read for the users.  So I guess it&#8217;s time to change the units to GB.</p>
<p>I idly considered using the hard drive manufacturer&#8217;s definition of GB so that a 1TB disk would actually display as having 1000GB (<a href="http://en.wikipedia.org/wiki/Gibibyte">the Wikipedia page for Gibibyte has the different definitions [2]</a>).  But of course having decimal and binary prefixes used in the X and Y axis of a graph would be a horror.  Also the block and chunk sizes used have to be multiples of a reasonably large power of two (at least 2^14) to get reasonable performance from the OS.</p>
<p>The next implication of this is that it&#8217;s a bad idea to have a default block size that is not a power of two.  The previous block sizes were 100M and 200M (for 1.0x and 1.9x branches respectively).  Expressing these as 0.0976G and 0.1953G respectively would not be user-friendly.  So I&#8217;m currently planning on 0.25G as the block size for both branches.</p>
<p>While changing the format it makes sense to change as many things as possible at once to reduce the number of incompatable file formats that are out there.  The next thing I&#8217;m considering is the precision.  In the past the speed in K/s was an integer.  Obviously an integer for the speed in M/s is not going to work well for some of the slower devices that are still in use (EG a 4* CD-ROM drive maxes out at 600KB/s).  Of course the accuracy of this is determined by the accuracy of the system clock.  The gettimeofday() system call returns the time in micro-seconds.  I expect that most systems don&#8217;t approach miro-second accuracy.  I expect that it&#8217;s not worth reporting with a precision that is greater than the accuracy.  Then there&#8217;s no point in making the precision of the speed any greater than the precision of the time.</p>
<p>Things were easier with the Bonnie++ program when I just reduced the precision as needed to fit in an 80 column display.  ;)</p>
<p>Finally I ran my tests on my new Dell T105 system.  While I didn&#8217;t get time to do as many tests as I desired before putting the machine in production I did get to do a quick test of two disks running at full speed.  Previously when testing desktop systems I had not found a system which when run with two disks of the same age as the machine could extract full performance from both disks simultaneously.  While the Dell T105 is a server-class system, it is a rather low-end server and I had anticipated that it would lack performance in this regard.  I was pleased to note that I could run both 1TB disks at full speed at the same time.  I didn&#8217;t get a chance to test three or four disks though (maybe for scheduled down-time in the future).</p>
<ul>
<li>[1] <a href="http://www.coker.com.au/bonnie++/zcav/results.html">http://www.coker.com.au/bonnie++/zcav/results.html</a></li>
<li>[2] <a href="http://en.wikipedia.org/wiki/Gibibyte">http://en.wikipedia.org/wiki/Gibibyte</a></li>
</ul>
<p>Related posts:</p><ol>
<li><a href='http://etbe.coker.com.au/2007/04/26/paper-about-zcav/' rel='bookmark' title='paper about ZCAV'>paper about ZCAV</a> <small>This paper by Rodney Van Meter about ZCAV (Zoned Constant...</small></li>
<li><a href='http://etbe.coker.com.au/2008/01/24/how-i-partition-disks/' rel='bookmark' title='How I Partition Disks'>How I Partition Disks</a> <small>Having had a number of hard drives fail over the...</small></li>
<li><a href='http://etbe.coker.com.au/2007/11/21/raid-and-bus-bandwidth/' rel='bookmark' title='RAID and Bus Bandwidth'>RAID and Bus Bandwidth</a> <small>As correctly pointed out by cmot [1] my previous post...</small></li>
</ol>]]></content:encoded>
			<wfw:commentRss>http://etbe.coker.com.au/2008/07/08/new-zcav-development/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

