|
In February I replaced a Dell T320 server with a HP Z640 workstation for a home server/workstation [1]. The T320 has 8*3.5″ drive bays which I had used to put 3*4TB disks in a BTRFS RAID-10 array for 6TB of usable capacity. The Z640 has only 2*3.5″ bays and 4*2.5″ bays, so one option I could have taken was to buy a 4TB 2.5″ SSD and keep the same 3*4TB array as before. Instead I chose to use an 8TB disk I had spare in an array with one of the original 4TB disks and some extra on NVMe devices (the system has 2*1TB NVMe devices which are used as a 380G RAID-1 for the root filesystem and the rest for the storage array). It’s nice how BTRFS allows putting any storage you have in a RAID-10 configuration.
Unfortunately it seems that I chose the wrong 4TB disk to use for this as it failed three days ago. It gave thousands of read and write errors and Linux decided that the drive no longer existed. I tried rebooting the system to get it in the BTRFS array again but it failed again and failed so quickly that it wasn’t even possible to use the data on it as part of a RAID rebuild. So I removed that disk and put in one of the other 4TB disks.
As the array is comprised of an 8TB disk and 3 other devices that don’t add up to 8TB the layout is the 8TB disk having one copy of everything and the other devices having parts of it. So the rebuild process comprised of copying data from the 8TB disk to the 4TB disk. For a RAID array run in the manner of Linux software RAID the rebuild of a RAID-1 involves a linear copy of data which is the optimal case for hard disks, copying 4TB of data in that manner would have an average speed of a bit over 100MB/s and take about 11 hours. With BTRFS the source disk has to be updated for each block that is recreated so the process was bottlenecked on writing to the 8TB disk. It took 2 days 23 hours to complete. The process involved reading 3,478,031MB and writing 4,405,545MB. The system was live for the process and some cron jobs etc were writing to the array, but in the 12 hours since the rebuild completed the array has had 7,038MB written. So presumably during the rebuild time about 42G of actual data were written to the array and the other 4.3TB written to the 8TB disk were from the process of copying 3.5TB from it to another device. Iostat reported that there were 645.36 TPS for the duration of the rebuild which seems like a decent number for a hard drive, during the process iostat reported that the drive had 99%+ of IO capacity used for the duration.
While waiting for this to complete I wrote a blog post about storage trends [2]. One thing I didn’t mention in that post is that if you are the type of person who checks the rebuild process fifty times a day then that should be counted as part of the cost of using slow storage. If instead of an 8TB disk plus some SSD storage I had used 2*4TB disks and 1*4TB SSD as I had considered doing then instead of having 3.8TB on one device I would have had about 2.5TB and the reconstruct would have probably taken 2/3 of the time. If I had moved the array to 3*4TB SSDs then it would have taken a small fraction of the time.
One thing to note is that I made a mistake in this operation by removing the failed device instead of doing a “btrfs replace” operation which can be significantly faster. If I had correctly done this then I would have written a blog post about the rebuild taking 2 days or something, the issues of hard drives being slow and me compulsively checking the progress would still apply.
It’s been 2 years since my last blog post about storage trends [1].
Minimum Storage <=2TB
In 2021 I stated that as MSY had 2TB disks for $72 and 2TB SSD for $245 it was barely worth considering a 2TB disk and anything less than 2TB wasn’t worth considering. Now for 2TB storage from MSY NVMe starts at $129, SATA SSD starts at $143, and hard disks start at $75. I guess that NVMe is slightly cheaper due to some combination of economies of scale for manufacture/sales and having less postage costs. It really doesn’t make sense to consider hard disks for storing 2TB or less.
For storage for a small system (PC or laptop) the cheapest storage device is $19 for a 128G SATA SSD. But it wouldn’t make sense to buy that when you can get a 256G SATA SSD for $22 or a 240G NVMe device for $23, saving $3 on storage wouldn’t make any sense. For 512G of storage the prices are $32 for NVMe and $33 for SATA SSD. For 1TB of storage the prices start at $68 for SATA SSD and $74 for NVMe. Probably for the vast majority of home users 1TB of SATA SSD or NVMe is the minimum storage capacity to consider, the $50 price difference isn’t much when considering the entire price of a PC or laptop and anything less than 1TB will run out quickly with modern use.
Larger Storage 4TB+
The price for 4TB of storage from MSY is NVMe starting at $349, SATA SSD starting at $369, and hard disks starting at $115. If you need 4TB of RAID-1 storage then it might be worth saving $470 and getting hard drives for a home user. For business use it wouldn’t make sense. Some laptops have two NVMe sockets so 8TB of storage (or 4TB of RAID-1) in a laptop would be interesting.
For 8TB of storage the MSY prices are SATA SSD for $739 and hard drives starting at $179. Probably hard drives are the best choice for most situations where there is a need to store 8TB or more of data. But the prices are low enough to make 8TB SSD something that can be considered for home use, it doesn’t seem that long ago that the 4TB hard drives I bought for my home server were almost that expensive.
Big Storage
MSY doesn’t have 8TB NVMe, such devices are on eBay for $1700 for regular M.2 NVMe and just under $1000 for U.2 (server hot-swap devices). So if you need more than 8TB of NVMe storage then probably buying a server with U.2 built in is the correct solution.
For home users who need more than 8TB of storage hard drives are a good solution. One issue is that the more affordable and larger drives use Shingled Magnetic Recording (SMR) which has some different performance characteristics for certain workloads. Apparently SMR performs badly for anything other than large file storage.
Why MSY?
I primarily used MSY prices for this post because they are a reliable local store that has a list of prices that is easy to read. For everything in this post I can get better prices by using eBay, the StaticIce.com.au price comparison site [2], and the computing section of the OzBargain site that gamifies finding good prices [3]. But a good shopping strategy nowadays is to compare prices in a store to determine what items are in your price range and then shop around for price on the item you want. Checking all the different bargain sites for all these items would take much more time than I want to spend writing a blog post!
Conclusion
Hard drives don’t make sense for the vast majority of systems. Not for laptops, not for typical desktop PCs, and not for small business servers (say 8TB or less of RAID storage). Hard drives only make sense for dozens or hundreds of TB of storage and even then finding out how to deal with SMR issues is going to increase the pain of deployment. Maybe using a combination of SSD and hard drives to deal with the SMR issues is going to be a competitive advantage for NAS vendors in future.
NVMe looks like it’s on the way to being cheaper than SATA SSD. There is likely to be a good market for systems with NVMe as the only internal storage option.
The long term trend of systems without DVD drives and with maybe 2.5″ SATA devices but no 3.5″ SATA devices seems to lead to the GPU being the major part that needs to fit into a PC case that determines the overall size. Maybe there will be a new trend of GPUs connected to riser cards so they can be parallel to the motherboard for compact PCs.
For business desktop systems (IE low powered graphics hardware as it’s not for gaming) I expect that the trend will be towards NUC type devices which are already based around the M.2 as a storage device size.
Interesting paper about a plan for eugenics in dogs with an aim to get human equivalent IQ within 100 generations [1]. It gets a bit silly when the author predicts IQs of 8000+ as there will eventually be limits of what can fit in one head. But the basic concept is good.
Interesting article about what happens inside a proton [2]. This makes some aspects of the Trisolar series and the Dragon’s Egg series seem less implausible.
Insightful article about how crypto-currencies really work [3]. Basically the vast majority of users trust some company that’s outside the scope of most financial regulations to act as their bank. Surprisingly the author doesn’t seem to identify such things as a Ponzi scheme.
Bruce Schneier wrote an interesting blog post about AIs as hackers [4].
Cory Doctorow wrote an insightful article titled “The ‘Enshittification’ of TikTok” which is about the enshittification of commercial Internet platforms in general [5]. We need more regulation of such things.
Cat Valente wrote an insightful article titled “Stop Talking to Each Other and Start Buying Things: Three Decades of Survival in the Desert of Social Media” about the desire to profit from social media repeatedly destroying platforms [6].
This Onion video has a good point, I don’t want to watch videos on news sites etc [7]. We need ad-blockers that can block video on all sites other than YouTube etc.
Wired has an interesting article about the machines that still need floppy disks, including early versions of the 747 [8]. There are devices to convert the floppy drive interface to a USB storage device which are being used on some systems but which presumably aren’t certified for a 747. The article says that 3.5″ disks cost $1 each because they are rare – that’s still cheaper than when they were first released.
Android Police has an interesting article about un-redacting information in PNG files [9]. It seems that some software on Pixel devices hasn’t been truncating files when editing them, just writing the new data over top and some platforms (notably Discord) send the entire file wuthout parsing it (unlike Twitter for example which removes EXIF data to protect users). Then even though a PNG file is compressed from the later part of the data someone can deduce the earlier data.
Teen Vogue has an insightful article about the harm that “influencer parents” do to their children [10].
Jonathan McDowell wrote a very informative blog post about his new RISC-V computer running Debian [11]. He says that it takes 10 hours to do a full Debian kernel build (compared to 14 minutes for my 18 core E5-2696) so it’s about 2% the CPU speed of a high end 2015 server CPU which is pretty good for an embedded devivce. That is similar to some of the low end Thinkpads that were on sale in 2015.
The Surviving Tomorrow site has an interesting article about a community where all property is community owned [12]. It’s an extremist Christian group and the article is written by a slightly different Christian extremist, but the organisation is interesting. A technology positive atheist versions of this would be good.
Bruce Schneier and Nathan E. Sanders co-wrote an insightful article about how AI could exploit the process of making laws [13]. We really need to crack down on political lobbying, any time a constitution is being amender prohibiting lobbying should be included.
Anarcat wrote a very informative blog post about the Framework laptops that are designed to be upgraded by the user [14]. The motherboard can be replaced and there are cases designed so you can use the old laptop motherboard as an embedded PC. Before 2017 I would have been very interested in such a laptop. Now I’ve moved to low power laptops and servers for serious compiles and a second-hand Thinkpad X1 Carbon costs less than a new Framework motherboard. But this will be a really good product for people with more demanding needs than mine. Pity they don’t have a keyboard with the Thinkpad Trackpoint.
A couple of days ago I upgraded my home server from Debian/Bullseye to Debian/Testing (soon to be Bookworm). Since then KDE sessions on that system have had problems of locking the input queue, the mouse can move and mouse-over events work but clicking the mouse or pressing the keyboard does nothing. Various web pages suggested that the xdotool program (in the xdotool package in Debian) can address this. The problem is apparently programs “grabbing” the input and not letting it go.
The command “xdotool key XF86LogGrabInfo” causes the xorg server to dump information on it’s “grabs”. After running that command I looked in /var/log/Xorg.0.log and found that active grabs were only held by /usr/bin/kwin_x11 and /usr/bin/kglobalaccel5. So it seems like a KDE issue. Other systems running X11 with Debian/Testing (such as the laptop I’m using to write this blog post) don’t have the problem, so it could be something related to the KDE configuration of the account used on that system.
The command “xdotool key XF86Ungrab” is supposed to break out of such a grab, but for me didn’t do so.
On the same system running KDE with Wayland works fine in this regard. Does Wayland do things differently and not allow this “grabbing” to block everything? Does KDE have an X11 specific bug? Is there a race condition that just gets triggered by the speed of Xorg on that system but not by the slightly different timings of Wayland? I might never find out.
I previously wrote about problems with Wayland/KDE on laptops [1]. Fortunately this bug happened to occur on a server so inability to reconfigure monitors isn’t necessarily a deal breaker, although being unable to use some of the high-DPI settings for the 4K monitor it has may be an issue. It will be really annoying if some of the laptop configurations I support get this grabbing problem. But since that time I have learned of the kscreen-doctor command which is included in Debian/Testing and can do some of the necessary things, it doesn’t have a man page so you have to run “kscreen-doctor -h” for documentation.
After reading Bálint’s blog post about Firebuild (a compile cache) [1] I decided to give it a go. It’s non-free, the project web site [2] says that it’s free for non-commercial use or commercial trials.
My first attempt at building a Debian package failed due to man-recode using a seccomp() sandbox, I filed Debian bug #1032619 [3] about this (thanks for the quick response Bálint). The solution for me was to edit /etc/firebuild.conf and add man-recode to the dont_intercept list. The new version that’s just been uploaded to Debian fixes it by disabling seccomp() and will presumably allow slightly better performance.
Here are the results of building the refpolicy package with Firebuild, a regular build, the first build with Firebuild (30% slower) and a rebuild with Firebuild that reduced the time by almost 42%.
real 1m32.026s
user 4m20.200s
sys 2m33.324s
real 2m4.111s
user 6m31.769s
sys 3m53.681s
real 0m53.632s
user 1m41.334s
sys 3m36.227s
Next I did a test of building a Linux 6.1.10 kernel with “make bzImage -j18“, here are the results from a normal build, first build with firebuild, and second build. The real time is worse with firebuild for this on my machine. I think that the relative speeds of my CPU (reasonably fast 18 core) and storage (two of the slower NVMe devices in a BTRFS RAID-1) is the cause of the first build being relatively so much slower for “make bzImage” than for building the refpolicy, as the kernel build process involves a lot more data. For the final build I moved ~/.cache/firebuild to a tmpfs (I have 128G of RAM and not much running on my machine at the time of the tests), even then building with firebuild was slightly slower in real time but took significantly less CPU time (user+real being 20mins instead of 36m). I also ran several tests with the kernel source tree on a tmpfs but for unknown reasons those tests each took about 6 minutes. Does firebuild or the Linux kernel build process dislike tmpfs for some reason?
real 2m43.020s
user 31m30.551s
sys 5m15.279s
real 8m49.675s
user 64m11.258s
sys 19m39.016s
real 3m6.858s
user 7m47.556s
sys 9m22.513s
real 2m51.910s
user 10m53.870s
sys 9m21.307s
One thing I noticed from the kernel build tests is that the total CPU time taken by the firebuild process (as reported by ps) was more than 2/3 of the run time and top usually reported it as taking around 75% of a CPU core. It seems to me that the firebuild process itself is a bottleneck on build speed. Building refpolicy without firebuild has an average of 4.5 cores in use while building the kernel haas 13.5. Unless they make a multi-threaded version of firebuild it seems that it won’t give the performance one would hope for from a CPU with 18+ cores. I presume that if I had been running with hyper-threading enabled then firebuild would have been even worse for kernel builds as it would sometimes get on the second thread of a core. It looks like firebuild would perform better on AMD CPUs as they tend to have fewer CPU cores with greater average performance per core so a single CPU core for firebuild will be less limited. I presume that the firebuild developers will make it perform better with large numbers of cores in future, the latest Intel laptop CPUs have 16+ cores and servers with 2*40core CPUs are common.
The performance improvement for refpolicy is significant as a portion of build time, but insignificant in terms of real time. A full build of refpolicy doesn’t take enough time to get a Coke and reducing it doesn’t offer a huge benefit, if Firebuild was available in past years when refpolicy took 20 minutes to build (when DDR2 was the best RAM available) then it would be a different story.
There is some potential to optimise the build of refpolicy for the non-firebuild case. Getting it to average more than 4.5 cores in use when there’s 18 available should be possible, there are a number of shell for loops in the main Makefile and maybe some of them can be replaced by make constructs to allow running in parallel. If it used 7 cores on average then it would be faster in a regular build than it currently is with firebuild and a hot cache. Any advice from make experts would be appreciated.
For a while I’ve had my monitoring systems alert me via XMPP (Jabber). To do that I used the sendxmpp command-line program which worked well for it’s basic tasks. I recently noticed that my laptop and workstation which I had upgraded to Debian/Testing weren’t sending messages, I’m not sure when it started as my main monitoring of such machines is to touch a key and see if there’s a response – if I’m not at the keyboard then a failure doesn’t bother me too much.
I’ve filed Debian bug #1032868 [1] about this. As sendxmpp is apparently not supported upstream and we are preparing for a release it could be that the next version of Debian is released without this working (if it’s specific to talking to Prosody) or without sendxmpp (if it fails on all Jabber servers).
I next tested xmppc which doesn’t send messages (gives no error when I have apparently correct parameters and just doesn’t send anything) and doesn’t display any text output for info related commands while not giving error messages or an error return code. I filed Debian bug #1032869 [2] about this.
Currently the only success I’ve found with Debian/Testing for this is with go-sendxmpp. To configure that you setup a file named ~/.config/go-sendxmpp/config with the following contents:
username: JABBER-ID
password: PASSWORD
Go-sendxmpp can take a username and password on the command-line but that’s bad for security as in the absence of SE Linux or other advanced security systems the password can be seen by any user on the same system who runs ps. To send a message run “echo $MESSAGE | go-sendxmpp $ADDR” to send $MESSAGE to $ADDR. It also has the option “go-sendxmpp -l” to listen for incoming messages. I don’t have an immediate need to receive messages from the command-line but it’s handy to have the option.
I probably won’t be able to get a new version of etbemon in Debian for the Bookworm release. So to get go-sendxmpp to work with etbemon you need to edit /usr/lib/mon/alert.d/mailxmpp.alert and change this sendxmpp line to this go-sendxmpp line:
open (XMPP, "| /usr/bin/sendxmpp -a /etc/ssl/certs -t @xmpprec -r $host") ||
open (XMPP, "| /usr/bin/go-sendxmpp @xmpprec") ||
I just did some quick tests of hyper-threading on my new E5-2696v3 CPU. I compiled the Linux 6.0.10 kernel with and without hyper-threading enabled. Here’s the times for “make -j36 bzImage” and “make -j36 modules” with HT enabled:
real 2m26.540s
user 55m25.121s
sys 9m56.443s
real 10m57.374s
user 309m21.531s
sys 58m1.070s
Here’s the times for “make -j18 bzImage” and “make -j18 modules” with HT disabled:
real 2m40.501s
user 31m35.295s
sys 5m43.523s
real 11m39.313s
user 170m46.840s
sys 31m37.756s
That’s 9.6% faster for bzImage and 6.4% faster for modules.
So for a performance boost that’s between 5% and 10% I get greater exposure to kernel security issues and more difficulty tracking CPU time. That doesn’t seem like a good trade-off so I’ve put the “nosmt” kernel command-line option back.
Vox has an insightful interview with the author of “Slouching Towards Utopia: An Economic History of the Twentieth Century” [1]. The main claim of that book is that “The 140 years from 1870 to 2010 of the long twentieth century were, I strongly believe, the most consequential years of all humanity’s centuries”. A claim that seems well supported.
PostMarketOS is an interesting OS for hardware designed for Android [2]. It is based on Alpine Linux, is small, and modular. If you want to change something just change that package not the entire image. Also an aim is to have as much commonality between devices as possible, all phones with the same CPU family can run the same packages apart from the kernel and maybe some utilities related to hardware. Abhijithpa blogged about getting started with pmOS, it seems easy to do [3].
Interesting article about gay samurai [4]. Regarding sex with men or women “an elderly arbiter, after hearing the impassioned arguments of the two sides, counsels that the wisest course is to follow both paths in moderation, thereby helping to prevent overindulgence in either”. Wow.
The SCP project is an interesting collaborative SciFi/horror fiction project [5] based on an organisation that aims to Secure and Contain dangerous objects and beings and Protect the world from them. The series of stories about the Anti-Memetics Division [6] is a good place to start reading.
I just got a E5-2696 v3 CPU for my ML110 Gen9 home workstation, this has a Passmark score of 23326 which is almost 3 times faster than the E5-2620 v4 which rated 9224. Previously it took over 40 minutes real time to compile a 6.10 kernel that was based on the Debian kernel configuration, now it takes 14 minutes of real time, 202 minutes of user time, and 37 minutes of system CPU time. That’s a definite benefit of having a faster CPU, I don’t often compile kernels but when I do I don’t want to wait 40+ minutes for a result. I also expanded the system from 96G of RAM to 128G, most of the time I don’t need so much RAM but it’s better to have too much than too little, particularly as my friend got me a good deal on RAM. The extra RAM might have helped improve performance too, going from 6/8 DIMM slots full to 8/8 might help the CPU balance access.
That series of HP machines has a plastic mounting bracket for the CPU, see this video about the HP Proliant Smart Socket for details [1]. I was working on this with a friend who has the same model of HP server as I do, after buying myself a system I was so happy with it that I bought another the same when I saw it going for a good price and then sold it to my friend when I realised that I had too many tower servers at home. It turns out that getting the same model of computer as a friend is a really good strategy so then you can work together to solve problems with it. My friend’s first idea was to try and buy new clips for the new CPUs (which would have delayed things and cost more money), but Reddit and some blog posts suggested that you can just skip the smart-socket guide clip and when the chip was resting in the socket it felt secure as the protrusions on the sides of the socket fit firmly enough into the notches in the CPU to prevent it moving far enough to short a connection. Testing on 2 systems showed that you don’t need the clip. As an aside it would be nice if Intel made every CPU that fits a particular socket have the same physical dimensions so clips and heatsinks can work well on all CPUs.
The TDP of the new CPU is 145W and the old one was 85W. One would hope that in a server class system that wouldn’t make a lot of difference but unfortunately the difference was significant. Previously I could have the system running 7/8 cores with BOINC 24*7 and I wouldn’t notice the fans being louder. It is possible that 100% CPU use on a hot day might make the fans sound louder if I didn’t have an air-conditioner on that was loud enough to drown them out, but the noteworthy fact is that with the previous CPU the system fans were a minor annoyance. Now if I have 16 cores running BOINC it’s quite loud, the sort of noise that makes most people avoid using tower servers as workstations! I’ve found that if I limit it to 4 or 5 cores then the system is about as quiet as it was before. As a rough approximation I can use as much CPU power as before without making the fans louder but if I use more CPU power than was previously available it gets noisy.
I also got some new NVMe devices, I was previously using 2*Crucial 1TB P1 NVMes in a BTRFS RAID-1 and now I have 2*Crucial 1TB P3 NVMes (where P1 is the slowest Crucial offering, P3 is better and more expensive, P5 is even better, etc). When doing the BTRFS migrations to move my workstation to new NVMe devices and my server to the old NVMe devices I found that the P3 series seem to have a limit of about 70MB/s for sustained random writes and the P1 series is about 35MB/s. Apparently with the cheaper NVMe devices they slow down if you do lots of random writes, pity that all the review articles talking about GB/s speeds don’t mention this. To see how bad reviews are Google some reviews of these SSDs, you will find a couple of comment threads on places like Reddit of them slowing down with lots of writes and lots of review articles on well known sites that don’t mention it. Generally I’d recommend not upgrading from P1 to P3 NVMe devices, the benefit isn’t enough to cover the effort. For every capacity of NVMe devices the most expensive devices cost more than twice as much as the cheapest devices, and sometimes it will be worth the money. Getting the most expensive device won’t guarantee great performance but getting cheap devices will guarantee that it’s slow.
It seems that CPU development isn’t progressing as well as it used to, the CPU I just bought was released in 2015 and scored 23,343 according to Passmark [2]. The most expensive Intel CPU on offer at my local computer store is the i9-13900K which was released this year and scores 62,914 [3]. One might say that CPUs designed for servers are different from ones designed for desktop PCs, but the i9 in question has a “TDP Up” of 253W which is too big for the PSU I have! According to the HP web site the new ML110 Gen10 servers aren’t sold with a CPU as fast as the E5-2696 v3! In the period from 1988 to about 2015 every year there were new CPUs with new capabilities that were worth an upgrade. Now for the last 8 years or so there hasn’t been much improvement at all. Buy a new PC for better USB ports or something not for a faster CPU!
In response to a post about my latest laptop I had someone ask why I chose an Intel CPU. I’ve been a fan of the Thinkpad series of laptops since the 90s. They have always seemed well constructed (given the constraints of being light etc) and had a good feature set. Also I really like the TrackPoint. I’ve been a fan of the smaller Thinkpads since I got an X-301 from e-waste [1] and the X1-Carbon series is the latest and greatest line of small Thinkpads.
AMD makes some nice laptop CPUs which appear to have low power use and good performance particularly for smaller numbers of threads, it seems that generally AMD CPUs are designed for fewer cores with higher performance per core which is good for laptops. But Lenovo only makes the Thinkpad Carbon X1 series with Intel CPUs so choosing that model of laptop means choosing Intel. It could be that for some combination of size, TDP, speed, etc Intel just happens to beat AMD for all the times when Lenovo was designing a new motherboard for the Carbon X1. But it seems more likely that Intel has been lobbying Lenovo for this. It would be nice if there was an anti-trust investigation into Intel, everyone who’s involved in the computer industry knows of some of the anti-competitive things that they have done.
Also it would be nice if Lenovo started shipping laptops with ARM CPUs across their entire range. But for the moment I guess I have to keep buying laptops with Intel CPUs.
|
|