Archives

Categories

Links May 2023

Petter Reinholdtsen wrote an interesting blog post about their work on packaging speech to text for Debian [1]. The work of the Debian Deep Learning Team seems really interesting and I look forward to playing with this sort of thing after the release of Bookworm (the packages in question will NOT go in Bookworm but I’ll run at least one system on Testing after Bookworm). It would be nice to get more information on the hardware used for running such programs, the minimum hardware needed for real-time speech to text would be interesting to know.

Brian Krebs wrote an informative article about attacks involving supply chain compromise and fake LinkedIn profiles [2]. The attacks targetted Linux as well as Windows.

Interesting video about the Illium cameras, a bit harsh though, they criticise Illium devices for being too low resolution, too expensive, and taking too much CPU time to process [3]. The Illium cameras still sell for decent prices on eBay, I wonder if it’s because of curious people like me who would like to play with them and have money to spare or whether some other interesting things are being done. I wonder how a 4*4 array of the rectangular cameras secured together with duct tape would go. The ideas of Illium should work better if implemented for multi-core CPUs or GPUs.

Bruce Schneier with Henry Farrell and Nathan Sanders wrote an insightful blog post about how AT Chatbots could improve democracy [4].

Wired has an interesting article about the way DJI drones transmit the location of the drone operator without encryption – by design [5]. Apparently this has been used for targetting attacks on drone operators in Ukraine.

This video about robot “mice” navigating mazes is interesting [6]. But I think it became less interesting when they got to the stage of milliseconds counting for the win, it’s very optimised for one case just like F1. I think it would be interesting if they had a rally contest where they go across grass or sand, 3D mazes both in air and water, and contests where Tungsten weights have to be transported. They should push some of the other limits of engineering as completing a maze quickly has been solved.

The Guardian has an interesting article about a blood test for sleepy driving [7]. Once they have an objective test they can punish people for it.

This github repository listing public APIs is interesting [8]. Lots of fun ideas for phone apps there.

Simon Josefsson wrote an insightful blog post about the threat model of security devices [9]. Unfortunately the security of most people is way below the level where this is an issue. But it’s good to think about future steps needed for good security.

Cory Doctorow wrote an interesting article “The Swivel Eyed Loons have a Point” [10] about the fact that some of the nuttiest people are protesting about real issues, just in the wrong way.

Genesis GV60

I recently test drove a Genesis GV70, but the GV60 [1] which I didn’t test drive is a nicer car.

The GV70 and GV60 are all electric so they are quiet and perform well. The GV70 has a sun-roof that opens, it was the first car I’ve driven like that and I decided I don’t like it. Having the shade open so I can see the sky while stuck in a traffic jam is nice though. The GV60 has a non-opening sun-roof with a shade that can be retracted, this is a feature I’d really like to have in my next car.

Electric cars as a general rule have good acceleration and are quiet, the GV70 performed as expected in that regard. It has a head-up display projected on the windscreen for the speed and the speed limit on the road in question which is handy. When driving in a car park it showed images from all sides which is really handy, I wish I had explored that feature more.

The console is all electronic with a TFT display instead of mechanical instruments but the only significant difference this makes in driving is that when a turn indicator is used the console display shows a video feed for the blind-spot that matches the lane change direction. This is a significant safety feature and will reduce the incidence of collisions. But the capabilities of the hardware seem under utilised, hopefully they will release a software update at some future time to do more with it.

The most significant benefit of the GV60 over the GV70 is that it has cameras instead of mirrors at the sides of the car. This reduces drag and also removes the need to adjust mirrors to match the height of the driver. Also for driver instruction the instructor and learner get to see the same view. A logical development of such cars is an expansion pack for instruction that has displays in the passenger seat to show the instructor the same instrument view as the driver sees.

The minimum list driveaway price for the GV60 is $117,171.50 and for the GV70 it is $138,119.89 – both of which are more than I’m prepared to pay for a car. The GV60 apparently can be started by fingerprint which seems like a bad idea given the poor security of fingerprint sensors, but as regular car keys tend not to be too difficult to work around it probably doesn’t matter. The Genesis web site makes it difficult to find the ranges of electric cars which is surprising. A Google search suggests that the GV60 can do 466Km and the GV70 can do 410Km which are both reasonable numbers and nothing to be ashamed of.

The GV70 was a fun car to drive and the GV60 looks like it would be even better. I recommend that everyone who likes technology take one for a test drive, but for my own use I’m looking for something that costs less than half as much.

Considering Convergence

What is Convergence

In 2013 Kyle Rankin (at the time Linux Journal columnist and CSO of Purism) wrote a Linux Journal article about Linux convergence [1] (which means using a phone and a dock to replace a desktop) featuring the Nokia N900 smart phone and a chroot environment on the Motorola Droid 4 Android phone. Both of them have very limited hardware even by the standards of the day and neither of which were systems I’d consider using all the time. None of the Android phones I used at that time were at all comparable to any sort of desktop system I’d want to use.

Hardware for Convergence – Comparing a Phone to a Laptop

The first hardware issue for convergence is docks and other accessories to attach a small computer to hardware designed for larger computers. Laptop docks have been around for decades and for decades I haven’t been using them because they have all been expensive and specific to a particular model of laptop. Having an expensive dock at home and an expensive dock at the office and then replacing them both when the laptop is replaced may work well for some people but wasn’t something I wanted to do. The USB-C interface supports data, power, and DisplayPort video over the same cable and now USB-C docks start at about $20 on eBay and dock functionality is built in to many new monitors. I can take a USB-C device to the office of any large company and know there’s a good chance that there will be a USB-C dock ready for me to use. The fact that USB-C is a standard feature for phones gives obvious potential for convergence.

The next issue is performance. The Passmark benchmark seems like a reasonable way to compare CPUs [2]. It may not be the best benchmark but it has an excellent set of published results for Intel and AMD CPUs. I ran that benchmark on my Librem5 [3] and got a result of 507 for the CPU score. At the end of 2017 I got a Thinkpad X301 [4] which rates 678 on Passmark. So the Librem5 has 3/4 the CPU power of a laptop that was OK for my use in 2018. Given that the X301 was about the minimum specs for a PC that I can use (for things other than serious compiles, running VMs, etc) the Librem 5 has 3/4 the CPU power, only 3G of RAM compared to 6G, and 32G of storage compared to 64G. Here is the Passmark page for my Librem5 [5]. As an aside my Libnrem5 is apparently 25% faster than the other results for the same CPU – did the Purism people do something to make their device faster than most?

For me the Librem5 would be at the very low end of what I would consider a usable desktop system. A friend’s N900 (like the one Kyle used) won’t complete the Passmark test apparently due to the “Extended Instructions (NEON)” test failing. But of the rest of the tests most of them gave a result that was well below 10% of the result from the Librem5 and only the “Compression” and “CPU Single Threaded” tests managed to exceed 1/4 the speed of the Librem5. One thing to note when considering the specs of phones vs desktop systems is that the MicroSD cards designed for use in dashcams and other continuous recording devices have TBW ratings that compare well to SSDs designed for use in PCs, so swap to a MicroSD card should work reasonably well and be significantly faster than the hard disks I was using for swap in 2013!

In 2013 I was using a Thinkpad T420 as my main system [6], it had 8G of RAM (the same as my current laptop) although I noted that 4G was slow but usable at the time. Basically it seems that the Librem5 was about the sort of hardware I could have used for convergence in 2013. But by today’s standards and with the need to drive 4K monitors etc it’s not that great.

The N900 hardware specs seem very similar to the Thinkpads I was using from 1998 to about 2003. However a device for convergence will usually do more things than a laptop (IE phone and camera functionality) and software had become significantly more bloated in 1998 to 2013 time period. A Linux desktop system performed reasonably with 32MB of RAM in 1998 but by 2013 even 2G was limiting.

Software Issues for Convergence

Jeremiah Foster (Director PureOS at Purism) wrote an interesting overview of some of the software issues of convergence [7]. One of the most obvious is that the best app design for a small screen is often very different from that for a large screen. Phone apps usually have a single window that shows a view of only one part of the data that is being worked on (EG an email program that shows a list of messages or the contents of a single message but not both). Desktop apps of any complexity will either have support for multiple windows for different data (EG two messages displayed in different windows) or a single window with multiple different types of data (EG message list and a single message). What we ideally want is all the important apps to support changing modes when the active display is changed to one of a different size/resolution. The Purism people are doing some really good work in this regard. But it is a large project that needs to involve a huge range of apps.

The next thing that needs to be addressed is the OS interface for managing apps and metadata. On a phone you swipe from one part of the screen to get a list of apps while on a desktop you will probably have a small section of a large monitor reserved for showing a window list. On a desktop you will typically have an app to manage a list of items copied to the clipboard while on Android and iOS there is AFAIK no standard way to do that (there is a selection of apps in the Google Play Store to do this sort of thing).

Purism has a blog post by Sebastian Krzyszkowiak about some of the development of the OS to make it work better for convergence and the status of getting it in Debian [8].

The limitations in phone hardware force changes to the software. Software needs to use less memory because phone RAM can’t be upgraded. The OS needs to be configured for low RAM use which includes technologies like the zram kernel memory compression feature.

Security

When mobile phones first came out they were used for less secret data. Loss of a phone was annoying and expensive but not a security problem. Now phone theft for the purpose of gaining access to resources stored on the phone is becoming a known crime, here is a news report about a thief stealing credit cards and phones to receive the SMS notifications from banks [9]. We should expect that trend to continue, stealing mobile devices for ssh keys, management tools for cloud services, etc is something we should expect to happen.

A problem with mobile phones in current use is that they have one login used for all access from trivial things done in low security environments (EG paying for public transport) to sensitive things done in more secure environments (EG online banking and healthcare). Some applications take extra precautions for this EG the Android app I use for online banking requires authentication before performing any operations. The Samsung version of Android has a system called Knox for running a separate secured workspace [10]. I don’t think that the Knox approach would work well for a full Linux desktop environment, but something that provides some similar features would be a really good idea. Also running apps in containers as much as possible would be a good security feature, this is done by default in Android and desktop OSs could benefit from it.

The Linux desktop security model of logging in to a single account and getting access to everything has been outdated for a long time, probably ever since single-user Linux systems became popular. We need to change this for many reasons and convergence just makes it more urgent.

Conclusion

I have become convinced that convergence is the way of the future. It has the potential to make transporting computers easier, purchasing cheaper (buy just a phone and not buy desktop and laptop systems), and access to data more convenient. The Librem5 doesn’t seem up to the task for my use due to being slow and having short battery life, the PinePhone Pro has more powerful hardware and allegedly has better battery life [11] so it might work for my needs. The PinePhone Pro probably won’t meet the desktop computing needs of most people, but hardware keeps getting faster and cheaper so eventually most people could have their computing needs satisfied with a phone.

The current state of software for convergence and for Linux desktop security needs some improvement. I have some experience with Linux security so this is something I can help work on.

To work on improving this I asked Linux Australia for a grant for me and a friend to get PinePhone Pro devices and a selection of accessories to go with them. Having both a Librem5 and a PinePhone Pro means that I can test software in different configurations which will make developing software easier. Also having a friend who’s working on similar things will help a lot, especially as he has some low level hardware skills that I lack.

Linux Australia awarded the grant and now the PinePhones are in transit. Hopefully I will have a PinePhone in a couple of weeks to start work on this.

Links April 2023

Cory Doctorow has an insightful article Gig Work is the Opposite of Steampunk [1] about the horrors that companies like Amazon are forcing on their employees.

Valerie Aurora and Leigh Honeywell wrote an insightful article about the al Capone theory of sexual harassment [2]. Why people who sexually harass others usually perform other anti-social activity that is also easier to prosecute.

The IEEE has an interesting article about using ML for parts of the CPU design process, both the technical issues and the controversy about competing ideas which is probably caused by sexism [3].

“Love and taxes are forever my heart” is a line from an anime dating sim game that prepares US taxes [4]. Unfortunately it was removed from Steam. The existence of the game is a weird social commentary and removing the game because you can’t have an anime hottie do taxes is bizarre but also understandable given liability issues. There’s no mention in the review of whether male hotties are available for people who prefer dating guys. As an aside my accountant looks like he is allergic to exercise…

The Killdozer Book web site (which has an invalid SSL certificate so you have to click on “advanced” in Chrome to get to the content) has an insightful article debunking some of the stories about the Killdozer [5]. He wasn’t some sort of hero of freedom, he was just a jerk who reneged on a deal hoping to get more money, thought that laws shouldn’t apply to him, and killed himself because of it.

Apparently some big tech companies are knowingly hiring people to not work unlike the usual large corporate case of unknowingly hiring people to not work [6]. Silicon Valley is a good TV show, and it’s apparently realistic.

Ron Garrett wrote in insightful blog post about theoretical attacks on Bitcoin and how Bitcoin could be used [7]. The conclusion is not good for Bitcoin.

Compiler Explorer is a program that shows how various C++ compilers produce assembly code for various architectures, this site hosts the main active instance [8]. There are other instances, here is an instance that produces code for the Ruzzian Elbrus architecture [9]. The Elbrus Wikipedia page is interesting [10]. Apparently the Ruzzians don’t want this information to be spread, LOL.

The Smithsonian Magazine has an interesting article about pet parrots being taught to video call each other [11]. Apparently parrots are social animals and can develop psychological problems if kept alone, so the video calls can be good for them. Also the owners had to monitor the chats to ensure that they played nicely together, just like play-dates for kids!

Phoronix has an amusing article about the drama regarding the AMD Spectral Chicken bit in the Linux kernel source [12].

This page listing bad free software licenses is amusing [13].

The ACS has an interesting article about how Samsung fakes photos of the moon and presumably could fake other photos of notable objects that don’t change [14]. The way that they proved the forgery was interesting.

Write a blog post in the style of Russell Coker

Feeling a bit bored I asked ChatGPT “Write a blog post in the style of Russell Coker” and the result is in the section below. I don’t know if ChatGPT knows that the person asking the question is the same as the person being asked about. If a human had created that I’d be certain that “great computer scientist and writer” was an attempt at flattery, for a machine I’m not sure.

I have not written a single book, but I expect that in some alternate universe some version of me has written several. I don’t know if humans would describe my writing as being known for “clarity, precision, and depth”. I would not be surprised if “no-one else wrote about it so I guess I’m forced to read what he wrote” would be a more common response.

The actual “article” part doesn’t seem to be in my style at all. Firstly it’s very short at only 312 words, while I have written some short posts most of them are much longer. To find this out I did some MySQL queries to get the lengths of posts (I used this blog post as inspiration [1]). Note that multiple sequential spaces counts as multiple words.

# get post ID and word count
SELECT id, LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1 AS wordcount FROM wp_posts where post_status = 'publish' and post_type='post';
# get average word count
SELECT avg(LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1) FROM wp_posts where post_status = 'publish' and post_type='post';
# get the first posts by length
SELECT id, LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1 AS wordcount, post_content FROM wp_posts where post_status = 'publish' and post_type='post' ORDER BY wordcount limit 10;
# get a count of the posts less than 312 words
SELECT count(*) from wp_posts where (LENGTH(post_content) - LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(post_content, "\r", ""), "\n", ""), "\t", ""), " ", "")) + 1) < 312 and post_status = 'publish' and post_type='post';
# get a count of all posts
select count(*) from wp_posts where post_status = 'publish' and post_type='post';

It turns out that there are 333/1521 posts that are less than 312 words and the average length is 665 words. Of the shortest posts a large portion were written before Twitter became popular and had the sort of unimportant content that Twitter is good for.

It seems ironic that ChatGPT says that I'm known for "depth" and then writes a short post allegedly in my style.

As for the actual content of the "post", it's not something that I greatly disagree with but it's definitely not something I'd write. If some manager or marketing person wrote something like this and asked my opinion I'd probably suppress a sigh and say it's OK for it's target market.

I think that free software is better in many ways and for many reasons and that separately from the issue of merit free software is important because of the ways that it is used. We could debate various issues related to the quality of free software vs proprietary software but even people who don't believe that free software is of high quality have to admit that software that runs most mobile phones and most web servers is important. As much as I dislike Windows most desktop PCs and laptops run it so I can't claim it to be unimportant. So my first disagreement with the content is the issue of importance vs quality.

One thing that ChatGPT is known for is weak arguments that appear convincing. "This diversity often leads to better software, as bugs are found and fixed faster, and new features are added more quickly" could be opposed by "the payment for proprietary software often leads to better software, as bugs are found and fixed faster, and new features are added more quickly". I think that any case where "feature of option A gives this list of good results" can be rewritten as "feature of the opposite of option A gives the same list of good results" and be equally convincing is a poor argument. Also any argument that seems to rest on "free software is all written by hobbyists" is ignoring the demands of users who need things done on a non-hobby timeframe, the many profitable companies that develop and support free software, and the employment status of a large portion of my Facebook friends (many of whom are paid for ~2000 hours a year of free software development and many more of whom have paid free software development as part of their work supporting the use of free software). Really the entire blog post it wrote seems to me like something a human might write if their boss assigned them to write about the benefits of free software, and that they would then write a new article saying the exact opposite if their boss asked them to.

Finally I prefer the term "free software" to "open source software". The difference doesn't seem important to some people, but any being that doesn't know the difference won't do well at impersonating me.

ChatGPT Result

Hello everyone, I'm ChatGPT, and today I'm going to try my best to write a blog post in the style of the great computer scientist and writer, Russell Coker.

Russell Coker is a renowned software engineer and writer, who is best known for his contributions to the Linux community, especially the Debian project. He has authored several books and articles on topics ranging from security to software engineering, and his writing style is known for its clarity, precision, and depth.

So, without further ado, let's dive into the topic of this blog post, which is the importance of open-source software.

Open-source software is software whose source code is available for anyone to inspect, modify, and distribute. This means that anyone can use, study, and improve the software, without any restrictions or fees. Open-source software has been around for a long time, but it has become more prevalent and important in recent years, thanks to the rise of the internet and the digital economy.

Open-source software has several benefits, both for individuals and for society as a whole. One of the most significant benefits is that it fosters innovation and collaboration. When software is open-source, it means that anyone can contribute to it, which leads to a diverse range of perspectives and ideas. This diversity often leads to better software, as bugs are found and fixed faster, and new features are added more quickly.

Another benefit of open-source software is that it is often more secure than proprietary software. Because the source code is available for anyone to inspect, vulnerabilities and security flaws can be found and fixed faster. This means that open-source software is often more reliable and trustworthy than proprietary software, which may have hidden backdoors or vulnerabilities that can be exploited by attackers.

Open-source software is also often more cost-effective than proprietary software. Because it is free to use and distribute, organizations can save money on licensing fees and other costs associated with proprietary software. This can be particularly important for non-profit organizations or small businesses that may not have the resources to invest in expensive software licenses.

In conclusion, open-source software is an important part of the digital landscape, and its benefits are numerous. Whether you are a software developer, a business owner, or an individual user, open-source software has something to offer. So, the next time you are looking for software to use, consider open-source options, and help support the collaborative and innovative spirit of the open-source community.

BTRFS Rebuild Time

In February I replaced a Dell T320 server with a HP Z640 workstation for a home server/workstation [1]. The T320 has 8*3.5″ drive bays which I had used to put 3*4TB disks in a BTRFS RAID-10 array for 6TB of usable capacity. The Z640 has only 2*3.5″ bays and 4*2.5″ bays, so one option I could have taken was to buy a 4TB 2.5″ SSD and keep the same 3*4TB array as before. Instead I chose to use an 8TB disk I had spare in an array with one of the original 4TB disks and some extra on NVMe devices (the system has 2*1TB NVMe devices which are used as a 380G RAID-1 for the root filesystem and the rest for the storage array). It’s nice how BTRFS allows putting any storage you have in a RAID-10 configuration.

Unfortunately it seems that I chose the wrong 4TB disk to use for this as it failed three days ago. It gave thousands of read and write errors and Linux decided that the drive no longer existed. I tried rebooting the system to get it in the BTRFS array again but it failed again and failed so quickly that it wasn’t even possible to use the data on it as part of a RAID rebuild. So I removed that disk and put in one of the other 4TB disks.

As the array is comprised of an 8TB disk and 3 other devices that don’t add up to 8TB the layout is the 8TB disk having one copy of everything and the other devices having parts of it. So the rebuild process comprised of copying data from the 8TB disk to the 4TB disk. For a RAID array run in the manner of Linux software RAID the rebuild of a RAID-1 involves a linear copy of data which is the optimal case for hard disks, copying 4TB of data in that manner would have an average speed of a bit over 100MB/s and take about 11 hours. With BTRFS the source disk has to be updated for each block that is recreated so the process was bottlenecked on writing to the 8TB disk. It took 2 days 23 hours to complete. The process involved reading 3,478,031MB and writing 4,405,545MB. The system was live for the process and some cron jobs etc were writing to the array, but in the 12 hours since the rebuild completed the array has had 7,038MB written. So presumably during the rebuild time about 42G of actual data were written to the array and the other 4.3TB written to the 8TB disk were from the process of copying 3.5TB from it to another device. Iostat reported that there were 645.36 TPS for the duration of the rebuild which seems like a decent number for a hard drive, during the process iostat reported that the drive had 99%+ of IO capacity used for the duration.

While waiting for this to complete I wrote a blog post about storage trends [2]. One thing I didn’t mention in that post is that if you are the type of person who checks the rebuild process fifty times a day then that should be counted as part of the cost of using slow storage. If instead of an 8TB disk plus some SSD storage I had used 2*4TB disks and 1*4TB SSD as I had considered doing then instead of having 3.8TB on one device I would have had about 2.5TB and the reconstruct would have probably taken 2/3 of the time. If I had moved the array to 3*4TB SSDs then it would have taken a small fraction of the time.

One thing to note is that I made a mistake in this operation by removing the failed device instead of doing a “btrfs replace” operation which can be significantly faster. If I had correctly done this then I would have written a blog post about the rebuild taking 2 days or something, the issues of hard drives being slow and me compulsively checking the progress would still apply.

Storage Trends 2023

It’s been 2 years since my last blog post about storage trends [1].

Minimum Storage <=2TB

In 2021 I stated that as MSY had 2TB disks for $72 and 2TB SSD for $245 it was barely worth considering a 2TB disk and anything less than 2TB wasn’t worth considering. Now for 2TB storage from MSY NVMe starts at $129, SATA SSD starts at $143, and hard disks start at $75. I guess that NVMe is slightly cheaper due to some combination of economies of scale for manufacture/sales and having less postage costs. It really doesn’t make sense to consider hard disks for storing 2TB or less.

For storage for a small system (PC or laptop) the cheapest storage device is $19 for a 128G SATA SSD. But it wouldn’t make sense to buy that when you can get a 256G SATA SSD for $22 or a 240G NVMe device for $23, saving $3 on storage wouldn’t make any sense. For 512G of storage the prices are $32 for NVMe and $33 for SATA SSD. For 1TB of storage the prices start at $68 for SATA SSD and $74 for NVMe. Probably for the vast majority of home users 1TB of SATA SSD or NVMe is the minimum storage capacity to consider, the $50 price difference isn’t much when considering the entire price of a PC or laptop and anything less than 1TB will run out quickly with modern use.

Larger Storage 4TB+

The price for 4TB of storage from MSY is NVMe starting at $349, SATA SSD starting at $369, and hard disks starting at $115. If you need 4TB of RAID-1 storage then it might be worth saving $470 and getting hard drives for a home user. For business use it wouldn’t make sense. Some laptops have two NVMe sockets so 8TB of storage (or 4TB of RAID-1) in a laptop would be interesting.

For 8TB of storage the MSY prices are SATA SSD for $739 and hard drives starting at $179. Probably hard drives are the best choice for most situations where there is a need to store 8TB or more of data. But the prices are low enough to make 8TB SSD something that can be considered for home use, it doesn’t seem that long ago that the 4TB hard drives I bought for my home server were almost that expensive.

Big Storage

MSY doesn’t have 8TB NVMe, such devices are on eBay for $1700 for regular M.2 NVMe and just under $1000 for U.2 (server hot-swap devices). So if you need more than 8TB of NVMe storage then probably buying a server with U.2 built in is the correct solution.

For home users who need more than 8TB of storage hard drives are a good solution. One issue is that the more affordable and larger drives use Shingled Magnetic Recording (SMR) which has some different performance characteristics for certain workloads. Apparently SMR performs badly for anything other than large file storage.

Why MSY?

I primarily used MSY prices for this post because they are a reliable local store that has a list of prices that is easy to read. For everything in this post I can get better prices by using eBay, the StaticIce.com.au price comparison site [2], and the computing section of the OzBargain site that gamifies finding good prices [3]. But a good shopping strategy nowadays is to compare prices in a store to determine what items are in your price range and then shop around for price on the item you want. Checking all the different bargain sites for all these items would take much more time than I want to spend writing a blog post!

Conclusion

Hard drives don’t make sense for the vast majority of systems. Not for laptops, not for typical desktop PCs, and not for small business servers (say 8TB or less of RAID storage). Hard drives only make sense for dozens or hundreds of TB of storage and even then finding out how to deal with SMR issues is going to increase the pain of deployment. Maybe using a combination of SSD and hard drives to deal with the SMR issues is going to be a competitive advantage for NAS vendors in future.

NVMe looks like it’s on the way to being cheaper than SATA SSD. There is likely to be a good market for systems with NVMe as the only internal storage option.

The long term trend of systems without DVD drives and with maybe 2.5″ SATA devices but no 3.5″ SATA devices seems to lead to the GPU being the major part that needs to fit into a PC case that determines the overall size. Maybe there will be a new trend of GPUs connected to riser cards so they can be parallel to the motherboard for compact PCs.

For business desktop systems (IE low powered graphics hardware as it’s not for gaming) I expect that the trend will be towards NUC type devices which are already based around the M.2 as a storage device size.

Links March 2023

Interesting paper about a plan for eugenics in dogs with an aim to get human equivalent IQ within 100 generations [1]. It gets a bit silly when the author predicts IQs of 8000+ as there will eventually be limits of what can fit in one head. But the basic concept is good.

Interesting article about what happens inside a proton [2]. This makes some aspects of the Trisolar series and the Dragon’s Egg series seem less implausible.

Insightful article about how crypto-currencies really work [3]. Basically the vast majority of users trust some company that’s outside the scope of most financial regulations to act as their bank. Surprisingly the author doesn’t seem to identify such things as a Ponzi scheme.

Bruce Schneier wrote an interesting blog post about AIs as hackers [4].

Cory Doctorow wrote an insightful article titled “The ‘Enshittification’ of TikTok” which is about the enshittification of commercial Internet platforms in general [5]. We need more regulation of such things.

Cat Valente wrote an insightful article titled “Stop Talking to Each Other and Start Buying Things: Three Decades of Survival in the Desert of Social Media” about the desire to profit from social media repeatedly destroying platforms [6].

This Onion video has a good point, I don’t want to watch videos on news sites etc [7]. We need ad-blockers that can block video on all sites other than YouTube etc.

Wired has an interesting article about the machines that still need floppy disks, including early versions of the 747 [8]. There are devices to convert the floppy drive interface to a USB storage device which are being used on some systems but which presumably aren’t certified for a 747. The article says that 3.5″ disks cost $1 each because they are rare – that’s still cheaper than when they were first released.

Android Police has an interesting article about un-redacting information in PNG files [9]. It seems that some software on Pixel devices hasn’t been truncating files when editing them, just writing the new data over top and some platforms (notably Discord) send the entire file wuthout parsing it (unlike Twitter for example which removes EXIF data to protect users). Then even though a PNG file is compressed from the later part of the data someone can deduce the earlier data.

Teen Vogue has an insightful article about the harm that “influencer parents” do to their children [10].

Jonathan McDowell wrote a very informative blog post about his new RISC-V computer running Debian [11]. He says that it takes 10 hours to do a full Debian kernel build (compared to 14 minutes for my 18 core E5-2696) so it’s about 2% the CPU speed of a high end 2015 server CPU which is pretty good for an embedded devivce. That is similar to some of the low end Thinkpads that were on sale in 2015.

The Surviving Tomorrow site has an interesting article about a community where all property is community owned [12]. It’s an extremist Christian group and the article is written by a slightly different Christian extremist, but the organisation is interesting. A technology positive atheist versions of this would be good.

Bruce Schneier and Nathan E. Sanders co-wrote an insightful article about how AI could exploit the process of making laws [13]. We really need to crack down on political lobbying, any time a constitution is being amender prohibiting lobbying should be included.

Anarcat wrote a very informative blog post about the Framework laptops that are designed to be upgraded by the user [14]. The motherboard can be replaced and there are cases designed so you can use the old laptop motherboard as an embedded PC. Before 2017 I would have been very interested in such a laptop. Now I’ve moved to low power laptops and servers for serious compiles and a second-hand Thinkpad X1 Carbon costs less than a new Framework motherboard. But this will be a really good product for people with more demanding needs than mine. Pity they don’t have a keyboard with the Thinkpad Trackpoint.

Strange X11 Grabbing

A couple of days ago I upgraded my home server from Debian/Bullseye to Debian/Testing (soon to be Bookworm). Since then KDE sessions on that system have had problems of locking the input queue, the mouse can move and mouse-over events work but clicking the mouse or pressing the keyboard does nothing. Various web pages suggested that the xdotool program (in the xdotool package in Debian) can address this. The problem is apparently programs “grabbing” the input and not letting it go.

The command “xdotool key XF86LogGrabInfo” causes the xorg server to dump information on it’s “grabs”. After running that command I looked in /var/log/Xorg.0.log and found that active grabs were only held by /usr/bin/kwin_x11 and /usr/bin/kglobalaccel5. So it seems like a KDE issue. Other systems running X11 with Debian/Testing (such as the laptop I’m using to write this blog post) don’t have the problem, so it could be something related to the KDE configuration of the account used on that system.

The command “xdotool key XF86Ungrab” is supposed to break out of such a grab, but for me didn’t do so.

On the same system running KDE with Wayland works fine in this regard. Does Wayland do things differently and not allow this “grabbing” to block everything? Does KDE have an X11 specific bug? Is there a race condition that just gets triggered by the speed of Xorg on that system but not by the slightly different timings of Wayland? I might never find out.

I previously wrote about problems with Wayland/KDE on laptops [1]. Fortunately this bug happened to occur on a server so inability to reconfigure monitors isn’t necessarily a deal breaker, although being unable to use some of the high-DPI settings for the 4K monitor it has may be an issue. It will be really annoying if some of the laptop configurations I support get this grabbing problem. But since that time I have learned of the kscreen-doctor command which is included in Debian/Testing and can do some of the necessary things, it doesn’t have a man page so you have to run “kscreen-doctor -h” for documentation.

Firebuild

After reading Bálint’s blog post about Firebuild (a compile cache) [1] I decided to give it a go. It’s non-free, the project web site [2] says that it’s free for non-commercial use or commercial trials.

My first attempt at building a Debian package failed due to man-recode using a seccomp() sandbox, I filed Debian bug #1032619 [3] about this (thanks for the quick response Bálint). The solution for me was to edit /etc/firebuild.conf and add man-recode to the dont_intercept list. The new version that’s just been uploaded to Debian fixes it by disabling seccomp() and will presumably allow slightly better performance.

Here are the results of building the refpolicy package with Firebuild, a regular build, the first build with Firebuild (30% slower) and a rebuild with Firebuild that reduced the time by almost 42%.

real    1m32.026s
user    4m20.200s
sys     2m33.324s

real    2m4.111s
user    6m31.769s
sys     3m53.681s

real    0m53.632s
user    1m41.334s
sys     3m36.227s

Next I did a test of building a Linux 6.1.10 kernel with “make bzImage -j18“, here are the results from a normal build, first build with firebuild, and second build. The real time is worse with firebuild for this on my machine. I think that the relative speeds of my CPU (reasonably fast 18 core) and storage (two of the slower NVMe devices in a BTRFS RAID-1) is the cause of the first build being relatively so much slower for “make bzImage” than for building the refpolicy, as the kernel build process involves a lot more data. For the final build I moved ~/.cache/firebuild to a tmpfs (I have 128G of RAM and not much running on my machine at the time of the tests), even then building with firebuild was slightly slower in real time but took significantly less CPU time (user+real being 20mins instead of 36m). I also ran several tests with the kernel source tree on a tmpfs but for unknown reasons those tests each took about 6 minutes. Does firebuild or the Linux kernel build process dislike tmpfs for some reason?

real    2m43.020s
user    31m30.551s
sys     5m15.279s

real    8m49.675s
user    64m11.258s
sys     19m39.016s

real    3m6.858s
user    7m47.556s
sys     9m22.513s

real    2m51.910s
user    10m53.870s
sys     9m21.307s

One thing I noticed from the kernel build tests is that the total CPU time taken by the firebuild process (as reported by ps) was more than 2/3 of the run time and top usually reported it as taking around 75% of a CPU core. It seems to me that the firebuild process itself is a bottleneck on build speed. Building refpolicy without firebuild has an average of 4.5 cores in use while building the kernel haas 13.5. Unless they make a multi-threaded version of firebuild it seems that it won’t give the performance one would hope for from a CPU with 18+ cores. I presume that if I had been running with hyper-threading enabled then firebuild would have been even worse for kernel builds as it would sometimes get on the second thread of a core. It looks like firebuild would perform better on AMD CPUs as they tend to have fewer CPU cores with greater average performance per core so a single CPU core for firebuild will be less limited. I presume that the firebuild developers will make it perform better with large numbers of cores in future, the latest Intel laptop CPUs have 16+ cores and servers with 2*40core CPUs are common.

The performance improvement for refpolicy is significant as a portion of build time, but insignificant in terms of real time. A full build of refpolicy doesn’t take enough time to get a Coke and reducing it doesn’t offer a huge benefit, if Firebuild was available in past years when refpolicy took 20 minutes to build (when DDR2 was the best RAM available) then it would be a different story.

There is some potential to optimise the build of refpolicy for the non-firebuild case. Getting it to average more than 4.5 cores in use when there’s 18 available should be possible, there are a number of shell for loops in the main Makefile and maybe some of them can be replaced by make constructs to allow running in parallel. If it used 7 cores on average then it would be faster in a regular build than it currently is with firebuild and a hot cache. Any advice from make experts would be appreciated.