Archives

Categories

RISC-V and Qemu

RISC-V is the latest RISC architecture that’s become popular. It is the 5th RISC architecture from the University of California Berkeley. It seems to be a competitor to ARM due to not having license fees or restrictions on alterations to the architecture (something you have to pay extra for when using ARM). RISC-V seems the most popular architecture to implement in FPGA.

When I first tried to run RISC-V under QEMU it didn’t work, which was probably due to running Debian/Unstable on my QEMU/KVM system and there being QEMU bugs in Unstable at the time. I have just tried it again and got it working.

The Debian Wiki page about RISC-V is pretty good [1]. The instructions there got it going for me. One thing I wasted some time on before reading that page was trying to get a netinst CD image, which is what I usually do for setting up a VM. Apparently there isn’t RISC-V hardware that boots from a CD/DVD so there isn’t a Debian netinst CD image. But debootstrap can install directly from the Debian web server (something I’ve never wanted to do in the past) and that gave me a successful installation.

Here are the commands I used to setup the base image:

apt-get install debootstrap qemu-user-static binfmt-support debian-ports-archive-keyring

debootstrap --arch=riscv64 --keyring /usr/share/keyrings/debian-ports-archive-keyring.gpg --include=debian-ports-archive-keyring unstable /mnt/tmp http://deb.debian.org/debian-ports

I first tried running RISC-V Qemu on Buster, but even ls didn’t work properly and the installation failed.

chroot /mnt/tmp bin/bash
# ls -ld .
/usr/bin/ls: cannot access '.': Function not implemented

When I ran it on Unstable ls works but strace doesn’t work in a chroot, this gave enough functionality to complete the installation.

chroot /mnt/tmp bin/bash
# strace ls -l
/usr/bin/strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Function not implemented
/usr/bin/strace: ptrace(PTRACE_TRACEME, ...): Function not implemented
/usr/bin/strace: PTRACE_SETOPTIONS: Function not implemented
/usr/bin/strace: detach: waitpid(1602629): No child processes
/usr/bin/strace: Process 1602629 detached

When running the VM the operation was noticably slower than the emulation of PPC64 and S/390x which both ran at an apparently normal speed. When running on a server with equivalent speed CPU a ssh login was obviously slower due to the CPU time taken for encryption, a ssh connection from a system on the same LAN took 6 seconds to connect. I presume that because RISC-V is a newer architecture there hasn’t been as much effort made on optimising the Qemu emulation and that a future version of Qemu will be faster. But I don’t think that Debian/Bullseye will give good Qemu performance for RISC-V, probably more changes are needed than can happen before the freeze. Maybe a version of Qemu with better RISC-V performance can be uploaded to backports some time after Bullseye is released.

Here’s the Qemu command I use to run RISC-V emulation:

qemu-system-riscv64 -machine virt -device virtio-blk-device,drive=hd0 -drive file=/vmstore/riscv,format=raw,id=hd0 -device virtio-blk-device,drive=hd1 -drive file=/vmswap/riscv,format=raw,id=hd1 -m 1024 -kernel /boot/riscv/vmlinux-5.10.0-1-riscv64 -initrd /boot/riscv/initrd.img-5.10.0-1-riscv64 -nographic -append net.ifnames=0 noresume security=selinux root=/dev/vda ro -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-device,rng=rng0 -device virtio-net-device,netdev=net0,mac=02:02:00:00:01:03 -netdev tap,id=net0,helper=/usr/lib/qemu/qemu-bridge-helper

Currently the program /usr/sbin/sefcontext_compile from the selinux-utils package needs execmem access on RISC-V while it doesn’t on any other architecture I have tested. I don’t know why and support for debugging such things seems to be in early stages of development, for example the execstack program doesn’t work on RISC-V now.

RISC-V emulation in Unstable seems adequate for people who are serious about RISC-V development. But if you want to just try a different architecture then PPC64 and S/390 will work better.

Monopoly the Game

The Smithsonian Mag has an informative article about the history of the game Monopoly [1]. The main point about Monopoly teaching about the problems of inequality is one I was already aware of, but there are some aspects of the history that I learned from the article.

Here’s an article about using modified version of Monopoly to teach Sociology [2].

Maria Paino and Jeffrey Chin wrote an interesting paper about using Monopoly with revised rules to teach Sociology [3]. They publish the rules which are interesting and seem good for a class.

I think it would be good to have some new games which can teach about class differences. Maybe have an “Escape From Poverty” game where you have choices that include drug dealing to try and improve your situation or a cooperative game where people try to create a small business. While Monopoly can be instructive it’s based on the economic circumstances of the past. The vast majority of rich people aren’t rich from land ownership.

Planet Linux Australia

Linux Australia have decided to cease running the Planet installation on planet.linux.org.au. I believe that blogging is still useful and a web page with a feed of Australian Linux blogs is a useful service. So I have started running a new Planet Linux Australia on https://planet.luv.asn.au/. There has been discussion about getting some sort of redirection from the old Linux Australia page, but they don’t seem able to do that.

If you have a blog that has a reasonable portion of Linux and FOSS content and is based in or connected to Australia then email me on russell at coker.com.au to get it added.

When I started running this I took the old list of feeds from planet.linux.org.au, deleted all blogs that didn’t have posts for 5 years and all blogs that were broken and had no recent posts. I emailed people who had recently broken blogs so they could fix them. It seems that many people who run personal blogs aren’t bothered by a bit of downtime.

As an aside I would be happy to setup the monitoring system I use to monitor any personal web site of a Linux person and notify them by Jabber or email of an outage. I could set it to not alert for a specified period (10 mins, 1 hour, whatever you like) so it doesn’t alert needlessly on routine sysadmin work and I could have it check SSL certificate validity as well as the basic page header.

Weather and Boinc

I just wrote a Perl script to look at the Australian Bureau of Meteorology pages to find the current temperature in an area and then adjust BOINC settings accordingly. The Perl script (in this post after the break, which shouldn’t be in the RSS feed) takes the URL of a Bureau of Meteorology observation point as ARGV[0] and parses that to find the current (within the last hour) temperature. Then successive command line arguments are of the form “24:100” and “30:50” which indicate that at below 24C 100% of CPU cores should be used and below 30C 50% of CPU cores should be used. In warm weather having a couple of workstations in a room running BOINC (or any other CPU intensive task) will increase the temperature and also make excessive noise from cooling fans.

To change the number of CPU cores used the script changes /etc/boinc-client/global_prefs_override.xml and then tells BOINC to reload that config file. This code is a little ugly (it doesn’t properly parse XML, it just replaces a line of text) and could fail on a valid configuration file that wasn’t produced by the current BOINC code.

The parsing of the BoM page is a little ugly too, it relies on the HTML code in the BoM page – they could make a page that looks identical which breaks the parsing or even a page that contains the same data that looks different. It would be nice if the BoM published some APIs for getting the weather. One thing that would be good is TXT records in the DNS. DNS supports caching with specified lifetime and is designed for high throughput in aggregate. If you had a million IOT devices polling the current temperature and forecasts every minute via DNS the people running the servers wouldn’t even notice the load, while a million devices polling a web based API would be a significant load. As an aside I recommend playing nice and only running such a script every 30 minutes, the BoM page seems to be updated on the half hour so I have my cron jobs running at 5 and 35 minutes past the hour.

If this code works for you then that’s great. If it merely acts as an inspiration for developing your own code then that’s great too! BOINC users outside Australia could replace the code for getting meteorological data (or even interface to a digital thermometer). Australians who use other CPU intensive batch jobs could take the BoM parsing code and replace the BOINC related code. If you write scripts inspired by this please blog about it and comment here with a link to your blog post.

Continue reading Weather and Boinc

MPV vs Mplayer

After writing my post about VDPAU in Debian [1] I received two great comments from anonymous people. One pointed out that I should be using VA-API (also known as VAAPI) on my Intel based Thinkpad and gave a reference to an Arch Linux Wiki page, as usual Arch Linux Wiki is awesome and I learnt a lot of great stuff there. I also found the Debian Wiki page on Hardware Video Acceleration [2] which has some good information (unfortunately I had already found all that out through more difficult methods first, I should read the Debian Wiki more often.

It seems that mplayer doesn’t suppoer VAAPI. The other comment suggested that I try the mpv fork of Mplayer which does support VAAPI but that feature is disabled by default in Debian.

I did a number of tests on playing different videos on my laptop running Debian/Buster with Intel video and my workstation running Debian/Unstable with ATI video. The first thing I noticed is that mpv was unable to use VAAPI on my laptop and that VDPAU won’t decode VP9 videos on my workstation and most 4K videos from YouTube seem to be VP9. So in most cases hardware decoding isn’t going to help me.

The Wikipedia page about Unified Video Decoder [3] shows that only VCN (Video Core Next) supports VP9 decoding while my R7-260x video card [4] has version 4.2 of the Unified Video Decoder which doesn’t support VP9, H.265, or JPEG. Basically I need a new high-end video card to get VP9 decoding and that’s not something I’m interested in buying now (I only recently bought this video card to do 4K at 60Hz).

The next thing I noticed is that for my combination of hardware and software at least mpv tends to take about 2/3 the CPU time to play videos that mplayer does on every video I tested. So it seems that using mpv will save me 1/3 of the power and heat from playing videos on my laptop and save me 1/3 of the CPU power on my workstation in the worst case while sometimes saving me significantly more than that.

Conclusion

To summarise quite a bit of time experimenting with video playing and testing things: I shouldn’t think too much about hardware decoding until VP9 hardware is available (years for me). But mpv provides some real benefits right now on the same hardware, I’m not sure why.

SMART and SSDs

The Hetzner server that hosts my blog among other things has 2*256G SSDs for the root filesystem. The smartctl and smartd programs report both SSDs as in FAILING_NOW state for the Wear_Leveling_Count attribute. I don’t have a lot of faith in SMART. I run it because it would be stupid not to consider data about possible drive problems, but don’t feel obliged to immediately replace disks with SMART errors when not convenient (if I ordered a new server and got those results I would demand replacement before going live).

Doing any sort of SMART scan will cause service outage. Replacing devices means 2 outages, 1 for each device.

I noticed the SMART errors 2 weeks ago, so I guess that the SMART claims that both of the drives are likely to fail within 24 hours have been disproved. The system is running BTRFS so I know there aren’t any unseen data corruption issues and it uses BTRFS RAID-1 so if one disk has an unreadable sector that won’t cause data loss.

Currently Hetzner Server Bidding has ridiculous offerings for systems with SSD storage. Search for a server with 16G of RAM and SSD storage and the minimum prices are only 2E cheaper than a new server with 64G of RAM and 2*512G NVMe. In the past Server Bidding has had servers with specs not much smaller than the newest systems going for rates well below the costs of the newer systems. The current Hetzner server is under a contract from Server Bidding which is significantly cheaper than the current Server Bidding offerings, so financially it wouldn’t be a good plan to replace the server now.

Monitoring

I have just released a new version of etbe-mon [1] which has a new monitor for SMART data (smartctl). It also has a change to the sslcert check to search all IPv6 and IPv4 addresses for each hostname, makes freespace check look for filesystem mountpoint, and makes the smtpswaks check use latest swaks command-line.

For the new smartctl check there is an option to treat “Marginal” alert status from smartctl as errors and there is an option to name attributes that will be treated as marginal even if smartctl thinks they are significant. So now I have my monitoring system checking the SMART data on the servers of mine which have real hard drives (not VMs) and aren’t using RAID hardware that obscures such things. Also it’s not alerting me about the Wear_Leveling_Count on that particular Hetzner server.

Testing VDPAU in Debian

VDPAU is the Video Decode and Presentation API for Unix [1]. I noticed an error with mplayer “Failed to open VDPAU backend libvdpau_i965.so: cannot open shared object file: No such file or directory“, Googling that turned up Debian Bug #869815 [2] which suggested installing the packages vdpau-va-driver and libvdpau-va-gl1 and setting the environment variable “VDPAU_DRIVER=va_gl” to enable VPDAU.

The command vdpauinfo from the vdpauinfo shows the VPDAU capabilities, which showed that VPDAU was working with va_gl.

When mplayer was getting the error about a missing i915 driver it took 35.822s of user time and 1.929s of system time to play Self Control by Laura Branigan [3] (a good music video to watch several times while testing IMHO) on my Thinkpad Carbon X1 Gen1 with Intel video and a i7-3667U CPU. When I set “VDPAU_DRIVER=va_gl” mplayer took 50.875s of user time and 4.207s of system time but didn’t have the error.

It’s possible that other applications on my Thinkpad might benefit from VPDAU with the va_gl driver, but it seems unlikely that any will benefit to such a degree that it makes up for mplayer taking more time. It’s also possible that the Self Control video I tested with was a worst case scenario, but even so taking almost 50% more CPU time made it unlikely that other videos would get a benefit.

For this sort of video (640×480 resolution) it’s not a problem, 38 seconds of CPU time to play a 5 minute video isn’t a real problem (although it would be nice to use less battery). For a 1600*900 resolution video (the resolution of the laptop screen) it took 131 seconds of user time to play a 433 second video. That’s still not going to be a problem when playing on mains power but will suck a bit when on battery. Most Thinkpads have Intel video and some have NVidia as well (which has issues from having 2 video cards and from having poor Linux driver support). So it seems that the only option for battery efficient video playing on the go right now is to use a tablet.

On the upside, screen resolution is not increasing at a comparable rate to Moore’s law so eventually CPUs will get powerful enough to do all this without using much electricity.

Thinkpad Storage Problem

For a while I’ve had a problem with my Thinkpad X1 Carbon Gen 1 [1] where storage pauses for 30 seconds, it’s become more common recently and unfortunately everything seems to depend on storage (ideally a web browser playing a video from the Internet could keep doing so with no disk access, but that’s not what happens).

echo 3 > /sys/block/sda/device/timeout

I’ve put the above in /etc/rc.local which should make it only take 3 seconds to recover from storage problems instead of 30 seconds, the problem hasn’t recurred since that change so I don’t know if it works as desired. The Thinkpad is running BTRFS and no data is reported as being lost, so it’s not a problem to use it at this time.

The bottom of this post has an example of the errors I’m getting. A friend said that his searches for such errors shows people claiming that it’s a “SATA controller or cable” problem, which for a SATA device bolted directly to the motherboard means it’s likely to be a motherboard problem (IE more expensive to fix than the $289 I paid for the laptop almost 3 years ago).

A Lenovo forum discussion says that the X1 Carbon Gen1 uses an unusual connector that’s not M.2 SATA and not mSATA [2]. This means that a replacement is going to be expensive, $100US for a replacement from eBay when a M.2 SATA or NVMe device would cost about $50AU from my local store. It also means that I can’t use a replacement for anything else. If my laptop took a regular NVMe I’d buy one and test it out, if it didn’t solve the problem a spare NVMe device is always handy to have. But I’m not interested in spending $100US for a part that may turn out to be useless.

I bought the laptop expecting that there was nothing I could fix inside it. While it is theoretically possible to upgrade the CPU etc in a laptop most people find that the effort and expense makes it better to just replace the entire laptop. With the ultra light category of laptops the RAM is soldered on the motherboard and there are even less options for replacing things. So while it’s annoying that Lenovo didn’t use a standard interface for this device (they did so for other models in the range) it’s not a situation that I hadn’t expected.

Now the question is what to do next. The Thinkpad has a SD socket and micro SD cards are getting large capacities nowadays. If it won’t boot from a SD card I could boot from USB and then run from SD. So even if the internal SATA device fails entirely it should still be useful. I wonder if I could find a Thinkpad with a similar problem going cheap as I’m pretty sure that a Windows user couldn’t make any storage device usable the way I can with Linux.

I’ve looked at prices on auction sites but haven’t seen a Thinkpad X1 Carbon going for anywhere near the price I got this one. The nearest I’ve seen is around $350 for a laptop with significant damage.

While I am a little unhappy at Lenovo’s choice of interface, I’ve definitely got a lot more than $289 of value out of this laptop. So I’m not really complaining.

[315041.837612] ata1.00: status: { DRDY }
[315041.837613] ata1.00: failed command: WRITE FPDMA QUEUED
[315041.837616] ata1.00: cmd 61/20:48:28:1e:3e/00:00:00:00:00/40 tag 9 ncq dma 
16384 out
                         res 40/00:01:00:00:00/00:00:00:00:00/e0 Emask 0x4 
(timeout)                
[315041.837617] ata1.00: status: { DRDY }
[315041.837618] ata1.00: failed command: READ FPDMA QUEUED
[315041.837621] ata1.00: cmd 60/08:50:e0:26:84/00:00:00:00:00/40 tag 10 ncq 
dma 4096 in
                         res 40/00:01:00:00:00/00:00:00:00:00/e0 Emask 0x4 
(timeout)                
[315041.837622] ata1.00: status: { DRDY }
[315041.837625] ata1: hard resetting link
[315042.151781] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[315042.163368] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) 
succeeded
[315042.163370] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) 
filtered out
[315042.163372] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered 
out
[315042.183332] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) 
succeeded
[315042.183334] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) 
filtered out
[315042.183336] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered 
out
[315042.193332] ata1.00: configured for UDMA/133
[315042.193789] sd 0:0:0:0: [sda] tag#10 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[315042.193791] sd 0:0:0:0: [sda] tag#10 Sense Key : Illegal Request [current] 
[315042.193793] sd 0:0:0:0: [sda] tag#10 Add. Sense: Unaligned write command
[315042.193795] sd 0:0:0:0: [sda] tag#10 CDB: Read(10) 28 00 00 84 26 e0 00 00 
08 00
[315042.193797] print_req_error: I/O error, dev sda, sector 8660704
[315042.193810] ata1: EH complete

Phone Plan Comparison 2020

It’s been over 6 years since my last post comparing 3G data plans in Australia.

I’m currently using Aldi Mobile [1]. They have a $15 package which gives 3G of data and unlimited calls and SMS in Australia for 30 days, and a $25 package that gives 20G of data and unlimited international calls to 15 unspecified countries (presumably the US and much of the EU). It’s on the Telstra network so has good access all around Australia. I’m on the grandfathered $20 per 30 days plan which gives me more than 3G of data (not sure how much but I need more than 3G) while costing less than the current offer of $25.

Telco Cost Data Network Calling
Aldi $15/30 days 3G Telstra Unlimited National
Aldi $25/30 days 20G Telstra Unlimited International
Dodo [2] $5/month none Optus Unlimited National + Unlimited International SMS
Moose [3] $8.80/month 1G Optus 300min national calls + Unlimited National SMS
AmaySIM [4] $10/month 2G Optus Unlimited National
Kogan [5] $135/year 100G Vodafone Unlimited National
Kogan $180/year 180G Vodafone Unlimited National
Dodo $20/month 12G Optus Unlimited National + Unlimited International SMS + 100mins International talk

In the above table I put Aldi at the top because Telstra has the best network. If you want to use a mobile phone anywhere in Australia then Telstra is the best choice. If you only care about city areas then the other networks are usually good enough. The rest of the services are in order of price, cheapest first. For every service I only mentioned the offerings that are price competitive.

I didn’t list the offerings that are larger than these. There are offerings of 300G/year or more, but most people don’t need them. Every service in this table has more expensive offerings with more data.

Please inform me in the comments of any services I missed.

Electromagnetic Space Launch

The G-Force Wikipedia page says that humans can survive 20G horizontally “eyes in” for up to 10 seconds and 10G for 1 minute.

An accelerator of 14G for 10 seconds (well below the level that’s unsafe) gives a speed of mach 4 and an acceleration distance of just under 7km. Launching a 100 metric ton spacecraft in that way would require 14MW at the end of the launch path plus some extra for the weight of the part that contains magnets which would be retrieved by parachute. 14MW is a small fraction of the power used by a train or tram network and brown-outs of the transit network is something that they deal with so such a launch could be powered by diverting power from a transit network. The Rocky Mountains in the US peak at 4.4KM above sea level, so a magnetic launch that starts 2.6KM below sea level and extends the height of the Rocky Mountains would do.

A speed of mach 4 vertically would get a height of 96Km if we disregard drag, that’s almost 1/4 of the orbital altitude of the ISS. This seems like a more practical way to launch humans into space than a space elevator.

The Mass Driver page on Wikipedia documents some of the past research on launching satellites that way, with shorter launch hardware and significantly higher G forces.