Linux Job Ads

It seems to me that we need to have syndicated feeds for Linux job adverts. To start this I have created a new blog for Linux job adverts, it will have categories for the states and territories of Australia (with a feed for each category) and I will also create Planet installations that take feeds from the blog as well as any other Linux jobs RSS feeds. I will create the Planet installations when I have feeds to add.

If your company regularly advertises Linux jobs in Australia then please create an RSS feed of the adverts and I will syndicate it. If your company occasionally advertises Linux jobs then you can email them to me.

As of this moment there are no positions advertised, but I hope that to change soon.

Committing Data to Disk

I’ve just watched the video of Stewart Smith’s LCA talk Eat My Data about writing applications to store data reliably and not lose it. The reason I watched it was not to learn about how to correctly write such programs, but so that I could recommend it to other people.

Recently I have had problems with a system (that I won’t name) which used fwrite() to write data to disk and then used fflush() to commit it! Below is a section from the fflush(3) man page:

NOTES
       Note that fflush() only flushes the user space buffers provided by  the
       C  library.   To  ensure that the data is physically stored on disk the
       kernel buffers must be flushed too, e.g. with sync(2) or fsync(2).

Does no-one read the man pages for library calls that they use?

Then recently I discovered (after losing some data) that both dpkg and rpm do not call fsync() after writing package files to disk. The vast majority of Linux systems use either dpkg or rpm to manage their packages. All those systems are vulnerable to data loss if the power fails, a cluster STONITH event occurs, or any other unexpected reboot happens shortly after a package is installed. This means that you can use the distribution defined interface for installing a package, be told that the package was successfully installed, have a crash or power failure, and then find that only some parts of the package were installed. So far I have agreement from Jeff Johnson that RPM 5 will use fsync(), no agreement from Debian people that this would be a good idea, and I have not yet reported it as a bug in SUSE and Red Hat (I’d rather get it fixed upstream first).

During his talk Stewart says sarcastically “everyone uses the same filesystem because it’s the one true way“. Unfortunately I’m getting this reaction from many people when reporting data consistency issues that arise on XFS. The fact that Ext3 by default will delay writes by up to 5 seconds for performance (which can be changed by a mount option) and that XFS will default to delaying up to 30 seconds means that some race conditions will be more likely to occur on XFS than in the default configuration of Ext3. This doesn’t mean that they won’t occur on Ext3, and certainly doesn’t mean that you can rely on such programs working on Ext3.

Ext3 does however have the data=ordered mount option (which seems to be the default configuration on Debian and on Red Hat systems), this means that meta-data is committed to disk after the data blocks that it referrs to. This means that an operation of writing to a temporary file and then renaming it should give the desired result. Of course it’s bad luck for dpkg and rpm users who use Ext3 but decided to use data=writeback as they get better performance but significantly less reliability.

Also we have to consider the range of filesystems that may be used. Debian supports Linux and HURD kernels as main projects and there are less supported sub-projects for the NetBSD, FreeBSD, and OpenBSD kernels as well as Solaris. Each of these kernels has different implementations of the filesystems that are in common and some have native filesystems that are not supported on Linux at all. It is not reasonable to assume that all of these filesystems have the same caching algorithms as Ext3 or that they are unlike XFS. The RPM tool is mainly known for being used on Red Hat distributions (Fedora and RHEL) and on SuSE – these distributions include support for Ext2/3, ReiserFS, and XFS as root filesystems. RPM is also used on BSD Unix and on other platforms that have different filesystems and different caching algorithms.

One objection that was made to using fsync() was the fact that cheap and nasty hard drives have write-back caches that are volatile (their contents dissappear on power loss). As with such drives reliable operation will be impossible so why not just give up! Pity about the people with good hard drives that don’t do such foolishness, maybe they are expected to lose data as an expression of solidarity with people who buy the cheap and nasty hardware.

Package installation would be expected to be slower if all files are sync’d. One method of mitigating this is to write a large number of files (EG up to a maximum of 900) and then call fsync() on each of them in a loop. After the last file has been written the first file may have been entirely committed to disk, and calling fsync() on one file may result in other files being synchronised too. Another issue is that the only time package installation speed really matters is during an initial OS install. It should not be difficult to provide an option to not call fsync() for use during the OS install (where any error would result in aborting the install anyway).

Update: If you are interested in disk performance then you might want to visit the Benchmark category of my blog, my Bonnie++ storage benchmark and my Postal mail server benchmark.

Update: This is the most popular post I’ve written so far. I would appreciate some comments about what you are interested in so I can write more posts that get such interest. Also please see the Future Posts page for any other general suggestions.

LUG talks today

Today I gave three talks at my local LUG. The first was my latest SE Linux talk (I’ll put the notes online soon). The second was a talk about voting.

I asked for a show of hands, who has already decided which party they will vote for at the next federal election (about 12 people put their hands up). I then asked people to put their hands down if they were not a member of the party that they intend to vote for, including myself there were only two raised hands in the room (including mine)!

With the way party politics works nowadays the major parties are not very interested in representing their core voters. Why try to please for people who will vote for you anyway? Instead they try to appeal to swinging voters and pressure groups. If you have decided to vote for a party they have no reason to try and impress you. Therefore you should join the party and try and influence the policy decision making process from within.

The issues that I believe are most important to the Linux community are free software use in government, sane intellectual property laws, the right to a fair trial, and not pandering to the US (which is related to the previous two points).

If you have already decided who to vote for then you should join that party and make your vote count in the party room.

One member of the audience said that he had been a member of one of the major parties but that the internal politics turned him off. If that is your experience then I think you should ask yourself whether you want to vote for a group of people that you can’t work with.

The final talk I gave was about getting speakers for Linux Users’ Groups. There is always difficulty in finding speakers for clubs. Ideally we would have meetings planned a few months ahead of time so that they could be advertised in various ways. Newspapers often have columns dedicated to providing information about public meetings but the lead time is usually at least a week (and the meeting would have to be advertised at least two weeks in advance – so more than a month’s planning ahead is required).

Getting a larger number and variety of speakers will attract new members, encourage existing members to attend more meetings, and inspire members in their Linux work.

Talks can be given by almost anyone. There is a constant demand for speakers who have expert knowledge in the topic, but anyone who is a decent speaker and has the confidence to stand up at the podium can give a good talk. For expert speakers possibilities include academics, industry leaders, leaders of free software development projects, and journalists. But that’s not all, anyone who wants to spend the time researching a topic can give a talk on it. For example I’ve been learning about MySQL recently for my own servers and will probably offer a talk about MySQL aimed at sys-admins who don’t want to become DBAs but who just want to get a database running. I’m not a MySQL expert (and don’t plan to become one) but I believe that there are many people who want to do the things I do with MySQL and who could benefit from a talk that I might give.

The best place to find speakers is a conference or trade-show. If they give a talk that works well you can suggest that they give it again for your local LUG. You can also find speakers at conferences that you can’t attend. If someone visits your country for a trade-show in a different city you could send them an email saying “unfortunately I can’t attend your talk, but if you are interested in visiting my city in the same trip then there will be an audience of X people interested in seeing you”.

There’s no harm in asking, the worst that they can do is decline. Ask everyone who you think can do a good job. Also make sure that you don’t make any commitment (unless you are member of the LUG committee).

Linux Tour Bus

I have seen buses used for tours that contain bunk beds. If one or more such buses were hired then a group of Linux people could go on a moving Linux conference. This would have to take place in an area with many reasonable size cities in a close area and where there is a good number of Linux people in such cities. Probably the EU is the only area.

A bus (or several buses depending on demand) would then take a group of Linux people through the major cities and have a conference in each one.

Currently there are conferences such as the Debian conference DebConf which receive sufficient sponsorship money to pay for many speakers to attend. Having a similar conference traveling around Europe shouldn’t cost any more money and will give plenty of time for the people in the bus to do some coding along the way.

We already have the Geek Cruises, my idea is to do a similar thing but on the road. Also it isn’t practical to transport an entire conference, so it would probably just be speakers on buses and the audiences would vary from city to city.

BLUG

This weekend I went to the Ballarat install-fest, mini-conf, and inaugural meeting of the Ballarat Linux Users’ Group (BLUG).

This was the second install-fest, the first one was quite successful so it was decided that there was demand for a second. I suggested that what we should do is get some of the more experience members of LUV to attend and give talks about their areas of expertise and make a mini-conference. I also suggested that we
hire a large vehicle to take a number of people to the meeting. Both my suggestions were accepted.

So on Friday evening I was in a Kia XXX with five other people from LUV on our way to Ballarat.

On Saturday we had the install-fest. We started at about 10AM, there were about a dozen people getting help installing Linux and many more attending the mini-conf and just hanging out. For lunch we had a BBQ. In the afternoon I gave a talk on SE Linux and then a brief impromptu talk on Poly-Instantiated Directories while the next speaker was setting up their laptop.

At the end there was the inaugural meeting of BLUG. The president was appointed, and there were some brief discussions about when to schedule meetings. I suggested that BLUG meetings should be either the day before or the day after LUV meetings to increase the incidence of speakers from other regions attending both meetings, my suggestion was being seriously considered at the time the meeting adjourned – LUV is a larger group and has better ability to get speakers from other regions. It was also agreed that a
weekend combined LUV and BLUG meeting would be arranged twice a year.

I traveled back to Melbourne by train which was cheap at $9 and comfortable. There was even a power point in the carriage (which I didn’t use as my laptop was charged and the location was not convenient). For the next such event I’ll try and arrange a group to travel on the train together.

The next thing to do is to find other regional centers in Victoria where we can do the same thing. Bendigo might be a possibility.

Also if you are a member of a LUG in a city please consider the possibilities for helping form a LUG in a regional center that’s nearby. I would be happy to provide whatever advice I can to help people replicate this success in areas surrounging other cities, so please email me if you have any questions.

meeting people at Linux conferences

One thing that has always surprised me is how few people talk to speakers after they have finished their lecture. A lecture might have many questions and the questions may be cut off, but when the speaker leaves the room they will usually do so alone.

When I give lectures at conferences I’m always happy to spend more time talking to people who are interested in the topic and disappointed that so few people choose to do so. It seems that other people have similar experiences, there have been several occasions when I have invited speakers to join me for lunch and no-one else has shown interest in joining us.

Usually the most significant factor in making someone offer a talk at a Linux conference is the opportunity to teach other people about the technology that they are working on. People with that motivation will take the opportunity to teach people at lunch, dinner, whenever.

Linux Conf Au
has an event called the “Professional Delegates Networking Session” which is regarded by some people as the way to meet speakers (about half the delegates don’t attend so the ratio of speakers to delegates is significantly better than at other conference events). But it seems to me that it’s more efficient to just offer to buy them dinner. When I worked for Red Hat the maximum value for a gift I could accept was $100US, I expect that Red Hat has not changed this policy since then and that most companies that employ speakers at Linux conferences have similar policies. $100US is more than a meal costs at most restaurants that are near a Linux conference.

If I was a manager at a company that sent employees to a Linux conference I would first send email to some speakers who were working in areas of Linux development that were related to the projects that the employees were working on. I would ask the speakers if they would be interested in having dinner bought for them by my company and give them the option of bringing one or two friends along for a free meal (the friends would probably be people who work in similar areas).

some random Linux tips

  • echo 1 > /proc/sys/vm/block_dump
    The above command sets a sysctl to cause the kernel to log all disk writes. Below is a sample of the output from it. Beware that there is a lot of data.
    Jan 10 09:05:53 aeon kernel: kjournald(1048): WRITE block XXX152 on dm-6
    Jan 10 09:05:53 aeon kernel: kjournald(1048): WRITE block XXX160 on dm-6
    Jan 10 09:05:53 aeon kernel: kjournald(1048): WRITE block XXX168 on dm-6
    Jan 10 09:05:54 aeon kernel: kpowersave(5671): READ block XXX384 on dm-7
    Jan 10 09:05:54 aeon kernel: kpowersave(5671): READ block XXX400 on dm-7
    Jan 10 09:05:54 aeon kernel: kpowersave(5671): READ block XXX408 on dm-7
    Jan 10 09:05:54 aeon kernel: bash(5803): dirtied inode XXXXXX1943 (block_dump) on proc
  • Prefixing a bash command with ‘ ‘ will prevent a ! operator from running it. For example if you had just entered the command ” ls -al /” then “!l” would not repeat it but would instead match the preceeding command that started with a ‘l’. On SLES-10 a preceeding space also makes the command not appear in
    the history while on Debian/etch it does (both run Bash 3.1).
  • LD_PRELOAD=/lib/libmemusage.so ls > /dev/null

    The above LD_PRELOAD will cause a dump to stderr of data about all memory allocations performed by the program in question. Below is a sample of the output.

    Memory usage summary: heap total: 28543, heap peak: 20135, stack peak: 9844
             total calls   total memory   failed calls
     malloc|         85          28543              0
    realloc|         11              0              0   (in place: 11, dec: 11)
     calloc|          0              0              0
       free|         21          12107
    Histogram for block sizes:
        0-15             29  30% ==================================================
       16-31              5   5% ========
       32-47             10  10% =================
       48-63             14  14% ========================
       64-79              4   4% ======
       80-95              1   1% =
       96-111            20  20% ==================================
      112-127             2   2% ===
      208-223             1   1% =
      352-367             4   4% ======
      384-399             1   1% =
      480-495             1   1% =
     1536-1551            1   1% =
     4096-4111            1   1% =
     4112-4127            1   1% =
    12800-12815           1   1% =
    

more about Fedora

In a comment on a previous blog entry I was described as an active Fedora advocate, I don’t think that is an accurate description. I advocate it to appropriate people, which is mostly non-programmers – but as I mentioned that means a larger proportion of the population than to whom I can advocate Debian. It’s not that I’m trying to advocate Fedora, just that it fills a need for many people. I believe that the term Fedora advocate means someone has an objective of increasing the use and to use Fedora, I don’t have such an objective. I am a Linux advocate, a Free Software advocate, and sometimes a Unix advocate (Unix meaning the entire family of Unix-like operating systems). Merely promoting something does not make you an advocate for it. I don’t think of myself as a Debian advocate at this time, but as I am a Debian developer this may change.

It seems that the people who run the Fedora Planet think that my blog has suitable Fedora content, it’s been added to that planet. Also the Fedora Planet appears to be running an older version of the Planet software as it has the same problem with my blog that Debian Planet had before the upgrade.

Now on the issue of gratis vs libre: As I am not a Red Hat employee I can’t maintain a kernel-xen-nopae package and give it the same status as the kernel-xen package. Even when I was a Red Hat employee I couldn’t have done that – it would require some amount of management approval. I believe that this fundamentally makes Fedora less of a libre distribution. There is no room in Fedora for someone who is an upstream developer and who just wants to maintain their own package. There is Fedora-Extras, but that has a second-class status. Only Red Hat employees can maintain packages in Fedora Core. This makes Fedora fundamentally less libre than Debian. I am not trying to suggest that Red Hat change things in this regard, I believe that Fedora is meeting all it’s goals and that making Fedora as libre as Debian is not possible given the goals of making a profit on selling support of RHEL.

Chris made a good point. I also believe that MP3 codecs should not be in Debian/main. But I believe that people making mistakes about some issues is not a factor in judging the entire project. I believe that Debian is more libre although some bad decisions were made – largely due to lack of overall management. Fedora has hierarchical management, so when the legal team declares that some software can not be distributed then it gets removed without debate. I guess I could propose a GR to exclude MP3 codecs from main.

Also it should be noted that RHEL Extras has some of this software that is not in Fedora (RealPlayer for example). The Red Hat legal advice was that MP3 codecs need a license, so they ship a licensed version in their commercial distribution. This is the right thing to do for their customers (it’s handy to have and I’m sure that they get a good deal by paying license fees for all their customers) and removing such things from Fedora is the right way to offer a gratis product without unreasonable legal liability.

Naturally Fedora is much more libre than any secret-source OS. Every user has the option of downloading the Fedora source and recompiling it as they wish. I could compile Fedora with a Xen kernel that runs on my hardware and with SE Linux policy that is more restrictive than that which Fedora currently has. I could build custom Fedora install CDs to install things the way I want (which I considered doing when I worked for Red Hat). But the liberty to fork a project does not compare to the liberty to join it, and the liberty to create your own packages in extras does not compare to the liberty to add your own packages that do things differently to the default package.

There are of course positive and negative aspects to this. I started work on SE Linux in Debian in 2001. In 2003 I joined Red Hat to work on SE Linux, in Red Hat I was not the only person dedicated to SE Linux work and other people spent part of their time working on it. The SE Linux work in Red Hat soon eclipsed that of Debian because there was management support. There was no possibility for a package maintainer to refuse to fix a bug that affected SE Linux simply because they didn’t care for it. The positive side of this is that the SE Linux work proceeded quickly and efficiently. The negative side of this is that things which don’t have management support don’t appear in Fedora Core. Exim is a fine MTA but is not in Fedora Core. Some people think that AppArmor is a better option than SE Linux, they are wrong – but in Debian any developer has the option to add AppArmor support and neither I nor any other DD can prevent them. The libre nature of Debian means that as long as basic technical criteria are met DDs can add any package that they wish to the distribution.

These issues however are all related to people who are actively involved in Free Software development. For a typical Free Software user it often doesn’t make much difference, until of course your favourite program doesn’t get management approval to appear in Fedora Core. But the counter argument is that the quality of some of the >10,000 packages in Debian is not so high. You can install a Fedora Core package and have a reasonable expectation about how well it works, but Debian packages are sometimes rather experimental.

I also don’t believe that Debian is a very functional Democracy. Some of the problems of Direct Democracy are demonstrated in Debian. In many ways it is more anarchistic, anarchy gives you liberty for good and bad. Maybe we should consider a Representative Democracy model for Debian.

The benefits of SE Linux

Today I discovered a bug in one of my programs, it called system() and didn’t correctly escape shell eta-characters. Fortunately I had written custom SE Linux policy for it which did domain_auto_trans(foo_t, shell_exec_t, very_restricted_t) so there was no possibility of damage.

The log files (which were not writable by the daemon by both SE Linux access control and Unix permissions) indicated that no-one had attempted to exploit the bug.

dunc-tank and motivation

The dunc-tank project was established to raise money to compensate some Debian developers who are essential to producing a timely release of Debian. There has been a lot of acrimoneous debate about whether this is a good or bad thing. The positive side of it is that the release managers will get to spend more time working on Debian, the negative side is that some volunteers will lose motivation.

However I have felt more motivated to do my unpaid Debian work. During the time that I was employed by Red Hat I was fairly slack about my Debian development work (incidentally Red Hat management were happy for me to continue Debian work so there was no pressure from Red Hat in this regard). Since leaving Red Hat I have been busy doing paid work.

Recently I have started getting involved in Debian work again. I am about to upload a new version of Postal for the first time in three years, I have set up a Xen server for Debian SE Linux development, and I am about to start serious Debian SE Linux development work again.

One factor in this has been my impression that other DDs are taking the release seriously. In the past schedules for release have slipped repeatedly without end. Now there is a schedule and this gives me more motivation to get bugs fixed!