For a long time the use of HTTP cookies [1] for tracking the web browsing habits of users has been well known. But I am not aware of any good solution to the problem. A large part of the problem is the needless use of cookies, it seems that many blog servers use cookies even though they provide no benefit to the user. A major culprit in this regard is the Google Analytics service which sets a cookie with a two year expiry time when you first visit a web site. The CustomizeGoogle.com Firefox plugin allows you to block the Google Analytics cookies [2] and much more.
It’s unfortunate that Firefox/Iceweasel seems to lack the cookie management functions of Konqueror. Konqueror (the KDE web browser) can be configured to prompt the user for the appropriate action when a cookie is offered, the options include once-only accept or reject and permanent accept or reject status for the site in question. Of course even this has some issues, when a web site is on the “permanently block cookies” list it is one that has obviously been viewed intensively on at least one occasion (IE many page views) or viewed on multiple occasions, in some situations this may be a fact that the user does not want revealed. An option to store a list of the hashes of the names of web sites which should be blocked would be useful. It’s also unfortunate that Konqueror (like most browsers) is unable to use Firefox plugins, so given a choice between Konqueror and Firefox I’m always going to lose some features.
Update: Andrew Pollock points out that Firefox does allow you to control when cookies are accepted [5]. It’s listed as “Keep Until” with the value of “ask me every time“.
The next issue relates to the storage of cookies. It is a good security feature to have certain types of cookie expire after some period of time. Unfortunately the expiry process requires that the user run the web browser in question. So if for example my browser preferences were to change then I would probably end up with the cookies from the old browser remaining in my home directory for years after their planned expiry date. My home directory has the untouched configuration and data files of many programs that I have not used for four years or more. I’m not sure whether any of them include cookies from web browsers (I have used many web browsers over the years).
I think that the best solution to this problem would be to have a common directory such as ~/.session-state which has files with an MTIME indicating when they should expire. A program that wants to store such session data could create a subdirectory such as ~/.session-state/Firefox and then use one file per cookie under that directory. Then the user could have a cron job which deletes all session state files that are older than the current date. Such a cron job would not need to know anything about the actual data in the files, it would just delete the files that are out of date. The exact format of the files would be determined by the application, so if there were thousands of cookies (which would lead to a performance problem on some systems if one file was used for each) then there could be one file for each week (if deleting the old cookies as much as 6 days too late is a serious problem then you are probably going to suffer anyway). Such a state directory could be used for any data which has a fixed expiry time, it would not need to be limited to cookies.
This would be a minor misuse of the mtime field, but it’s the most reliable way of implementing this and making it difficult to mess it up (in terms of exposing private data). Note that the MTIME would not have to be the sole source of such data, an application such as Firefox could reset the MTIMEs on the files to values it considers appropriate (based on file name, file contents, or some metadata stored elsewhere). It is expected that certain backup/restore operations among other things can result in the timestamp data on files being lost.
Now cookies are not the extent of the problem. It seems that Macromedia/Adobe have some similar functionality in the Flash player [3], but the insidious thing is that Flash cookies are used to respawn HTTP cookies if the user deletes them! After reading about that I discovered some Flash cookies that were stored on my laptop since 2005 (which was probably the last time I ran Flash). It seems that if you desire security you need to first avoid software from companies that are at best disinterested and sometimes seem overtly hostile towards the privacy needs of users – this is why I haven’t used Flash on machines that matter to me for many years. If I had a lot of spare time I would help out with the GNASH project.
One thing I have been considering is to change my browsing habits to use a different account for untrusted content. The switch user functionality that has been in most Linux distributions for a few years seems to have the potential to alleviate this. I am considering setting up a system to allow me to ssh to a guest account to open a web browser window. Then I can switch to the X desktop that has untrusted web sites open and read them. It would be nice if I could extend a web browser to add an extra entry to the menu that is displayed when the secondary mouse button is pressed on a link, then I could make that run a script to launch the URL in a new window. I could also use that when I’m at home to launch the URL on a different system.
One thing that I have to do is to get XGuest (the SE Linux Kiosk Mode) [4] running in Debian. It’s been in Fedora since version 8. With the XGuest used for untrusted browsing nothing gets stored.
This is not the extent of security issues related to web browsing. It’s just a small set of issues that need to be fixed, we have to start somewhere.
- [1] http://en.wikipedia.org/wiki/HTTP_cookie
- [2] http://www.customizegoogle.com/block-google-analytics-cookies.html
- [3] http://www.schneier.com/blog/archives/2009/08/flash_cookies.html
- [4] http://james-morris.livejournal.com/25640.html
- [5] http://blog.andrew.net.au/2009/08/17#firefox_cookie_handling
I’ve noted Arora has the option “accept cookies only from sites you’ve navigated to” (with configurable exceptions). Quite how this is defined and implemented I’m not sure, but it did intrigue me and at first glance seems a plausible restriction to bring into the mix.
But I think ultimately it is like the spam problem. Some cookies are useful to you, some are useful only to the marketers, some in between, unlike spam it is often hard to tell the difference.
I got miffed at some of the tracking I saw early on, with sites like preferences.com, and have tried various approaches to cookies.
For a long time I had the browser ask me with the first cookie from each domain, but this gets tedious. Then I tried deleting cookies on exiting browser, but this is also tedious because so many sites use cookies to store authentication tokens (and you have to login too much as it is).
I’m trying the “privacy is dead get over it” approach now, Iceweasel using NoScript to keep JS and Flash well behaved.
You can use NoScript firefox plugin – http://noscript.net – all google’s cookies are in the scripts.