Linux, politics, and other interesting things
I had a problem where the email address email@example.com spammed a Request Tracker (RT)  installation (one of the rules for running a vaction program is that you never respond twice to the same address, another rule is that you never respond to automatically generated messages).
Deleting these tickets was not easy, the RT web interface only supports deleting 50 tickets at a time.
To delete them I first had to find the account ID in RT, the following query does that:
select id from Users where EmailAddressfirstname.lastname@example.org';
Then to mark the tickets as deleted I ran the following SQL command (where X was the ID):
update Tickets set Status='deleted' where Creator=X;
Finally to purge the deleted entries from the database (which was growing overly large) I used the RTx-Shredder  tool. RTx-Shredder doesn’t seem to support deleting tickets based on submitter, which is why I had to delete them first.
I am currently using the following command to purge the tickets. The “limit,500” directive tells rtx-shredder to remove 500 tickets at one time (the default is to only remove 10 tickets).
./sbin/rtx-shredder --force --plugin 'Tickets=status,deleted;limit,500'
There are currently over 34,000 deleted tickets to remove, and rtx-shredder is currently proceeding at a rate of 9 tickets per minute, so it seems that it will take almost three days of database activity to clear the tickets out.
I also need to purge some tickets that have been resolved for a long time, I’m running the following command to remove them:
./sbin/rtx-shredder --force --plugin 'Tickets=status,resolved;updated_before,2008-03-01 01:01:34;limit,500'
With both the rtx-shredder commands running at once I’m getting a rate of 15 tickets per minute, so it seems that the bottleneck is more related to rtx-shredder than MySQL (which is what I expected). Although with two copies running at once I have mysqld listed as taking about 190% of CPU (two CPUs running two capacity). The machine in question has two P4 CPUs with hyper-threading enabled, so maybe running two copies of rtx-shredder causes mysqld to become CPU bottlenecked. I’m not sure how to match up CPU use as reported via top to actual CPU power in a system with hyper-threading (the hyper-threaded virtual CPUs do not double the CPU power). I wonder if this means that the indexes on the RT tables are inadequate to the task.
I tried adding the following indexes (as suggested in the rtx-shredder documentation), but it didn’t seem to do any good – it might have improved performance by 10% but that could be due to sampling error.
CREATE INDEX SHREDDER_CGM1 ON CachedGroupMembers(MemberId, GroupId, Disabled);
CREATE INDEX SHREDDER_CGM2 ON CachedGroupMembers(ImmediateParentId, MemberId);
CREATE UNIQUE INDEX SHREDDER_GM1 ON GroupMembers(MemberId, GroupId);
CREATE INDEX SHREDDER_TXN1 ON Transactions(ReferenceType, OldReference);
CREATE INDEX SHREDDER_TXN2 ON Transactions(ReferenceType, NewReference);
CREATE INDEX SHREDDER_TXN3 ON Transactions(Type, OldValue);
CREATE INDEX SHREDDER_TXN4 ON Transactions(Type, NewValue);