servermgrd bus error

January 28, 2007

This happened after a failed attempt to add a signed cert from a CA - servermgrd just crashed. Trying to disable all SSL (/Library/Preferences/com.apple.servermgrd.plist) had no effect. Starting in debug mode just said this:

# servermgrd -d

2007-01-28 23:39:04.717 servermgrd[20540] *** _NSAutoreleaseNoPool(): Object 0x306030 of class NSCFData autoreleased with no pool in place - just leaking

2007-01-28 23:39:04.717 servermgrd[20540] *** _NSAutoreleaseNoPool(): Object 0x306420 of class NSCFData autoreleased with no pool in place - just leaking

2007-01-28 23:39:04.733 servermgrd[20540] Entering initialize

2007-01-28 23:39:05.600 servermgrd[20540] Starting idle processing

Bus error

Well, it turns out that the stuff about memory leaking is “normal”. Here’s the output of the same command on a totally unrelated, perfectly in-order Tiger server:

# servermgrd -d

2007-01-28 23:52:46.348 servermgrd[21665] *** _NSAutoreleaseNoPool(): Object 0x306020 of class NSCFData autoreleased with no pool in place - just leaking

2007-01-28 23:52:46.348 servermgrd[21665] *** _NSAutoreleaseNoPool(): Object 0x306410 of class NSCFData autoreleased with no pool in place - just leaking

2007-01-28 23:52:46.349 servermgrd[21665] Entering initialize

It’s the Bus error that I’m worried about. Most of ktrace servermgrd -d and kdump -f ktrace.out is incomprehensible and so is pretty much /Library/Logs/CrashReporter/servermgrd.crash.log

Checking the last lines of kdump (kdump -f ktrace.out | tail -n 20) did mention /Library/Keychains/System.keychain, just shortly before the crash. A find -ctime 2 confirms that System.keychain was modified just around that fateful moment when this problem started. For the heck of it, I decided to move the old keychain aside, and create a new one:

# mv System.keychain System.keychain.old

# security create-keychain /Library/Keychains/System.keychain

Sure enough, servermgrd was open for business again:

server:/Library/Keychains root# servermgrd -d

2007-01-29 00:20:48.654 servermgrd[20712] *** _NSAutoreleaseNoPool(): Object 0x306030 of class NSCFData autoreleased with no pool in place - just leaking

2007-01-29 00:20:48.655 servermgrd[20712] *** _NSAutoreleaseNoPool(): Object 0x306420 of class NSCFData autoreleased with no pool in place - just leaking

2007-01-29 00:20:48.655 servermgrd[20712] Entering initialize

2007-01-29 00:20:48.946 servermgrd[20712] Starting idle processing

2007-01-29 00:20:51.534 servermgrd[20712] Done with idle processing

I was actually able to salvage the certs and private keys from the damaged keychain file like thus:

# security export -k /Library/Keychains/System.keychain.old -t all -o ./all.pem

and then import them back into the fresh keychain:

# security import ./all.pem -P -k /Library/Keychains/System.keychain

2 keys imported.

3 certificates imported.

The bad news is that although Server Admin works again, I’m unable to use the Certificate Manager. Any attempt to either add or import a cert gets replied by a dull “The selected certificate could not be retrieved. Going back to the list.” Oh well, just another good reason to get more comfortable with the CLI - it’s not as fragile… By the way, If you change SSL certs in httpd conf files, it seems it’s better to stop and start the server, not restart (otherwise the old cert is still used).

Just for the record, the cert and key in /etc/servermgrd are disposable. If you delete them, they will be re-created by servermgrd on the next launch. Oh, and there’s also certadmin, but it did absolutely nothing for me.