Warning: Creating default object from empty value in /homepages/u37107/www.sebastian-kirsch.org/moebius/blog/wp-includes/functions.php on line 341

Warning: session_start(): Cannot send session cookie - headers already sent by (output started at /homepages/u37107/www.sebastian-kirsch.org/moebius/blog/wp-includes/functions.php:341) in /homepages/u37107/www.sebastian-kirsch.org/moebius/blog/my-hacks.php on line 3

Warning: session_start(): Cannot send session cache limiter - headers already sent (output started at /homepages/u37107/www.sebastian-kirsch.org/moebius/blog/wp-includes/functions.php:341) in /homepages/u37107/www.sebastian-kirsch.org/moebius/blog/my-hacks.php on line 3
Sebastian Kirsch: Blog » 2004 » August » 20

Sebastian Kirsch: Blog

Friday, 20 August 2004

unison; or: Why I don’t like Mac OS X software

Filed under: — Sebastian Kirsch @ 00:18

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /homepages/u37107/www.sebastian-kirsch.org/moebius/blog/wp-includes/functions-formatting.php on line 76

I recently bought the most wonderful little toy, a PowerBook G4 12″ with all the bells and whistles. I’m a Unix professional, and since it’s almost impossible to find a Unix notebook, the PowerBook seemed to be a good choice. It’s based on a BSDified Mach kernel, with a BSD userland. Having worked with some half-dozen Unix operating systems, I thought that one more or less wouldn’t make a difference.

Well, I was half right. As long as you only look at the Unix part, working with Mac OS X is indeed quite pleasant and familiar.

The problem starts when you realize that there’s more to Mac OS X than the Unix part. For one thing, there’s the original (pre OS X) Mac OS culture, and there’s also people who write for Unix, but don’t understand Unix.

This was exemplified when I recently looked for an application to synchronize my files with my desktop computer. (I’m running Linux on the desktop.)

Asking the local OS X gurus didn’t provide helpful answers. Most answers pointed me to tools for synchronizing with other Macs, which isn’t what I wanted to do. (For example, there’s psync, which uses a Perl module called MacOSX::File. Not likely to work under Linux.)

Now what does an old Unix hand think of for file synchronization? rsync, of course. The problem is that rsync only works in one direction: It can create and maintain a mirror of a filesystem very efficiently, but it can’t synchronize two filesystems. One helpful soul directed me to a tool called syncIt! that was mentioned on a Mac news site that very day. Hm, it’s supposed to be based on rsync, so I thought – what kind of magic does the author work to use it for synchronization?

None, it turned out to be. The whole “syncIt!” application – apart from the cool name – is just a shell script that does nothing but run rsync twice: first in one direction, then in the other. All nicely packaged with Cocoa dialog that asks you for the hostname of the remote computer. That’s it. You can’t even tell it to exclude some files, or browse differences, or the myriad of other features one might envision for a synchronization tool. You can’t even save the hostname of the remote site. Nothing. It’s just a fricking shell script. So that’s how you get on the front page of a Mac news site.

That did piss me off a bit. At the danger of sounding elitist, where I come from, we don’t create 100s of KB of “application” just for running two commands. We may pass those two commands around and tell people, “That’s how I do it, have a look at it.” But an application? No. That word is reserved for something that actually achieves something, and does not merely repackage a ten-year-old tool.

But there’s hope. Specifically, there’s unison. Supposedly, it’s a real file synchronization tool, written in OCaML, cross-platform (Unix and Windows), fast, and stable. Sounds good. In principle. It could be better.

I tried unison 2.9.1, compiled with OCaML 3.07 from fink.

First thing that’s problematic is the Mac’s file system. Mac OS X doesn’t use a regular Unix filesystem, but the regular Mac OS filesystem, called HFS+. The Unix part of Mac OS X just sees a POSIX-like view on HFS+. One of the brain damages that HFS shares with Windows is that it doesn’t discern the case of filenames – at filesystem level! This is quite contrary to the Unix way of treating filenames: Just stuff anything in it you want, as long as it doesn’t contain “/” or a null byte, we don’t care. Unfortunately, Mac OS X does care.

Being used to this woe of the Windows world, unison has a switch called “ignorecase". This should come in handy, if for one thing: Once you activate this switch, unison presumes that you are working on a Windows filesystem. And that means that several filenames that are perfectly legal on HFS+ are presumed to be illegal, for example filenames that contain “:".

unison detects when you try to synchronize such a file, and aborts the whole synchronization process. But it doesn’t tell you the filename of the offending file. You are left to guess which one of your 2GB of files caused the error. Then you start the synchronization again. To find that half-way through those 2GB, there’s another offending filename that you haven’t thought of yet. I still haven’t managed to weed all of those files from my home directory.

Please, dear unison developers – when you provide a switch, make sure that it does what it name says it does. If the name says “ignorecase", then it should be set to ignore the case of filenames. If it’s called “windowsfilesystem” or “fatfs", it should accomodate for the quirks of Windows.

Once you managed that, a more graceful way of failing would be nice. unison is interactive anyway, and it’s supposed to change both replicas. So why don’t you provide a way of resolving those conflicts? For example, a way of renaming the offending files before transmitting them? That would be really dandy.

As of this release, unison does not work with filesnames that contain accented characters as well. I haven’t been able to work out yet whether this is unison’s, OCaML’s or Mac OS X’s fault. Another class of files to weed from your home directory, because they will cause your synchronization to abort half-way through.

UPDATE: It appears that Mac OS X itself blocks filenames that are not valid Unicode (or UTF-8) strings. A small C program verifies that if you call open("b\374rger",…) ("bürger” in ISO-8859-1), you get an “invalid argument” error. open("buerger\xC2\xA9lars",…) ("buerger©lars” in UTF-8) works correctly, though, and also displays correctly in Finder.

So unison would have to include character set conversions for synchronizing between different operating systems. This is not a nice prospect.

I still don’t think that an operating system should impose semantics on filenames at system call level.

Still, unison is more powerful than rsync, and I hope I can use it on a daily basis once those quirks have been worked out.

Next I’ll be trying to cobble together some kind of backup solution. I haven’t found a suitable native OS X application, so I’ll try to cobble something together using amanda and hfstar. Should be fun …

Copyright © 1999--2004 Sebastian Marius Kirsch webmaster@sebastian-kirsch.org , all rights reserved.