On 16.10.2008, at 08:53, Mark Cockfield wrote:
My initial thought is that if the number users grows into the thousands a file system based repository could become unwieldy thus resulting in maintenance/management issues.
You keep repeating that, but w/o saying why? :-) Its much easier to deal with a million individual files than with one super fat database. Just consider the FS a database.
Then I started wondering about the scalability of time range filtered reports and performance in general.
OK. First you seem to mix scalability (how easy you can grow, how much you can grow) with performance. Of course they are related, but different. Consider a calendar collection. You could (and many 'traditional' servers do) store all events of all users in a single SQL table. This works fine up to a certain count, maybe ~10.000.000 records to give a very rough figure. This gives you space for very good performance for just 1000 users with 10.000 events each. Afterwards you will need to work increasingly harder to actually scale the database. The system does not scale well beyond this point (though it has great performance below that point). Contrast that with CalServer. It stores the structured data per server, each in its own (SQLite3) database. Queries across different calendars are slower because multiple databases need to be touched. BUT you can easily add more databases, hence the system scales very well (in fact w/ super high limits in number of calendars). But more importantely, keep in mind that the primary consumers of the CalServer are calendaring clients. iCal, Outlook, Thunderbird etc. Those clients always synchronize full calendars with their local cache. And run queries/reports *inside* that cache. Hence the server is optimized to deliver the raw data quickly to the clients. Which then do the actual work. Well, and this works best with plain files. Now either you also do the work in the client or your middleware, or you use a backend which is optimized for queries instead of content delivery :-)
I was also thinking that atomic commits would be an important design requirement
You probably refer to the ability to commit multiple events in a single transaction? Thats indeed not possible with CalDAV/CalServer. [Obviously a single PUT in CalServer *is* atomic :-)] Anyways, a customized SOGo server might deal with your requirements best. You could enhance the SOGo index tables to contain the additional information you need to query (your X- attrs) and you could add transactions (as long as you limit scalability and keep related calendars in a single database, which is possible with SOGo). Still you get good iCal/vCard interop. Have fun, Helge -- Helge Hess http://helgehess.eu/