[CalendarServer-dev] Reproducible pg lock timeout during future events cleanup

Andre LaBranche dre at apple.com
Mon Jul 25 11:29:51 PDT 2016


Hi,

There is some recent work that hopes to avoid a deadlock situation we have encountered recently in CalendarServer's work queue (twext.enterprise.jobs) by using the skip-locked <http://michael.otacoo.com/postgresql-2/postgres-9-5-feature-highlight-skip-locked-row-level/> feature in Postgres >= 9.5.

http://trac.calendarserver.org/changeset/15735/CalendarServer/trunk - if you take this, you might as well go all the way top latest for a small handful of other fixes and enhancements.

For CS in 'dev' mode, this may require nuking the build root to pull down the newer version automatically. Also, when using Postgres older than 9.5, take special note of this portion of conf/caldavd-stdconfig.plist - this plist is composed of the defaults for all parameters. With older postgres, you need to define DBFeatures with an empty array.

        <!-- Features supported by the database

             'skip-locked': SKIP LOCKED available with SELECT (remove if using postgres
             &lt; v9.5) -->
        <key>DBFeatures</key>
        <array>
                <string>skip-locked</string>
        </array>


If you don't take the above fix in the short term, as a bandaid you can enable the transaction timeout, set to some reasonably high-ish value like 10 minutes. This would go in DatabaseConnection:txnTimeoutSeconds, for example:

    <key>DatabaseConnection</key>
    <dict>
        <key>endpoint</key>
        <string>tcp:localhost</string>
        <key>database</key>
        <string>caldav</string>
        <key>user</key>
        <string>caldav</string>
        <key>password</key>
        <string></string>
        <!-- needed when DBType = postgres -->
        <key>txnTimeoutSeconds</key>
        <integer>600</integer>
    </dict>

Attentive readers may note that DatabaseConnection:txnTimeoutSeconds is not mentioned in either twistedcaldavd/stdconfig.py or conf/caldavd-stdconfig.plist. This sort of thing loosely falls under the category of "params we pass to someone else's code", and we make no effort to expose in our config all features of other software we use. However, this one probably deserves a mention, so I'll add that.

We might want additional detail about the error you are hitting, as I can't say for sure if your problem is the one addressed by r15735. The authority on this part of CS is currently on holiday, so I will get back to you if we need more info. If you just want to throw a wide net to collect something before trying to get out of the problem state, you could simply turn up DefaultLogLevel to debug and reproduce the problem, then save off the log - this may only be practical if the error reproduces fairly easily / quickly, otherwise the log grows rather large!

-dre

> On Jul 23, 2016, at 4:06 AM, Axel Rau <Axel.Rau at Chaos1.DE> wrote:
> 
> Hi,
> 
> This error is reproducable on my development server:
> - - -
> 2016-07-23T10:07:08+0000 [caldav-0]  [calendarserver.tools.purge#warn] Cleaning up future events for principal 54371764-438E-4D2A-8E25-A1A15E8CB14B since they are no longer in directory
> 2016-07-23T10:07:40+0000 [caldav-0]  [twext.enterprise.jobs.jobitem#error] JobItem: 4699, WorkItem: 74300 failed: [Failure instance: Traceback: <class 'pg8000.core.ProgrammingError'>: (u'ERROR', u'57014', u'canceling statement due to statement timeout', u'while locking tuple (456,4) in relation "calendar_object"\nSQL statement "SELECT 1 FROM ONLY "caldavd"."calendar_object" x WHERE "resource_id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x“', u'postgres.c', u'2967', u'ProcessInterrupts', u'', u'')
> - - -
> Shall I try upgrading to 8.1 or could it help to collect more info on 8.0?
> 
> Thanks, Axel
> ---
> PGP-Key:29E99DD6  ☀  computing @ chaos claudius
> 
> _______________________________________________
> calendarserver-dev mailing list
> calendarserver-dev at lists.macosforge.org
> https://lists.macosforge.org/mailman/listinfo/calendarserver-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.macosforge.org/pipermail/calendarserver-dev/attachments/20160725/2a8095ab/attachment.html>


More information about the calendarserver-dev mailing list