Is there anything that can prevent launchd from working
In a case where launchd works just fine all the time, for example: <dict> <key>Label</key> <string>com.host.ntpdate</string> <key>ProgramArguments</key> <array> <string>/bin/bash</string> <string>/Users/me/bin/set_date</string> </array> <key>RunAtLoad</key> <true/> <key>StartInterval</key> <integer>3600</integer> </dict> script... (slightly shortened) #!/bin/bash # Create a place to store this log file log_path=/Users/me/Library/Logs/ntpdate.log echo `date` >> $log_path # Sleep for a duration so the system can get the interfaces up # Running in chunks just so the log and get updated # and I can see that it is working echo "Sleeping...90 Seconds `date`" >> $log_path /bin/sleep 90; # Do the work echo "About to run ntpdate -u" >> $log_path /usr/sbin/ntpdate -u echo "Finished running ntpdate -u" >> $log_path log... (Full output, non shortened) ----- BEGIN ----- Started to set time and date Mon Jun 1 11:18:35 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:18:36 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:19:06 PDT 2009 Sleeping...15 Seconds Mon Jun 1 11:19:36 PDT 2009 About to run ntpdate -u Finished running ntpdate -u Finished setting time and date ----- END ----- Syslog tells me it works: Jun 1 10:57:41 host com.host.ntpdate[19696]: 1 Jun 10:57:41 ntpdate[19776]: adjust time server 17.151.16.21 offset -0.070842 sec It works all the time, sans one case, which is when I need it most. I hope there are people here who know a but about the kernel of OS X. When the machine panics, which I am still working on determining why, I suspect ram, though on the 3rd batch. I have 100% ruled out a bad pram battery. After a panic, the "automatically restart after power failure" feature does not work. It does work any other time, only after panics does not it. I have to physically reboot. This also adds more credence to the battery being fine. When the machine comes up, there is a message on screen about the date and time being at 1969. Doing a google search for "panic 1969" and you see there are a good deal of others with panic logs at 1969. I conclude, panics mess with the date and time for some reason, or nuke nvram, not sure. No big deal really, though this is an email server, and if I forget, or do not have access, the date and time will continue to be wrong, and all emails will be wrongly dated. My plist is set to run at load, and also to run every 60 minutes, which I see it does do. I can find a syslog line that shows it ran at load, and ones that show it runs every hour as well. launchd totally fails to run this item after a kernel panic and the date is set into the 1969 range. I have the sys log for where the dates are all Dec 31, which is just after the panic, and after I called in a reboot, which then foiled my script from setting the date correctly. I see named started, which uses launchd, so I know it is working. I see a lot of other things start. But then I see this: Dec 31 16:00:54 host-domain-com com.apple.launchd[1] (com.domain.ntpdate[50]): Exited with exit code: 1 Right after that, I have a launchd item that checks S.M.A.R.T. which runs. Since this is after bind came up, I am pretty sure the interfaces are up enough that I could talk to a time server. Is there something about ntpdate that will error out on insanely large time offsets? All I end up doing to solve it is `sudo /usr/sbin/ ntpdate -u`, the same thing launchd should do. I will solve the panics, in the meantime, I would love to know how to protect myself from living in the past for too long. Thank you all. -- Scott * If you contact me off list replace talklists@ with scott@ *
On Jun 1, 2009, at 12:00 PM, Scott Haneda wrote:
In a case where launchd works just fine all the time, for example: <dict> <key>Label</key> <string>com.host.ntpdate</string> <key>ProgramArguments</key> <array> <string>/bin/bash</string> <string>/Users/me/bin/set_date</string> </array> <key>RunAtLoad</key> <true/> <key>StartInterval</key> <integer>3600</integer> </dict>
script... (slightly shortened) #!/bin/bash # Create a place to store this log file log_path=/Users/me/Library/Logs/ntpdate.log
echo `date` >> $log_path
# Sleep for a duration so the system can get the interfaces up # Running in chunks just so the log and get updated # and I can see that it is working
echo "Sleeping...90 Seconds `date`" >> $log_path /bin/sleep 90;
# Do the work echo "About to run ntpdate -u" >> $log_path /usr/sbin/ntpdate -u echo "Finished running ntpdate -u" >> $log_path
log... (Full output, non shortened) ----- BEGIN ----- Started to set time and date Mon Jun 1 11:18:35 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:18:36 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:19:06 PDT 2009 Sleeping...15 Seconds Mon Jun 1 11:19:36 PDT 2009 About to run ntpdate -u Finished running ntpdate -u Finished setting time and date ----- END -----
You should use the StandardOutPath key (pointing to /Users/me/Library/ Logs/ntpdate.lo) in your launchd job, that way your script doesn't have to worry about echo(1)ing to a log file. Its stdout will just be redirected to the log file. This results in less complexity for your script.
Syslog tells me it works: Jun 1 10:57:41 host com.host.ntpdate[19696]: 1 Jun 10:57:41 ntpdate [19776]: adjust time server 17.151.16.21 offset -0.070842 sec
It works all the time, sans one case, which is when I need it most. I hope there are people here who know a but about the kernel of OS X. When the machine panics, which I am still working on determining why, I suspect ram, though on the 3rd batch. I have 100% ruled out a bad pram battery.
After a panic, the "automatically restart after power failure" feature does not work. It does work any other time, only after panics does not it. I have to physically reboot. This also adds more credence to the battery being fine.
When the machine comes up, there is a message on screen about the date and time being at 1969. Doing a google search for "panic 1969" and you see there are a good deal of others with panic logs at 1969. I conclude, panics mess with the date and time for some reason, or nuke nvram, not sure.
No big deal really, though this is an email server, and if I forget, or do not have access, the date and time will continue to be wrong, and all emails will be wrongly dated.
My plist is set to run at load, and also to run every 60 minutes, which I see it does do. I can find a syslog line that shows it ran at load, and ones that show it runs every hour as well.
launchd totally fails to run this item after a kernel panic and the date is set into the 1969 range. I have the sys log for where the dates are all Dec 31, which is just after the panic, and after I called in a reboot, which then foiled my script from setting the date correctly.
I see named started, which uses launchd, so I know it is working. I see a lot of other things start. But then I see this:
Dec 31 16:00:54 host-domain-com com.apple.launchd[1] (com.domain.ntpdate[50]): Exited with exit code: 1
You're drawing bad conclusions. launchd did, in fact, start the job. It has a PID and exit status and everything. The job just exited unsuccessfully. Not really launchd's fault. Don't interpret "The job failed" as "launchd did not start the job". All we do is start the job and get out of the way.
Right after that, I have a launchd item that checks S.M.A.R.T. which runs. Since this is after bind came up, I am pretty sure the interfaces are up enough that I could talk to a time server.
Is there something about ntpdate that will error out on insanely large time offsets? All I end up doing to solve it is `sudo /usr/ sbin/ntpdate -u`, the same thing launchd should do.
I will solve the panics, in the meantime, I would love to know how to protect myself from living in the past for too long. Thank you all.
Are you sure you even need to do this? Mac OS X should adjust clock drift automatically. -- Damien Sorresso BSD Engineering Apple Inc.
On Jun 1, 2009, at 12:32 PM, Damien Sorresso wrote:
On Jun 1, 2009, at 12:00 PM, Scott Haneda wrote:
In a case where launchd works just fine all the time, for example:
[snip...] echo "Sleeping...90 Seconds `date`" >> $log_path /bin/sleep 90;
# Do the work echo "About to run ntpdate -u" >> $log_path /usr/sbin/ntpdate -u echo "Finished running ntpdate -u" >> $log_path
log... (Full output, non shortened) ----- BEGIN ----- Started to set time and date Mon Jun 1 11:18:35 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:18:36 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:19:06 PDT 2009 Sleeping...15 Seconds Mon Jun 1 11:19:36 PDT 2009 About to run ntpdate -u Finished running ntpdate -u Finished setting time and date ----- END -----
You should use the StandardOutPath key (pointing to /Users/me/ Library/Logs/ntpdate.lo) in your launchd job, that way your script doesn't have to worry about echo(1)ing to a log file. Its stdout will just be redirected to the log file. This results in less complexity for your script.
Nice, thanks. What gets passed into StandardOutPath? I assume anything the script outputs via a plain echo, will be intercepted to the log? This is really nice, as I can then debug scripts as usual, and not have to run a tail -f on some log somewhere.
Syslog tells me it works: Jun 1 10:57:41 host com.host.ntpdate[19696]: 1 Jun 10:57:41 ntpdate[19776]: adjust time server 17.151.16.21 offset -0.070842 sec
It works all the time, sans one case, which is when I need it most. I hope there are people here who know a but about the kernel of OS X. When the machine panics, which I am still working on determining why, I suspect ram, though on the 3rd batch. I have 100% ruled out a bad pram battery.
After a panic, the "automatically restart after power failure" feature does not work. It does work any other time, only after panics does not it. I have to physically reboot. This also adds more credence to the battery being fine.
When the machine comes up, there is a message on screen about the date and time being at 1969. Doing a google search for "panic 1969" and you see there are a good deal of others with panic logs at 1969. I conclude, panics mess with the date and time for some reason, or nuke nvram, not sure.
No big deal really, though this is an email server, and if I forget, or do not have access, the date and time will continue to be wrong, and all emails will be wrongly dated.
My plist is set to run at load, and also to run every 60 minutes, which I see it does do. I can find a syslog line that shows it ran at load, and ones that show it runs every hour as well.
launchd totally fails to run this item after a kernel panic and the date is set into the 1969 range. I have the sys log for where the dates are all Dec 31, which is just after the panic, and after I called in a reboot, which then foiled my script from setting the date correctly.
I see named started, which uses launchd, so I know it is working. I see a lot of other things start. But then I see this:
Dec 31 16:00:54 host-domain-com com.apple.launchd[1] (com.domain.ntpdate[50]): Exited with exit code: 1
You're drawing bad conclusions. launchd did, in fact, start the job. It has a PID and exit status and everything. The job just exited unsuccessfully. Not really launchd's fault. Don't interpret "The job failed" as "launchd did not start the job". All we do is start the job and get out of the way.
Ok, understood. The trouble is, ntpdate is failing then, and I have no way to get the date set to 1969. Any attempts to do so via GUI, drop it back to current date. I can shove it to some other date, but ntpdate does not error when run. How can I mimic from an interactive shell exactly what launchd is doing, so I know I am on the same page? Any idea how to get the system clock into 1969? I really do not want to pull a pram battery in this laptop :) Thanks so much, I will follow up and see if I can find a ntpdate list. -- Scott * If you contact me off list replace talklists@ with scott@ *
Ok, understood. The trouble is, ntpdate is failing then, and I have no way to get the date set to 1969. Any attempts to do so via GUI, drop it back to current date. I can shove it to some other date, but ntpdate does not error when run.
It's probably overkill, but as a last resort you might try inducing a panic. To do so, I'd create a simple kext and have it do something "bad." Then, when you're ready to panic, simply do a kextload.
How can I mimic from an interactive shell exactly what launchd is doing, so I know I am on the same page?
I've found the source to be a pretty good place to get answers: svn co https://svn.macosforge.org/repository/launchd/branches/Leopard/launchd launchd David
At 11:04 -1000 1/6/09, Dave Keck wrote:
It's probably overkill, but as a last resort you might try inducing a panic. To do so, I'd create a simple kext and have it do something "bad." Then, when you're ready to panic, simply do a kextload.
Just FYI, on 10.5 and later you can trigger a panic via DTrace. <http://developer.apple.com/technotes/tn2004/tn2118.html#DTRACEPANICTRIGGER> S+E -- Quinn "The Eskimo!" <http://www.apple.com/developer/> Apple Developer Relations, Developer Technical Support, Core OS/Hardware
On Jun 1, 2009, at 12:47 PM, Scott Haneda wrote:
On Jun 1, 2009, at 12:32 PM, Damien Sorresso wrote: On Jun 1, 2009, at 12:00 PM, Scott Haneda wrote:
In a case where launchd works just fine all the time, for example:
[snip...] echo "Sleeping...90 Seconds `date`" >> $log_path /bin/sleep 90;
# Do the work echo "About to run ntpdate -u" >> $log_path /usr/sbin/ntpdate -u echo "Finished running ntpdate -u" >> $log_path
log... (Full output, non shortened) ----- BEGIN ----- Started to set time and date Mon Jun 1 11:18:35 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:18:36 PDT 2009 Sleeping...30 Seconds Mon Jun 1 11:19:06 PDT 2009 Sleeping...15 Seconds Mon Jun 1 11:19:36 PDT 2009 About to run ntpdate -u Finished running ntpdate -u Finished setting time and date ----- END -----
You should use the StandardOutPath key (pointing to /Users/me/ Library/Logs/ntpdate.lo) in your launchd job, that way your script doesn't have to worry about echo(1)ing to a log file. Its stdout will just be redirected to the log file. This results in less complexity for your script.
Nice, thanks. What gets passed into StandardOutPath? I assume anything the script outputs via a plain echo, will be intercepted to the log? This is really nice, as I can then debug scripts as usual, and not have to run a tail -f on some log somewhere.
Anything written to stdout. So echo qualifies as well as any command output you've piped into your own stdout.
Syslog tells me it works: Jun 1 10:57:41 host com.host.ntpdate[19696]: 1 Jun 10:57:41 ntpdate[19776]: adjust time server 17.151.16.21 offset -0.070842 sec
It works all the time, sans one case, which is when I need it most. I hope there are people here who know a but about the kernel of OS X. When the machine panics, which I am still working on determining why, I suspect ram, though on the 3rd batch. I have 100% ruled out a bad pram battery.
After a panic, the "automatically restart after power failure" feature does not work. It does work any other time, only after panics does not it. I have to physically reboot. This also adds more credence to the battery being fine.
When the machine comes up, there is a message on screen about the date and time being at 1969. Doing a google search for "panic 1969" and you see there are a good deal of others with panic logs at 1969. I conclude, panics mess with the date and time for some reason, or nuke nvram, not sure.
No big deal really, though this is an email server, and if I forget, or do not have access, the date and time will continue to be wrong, and all emails will be wrongly dated.
My plist is set to run at load, and also to run every 60 minutes, which I see it does do. I can find a syslog line that shows it ran at load, and ones that show it runs every hour as well.
launchd totally fails to run this item after a kernel panic and the date is set into the 1969 range. I have the sys log for where the dates are all Dec 31, which is just after the panic, and after I called in a reboot, which then foiled my script from setting the date correctly.
I see named started, which uses launchd, so I know it is working. I see a lot of other things start. But then I see this:
Dec 31 16:00:54 host-domain-com com.apple.launchd[1] (com.domain.ntpdate[50]): Exited with exit code: 1
You're drawing bad conclusions. launchd did, in fact, start the job. It has a PID and exit status and everything. The job just exited unsuccessfully. Not really launchd's fault. Don't interpret "The job failed" as "launchd did not start the job". All we do is start the job and get out of the way.
Ok, understood. The trouble is, ntpdate is failing then, and I have no way to get the date set to 1969. Any attempts to do so via GUI, drop it back to current date. I can shove it to some other date, but ntpdate does not error when run.
How can I mimic from an interactive shell exactly what launchd is doing, so I know I am on the same page? Any idea how to get the system clock into 1969? I really do not want to pull a pram battery in this laptop :)
I'm not sure I understand what you're asking. If you want to see why ntpdate is returning an exit code of 1, you should read their documentation or their source. Also, logging the output of the invocation of ntpdate would probably be useful for you. -- Damien Sorresso BSD Engineering Apple Inc.
On Jun 1, 2009, at 2:19 PM, Damien Sorresso wrote:
Ok, understood. The trouble is, ntpdate is failing then, and I have no way to get the date set to 1969. Any attempts to do so via GUI, drop it back to current date. I can shove it to some other date, but ntpdate does not error when run.
How can I mimic from an interactive shell exactly what launchd is doing, so I know I am on the same page? Any idea how to get the system clock into 1969? I really do not want to pull a pram battery in this laptop :)
I'm not sure I understand what you're asking. If you want to see why ntpdate is returning an exit code of 1, you should read their documentation or their source. Also, logging the output of the invocation of ntpdate would probably be useful for you.
Its getting a little OT, so I really appreciate this. Either a crash or a panic cause the machine to freeze, I have to manually reboot. When I do, there is a dialogue on screen telling me there is a date time issue. I wish I could find a screen shot but I do not remember the exact terminology. When that type of messages comes up, ntpdate returns exit 1. I need a test case, in order to reproduce it on another machine. The previous suggestions to inspire panic to happen, do not mess with the date and time on my macbook, and I can not really fiddle on a live server. As to reading the docs and source, I looked at the man page, where is the source for the BSD version, I can not find it, not that I would be very good at reading it anyway. Actually, I believe a PMU reset inspires that message, so maybe I can try that as well. Thanks all. -- Scott * If you contact me off list replace talklists@ with scott@ *
On Jun 1, 2009, at 12:32 PM, Damien Sorresso wrote:
I will solve the panics, in the meantime, I would love to know how to protect myself from living in the past for too long. Thank you all.
Are you sure you even need to do this? Mac OS X should adjust clock drift automatically.
I find this to not be the case at all. Perhaps on a desktop machine or laptop that has a reasonable expectation to be rebooted. On a server, where it is up as long as possible, and more than likely not logged in at all, it will drift quite a bit depending on which apps you are running. There are some things you can do, that will just halt the clock, I am not sure if that is to blame Apple or the software maker, either way, I get clock drift of about 5 minutes per 10 days, so I need this for sure. Thank you. -- Scott * If you contact me off list replace talklists@ with scott@ *
participants (4)
-
Damien Sorresso
-
Dave Keck
-
Quinn
-
Scott Haneda