Greetings, I'm having a number of issues with launchd, particularly under Leopard. (Most, if not all, of this works just fine under Tiger.) My current, and most pressing issue, is trying to replace a daemon when the user upgrades their software. Here's what I've got running - A daemon is installed in /Library/LaunchDaemons/QRecallScheduler501.plist. I install the file by writing the plist then executing 'launchctl load /path-to-plist' as root. The .plist file has the UserName property set to an individual user. This has been working (mostly) until I need to upgrade the user's software. I've tried various techniques, but the one I'm using now which I think should work doesn't. Setup (if this makes any difference): The path to the daemon is a symbolic link to the actual binary in my application bundle. When the application is upgraded (via Sparkle), the application bundle is replaced with the new application bundle containing an updated daemon. Step #1: Run 'launchctl unload -w /path-to-plist' as root. I wait for launchctl to return and then wait for the daemon to terminate. Step #2: Delete the .plist at /Library/LaunchDaemons/QRecallScheduler501.plist. Step #3: Write a new .plist at /Library/LaunchDaemons/QRecallScheduler501.plist Step #4: Run 'launchctl load /path-to-plist' as root. Wait for launchctl to finish. Here's what happens: In step #1, launchctl returns with a status of 0. The daemon receives a SIGTERM and shuts down (2-3 seconds). IMMEDIATELY, launchd starts the new deamon. I can only assume because the location of the updated binary is the same as the old one that was just terminated. Steps #2 and #3 delete and rewrite the .plist. Step #4 runs 'launchctl load' which returns with a status of 0. What I get in the console log is 11/28/07 4:09:44 PM [0x0-0x1e01e].com.qrecall.client[145] launchctl: Error unloading: com.qrecall.scheduler 11/28/07 4:09:56 PM com.apple.launchd[62] (com.qrecall.scheduler.501) Ignored this key: UserName Contrary to the second error, the daemon is run with a UID of 501 (which is correct -- the daemon will terminate if it's started as root). The new daemon is then repeatedly restarted. (The daemon checks for a duplicate running instance of itself and terminates immediately if it's already running.) 11/28/07 4:09:57 PM com.apple.launchd[62] (com.qrecall.scheduler.501) Throttling respawn: Will start in 10 seconds 11/28/07 4:10:07 PM com.apple.launchd[62] (com.qrecall.scheduler.501) Throttling respawn: Will start in 10 seconds 11/28/07 4:10:17 PM com.apple.launchd[62] (com.qrecall.scheduler.501) Throttling respawn: Will start in 10 seconds 11/28/07 4:10:27 PM com.apple.launchd[62] (com.qrecall.scheduler.501) Throttling respawn: Will start in 10 seconds The really bad part is the instance of the daemon that is running does not seem to be running in the correct environment/namespace. When it launches a sub-process it attempts to connect with it using Mach ports; The connection fails. Restarting the OS doesn't fix the problem. I think the problem is that (in Leopard) there seems to be two instances of launchd running: One as root and one as user 501. But I'm getting very confused as to which one I should be trying to deal with. I changed the code to call launchctl while running as user 501. That sometimes works and sometimes doesn't. If I can get the daemon to stop and issue 'launchctl load /...' as user 501, the daemon starts working: specifically, it can communication with the sub-processes that it starts. Once started using 'launchctl' as user 501 it can be stopped again using launchctl as 501. But after a restart, it's running in the wrong namespace again and I have to perform a 'sudo launchctl unload' to get it to stop. So after an entire day, I'm really confused as to what launchctl commands I should be issuing to replace a running deamon, in what order, and using what UID. James -- James Bucanek
I recommend reading the "Daemonomicon" <http://developer.apple.com/technotes/tn2005/tn2083.html#SECDAEMONOMICON
.
Short summary pertinent to your symptoms: - Root (uid 0) launchd loads jobs from LaunchDameons directories. Use `sudo launchctl` to communicate with it. - Per-user (uid 501) launchd loads jobs from LaunchAgents directories. Use `launchctl` to communicate with it. - Per-user launchd ignores the UserName key because it does not have the privilege to execute as any other user than the current user. - Per-user mach bootstrap is a sub-bootstrap of the root mach bootstrap: agents can lookup daemons, but daemons cannot lookup agents. - Kevin On Nov 28, 2007, at 10:28 PM, James Bucanek wrote:
Greetings,
I'm having a number of issues with launchd, particularly under Leopard. (Most, if not all, of this works just fine under Tiger.) My current, and most pressing issue, is trying to replace a daemon when the user upgrades their software.
Here's what I've got running
- A daemon is installed in /Library/LaunchDaemons/ QRecallScheduler501.plist. I install the file by writing the plist then executing 'launchctl load /path-to-plist' as root. The .plist file has the UserName property set to an individual user.
This has been working (mostly) until I need to upgrade the user's software. I've tried various techniques, but the one I'm using now which I think should work doesn't.
Setup (if this makes any difference): The path to the daemon is a symbolic link to the actual binary in my application bundle. When the application is upgraded (via Sparkle), the application bundle is replaced with the new application bundle containing an updated daemon.
Step #1: Run 'launchctl unload -w /path-to-plist' as root. I wait for launchctl to return and then wait for the daemon to terminate.
Step #2: Delete the .plist at /Library/LaunchDaemons/ QRecallScheduler501.plist.
Step #3: Write a new .plist at /Library/LaunchDaemons/ QRecallScheduler501.plist
Step #4: Run 'launchctl load /path-to-plist' as root. Wait for launchctl to finish.
Here's what happens:
In step #1, launchctl returns with a status of 0. The daemon receives a SIGTERM and shuts down (2-3 seconds).
IMMEDIATELY, launchd starts the new deamon. I can only assume because the location of the updated binary is the same as the old one that was just terminated.
Steps #2 and #3 delete and rewrite the .plist.
Step #4 runs 'launchctl load' which returns with a status of 0. What I get in the console log is
11/28/07 4:09:44 PM [0x0-0x1e01e].com.qrecall.client[145] launchctl: Error unloading: com.qrecall.scheduler 11/28/07 4:09:56 PM com.apple.launchd[62] (com.qrecall.scheduler. 501) Ignored this key: UserName
Contrary to the second error, the daemon is run with a UID of 501 (which is correct -- the daemon will terminate if it's started as root).
The new daemon is then repeatedly restarted. (The daemon checks for a duplicate running instance of itself and terminates immediately if it's already running.)
11/28/07 4:09:57 PM com.apple.launchd[62] (com.qrecall.scheduler. 501) Throttling respawn: Will start in 10 seconds 11/28/07 4:10:07 PM com.apple.launchd[62] (com.qrecall.scheduler. 501) Throttling respawn: Will start in 10 seconds 11/28/07 4:10:17 PM com.apple.launchd[62] (com.qrecall.scheduler. 501) Throttling respawn: Will start in 10 seconds 11/28/07 4:10:27 PM com.apple.launchd[62] (com.qrecall.scheduler. 501) Throttling respawn: Will start in 10 seconds
The really bad part is the instance of the daemon that is running does not seem to be running in the correct environment/namespace. When it launches a sub-process it attempts to connect with it using Mach ports; The connection fails. Restarting the OS doesn't fix the problem.
I think the problem is that (in Leopard) there seems to be two instances of launchd running: One as root and one as user 501. But I'm getting very confused as to which one I should be trying to deal with.
I changed the code to call launchctl while running as user 501. That sometimes works and sometimes doesn't. If I can get the daemon to stop and issue 'launchctl load /...' as user 501, the daemon starts working: specifically, it can communication with the sub-processes that it starts. Once started using 'launchctl' as user 501 it can be stopped again using launchctl as 501. But after a restart, it's running in the wrong namespace again and I have to perform a 'sudo launchctl unload' to get it to stop.
So after an entire day, I'm really confused as to what launchctl commands I should be issuing to replace a running deamon, in what order, and using what UID.
James -- James Bucanek
_______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo/launchd-dev
Kevin Van Vechten <mailto:kvv@apple.com> wrote (Wednesday, November 28, 2007 12:08 AM -0800):
I recommend reading the "Daemonomicon" <http://developer.apple.com/technotes/tn2005/tn2083.html#SECDAEMONOMICON>.
Been there, read that. (This is the one technote I keep on my desktop.)
Short summary pertinent to your symptoms:
- Root (uid 0) launchd loads jobs from LaunchDameons directories. Use `sudo launchctl` to communicate with it.
Exactly. Since this is a daemon installed in /Library/LaunchDaemons I was forking launchctl when running as root. But that's when all the trouble started. :(
- Per-user (uid 501) launchd loads jobs from LaunchAgents directories. Use `launchctl` to communicate with it.
This isn't an agent.
- Per-user launchd ignores the UserName key because it does not have the privilege to execute as any other user than the current user. - Per-user mach bootstrap is a sub-bootstrap of the root mach bootstrap: agents can lookup daemons, but daemons cannot lookup agents.
I'll try to clean out my system (each instance of launchd seems to have its own, persistent, history of what's been loaded and unloaded) and run my tests again verifying that each execution of launchctl occurs when running as root. From your notes, and reading technote 2083, it appears that I was doing (or I *thought* I was doing) exactly what I I should have been doing: Copy the .plist file to /Library/LaunchDaemons, then invoke 'launchctl load ...' as root to install the daemon and 'launchctl unload ...' to uninstall the daemon. James -- James Bucanek
James, You're clearly seen where the OS is headed. For better and for worse, you have also are discovering the edge cases of where our vision isn't completely implemented. Let's have a conference call. I'll contact you offline. For the rest of the people on this email list, please note, I cannot offer this level of service to any developer/customer, but at the moment, I think I have the time. davez On Nov 29, 2007, at 7:51 AM, James Bucanek wrote:
Kevin Van Vechten <mailto:kvv@apple.com> wrote (Wednesday, November 28, 2007 12:08 AM -0800):
I recommend reading the "Daemonomicon" <http://developer.apple.com/technotes/tn2005/tn2083.html#SECDAEMONOMICON
.
Been there, read that. (This is the one technote I keep on my desktop.)
Short summary pertinent to your symptoms:
- Root (uid 0) launchd loads jobs from LaunchDameons directories. Use `sudo launchctl` to communicate with it.
Exactly. Since this is a daemon installed in /Library/LaunchDaemons I was forking launchctl when running as root. But that's when all the trouble started. :(
- Per-user (uid 501) launchd loads jobs from LaunchAgents directories. Use `launchctl` to communicate with it.
This isn't an agent.
- Per-user launchd ignores the UserName key because it does not have the privilege to execute as any other user than the current user. - Per-user mach bootstrap is a sub-bootstrap of the root mach bootstrap: agents can lookup daemons, but daemons cannot lookup agents.
I'll try to clean out my system (each instance of launchd seems to have its own, persistent, history of what's been loaded and unloaded) and run my tests again verifying that each execution of launchctl occurs when running as root. From your notes, and reading technote 2083, it appears that I was doing (or I *thought* I was doing) exactly what I I should have been doing: Copy the .plist file to /Library/LaunchDaemons, then invoke 'launchctl load ...' as root to install the daemon and 'launchctl unload ...' to uninstall the daemon.
James -- James Bucanek
_______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org http://lists.macosforge.org/mailman/listinfo/launchd-dev
Hello again, I've documented a little more of my problem. I think I might have a solution, but I'm still trying to understand what's going on and why. (Skip to the summary of the problem at the end if you want the short version.) - A launchd.plist configuration file is stored it in /Library/LaunchDaemons/. The configuration includes the UserName=james property. I made sure everything was unloaded before starting. - Wrote a little script so I can see what's going on echo 'per-user'; launchctl list | fgrep -i qre echo 'per-user namespace'; launchctl bslist | fgrep -i qre echo 'root'; sudo launchctl list | fgrep -i qre echo 'root namespace'; sudo launchctl bslist | fgrep -i qre - Load the daemon using the per-user launchd crocodile:~ james$ launchctl load /Library/LaunchDaemons/QRecallScheduler501.plist crocodile:~ james$ ~/Desktop/seedaemons.sh per-user 789 - [0x0-0x4d04d].com.qrecall.client 115 - [0x0-0x16016].com.qrecall.monitor 1251 - 0x1109d0.QRecallScheduler 1250 - com.qrecall.scheduler.501 per-user namespace A QRecallMonitor root root namespace (Ignore the extra child instance of QRecallScheduler; the daemon spawns a copy of itself to generate notifications -- a solution that I'm going to replace with another mechanism.) I can see that the scheduler is running in the context of the per-user launchd. When it starts up its helper process, that too appears in the context of the user session: crocodile:~ james$ ~/Desktop/seedaemons.sh per-user 789 - [0x0-0x4d04d].com.qrecall.client 115 - [0x0-0x16016].com.qrecall.monitor 1263 - 0x1112c0.QRecallHelper 1251 - 0x1109d0.QRecallScheduler 1250 - com.qrecall.scheduler.501 per-user namespace A QRecallMonitor root root namespace After starting the helper tool, the scheduler connects with the helper and does its thing. So far, so good. Now, I try to install the scheduler process as a system-wide daemon. crocodile:~ james$ launchctl unload /Library/LaunchDaemons/QRecallScheduler501.plist crocodile:~ james$ ~/Desktop/seedaemons.sh per-user 789 - [0x0-0x4d04d].com.qrecall.client 115 - [0x0-0x16016].com.qrecall.monitor per-user namespace A QRecallMonitor root root namespace (All traces of the scheduler are gone.) crocodile:~ james$ sudo launchctl load /Library/LaunchDaemons/QRecallScheduler501.plist crocodile:~ james$ ~/Desktop/seedaemons.sh per-user 789 - [0x0-0x4d04d].com.qrecall.client 115 - [0x0-0x16016].com.qrecall.monitor 1170 - 0x10dcc0.QRecallScheduler 1167 - 0x1109d0.QRecallScheduler per-user namespace A QRecallMonitor root 1170 - 0x10be90.QRecallScheduler 1167 - com.qrecall.scheduler.501 root namespace A QRecallScheduler.501 It looks OK. The scheduler appears in the bootstrap launchd and also in the context of the per-user launchd (since I assume that the per-user launchd inherits everything from its parent). The "QRecallScheduler.501" is its registered Mach communications port. Now when the scheduler starts the helper tool: crocodile:~ james$ ~/Desktop/seedaemons.sh per-user 789 - [0x0-0x4d04d].com.qrecall.client 115 - [0x0-0x16016].com.qrecall.monitor 1182 - 0x10df80.QRecallHelper 1170 - 0x10dcc0.QRecallScheduler 1167 - 0x1109d0.QRecallScheduler per-user namespace A QRecallMonitor root 1182 - 0x10b0b0.QRecallHelper 1170 - 0x10be90.QRecallScheduler 1167 - com.qrecall.scheduler.501 root namespace A QRecallScheduler.501 The helper process appears in both contexts, but its communications port doesn't appear in the bootstrap namespace. The helper successfully registers its distribute objects connection port, but when the scheduler process attempt to connect with that port it fails. It appears that the problem is this: The scheduler is started by the bootstrap launchd. It creates and registers a Mach port for communications. That works. The scheduler then launches a child process (the helper, using NSTask), but when it tries to connect with the Mach port created by the helper that doesn't work. No process can see the port and it doesn't appear in the boostrap namespace according to launchctl, even though the helper is told that the port was successfully created and registered. -- James Bucanek
A quick query: do the five sessiontypes (Aqua, LoginWindow, Background, StandardIO and System) correspond to the five locations (~/Library/LaunchAgents, /Library/LaunchAgents, /Library/LaunchDaemons, /System/Library/LaunchAgents, /System/Library/LaunchDaemons)? If not, how should we interpret the meaning of the sessiontypes? Hamish
On Nov 29, 2007, at 3:50 PM, Hamish Allan wrote:
A quick query: do the five sessiontypes (Aqua, LoginWindow, Background, StandardIO and System) correspond to the five locations (~/Library/LaunchAgents, /Library/LaunchAgents, /Library/LaunchDaemons, /System/Library/LaunchAgents, /System/Library/LaunchDaemons)?
No, it's not a direct mapping.
If not, how should we interpret the meaning of the sessiontypes?
<http://developer.apple.com/technotes/tn2005/tn2083.html#TABLAUNCHAGENTSUBTYP...
- LaunchDaemons are always loaded exclusively by the root launchd. - LaunchAgents are typically loaded by the per-user launchd (unless you log in as root, or have a "LoginWindow" agent as described below). Here's a summary of sessions: - LaunchDaemons are only loaded into the "System" session of the root launchd. No per-user launchd has a "System" session. - Unless otherwise specified by the LimitLoadToSessionType key in the plist, LaunchAgents are loaded into the "Aqua" session. - The "Aqua" session is the same session all GUI Applications are executed in. - Logins via SSH, telnet, and others are executed in a "StandardIO" session. - Each user is given a single "Background" session. Jobs in the "Background" session may live longer than the last logout of the user. - The "LoginWindow" session is active when the system is at the login window. It disappears once a user as logged in. The list of directories is the result of: {LaunchDaemons, LaunchAgents} x {~/Library, /Library, /System/Liberary} However, ~/Library/LaunchDaemons is omitted because there is not yet a current user when the root launchd starts, and per-user launchd instances never execute daemons. - Kevin
On Nov 29, 2007, at 5:14 PM, Kevin Van Vechten wrote:
On Nov 29, 2007, at 3:50 PM, Hamish Allan wrote:
A quick query: do the five sessiontypes (Aqua, LoginWindow, Background, StandardIO and System) correspond to the five locations (~/Library/LaunchAgents, /Library/LaunchAgents, /Library/LaunchDaemons, /System/Library/LaunchAgents, /System/Library/LaunchDaemons)?
No, it's not a direct mapping.
If not, how should we interpret the meaning of the sessiontypes?
<http://developer.apple.com/technotes/tn2005/tn2083.html#TABLAUNCHAGENTSUBTYP...
- LaunchDaemons are always loaded exclusively by the root launchd. - LaunchAgents are typically loaded by the per-user launchd (unless you log in as root, or have a "LoginWindow" agent as described below).
Here's a summary of sessions:
- LaunchDaemons are only loaded into the "System" session of the root launchd. No per-user launchd has a "System" session. - Unless otherwise specified by the LimitLoadToSessionType key in the plist, LaunchAgents are loaded into the "Aqua" session.
- The "Aqua" session is the same session all GUI Applications are executed in. - Logins via SSH, telnet, and others are executed in a "StandardIO" session. - Each user is given a single "Background" session. Jobs in the "Background" session may live longer than the last logout of the user.
- The "LoginWindow" session is active when the system is at the login window. It disappears once a user as logged in.
The list of directories is the result of:
{LaunchDaemons, LaunchAgents} x {~/Library, /Library, /System/ Liberary}
However, ~/Library/LaunchDaemons is omitted because there is not yet a current user when the root launchd starts, and per-user launchd instances never execute daemons.
For whatever it may be worth: In hindsight, there should never have been a LaunchAgent versus LaunchDaemon distinction. It just have been just LaunchJobs and the scoping should have been controlled by the LimitLoadToSessionType. davez
On Nov 29, 2007, at 5:14 PM, Kevin Van Vechten wrote:
Here's a summary of sessions:
- LaunchDaemons are only loaded into the "System" session of the root launchd. No per-user launchd has a "System" session. - Unless otherwise specified by the LimitLoadToSessionType key in the plist, LaunchAgents are loaded into the "Aqua" session.
- The "Aqua" session is the same session all GUI Applications are executed in. - Logins via SSH, telnet, and others are executed in a "StandardIO" session. - Each user is given a single "Background" session. Jobs in the "Background" session may live longer than the last logout of the user.
What are the odds of this information finding its way into the launchctl man page?
On Nov 29, 2007, at 7:37 PM, Nathan Duran wrote:
On Nov 29, 2007, at 5:14 PM, Kevin Van Vechten wrote:
Here's a summary of sessions:
- LaunchDaemons are only loaded into the "System" session of the root launchd. No per-user launchd has a "System" session. - Unless otherwise specified by the LimitLoadToSessionType key in the plist, LaunchAgents are loaded into the "Aqua" session.
- The "Aqua" session is the same session all GUI Applications are executed in. - Logins via SSH, telnet, and others are executed in a "StandardIO" session. - Each user is given a single "Background" session. Jobs in the "Background" session may live longer than the last logout of the user.
What are the odds of this information finding its way into the launchctl man page?
A lot better when a bug report is filed into our tracking system. :-) http://bugreport.apple.com/ davez
On Nov 29, 2007, at 11:36 AM, James Bucanek wrote:
The helper process appears in both contexts, but its communications port doesn't appear in the bootstrap namespace. The helper successfully registers its distribute objects connection port, but when the scheduler process attempt to connect with that port it fails.
It appears that the problem is this: The scheduler is started by the bootstrap launchd. It creates and registers a Mach port for communications. That works. The scheduler then launches a child process (the helper, using NSTask), but when it tries to connect with the Mach port created by the helper that doesn't work. No process can see the port and it doesn't appear in the boostrap namespace according to launchctl, even though the helper is told that the port was successfully created and registered.
Just to clear up some terminology... A bootstrap is a namespace for looking up mach services by name. I would describe the above as "the helper's port does not appear in the root bootstrap." And "the scheduler is started by the root launchd."
crocodile:~ james$ ~/Desktop/seedaemons.sh
per-user 789 - [0x0-0x4d04d].com.qrecall.client 115 - [0x0-0x16016].com.qrecall.monitor 1182 - 0x10df80.QRecallHelper 1170 - 0x10dcc0.QRecallScheduler 1167 - 0x1109d0.QRecallScheduler per-user namespace A QRecallMonitor root 1182 - 0x10b0b0.QRecallHelper 1170 - 0x10be90.QRecallScheduler 1167 - com.qrecall.scheduler.501 root namespace A QRecallScheduler.501
This output doesn't actually tell us which bootstrap the QRecallHelper or QRecallScheduler are running in. All it tells us is that both of these processes have made themselves known to each of these launchd processes (which could have happened a number of ways). The fact that the lookup fails in the root bootstrap strongly implies the helper is registered in the per-user bootstrap (if it's registered at all). What is the name of the helper's registration? (I'm assuming it's neither "QRecallScheduler.501" nor "QRecallMonitor" which are the only two names you've listed.) As an aside, which API are you using to register the mach port? bootstrap_register? CFMessagePort? - Kevin
Kevin Van Vechten <mailto:kvv@apple.com> wrote (Thursday, November 29, 2007 5:32 PM -0800):
Just to clear up some terminology...
A bootstrap is a namespace for looking up mach services by name. I would describe the above as "the helper's port does not appear in the root bootstrap." And "the scheduler is started by the root launchd."
That's correct.
crocodile:~ james$ ~/Desktop/seedaemons.sh
per-user 789 - [0x0-0x4d04d].com.qrecall.client 115 - [0x0-0x16016].com.qrecall.monitor 1182 - 0x10df80.QRecallHelper 1170 - 0x10dcc0.QRecallScheduler 1167 - 0x1109d0.QRecallScheduler per-user namespace A QRecallMonitor root 1182 - 0x10b0b0.QRecallHelper 1170 - 0x10be90.QRecallScheduler 1167 - com.qrecall.scheduler.501 root namespace A QRecallScheduler.501
This output doesn't actually tell us which bootstrap the QRecallHelper or QRecallScheduler are running in. All it tells us is that both of these processes have made themselves known to each of these launchd processes (which could have happened a number of ways).
The fact that the lookup fails in the root bootstrap strongly implies the helper is registered in the per-user bootstrap (if it's registered at all). What is the name of the helper's registration? (I'm assuming it's neither "QRecallScheduler.501" nor "QRecallMonitor" which are the only two names you've listed.)
A helper is given a name that includes its user and a job number. So it would register a port with a name like "QRecallHelper.501.6f3b21e0".
As an aside, which API are you using to register the mach port? bootstrap_register? CFMessagePort?
Both the scheduler and the helper register their distributed object connection using the following code (this is actually in a subroutine that's shared by all binaries, so I'm confident there are no differences between tasks): NSConnection* serverConnection = [NSConnection defaultConnection]; if (![serverConnection registerName:serviceName]) { // log error serverConnection = nil; } return (serverConnection); -- James Bucanek
On Nov 29, 2007, at 5:12 PM, James Bucanek wrote:
A helper is given a name that includes its user and a job number. So it would register a port with a name like "QRecallHelper. 501.6f3b21e0".
So then let me ask why you think that name has been registered in the root bootstrap when it doesn't appear in the following output?
echo 'per-user namespace'; launchctl bslist | fgrep -i qre echo 'root namespace'; sudo launchctl bslist | fgrep -i qre
per-user namespace A QRecallMonitor
root namespace A QRecallScheduler.501
As you've indicated NSConnection's registerName is succeeding, we can probably assume that the name is being registered in _some_ bootstrap, but some subtlety of your NSTask approach may be landing it in a bootstrap that you don't expect. - Kevin
Kevin Van Vechten <mailto:kvv@apple.com> wrote (Thursday, November 29, 2007 6:28 PM -0800):
On Nov 29, 2007, at 5:12 PM, James Bucanek wrote:
A helper is given a name that includes its user and a job number. So it would register a port with a name like "QRecallHelper.501.6f3b21e0".
So then let me ask why you think that name has been registered in the root bootstrap when it doesn't appear in the following output?
I guess I wasn't clear. Since the registration isn't throwing any errors, I'm assuming that the helper is registering its name in *some* bootstrap. I expected that to be the same bootstrap as the parent process that launched it, but it's clearly not. Would it help to know the parent of the scheduler daemon? I could add debug code to log the PPID of each process as it starts up.
echo 'per-user namespace'; launchctl bslist | fgrep -i qre echo 'root namespace'; sudo launchctl bslist | fgrep -i qre
per-user namespace A QRecallMonitor
root namespace A QRecallScheduler.501
As you've indicated NSConnection's registerName is succeeding, we can probably assume that the name is being registered in _some_ bootstrap, but some subtlety of your NSTask approach may be landing it in a bootstrap that you don't expect.
The helper is either a normal executable or a SUID root executable (a la MoreAuthSample) that gets launched with: + (NSTask*)launchHelper:(NSString*)helperPath withCommand:(NSString*)commandName passingAuthorization:(AuthorizationRef)authorizationRef returningPortName:(NSString**)portNamePtr { NSTask* helperTask = [[NSTask new] autorelease]; NSString* portName = nil; NS_DURING // If a helper wasn't specified, use the bundled helper if (helperPath==nil) helperPath = [[NSBundle mainBundle] pathForResource:kHelperName ofType:nil]; // Prepare the helper to execute [helperTask setLaunchPath:helperPath]; // Redirect the task's stdin and stdout to these NSPipe objects, so we can talk to the task NSPipe* inPipe = [NSPipe pipe]; NSPipe* outPipe = [NSPipe pipe]; [helperTask setStandardInput:inPipe]; [helperTask setStandardOutput:outPipe]; NSFileHandle* stdInHandle = [inPipe fileHandleForWriting]; NSFileHandle* stdOutHandle = [outPipe fileHandleForReading]; // Start the task running [helperTask launch]; ... Followed by a bunch of convoluted code to passes all of the startup parameters to the helper via stdin (which the helper does get, because those get logged as the helper starts up). My assumption was that [NSTask launch] would start the process in the same environment/session/bootstrap as the parent process but something seems to be interfering with that. I can see a couple of possible workarounds: - Figure out why the helper's ports aren't getting registered in a namespace that's accessible by its parent. Maybe I need to use something other than NSTask to start the child process. - Return (once again!) to using UNIX domain sockets for communications, which don't have these kinds of scoping issues. -- James Bucanek
On Nov 29, 2007 6:28 AM, James Bucanek <subscriber@gloaming.com> wrote:
The really bad part is the instance of the daemon that is running does not seem to be running in the correct environment/namespace. When it launches a sub-process it attempts to connect with it using Mach ports; The connection fails. Restarting the OS doesn't fix the problem.
Have you tried passing various different sessiontypes (Background, StandardIO and System) to launchctl? I'd have thought that if this worked, so would an OS restart, but it might be worth a try. Hamish
participants (5)
-
Dave Zarzycki
-
Hamish Allan
-
James Bucanek
-
Kevin Van Vechten
-
Nathan Duran