100% repeatable Kernel panic caused by launchd
I was pointed at this mailing list to try and figure out what seems to be a pretty bad launchd bug. I recently added an opentsdb formula to homebrew, including a launchd plist. However when I load the plist with launchctl load ~/Library/LaunchAgents/homebrew.mxcl.opentsdb.plist my system hangs, then kernel panics, then on reboot it just hangs and doesn’t ever finish booting, requiring me to boot into single user mode, fsck the filesystem, mount the fs as read/write and then delete the plist. This is 100% reproducible on my MacBook Pro (Retina, 15-inch, Mid 2014) running OS X 10.11.5 (15F34). Steps to reproduce (warning, this will almost certainly cause a kernel panic): /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" brew install opentsdb ln -s `brew --prefix hbase`/homebrew.mxcl.hbase.plist ~/Library/LaunchAgents/ ln -s `brew --prefix opentsdb`/homebrew.mxcl.opentsdb.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.hbase.plist sleep 15 && launchctl load ~/Library/LaunchAgents/homebrew.mxcl.opentsdb.plist Note that running hbase via launchd and starting opentsdb manually does not cause a kernel panic (or any other problems that I’ve noticed). Does anyone have any ideas about why this is happening?
On 24 May 2016, at 18:52, Camden Narzt <camden.narzt@hotmail.com> wrote:
Does anyone have any ideas about why this is happening?
The answer to the question “Why does my system kernel panic?” is pretty much always “Because of a bug in the OS?” The system is not supposed to panic unless you do obviously dodgy things (like load KEXTs). Whatever else you do, please file a bug report about this. <https://developer.apple.com/bug-reporting/> Make sure to: * check whether there was a panic log created and, if so, attach that to your bug * post your bug number here, for the record Share and Enjoy -- Quinn "The Eskimo!" <http://www.apple.com/developer/> Apple Developer Relations, Developer Technical Support, Core OS/Hardware
On May 24, 2016, at 11:50 AM, Quinn The Eskimo! <eskimo1@apple.com> wrote:
On 24 May 2016, at 18:52, Camden Narzt <camden.narzt@hotmail.com> wrote:
Does anyone have any ideas about why this is happening?
The answer to the question “Why does my system kernel panic?” is pretty much always “Because of a bug in the OS?” The system is not supposed to panic unless you do obviously dodgy things (like load KEXTs).
One quick note on this- what Quinn is saying here is 100% correct, but not necessarily good for you. The fact that the kernel shouldn’t panic does NOT mean that your code should work. While it’s possible that this is entirely an OS level issue, what’s often the case is that OS X is failing to catch something dodgy and that failure eventually leads to a panic. Fixing the issue is OS X means catching the problem earlier and doing something “nice” like crashing/hanging the triggering process, not making the dodgy thing work. I have no idea what homebrew.mxcl.opentsdb.plist is kicking off, but it’s very likely that getting this to work is going to involved changes to homebrew and/or opentsdb, even after the kernel panic is fixed. -Kevin
Radar:26456448 Camden
On May 24, 2016, at 1:10 PM, Kevin Elliott <kelliott@mac.com> wrote:
On May 24, 2016, at 11:50 AM, Quinn The Eskimo! <eskimo1@apple.com> wrote:
On 24 May 2016, at 18:52, Camden Narzt <camden.narzt@hotmail.com> wrote:
Does anyone have any ideas about why this is happening?
The answer to the question “Why does my system kernel panic?” is pretty much always “Because of a bug in the OS?” The system is not supposed to panic unless you do obviously dodgy things (like load KEXTs).
One quick note on this- what Quinn is saying here is 100% correct, but not necessarily good for you. The fact that the kernel shouldn’t panic does NOT mean that your code should work. While it’s possible that this is entirely an OS level issue, what’s often the case is that OS X is failing to catch something dodgy and that failure eventually leads to a panic. Fixing the issue is OS X means catching the problem earlier and doing something “nice” like crashing/hanging the triggering process, not making the dodgy thing work.
I have no idea what homebrew.mxcl.opentsdb.plist is kicking off, but it’s very likely that getting this to work is going to involved changes to homebrew and/or opentsdb, even after the kernel panic is fixed.
-Kevin _______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/launchd-dev
You should attach the problematic launchd.plist to that Radar. -damien
On 24 May, 2016, at 16:07, Camden Narzt <camden.narzt@hotmail.com> wrote:
Radar:26456448
Camden
On May 24, 2016, at 1:10 PM, Kevin Elliott <kelliott@mac.com> wrote:
On May 24, 2016, at 11:50 AM, Quinn The Eskimo! <eskimo1@apple.com> wrote:
On 24 May 2016, at 18:52, Camden Narzt <camden.narzt@hotmail.com> wrote:
Does anyone have any ideas about why this is happening?
The answer to the question “Why does my system kernel panic?” is pretty much always “Because of a bug in the OS?” The system is not supposed to panic unless you do obviously dodgy things (like load KEXTs).
One quick note on this- what Quinn is saying here is 100% correct, but not necessarily good for you. The fact that the kernel shouldn’t panic does NOT mean that your code should work. While it’s possible that this is entirely an OS level issue, what’s often the case is that OS X is failing to catch something dodgy and that failure eventually leads to a panic. Fixing the issue is OS X means catching the problem earlier and doing something “nice” like crashing/hanging the triggering process, not making the dodgy thing work.
I have no idea what homebrew.mxcl.opentsdb.plist is kicking off, but it’s very likely that getting this to work is going to involved changes to homebrew and/or opentsdb, even after the kernel panic is fixed.
-Kevin _______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/launchd-dev
_______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/launchd-dev
Le 24 mai 2016 à 19:52, Camden Narzt a écrit :
[…]
This is 100% reproducible on my MacBook Pro (Retina, 15-inch, Mid 2014) running OS X 10.11.5 (15F34).
Steps to reproduce (warning, this will almost certainly cause a kernel panic):
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" brew install opentsdb ln -s `brew --prefix hbase`/homebrew.mxcl.hbase.plist ~/Library/LaunchAgents/ ln -s `brew --prefix opentsdb`/homebrew.mxcl.opentsdb.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.hbase.plist sleep 15 && launchctl load ~/Library/LaunchAgents/homebrew.mxcl.opentsdb.plist
Note that running hbase via launchd and starting opentsdb manually does not cause a kernel panic (or any other problems that I’ve noticed).
Does anyone have any ideas about why this is happening?
Hello Camden, Do you mean we have to download/install a lot of things for being able to answer your seemingly innocuous question "any ideas about why this is happening"? We all (well, at least me) on this list are now alarmed about a possible problem with launchd (or even Mac OS X), yet without any clue about that problem. Could you provide us with a case reduced to the bare minimum allowing to reproduce the problem you are encountering? TIA, Axel
I’m sorry, I just assumed anyone doing any dev work on OSX would already run homebrew. A much smaller test case with no installs so definitely a problem with launchd/OS X is as follows: cat <<EOF > php.plist <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>KeepAlive</key> <dict> <key>OtherJobEnabled</key> <string>org.apache.httpd</string> </dict> <key>Label</key> <string>php</string> <key>ProgramArguments</key> <array> <string>/usr/bin/php</string> <string>--server</string> <string>127.0.0.1:9000</string> </array> </dict> </plist> EOF sudo apachectl start launchctl load php.plist Cam
On May 25, 2016, at 3:23 PM, Axel Luttgens <axel.luttgens@skynet.be> wrote:
Le 24 mai 2016 à 19:52, Camden Narzt a écrit :
[…]
This is 100% reproducible on my MacBook Pro (Retina, 15-inch, Mid 2014) running OS X 10.11.5 (15F34).
Steps to reproduce (warning, this will almost certainly cause a kernel panic):
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" brew install opentsdb ln -s `brew --prefix hbase`/homebrew.mxcl.hbase.plist ~/Library/LaunchAgents/ ln -s `brew --prefix opentsdb`/homebrew.mxcl.opentsdb.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.hbase.plist sleep 15 && launchctl load ~/Library/LaunchAgents/homebrew.mxcl.opentsdb.plist
Note that running hbase via launchd and starting opentsdb manually does not cause a kernel panic (or any other problems that I’ve noticed).
Does anyone have any ideas about why this is happening?
Hello Camden,
Do you mean we have to download/install a lot of things for being able to answer your seemingly innocuous question "any ideas about why this is happening"? We all (well, at least me) on this list are now alarmed about a possible problem with launchd (or even Mac OS X), yet without any clue about that problem. Could you provide us with a case reduced to the bare minimum allowing to reproduce the problem you are encountering?
TIA, Axel
_______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/launchd-dev
On May 25, 2016, at 3:05 PM, Camden Narzt <camden.narzt@hotmail.com> wrote:
I’m sorry, I just assumed anyone doing any dev work on OSX would already run homebrew.
I do lots of development work on OS X. I don't run homebrew.
A much smaller test case with no installs so definitely a problem with launchd/OS X is as follows:
cat <<EOF > php.plist <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>KeepAlive</key> <dict> <key>OtherJobEnabled</key> <string>org.apache.httpd</string> </dict> <key>Label</key> <string>php</string> <key>ProgramArguments</key> <array> <string>/usr/bin/php</string> <string>--server</string> <string>127.0.0.1:9000</string> </array> </dict> </plist> EOF sudo apachectl start launchctl load php.plist
Cam
On May 25, 2016, at 3:23 PM, Axel Luttgens <axel.luttgens@skynet.be> wrote:
Le 24 mai 2016 à 19:52, Camden Narzt a écrit :
[…]
This is 100% reproducible on my MacBook Pro (Retina, 15-inch, Mid 2014) running OS X 10.11.5 (15F34).
Steps to reproduce (warning, this will almost certainly cause a kernel panic):
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" brew install opentsdb ln -s `brew --prefix hbase`/homebrew.mxcl.hbase.plist ~/Library/LaunchAgents/ ln -s `brew --prefix opentsdb`/homebrew.mxcl.opentsdb.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.hbase.plist sleep 15 && launchctl load ~/Library/LaunchAgents/homebrew.mxcl.opentsdb.plist
Note that running hbase via launchd and starting opentsdb manually does not cause a kernel panic (or any other problems that I’ve noticed).
Does anyone have any ideas about why this is happening?
Hello Camden,
Do you mean we have to download/install a lot of things for being able to answer your seemingly innocuous question "any ideas about why this is happening"? We all (well, at least me) on this list are now alarmed about a possible problem with launchd (or even Mac OS X), yet without any clue about that problem. Could you provide us with a case reduced to the bare minimum allowing to reproduce the problem you are encountering?
TIA, Axel
_______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/launchd-dev
_______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/launchd-dev
Le 26 mai 2016 à 00:05, Camden Narzt a écrit :
I’m sorry, I just assumed anyone doing any dev work on OSX would already run homebrew.
A much smaller test case with no installs so definitely a problem with launchd/OS X is as follows:
cat <<EOF > php.plist <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>KeepAlive</key> <dict> <key>OtherJobEnabled</key> <string>org.apache.httpd</string> </dict> <key>Label</key> <string>php</string> <key>ProgramArguments</key> <array> <string>/usr/bin/php</string> <string>--server</string> <string>127.0.0.1:9000</string> </array> </dict> </plist> EOF sudo apachectl start launchctl load php.plist
Hello Camden, Thanks for sharing. So, I guess the morality is "currently, don’t even try to make use of the OtherJobEnabled key"? Thanks again, Axel
Camden, OtherJobEnabled is documented to be a dictionary of booleans, not a single string. Launchd should log an error message pointing this out and ignore the key. Instead it crashes, which is definitely a bug! :) The bug will be addressed in a future release; meanwhile you can work around by changing the plist to <key>OtherJobEnabled</key> <dict> <key>org.apache.httpd</key> <true/> </dict> Thank you very much for reporting this!
On May 27, 2016, at 03:43, Axel Luttgens <axel.luttgens@skynet.be> wrote:
Le 26 mai 2016 à 00:05, Camden Narzt a écrit :
I’m sorry, I just assumed anyone doing any dev work on OSX would already run homebrew.
A much smaller test case with no installs so definitely a problem with launchd/OS X is as follows:
cat <<EOF > php.plist <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>KeepAlive</key> <dict> <key>OtherJobEnabled</key> <string>org.apache.httpd</string> </dict> <key>Label</key> <string>php</string> <key>ProgramArguments</key> <array> <string>/usr/bin/php</string> <string>--server</string> <string>127.0.0.1:9000</string> </array> </dict> </plist> EOF sudo apachectl start launchctl load php.plist
Hello Camden,
Thanks for sharing.
So, I guess the morality is "currently, don’t even try to make use of the OtherJobEnabled key"?
Thanks again, Axel
_______________________________________________ launchd-dev mailing list launchd-dev@lists.macosforge.org <mailto:launchd-dev@lists.macosforge.org> https://lists.macosforge.org/mailman/listinfo/launchd-dev <https://lists.macosforge.org/mailman/listinfo/launchd-dev>
participants (7)
-
Axel Luttgens
-
Camden Narzt
-
Damien Sorresso
-
Gregory Neagle
-
Joe Auricchio
-
Kevin Elliott
-
Quinn "The Eskimo!"