[Xquartz-dev] Help requested debugging rgl under XQuartz

Duncan Murdoch murdoch.duncan at gmail.com
Wed Feb 24 12:10:53 PST 2021


The only call it makes is to libGFXShared.dylib`gfxIODataBindSurface, 
and when it returns from that it jumps to the error exit.

Inside that function, it checks whether a pointer is non-null, then uses 
it to jump to libGPUSupportMercury.dylib`gldAttachDrawable.

In gldAttachDrawable, it looks like it is detecting something wrong, 
then it calls gpuiReleaseDrawable, IOAccelGLContextClearDrawable, and 
then returns the 0x2715 = 10005 = kCGLBadDrawable value.

I don't have the source (do I?), and I don't know the argument passing 
conventions.  Can you tell me how type would be passed in?

I don't think we get to either of the other functions.

Duncan Murdoch



On 24/02/2021 12:59 p.m., Jeremy Huddleston Sequoia wrote:
> IOAccelGLContextClearDrawable is called on the error-out path of that function, so yeah, we need to see how we got there.
> 
> enum32_t gldAttachDrawable(GLDContext ctx, enum32_t type, const GLDDrawable drawable, bitfield32_t options, GLTDimensions *size_ret)
> 
> Can you tell me what the type is here?
> 
> Do we get to IOAccelGLContextSetDrawable()?  If so, what does it return?
> Do we get to gpulUpdateDrawableDepth()?  If so, what does it return?
> 
> 
>> On Feb 24, 2021, at 09:46, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
>>
>> Yes, I did get it wrong.  It looks like the error was detected before the call to IOAccelGLContextClearDrawable and the stack checking.  I'll see if I can figure out where.
>>
>> Duncan Murdoch
>>
>> On 24/02/2021 12:11 p.m., Duncan Murdoch wrote:
>>> I don't see any calls to __stack_chk_fail .  It's possible I
>>> misinterpreted what was going on after the IOAccelGLContextClearDrawable
>>> call.  I'll take another look.
>>> Duncan Murdoch
>>> On 24/02/2021 11:41 a.m., Jeremy Huddleston Sequoia wrote:
>>>> __stack_chk_guard is part of stack protector.
>>>>
>>>> If it's not liking the value in __stack_chk_guard, it means the stack
>>>> was smashed.
>>>>
>>>> When this is detected, the compiler runtime should
>>>> call __stack_chk_fail() if implemented or abort if not.  Given that
>>>> we're not crashing, I wonder if there's a handler somewhere that ends up
>>>> causing us to return the bad value instead of crashing.
>>>>
>>>> Can you break on __stack_chk_fail and see if that gives us anything useful?
>>>>
>>>>
>>>>
>>>>
>>>>> On Feb 24, 2021, at 06:26, Duncan Murdoch <murdoch.duncan at gmail.com
>>>>> <mailto:murdoch.duncan at gmail.com>> wrote:
>>>>>
>>>>> Tracing in with lldb, it appears to be this sequence of calls leading
>>>>> to the 10005 error value:
>>>>>
>>>>> r
>>>>>   * frame #0: 0x00007fff5afc19e0
>>>>> libGPUSupportMercury.dylib`gldAttachDrawable + 1
>>>>>     frame #1: 0x00007fff4467f396 GLEngine`gliAttachDrawableWithOptions
>>>>> + 251
>>>>>     frame #2: 0x00007fff4465d9f5
>>>>> OpenGL`___lldb_unnamed_symbol40$$OpenGL + 972
>>>>>     frame #3: 0x00007fff446618e2
>>>>> OpenGL`___lldb_unnamed_symbol59$$OpenGL + 82
>>>>>     frame #4: 0x00007fff44661c29 OpenGL`CGLSetSurface + 330
>>>>>     frame #5: 0x00007fff70c6ca63
>>>>> libXplugin.1.dylib`xp_attach_gl_context + 95
>>>>>     frame #6: 0x0000000108590dee libGL.1.dylib`surface_make_current + 206
>>>>>     frame #7: 0x000000010858df6a
>>>>> libGL.1.dylib`apple_glx_make_current_context + 1274
>>>>>     frame #8: 0x0000000108574579 libGL.1.dylib`applegl_bind_context + 185
>>>>>     frame #9: 0x000000010856237e libGL.1.dylib`MakeContextCurrent + 414
>>>>>     frame #10: 0x00000001085621d9 libGL.1.dylib`glXMakeCurrent + 41
>>>>>
>>>>>
>>>>> The libGPUSupportMercury.dylib`gldAttachDrawable function calls
>>>>>
>>>>> IOAccelGLContextClearDrawable
>>>>>
>>>>> then does some sort of check of __stack_chk_guard and doesn't like
>>>>> what it sees, and sets the error.
>>>>>
>>>>> Does this give any hint about what's wrong, or a way to fix it?
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>>
>>>>>
>>>>> On 23/02/2021 4:31 p.m., Duncan Murdoch wrote:
>>>>>> On 23/02/2021 3:47 p.m., Jeremy Huddleston Sequoia wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Feb 23, 2021, at 06:14, Duncan Murdoch <murdoch.duncan at gmail.com
>>>>>>>> <mailto:murdoch.duncan at gmail.com>
>>>>>>>> <mailto:murdoch.duncan at gmail.com
>>>>>>>> <mailto:murdoch.duncan at gmail.com>>> wrote:
>>>>>>>>
>>>>>>>> On 23/02/2021 12:47 a.m., Jeremy Huddleston Sequoia wrote:
>>>>>>>>>> On Feb 22, 2021, at 14:38, Duncan Murdoch
>>>>>>>>>> <murdoch.duncan at gmail.com <mailto:murdoch.duncan at gmail.com>
>>>>>>>>>> <mailto:murdoch.duncan at gmail.com
>>>>>>>>>> <mailto:murdoch.duncan at gmail.com>>
>>>>>>>>>> <mailto:murdoch.duncan at gmail.com <mailto:murdoch.duncan at gmail.com>
>>>>>>>>>> <mailto:murdoch.duncan at gmail.com
>>>>>>>>>> <mailto:murdoch.duncan at gmail.com>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> I've made a little bit of progress.
>>>>>>>>>>
>>>>>>>>>> The message "error: xp_attach_gl_context returned: 2" comes from the
>>>>>>>>>> Mesa routine surface_make_current, which calls xp_attach_gl_context.
>>>>>>>>>>   I haven't found where xp_attach_gl_context is defined.
>>>>>>>>> xp_attach_gl_context is in libXplugin (check Xplugin.h in the SDK).
>>>>>>>>> 2 is XP_BadValue, which is returned if cgl_ctx is NULL... so I'd
>>>>>>>>> suggest looking into why mesa is calling xp_attach_gl_context with a
>>>>>>>>> NULL context.
>>>>>>>>
>>>>>>>> Thanks, that's helpful.  The context is not NULL, so I need to think
>>>>>>>> of other ways it could be "bad".
>>>>>>>
>>>>>>> Ok, well xp_attach_gl_context is just a wrapper around CGLSetSurface(),
>>>>>>> which is an internal function to do exactly what we're trying to do
>>>>>>> here.  If it returns any error, xp_attach_gl_context returns bad value.
>>>>>>>
>>>>>>> Are you able to capture this in the debugger and figure out what the
>>>>>>> return value from CGLSetSurface() is?  That will tell us what the
>>>>>>> underlying CGLError is, which might help shed some light on this.
>>>>>> I believe it's returning  0x0000000000002715 when there's an error.
>>>>>> That's 10005, kCGLBadDrawable.  So now I need to find out what happened
>>>>>> to the drawable.
>>>>>> This feels like progress!  Thanks again.
>>>>>> Duncan
>>>>>>>
>>>>>>>> Here's what I see with LIBGL_DIAGNOSTIC=1.  For a successful open,
>>>>>>>>
>>>>>>>>> rgl.open()
>>>>>>>> function is no-op
>>>>>>>> Debug     ../src/glx/apple/apple_glx_context.c:205
>>>>>>>> apple_glx_create_context(4295810496): apple_glx_create_context: ac
>>>>>>>> 0x100a10a00 ac->context_obj 0x107cdce00
>>>>>>>> 2021-02-23 08:23:00.041711-0500 R[45754:1283995]
>>>>>>>> apple_glx_create_context: ac 0x100a10a00 ac->context_obj 0x107cdce00
>>>>>>>> Debug     ../src/glx/apple/apple_glx_drawable.c:342
>>>>>>>> apple_glx_drawable_create(4295810496): apple_glx_drawable_create: new
>>>>>>>> drawable 0x107ce0e00
>>>>>>>> 2021-02-23 08:23:00.042235-0500 R[45754:1283995]
>>>>>>>> apple_glx_drawable_create: new drawable 0x107ce0e00
>>>>>>>> Debug     ../src/glx/apple/apple_glx_surface.c:154
>>>>>>>> create_surface(4295810496): create_surface: created a surface for
>>>>>>>> drawable 0x600066 with uid 621
>>>>>>>> 2021-02-23 08:23:00.044773-0500 R[45754:1283995] create_surface:
>>>>>>>> created a surface for drawable 0x600066 with uid 621
>>>>>>>> Debug     ../src/glx/apple/apple_glx_surface.c:69
>>>>>>>> surface_make_current(4295810496): surface_make_current:
>>>>>>>> ac->context_obj 0x107cdce00 s->surface_id 9
>>>>>>>> 2021-02-23 08:23:00.044839-0500 R[45754:1283995] surface_make_current:
>>>>>>>> ac->context_obj 0x107cdce00 s->surface_id 9
>>>>>>>> Debug     ../src/glx/apple/apple_glx_surface.c:89
>>>>>>>> surface_make_current(4295810496): surface_make_current: drawable
>>>>>>>> 0x600066
>>>>>>>> 2021-02-23 08:23:00.045680-0500 R[45754:1283995] surface_make_current:
>>>>>>>> drawable 0x600066
>>>>>>>> ... (more lines deleted)
>>>>>>>>
>>>>>>>> After I run quartz(), I see this:
>>>>>>>>
>>>>>>>>> rgl.open()
>>>>>>>> Debug     ../src/glx/apple/apple_glx_context.c:205
>>>>>>>> apple_glx_create_context(4295810496): apple_glx_create_context: ac
>>>>>>>> 0x10262bb00 ac->context_obj 0x1058c4800
>>>>>>>> 2021-02-23 08:23:35.666675-0500 R[45754:1283995]
>>>>>>>> apple_glx_create_context: ac 0x10262bb00 ac->context_obj 0x1058c4800
>>>>>>>> Debug     ../src/glx/apple/apple_glx_drawable.c:342
>>>>>>>> apple_glx_drawable_create(4295810496): apple_glx_drawable_create: new
>>>>>>>> drawable 0x107648000
>>>>>>>> 2021-02-23 08:23:35.667040-0500 R[45754:1283995]
>>>>>>>> apple_glx_drawable_create: new drawable 0x107648000
>>>>>>>> Debug     ../src/glx/apple/apple_glx_surface.c:154
>>>>>>>> create_surface(4295810496): create_surface: created a surface for
>>>>>>>> drawable 0x6000c9 with uid 629
>>>>>>>> 2021-02-23 08:23:35.669119-0500 R[45754:1283995] create_surface:
>>>>>>>> created a surface for drawable 0x6000c9 with uid 629
>>>>>>>> Debug     ../src/glx/apple/apple_glx_surface.c:69
>>>>>>>> surface_make_current(4295810496): surface_make_current:
>>>>>>>> ac->context_obj 0x1058c4800 s->surface_id 13
>>>>>>>> 2021-02-23 08:23:35.669195-0500 R[45754:1283995] surface_make_current:
>>>>>>>> ac->context_obj 0x1058c4800 s->surface_id 13
>>>>>>>> error: xp_attach_gl_context returned: 2
>>>>>>>> Debug     ../src/glx/applegl_glx.c:60
>>>>>>>> applegl_bind_context(4295810496): applegl_bind_context: error YES
>>>>>>>> 2021-02-23 08:23:35.669834-0500 R[45754:1283995] applegl_bind_context:
>>>>>>>> error YES
>>>>>>>>
>>>>>>>> and then I get my own messages from the failure of glXMakeCurrent().
>>>>>>>>   As far as I can see, everything appears fine until the call to
>>>>>>>> xp_attach_gl_context.
>>>>>>>>
>>>>>>>>
>>>>>>>> Everything looks very similar up to the failure of
>>>>>>>> xp_attach_gl_context.  Any idea I why the value returned a few lines
>>>>>>>> earlier from apple_glx_create_context() should be a bad value?
>>>>>>>>
>>>>>>>> Duncan Murdoch
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
> 



More information about the Xquartz-dev mailing list