问题
I have a rather large client-server network application, written in Python. I'm using select.poll to provide asynchronous capabilities. For the past six months, everything has worked fine. However, recently I changed some things and allowed the client to reliably log-off from the server. It appeared at first glance that the client was never receiving the request, and furthermore, it was blocking. When I killed the process with , I received the following output:
*** glibc detected *** /usr/bin/python: corrupted double-linked list: 0x0a9fea60 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x6cbe1)[0xd96be1]
/lib/i386-linux-gnu/libc.so.6(+0x6fc1c)[0xd99c1c]
/lib/i386-linux-gnu/libc.so.6(__libc_malloc+0x63)[0xd9b1d3]
/usr/lib/i386-linux-gnu/libxcb.so.1(+0x8ff6)[0xb30ff6]
/usr/lib/i386-linux-gnu/libxcb.so.1(+0x706d)[0xb2f06d]
/usr/lib/i386-linux-gnu/libxcb.so.1(+0x75b5)[0xb2f5b5]
/usr/lib/i386-linux-gnu/libxcb.so.1(xcb_writev+0x67)[0xb2f667]
/usr/lib/i386-linux-gnu/libX11.so.6(_XSend+0x14b)[0x59b42b]
/usr/lib/i386-linux-gnu/libX11.so.6(_XFlush+0x39)[0x59b889]
/usr/lib/i386-linux-gnu/libX11.so.6(XFlush+0x31)[0x57ba81]
/usr/lib/libSDL-1.2.so.0(+0x34dfe)[0x16adfe]
/usr/lib/libSDL-1.2.so.0(+0x37998)[0x16d998]
/usr/lib/libSDL-1.2.so.0(+0x393db)[0x16f3db]
/usr/lib/libSDL-1.2.so.0(SDL_PumpEvents+0x3d)[0x140d7d]
/usr/lib/libSDL-1.2.so.0(SDL_PollEvent+0x17)[0x140db7]
/usr/lib/libSDL-1.2.so.0(SDL_EventState+0x58)[0x140f78]
/usr/lib/libSDL-1.2.so.0(SDL_JoystickEventState+0x5b)[0x16810b]
/usr/lib/python2.7/dist-packages/pygame/joystick.so(+0x196d)[0x55896d]
/usr/lib/python2.7/dist-packages/pygame/base.so(+0x178a)[0x56078a]
/usr/lib/python2.7/dist-packages/pygame/base.so(+0x17c7)[0x5607c7]
/usr/bin/python(PyEval_EvalFrameEx+0x4332)[0x80de822]
/usr/bin/python(PyEval_EvalCodeEx+0x127)[0x80e11e7]
/usr/bin/python[0x8105a61]
/usr/bin/python(PyObject_Call+0x4a)[0x80a464a]
/usr/bin/python(PyEval_CallObjectWithKeywords+0x44)[0x80da034]
/usr/bin/python(Py_Finalize+0xc7)[0x8070ee1]
/usr/bin/python(Py_Main+0xc66)[0x805c109]
/usr/bin/python(main+0x1b)[0x805b25b]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0xd40e37]
/usr/bin/python[0x81074ad]
followed by a memory map, which I'm not posting for the sake of brevity. I ran the code under PDB, and found that the client was blocking on the call to pollingObject.poll(0)
, which shouldn't be blocking. So, I changed that call to select.select([socket], [], [], 0)
, still without success. I'm using PyGame, if that makes a difference, as I know it sometimes does. I'm completely lost here. I know that Python overrides malloc
, could it have something to do with that?
回答1:
I managed to fix it by implementing the network code in C and calling it from Python.
回答2:
It looks to me like PyGame is checking for input events after the X connection has been closed, due to finalizers. Calling anything in Xlib with a Display *
that's already been passed to XCloseDisplay
means accessing already-freed memory, of course, and if that's what's going on it isn't surprising that glibc's heap becomes corrupted.
If my diagnosis is correct, you won't be able to truly fix it at the application level, but producing a minimal test case and submitting it to the PyGame developers might be productive.
来源:https://stackoverflow.com/questions/12134948/python-select-select-select-poll-corrupted-double-linked-list