How to solve ClickHouse deadlock?

感情迁移 提交于 2020-04-18 12:35:51

问题


I was doing a set of 10 concurrent tests when ClickHouse became deadlocked.

The following SQL

select
id,
sum(a) as a,
sum(b) as b,
sum(c) as c,
round(sum(d), 2) as d
from f_table
where xxx

And I ran pstack my-clickhouse-server-process-id and got some __lll_lock_wait.

Sorry for posting so many thread stack logs, I thought more information may give you some ideas. Since this reproduce is not stable at present, I haven't posted it on GitHub's issue. I read https://github.com/ClickHouse/ClickHouse/issues/4316 but I'm not really sure what the fix has achieved. My current version is 19.14.7.15.

Here are a few typical stacks:

Thread 1278 (Thread 0x7f89a23c6700 (LWP 18345)):
#0  0x00007f89a8b944ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f89a8b910e2 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#2  0x000055dc6dd3c87f in ?? ()
#3  0x000055dc6dd35393 in unw_step ()
#4  0x000055dc6dd354b0 in unw_backtrace ()
#5  0x000055dc68058221 in StackTrace::StackTrace(ucontext_t const&) ()
#6  0x000055dc6806c91e in ?? ()
#7  <signal handler called>
#8  0x000055dc6b87d21d in ?? ()
#9  0x000055dc6b87da1e in LZ4::decompress(char const*, char*, unsigned long, unsigned long, LZ4::PerformanceStatistics&) ()
#10 0x000055dc6b184eb6 in DB::ICompressionCodec::decompress(char const*, unsigned int, char*) const ()
#11 0x000055dc6b17dd37 in DB::CompressedReadBufferBase::decompress(char*, unsigned long, unsigned long) ()
#12 0x000055dc6b8544b5 in DB::CompressedReadBufferFromFile::nextImpl() ()
#13 0x000055dc6b854609 in DB::CompressedReadBufferFromFile::seek(unsigned long, unsigned long) ()
#14 0x000055dc6b6519c8 in DB::MergeTreeReaderStream::seekToMark(unsigned long) ()
#15 0x000055dc6bb1d637 in ?? ()
#16 0x000055dc681ecc78 in DB::IDataType::deserializeBinaryBulkWithMultipleStreams(DB::IColumn&, unsigned long, DB::IDataType::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::IDataType::DeserializeBinaryBulkState>&) const ()
#17 0x000055dc6bb1f64c in DB::MergeTreeReader::readData(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, DB::IDataType const&, DB::IColumn&, unsigned long, bool, unsigned long, bool) ()
#18 0x000055dc6bb1fdfb in DB::MergeTreeReader::readRows(unsigned long, bool, unsigned long, DB::Block&) ()
#19 0x000055dc6bb197be in DB::MergeTreeRangeReader::DelayedStream::finalize(DB::Block&) ()
#20 0x000055dc6bb1b988 in DB::MergeTreeRangeReader::continueReadingChain(DB::MergeTreeRangeReader::ReadResult&) ()
#21 0x000055dc6bb1c1ae in DB::MergeTreeRangeReader::read(unsigned long, std::vector<DB::MarkRange, std::allocator<DB::MarkRange> >&) ()
#22 0x000055dc6baec6d3 in DB::MergeTreeBaseSelectBlockInputStream::readFromPartImpl() ()
#23 0x000055dc6baed1a5 in DB::MergeTreeBaseSelectBlockInputStream::readImpl() ()
#24 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#25 0x000055dc6b8bf6c6 in DB::FilterBlockInputStream::readImpl() ()
#26 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#27 0x000055dc6b8b81ef in DB::ExpressionBlockInputStream::readImpl() ()
#28 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#29 0x000055dc6b8f1b9a in DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::thread(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long) ()
#30 0x000055dc6b8f265d in _ZZN20ThreadFromGlobalPoolC4IMN2DB23ParallelInputsProcessorINS1_35ParallelAggregatingBlockInputStream7HandlerEEEFvSt10shared_ptrINS1_17ThreadGroupStatusEEmEJPS5_S8_RmEEEOT_DpOT0_ENKUlvE_clEv ()
#31 0x000055dc6809338c in ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) ()
#32 0x000055dc6dd01b60 in ?? ()
#33 0x00007f89a8b8ddd5 in start_thread () from /lib64/libpthread.so.0
#34 0x00007f89a84ab02d in clone () from /lib64/libc.so.6

and

Thread 1231 (Thread 0x7f897ebff700 (LWP 18392)):
#0  0x00007f89a8b944ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f89a8b910e2 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#2  0x000055dc6dd3c87f in ?? ()
#3  0x000055dc6dd35393 in unw_step ()
#4  0x000055dc6dd354b0 in unw_backtrace ()
#5  0x000055dc68058221 in StackTrace::StackTrace(ucontext_t const&) ()
#6  0x000055dc6806c91e in ?? ()
#7  <signal handler called>
#8  0x00007f89a8b91963 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#9  0x000055dc6dc8dc0c in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#10 0x000055dc6d527cab in Poco::Event::wait() ()
#11 0x000055dc6b8ef2fa in DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::wait() ()
#12 0x000055dc6b8eaff8 in DB::ParallelAggregatingBlockInputStream::execute() ()
#13 0x000055dc6b8ee7e0 in DB::ParallelAggregatingBlockInputStream::readImpl() ()
#14 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#15 0x000055dc6b18a15b in DB::AsynchronousBlockInputStream::calculate() ()
#16 0x000055dc6b18a520 in ?? ()
#17 0x000055dc680958de in ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::_List_iterator<ThreadFromGlobalPool>) ()
#18 0x000055dc68095eee in _ZZN20ThreadFromGlobalPoolC4IZN14ThreadPoolImplIS_E12scheduleImplIvEET_St8functionIFvvEEiSt8optionalImEEUlvE1_JEEEOS4_DpOT0_ENKUlvE_clEv ()
#19 0x000055dc6809338c in ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) ()
#20 0x000055dc6dd01b60 in ?? ()
#21 0x00007f89a8b8ddd5 in start_thread () from /lib64/libpthread.so.0
#22 0x00007f89a84ab02d in clone () from /lib64/libc.so.6

and

Thread 1203 (Thread 0x7f8969ff9700 (LWP 18420)):
#0  0x00007f89a8b944ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f89a8b910e2 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#2  0x000055dc6dd3c87f in ?? ()
#3  0x000055dc6dd35393 in unw_step ()
#4  0x000055dc6dd354b0 in unw_backtrace ()
#5  0x000055dc68058221 in StackTrace::StackTrace(ucontext_t const&) ()
#6  0x000055dc6806c91e in ?? ()
#7  <signal handler called>
#8  0x00007f89a8b944eb in __lll_lock_wait () from /lib64/libpthread.so.0
#9  0x00007f89a8b910e2 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#10 0x000055dc6dd3c87f in ?? ()
#11 0x000055dc6dd35393 in unw_step ()
#12 0x000055dc6dd354b0 in unw_backtrace ()
#13 0x000055dc68058221 in StackTrace::StackTrace(ucontext_t const&) ()
#14 0x000055dc6806c91e in ?? ()
#15 <signal handler called>
#16 0x000055dc68bb02ee in DB::NumComparisonImpl<long, unsigned int, DB::EqualsOp<long, unsigned int> >::vector_constant(DB::PODArray<long, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, unsigned int, DB::PODArray<unsigned char, 4096ul, Allocator<false, false>, 15ul, 16ul>&) ()
#17 0x000055dc68be2aac in bool DB::FunctionComparison<DB::EqualsOp, DB::NameEquals>::executeNumLeftType<long>(DB::Block&, unsigned long, DB::IColumn const*, DB::IColumn const*) ()
#18 0x000055dc68cb9496 in DB::FunctionComparison<DB::EqualsOp, DB::NameEquals>::executeImpl(DB::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long) ()
#19 0x000055dc6b8329e4 in DB::PreparedFunctionImpl::execute(DB::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long, bool) ()
#20 0x000055dc6ba20231 in DB::ExpressionAction::execute(DB::Block&, bool) const ()
#21 0x000055dc6ba21945 in DB::ExpressionActions::execute(DB::Block&, bool) const ()
#22 0x000055dc6b8bfa08 in DB::FilterBlockInputStream::readImpl() ()
#23 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#24 0x000055dc6b8b81ef in DB::ExpressionBlockInputStream::readImpl() ()
#25 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#26 0x000055dc6b8f1b9a in DB::ParallelInputsProcessor<DB::ParallelAggregatingBlockInputStream::Handler>::thread(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long) ()
#27 0x000055dc6b8f265d in _ZZN20ThreadFromGlobalPoolC4IMN2DB23ParallelInputsProcessorINS1_35ParallelAggregatingBlockInputStream7HandlerEEEFvSt10shared_ptrINS1_17ThreadGroupStatusEEmEJPS5_S8_RmEEEOT_DpOT0_ENKUlvE_clEv ()
#28 0x000055dc6809338c in ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) ()
#29 0x000055dc6dd01b60 in ?? ()
#30 0x00007f89a8b8ddd5 in start_thread () from /lib64/libpthread.so.0
#31 0x00007f89a84ab02d in clone () from /lib64/libc.so.6

and

Thread 1202 (Thread 0x7f89697f8700 (LWP 18421)):
#0  0x00007f89a8b944ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f89a8b910e2 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#2  0x000055dc6dd3c87f in ?? ()
#3  0x000055dc6dd35393 in unw_step ()
#4  0x000055dc6dd354b0 in unw_backtrace ()
#5  0x000055dc680581e0 in StackTrace::StackTrace() ()
#6  0x000055dc6b287535 in DB::Context::getTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const ()
#7  0x000055dc6b2d32dc in DB::InterpreterInsertQuery::getTable(DB::ASTInsertQuery const&) ()
#8  0x000055dc6b2d3ce7 in DB::InterpreterInsertQuery::execute() ()
#9  0x000055dc680b807a in DB::SystemLog<DB::QueryLogElement>::flushImpl(DB::SystemLog<DB::QueryLogElement>::EntryType) ()
#10 0x000055dc680b87f2 in DB::SystemLog<DB::QueryLogElement>::threadFunction() ()
#11 0x000055dc680b938a in _ZZN20ThreadFromGlobalPoolC4IZN2DB9SystemLogINS1_15QueryLogElementEEC4ERNS1_7ContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESE_SE_mEUlvE_JEEEOT_DpOT0_ENKUlvE_clEv ()
#12 0x000055dc6809338c in ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) ()
#13 0x000055dc6dd01b60 in ?? ()
#14 0x00007f89a8b8ddd5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007f89a84ab02d in clone () from /lib64/libc.so.6

and

Thread 1175 (Thread 0x7f888a1f3700 (LWP 18449)):
#0  0x00007f89a8b944ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f89a8b910e2 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#2  0x000055dc6dd3c87f in ?? ()
#3  0x000055dc6dd35393 in unw_step ()
#4  0x000055dc6dd354b0 in unw_backtrace ()
#5  0x000055dc68058221 in StackTrace::StackTrace(ucontext_t const&) ()
#6  0x000055dc6806c91e in ?? ()
#7  <signal handler called>
#8  0x00007f89a84ab603 in epoll_wait () from /lib64/libc.so.6
#9  0x000055dc6bea48b6 in Poco::Net::SocketImpl::poll(Poco::Timespan const&, int) ()
#10 0x000055dc6bea1d5b in Poco::Net::SocketImpl::receiveBytes(void*, int, int) ()
#11 0x000055dc6be5afaa in DB::ReadBufferFromPocoSocket::nextImpl() ()
#12 0x000055dc6b6e676f in DB::Connection::receivePacket() ()
#13 0x000055dc6b6f69ae in DB::MultiplexedConnections::receivePacket() ()
#14 0x000055dc6b1a8396 in DB::RemoteBlockInputStream::readImpl() ()
#15 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#16 0x000055dc6b3036ac in DB::ParallelInputsProcessor<DB::UnionBlockInputStream::Handler>::loop(unsigned long) ()
#17 0x000055dc6b303d95 in DB::ParallelInputsProcessor<DB::UnionBlockInputStream::Handler>::thread(std::shared_ptr<DB::ThreadGroupStatus>, unsigned long) ()
#18 0x000055dc6b30470d in _ZZN20ThreadFromGlobalPoolC4IMN2DB23ParallelInputsProcessorINS1_21UnionBlockInputStream7HandlerEEEFvSt10shared_ptrINS1_17ThreadGroupStatusEEmEJPS5_S8_RmEEEOT_DpOT0_ENKUlvE_clEv ()
#19 0x000055dc6809338c in ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) ()
#20 0x000055dc6dd01b60 in ?? ()
#21 0x00007f89a8b8ddd5 in start_thread () from /lib64/libpthread.so.0
#22 0x00007f89a84ab02d in clone () from /lib64/libc.so.6

and

Thread 1171 (Thread 0x7f88881ef700 (LWP 18453)):
#0  0x00007f89a8b944ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f89a8b910e2 in pthread_rwlock_rdlock () from /lib64/libpthread.so.0
#2  0x000055dc6dd3c87f in ?? ()
#3  0x000055dc6dd35393 in unw_step ()
#4  0x000055dc6dd354b0 in unw_backtrace ()
#5  0x000055dc68058221 in StackTrace::StackTrace(ucontext_t const&) ()
#6  0x000055dc6806c91e in ?? ()
#7  <signal handler called>
#8  0x00007f89a8b91963 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#9  0x000055dc6dc8dc0c in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
#10 0x000055dc6d580c63 in Poco::Semaphore::wait() ()
#11 0x000055dc6b304da6 in DB::UnionBlockInputStream::readImpl() ()
#12 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#13 0x000055dc6b940442 in DB::Aggregator::mergeStream(std::shared_ptr<DB::IBlockInputStream> const&, DB::AggregatedDataVariants&, unsigned long) ()
#14 0x000055dc6b8d7a5a in DB::MergingAggregatedBlockInputStream::readImpl() ()
#15 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#16 0x000055dc6b8b81ef in DB::ExpressionBlockInputStream::readImpl() ()
#17 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#18 0x000055dc6b8b81ef in DB::ExpressionBlockInputStream::readImpl() ()
#19 0x000055dc6b191637 in DB::IBlockInputStream::read() ()
#20 0x000055dc6b18a15b in DB::AsynchronousBlockInputStream::calculate() ()
#21 0x000055dc6b18a520 in ?? ()
#22 0x000055dc680958de in ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::_List_iterator<ThreadFromGlobalPool>) ()
#23 0x000055dc68095eee in _ZZN20ThreadFromGlobalPoolC4IZN14ThreadPoolImplIS_E12scheduleImplIvEET_St8functionIFvvEEiSt8optionalImEEUlvE1_JEEEOS4_DpOT0_ENKUlvE_clEv ()
#24 0x000055dc6809338c in ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) ()
#25 0x000055dc6dd01b60 in ?? ()
#26 0x00007f89a8b8ddd5 in start_thread () from /lib64/libpthread.so.0
#27 0x00007f89a84ab02d in clone () from /lib64/libc.so.6

Do you know what caused the deadlock? Why is clickhouse locked while reading? The concurrency may not seem high.


回答1:


Most probably you hit that bug https://github.com/ClickHouse/ClickHouse/issues/7383

It was fixed in 19.14.10 or newer.



来源:https://stackoverflow.com/questions/60698597/how-to-solve-clickhouse-deadlock

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!