One ZeroMQ socket per thread or per call?

前端 未结 4 486
名媛妹妹
名媛妹妹 2021-01-06 16:32

As we all know, a ZeroMQ socket shall not be shared among application threads.
context_t instances however can.

I have a multi-threaded-application a

相关标签:
4条回答
  • 2021-01-06 16:37

    Nota Bene: this answer was posted before O/P was changed from 20k TPS to 140k TPS on ipc:// transport-class

    Q: Is there a more correct ZeroMQ-way of doing this?

    A:Not easy to say what is "this" and what are the parameters of the "correctness"-metric

    Given that,
    the points below will be more general
    and applicable for a system design-phase reasoning:


    Resources Utilisation Overheads Avoidance

    This point is a dual-edge sword. Some overheads are always associated with both an infrastructure element setup and disposal ( yes, even the closing and dismantling ) of the REQ-AccessPoint to the REQ/REP-pattern and the associated socket-based transport-class impose some remarkable overheads on both the REQ-side host and also the REP-side.

    It was fair you've noted, that you took some care about quantitatively testing this, on a level of some 20k TPS, and did not observe any adverse effects of such approach. What was not clear, whether any other scenario was also tested in-vivo on the same SUT ( System-under-Test ), so as to provide some baseline for comparisons of each respective design ( and to allow to determine the difference of the overheads per se ).

    While a well designed framework hides this part of the system internal behaviour from the user-maintained code, it does not mean, it is all a cheap, the less a free-of-charge processing.

    It is obvious there are jobs performed under the hood in the Context()-instance thread(s) ( ... yes, plural is correct here, as some high-performance code may benefit from using more than one I/O-threads per a Context() instance and positively influence the workload distribution by explicitly defined affinity-mapping between a pattern-socket and it's respective I/O-thread handler ( so as to somehow balance, if not able to deterministically level, the expected I/O-throughput, incl. all the associated overheads ).

    If still in doubts, one shall always remember, that an imperative programming style function or an object-oriented methods are principally victims of the external caller, who decides at which moment & how often such "en-slaved" code-execution unit is called on duty and being executed. The function/method does not have any natural means of back-throtling ( a suppresion of ) the frequency of it's own invocations from external caller(s) and robust designs simply cannot remain to just rely on optimistic assumptions that such calls do not come more often than XYZ-k TPS ( 20k being cited above may be fine for in-vitro testing, but real deployment may shift that several orders of manitude ( be it artificially - during testing, or not - during some peak-hour or user(system)-panic or due to some technical error or hardware failure ( we've all heard so many times about NIC-card flooding an L1/L2 traffic beyond all imaginable limits et al - we just do not and cannot know, when / where it will happen next time again ).

    Avoiding Risk of Blocking

    The mentioned REQ/REP Scaleable Formal Communication Pattern is known for it's risk of falling into an externally unresolveable distributed internal dead-lock. This is always a risk to avoid. Mitigation strategies may depend on the actual use-case's value at risk ( a need to certify a medical instrument, fintech use-cases, control-loop use-cases, academia research paper code or a private hobby toy ).

    Ref.: REQ/REP Deadlocks >>> https://stackoverflow.com/a/38163015/3666197

    Fig.1: Why is it wrong to use a naive REQ/REP
    all cases when [App1]in_WaitToRecvSTATE_W2R + [App2]in_WaitToRecvSTATE_W2R
    are principally an unsalvageable distributed mutual deadlock of REQ-FSA/REP-FSA ( each of the both Finite-State-Automata waits for "the other" to move ) and will never reach the "next" in_WaitToSendSTATE_W2S internal state.

                   XTRN_RISK_OF_FSA_DEADLOCKED ~ {  NETWORK_LoS
                                             :   || NETWORK_LoM
                                             :   || SIG_KILL( App2 )
                                             :   || ...
                                             :      }
                                             :
    [App1]      ![ZeroMQ]                    :    [ZeroMQ]              ![App2] 
    code-control! code-control               :    [code-control         ! code-control
    +===========!=======================+    :    +=====================!===========+
    |           ! ZMQ                   |    :    |              ZMQ    !           |
    |           ! REQ-FSA               |    :    |              REP-FSA!           |
    |           !+------+BUF> .connect()|    v    |.bind()  +BUF>------+!           |
    |           !|W2S   |___|>tcp:>---------[*]-----(tcp:)--|___|W2R   |!           |
    |     .send()>-o--->|___|           |         |         |___|-o---->.recv()     |
    | ___/      !| ^  | |___|           |         |         |___| ^  | |!      \___ |
    | REQ       !| |  v |___|           |         |         |___| |  v |!       REP |
    | \___.recv()<----o-|___|           |         |         |___|<---o-<.send()___/ |
    |           !|   W2R|___|           |         |         |___|   W2S|!           |
    |           !+------<BUF+           |         |         <BUF+------+!           |
    |           !                       |         |                     !           |
    |           ! ZMQ                   |         |   ZMQ               !           |
    |           ! REQ-FSA               |         |   REP-FSA           !           |
    ~~~~~~~~~~~~~ DEADLOCKED in W2R ~~~~~~~~ * ~~~~~~ DEADLOCKED in W2R ~~~~~~~~~~~~~
    |           ! /\/\/\/\/\/\/\/\/\/\/\|         |/\/\/\/\/\/\/\/\/\/\/!           |
    |           ! \/\/\/\/\/\/\/\/\/\/\/|         |\/\/\/\/\/\/\/\/\/\/\!           |
    +===========!=======================+         +=====================!===========+
    

    0 讨论(0)
  • 2021-01-06 16:55

    An alternative could be having one dedicated thread for the ZeroMQ communication with some FIFO queue (must be guarded with a mutex or similar, of course...). This dedicated thread should be sleeping as long as the queue is empty and wake up (being signalled appropriately) whenever this state changes.

    Depending on general needs, whenever the response for some outgoing message is received, the dedicated thread could simply call some callback (at some dedicated object per thread); be aware that you have a different thread context then, so you might need some means of synchronisation to prevent race conditions.

    Alternatively, the sending threads could just wait for the response, being signalled by the ZeroMQ thread on received response (well, this actually is one of those means to prevent race conditions...).

    0 讨论(0)
  • 2021-01-06 17:00

    Here is my (current) solution, in C++11 you can assign object to a thread_local-storage. Storing the socket_t-instance static and thread_local in a function gives me the functionality I was looking for:

    class socketPool
    {
        std::string endpoint_;
    
    public:
        socketPool(const std::string &ep) : endpoint_(ep) {}
    
        zmq::socket_t & operator()()
        {
            thread_local static zmq::socket_t socket(
                    globalContext(), 
                    ZMQ_REQ);
            thread_local static bool connected;
    
            if (!connected) {
                connected = true;
                socket.connect(endpoint_);
            }
    
            return socket;
        }
    };
    
    // creating a pool for each endpoint
    socketPool httpReqPool("ipc://http-concentrator");
    

    In my sendMessage()-function instead of creating and connecting I simply do

    bool sendMessage(std::string s)
    {
        zmq::socket_t &socket = httpReqPool();
    
        // the rest as above
    }
    

    Regarding performance, well, it's 7 times faster on my machine. (140k REQ/REP per second).

    0 讨论(0)
  • 2021-01-06 17:00

    I think one different is performance.

    With above code, that means you need do 20k times of create socket, establish connection, send message and close socket, which is time consuming from my perspective and you can run some performance tool analysis to check how much time is used in function sendMessage().

    An alternative approach might create one request socket for each thread, and send data with socket of that thread it belongs to. ZeroMQ is not support multiple thread, or it will lead to errors, such as assert error (debug mode) or crash.

    0 讨论(0)
提交回复
热议问题