One ZeroMQ socket per thread or per call?

前端 未结 4 485
名媛妹妹
名媛妹妹 2021-01-06 16:32

As we all know, a ZeroMQ socket shall not be shared among application threads.
context_t instances however can.

I have a multi-threaded-application a

4条回答
  •  离开以前
    2021-01-06 16:37

    Nota Bene: this answer was posted before O/P was changed from 20k TPS to 140k TPS on ipc:// transport-class

    Q: Is there a more correct ZeroMQ-way of doing this?

    A:Not easy to say what is "this" and what are the parameters of the "correctness"-metric

    Given that,
    the points below will be more general
    and applicable for a system design-phase reasoning:


    Resources Utilisation Overheads Avoidance

    This point is a dual-edge sword. Some overheads are always associated with both an infrastructure element setup and disposal ( yes, even the closing and dismantling ) of the REQ-AccessPoint to the REQ/REP-pattern and the associated socket-based transport-class impose some remarkable overheads on both the REQ-side host and also the REP-side.

    It was fair you've noted, that you took some care about quantitatively testing this, on a level of some 20k TPS, and did not observe any adverse effects of such approach. What was not clear, whether any other scenario was also tested in-vivo on the same SUT ( System-under-Test ), so as to provide some baseline for comparisons of each respective design ( and to allow to determine the difference of the overheads per se ).

    While a well designed framework hides this part of the system internal behaviour from the user-maintained code, it does not mean, it is all a cheap, the less a free-of-charge processing.

    It is obvious there are jobs performed under the hood in the Context()-instance thread(s) ( ... yes, plural is correct here, as some high-performance code may benefit from using more than one I/O-threads per a Context() instance and positively influence the workload distribution by explicitly defined affinity-mapping between a pattern-socket and it's respective I/O-thread handler ( so as to somehow balance, if not able to deterministically level, the expected I/O-throughput, incl. all the associated overheads ).

    If still in doubts, one shall always remember, that an imperative programming style function or an object-oriented methods are principally victims of the external caller, who decides at which moment & how often such "en-slaved" code-execution unit is called on duty and being executed. The function/method does not have any natural means of back-throtling ( a suppresion of ) the frequency of it's own invocations from external caller(s) and robust designs simply cannot remain to just rely on optimistic assumptions that such calls do not come more often than XYZ-k TPS ( 20k being cited above may be fine for in-vitro testing, but real deployment may shift that several orders of manitude ( be it artificially - during testing, or not - during some peak-hour or user(system)-panic or due to some technical error or hardware failure ( we've all heard so many times about NIC-card flooding an L1/L2 traffic beyond all imaginable limits et al - we just do not and cannot know, when / where it will happen next time again ).

    Avoiding Risk of Blocking

    The mentioned REQ/REP Scaleable Formal Communication Pattern is known for it's risk of falling into an externally unresolveable distributed internal dead-lock. This is always a risk to avoid. Mitigation strategies may depend on the actual use-case's value at risk ( a need to certify a medical instrument, fintech use-cases, control-loop use-cases, academia research paper code or a private hobby toy ).

    Ref.: REQ/REP Deadlocks >>> https://stackoverflow.com/a/38163015/3666197

    Fig.1: Why is it wrong to use a naive REQ/REP
    all cases when [App1]in_WaitToRecvSTATE_W2R + [App2]in_WaitToRecvSTATE_W2R
    are principally an unsalvageable distributed mutual deadlock of REQ-FSA/REP-FSA ( each of the both Finite-State-Automata waits for "the other" to move ) and will never reach the "next" in_WaitToSendSTATE_W2S internal state.

                   XTRN_RISK_OF_FSA_DEADLOCKED ~ {  NETWORK_LoS
                                             :   || NETWORK_LoM
                                             :   || SIG_KILL( App2 )
                                             :   || ...
                                             :      }
                                             :
    [App1]      ![ZeroMQ]                    :    [ZeroMQ]              ![App2] 
    code-control! code-control               :    [code-control         ! code-control
    +===========!=======================+    :    +=====================!===========+
    |           ! ZMQ                   |    :    |              ZMQ    !           |
    |           ! REQ-FSA               |    :    |              REP-FSA!           |
    |           !+------+BUF> .connect()|    v    |.bind()  +BUF>------+!           |
    |           !|W2S   |___|>tcp:>---------[*]-----(tcp:)--|___|W2R   |!           |
    |     .send()>-o--->|___|           |         |         |___|-o---->.recv()     |
    | ___/      !| ^  | |___|           |         |         |___| ^  | |!      \___ |
    | REQ       !| |  v |___|           |         |         |___| |  v |!       REP |
    | \___.recv()<----o-|___|           |         |         |___|<---o-<.send()___/ |
    |           !|   W2R|___|           |         |         |___|   W2S|!           |
    |           !+------

提交回复
热议问题