Periodic latency spikes from UDP socket caused by periodic sendto()/recvfrom() delay, C++ for Linux RT-PREEMPT system

问题

I have setup two Raspberry Pis to use UDP sockets, one as the client and one as the server. The kernel has been patched with RT-PREEMPT (4.9.43-rt30+). The client acts as an echo to the server to allow for the calculation of Round-Trip Latency (RTL). At the moment a send frequency of 10Hz is being used on the server side with 2 threads: one for sending the messages to the client and one for receiving the messages from the client. The threads are setup to have a schedule priority of 95 using Round-Robin scheduling.

The server constructs a message containing the time the message was sent and the time past since messages started being sent. This message is sent from the server to the client then immediately returned to the server. Upon receiving the message back from the client the server calculates the Round-Trip Latency and then stores it in a .txt file, to be used for plotting using Python.

The problem is that when analysing the graphs I noticed there is a periodic spike in the RTL. The top graph of the image:RTL latency and sendto() + recvfrom() times. In the legend I have used RTT instead of RTL. These spikes are directly related to the spikes shown in the server side sendto() and recvfrom() calls. Any suggestion on how to remove these spikes as my application is very reliant on consistency?

Things I have tried and noticed:

The size of the message being sent has no effect. I have tried larger messages (1024 bytes) and smaller messages (0 bytes) and the periodic delay does not change. This suggests to me that it is not a buffer issue as there is nothing filling up?
The frequency at which the messages are sent does play a big role, if the frequency is doubled then the latency spikes occur twice as often. This then suggests that something is filling up and while it empties the sendto()/recvfrom() functions experience a delay?
Changes to the buffer size with setsockop() has no effect.
I have tried quite a few other settings (MSG_DONTWAIT, etc) to no avail.

I am by no means an expert in sockets/C++ programming/Linux so any suggestions given will be greatly appreciated as I am out of ideas. Below is the code used to create the socket and start the server threads for sending and receiving the messages. Below that is the code for sending the messages from the server, if you need the rest please let me know but for now my concern is centred around the delay caused by the sendto() function. If you need anything else please let me know. Thanks.

    thread_priority = priority;  
    recv_buff = recv_buff_len;
    std::cout << del << " Second start-up delay..." << std::endl;
    sleep(del);
    std::cout << "Delay complete..." << std::endl;

    master = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);

    // master socket creation
    if(master == 0){// Try to create the UDP socket
        perror("Could not create the socket: ");
        exit(EXIT_FAILURE);
    }    
    std::cout << "Master Socket Created..." << std::endl;
    std::cout << "Adjusting send and receive buffers..." << std::endl;
    setBuff();

    // Server address and port creation
    serv.sin_family = AF_INET;// Address family
    serv.sin_addr.s_addr = INADDR_ANY;// Server IP address, INADDR_ANY will                 
    work on the server side only
    serv.sin_port = htons(portNum);
    server_len = sizeof(serv);

    // Binding of master socket to specified address and port
    if (bind(master, (struct sockaddr *) &serv, sizeof (serv)) < 0) {
    //Attempt to bind master socket to address
        perror("Could not bind socket...");
        exit(EXIT_FAILURE);
    }

    // Show what address and port is being used
    char IP[INET_ADDRSTRLEN];                                 
    inet_ntop(AF_INET, &(serv.sin_addr), IP, INET_ADDRSTRLEN);// INADDR_ANY         
    allows all network interfaces so it will always show 0.0.0.0
    std::cout << "Listening on port: " << htons(serv.sin_port) << ", and         
    address: " << IP << "..." << std::endl;  

    // Options specific to the server RPi
    if(server){
        std::cout << "Run Time: " << duration << " seconds." << std::endl;
        client.sin_family = AF_INET;// Address family
        inet_pton(AF_INET, clientIP.c_str(), &(client.sin_addr));
        client.sin_port = htons(portNum);
        client_len = sizeof(client);
        serv_send = std::thread(&SocketServer::serverSend, this);  
        serv_send.detach();// The server send thread just runs continuously
        serv_receive = std::thread(&SocketServer::serverReceive, this);
        serv_receive.join();        
    }else{// Specific to client RPi
        SocketServer::clientReceiveSend();
    }

And the code for sending the messages:

    // Setup the priority of this thread
    param.sched_priority = thread_priority;
    int result = sched_setscheduler(getpid(), SCHED_RR, &param);
    if(result){
        perror ("The following error occurred while setting serverSend() priority");
    }
    int ched = sched_getscheduler(getpid());
    printf("serverSend() priority result %i : Scheduler priority id %i \n", result, ched);

    std::ofstream Out;
    std::ofstream Out1;

    Out.open(file_name);
    Out << duration << std::endl; 
    Out << frequency << std::endl;
    Out << thread_priority << std::endl;
    Out.close(); 

    Out1.open("Server Side Send.txt");
    packets_sent = 0;

    Tbegin = std::chrono::high_resolution_clock::now();    

    // Send messages for a specified time period at a specified frequency
    while(!stop){ 
        // Setup the message to be sent
        Tstart = std::chrono::high_resolution_clock::now();
        TDEL = std::chrono::duration_cast< std::chrono::duration<double>>(Tstart - Tbegin); // Total time passed before sending message
        memcpy(&message[0], &Tstart, sizeof(Tstart));// Send the time the message was sent with the message
        memcpy(&message[8], &TDEL, sizeof(TDEL));// Send the time that had passed since Tstart

        // Send the message to the client
        T1 = std::chrono::high_resolution_clock::now();
        sendto(master, &message, 16, MSG_DONTWAIT, (struct sockaddr *)&client, client_len);  
        T2 = std::chrono::high_resolution_clock::now();
        T3 = std::chrono::duration_cast< std::chrono::duration<double>>(T2-T1);
        Out1 << T3.count() << std::endl;

        packets_sent++;

        // Pause so that the required message send frequency is met

        while(true){
            Tend = std::chrono::high_resolution_clock::now();
            Tdel = std::chrono::duration_cast< std::chrono::duration<double>>(Tend - Tstart);
            if(Tdel.count() > 1/frequency){
                break;
            }            
        }

        TDEL = std::chrono::duration_cast< std::chrono::duration<double>>(Tend - Tbegin);


        // Check to see if the program has run as long as required
        if(TDEL.count() > duration){
            stop = true;
            break;
        }        
    } 

    std::cout << "Exiting serverSend() thread..." << std::endl;   

    // Save extra results to the end of the last file    
    Out.open(file_name, std::ios_base::app);
    Out << packets_sent << "\t\t " << packets_returned << std::endl;    
    Out.close();   
    Out1.close();
    std::cout << "^C to exit..." << std::endl;

回答1:

I have sorted out the problem. It was not the ARP tables as even with the ARP functionality disabled there was a periodic spike. With the ARP functionality disabled there would only be a single spike in latency as opposed to a series of latency spikes.

It turned out to be a problem with the threads I was using as there were two threads on a CPU only capable of handling one thread at a time. The one thread that was sending the information was being affected by the second thread that was receiving information. I changed the thread priorities around a lot (send priority higher than receive, receive higher than send and send equal to receive) to no avail. I have now bought a Raspberry Pi that has 4 cores and I have set the send thread to run on core 2 while the receive thread runs on core 3, preventing the threads from interfering with each other. This has not only removed the latency spikes but also reduced the mean latency of my setup.

来源：https://stackoverflow.com/questions/47618619/periodic-latency-spikes-from-udp-socket-caused-by-periodic-sendto-recvfrom-d

标签

c++

Linux

sockets

latency

sendto