Changing offset from child processes

喜夏-厌秋 提交于 2019-12-12 04:31:39

问题


Let say that I have a parent process, and then create some number of child processes in order to read from the same file.

  1. when each process read from the file descriptor, is the offset been changed between all his sibling's processes?

  2. and so, is it possible that each process will read a unique line, or that without synchronized the app , each process will read the same lines like his siblings?

    id = fork();
    
    if (id < 0)
        exit(EXIT_FAILURE);
    
    if (pipe(fd) == -1)
        exit(EXIT_FAILURE);
    
    switch (id) {
    case 0:
        //child process
        readFromFile(filename);
        exit(0);
        break;
    default:
        //Parent process doing something..
        break;
    }
    

回答1:


On a POSIX system, file descriptors inherited by a child process through a fork call refer to the same file descriptor in a system-wide table. Here's a relevant quotation from the Linux manual page for open(2):

The term open file description is the one used by POSIX to refer to the entries in the system-wide table of open files... When a file descriptor is duplicated (using dup(2) or similar), the duplicate refers to the same open file description as the original file descriptor, and the two file descriptors consequently share the file offset and file status flags. Such sharing can also occur between processes: a child process created via fork(2) inherits duplicates of its parent's file descriptors, and those duplicates refer to the same open file descriptions.

This means that the parent and child share the same information on file offset, and reads in one will change the offset seen by all other processes. If processes read in parallel without lseeking between reads, no two processes will read the same data.

You can see this in action in the following test program, which prints the first 20 characters of the file given in the command line. (If position information wasn't shared, it would print the first 10 characters twice).

#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

char buffer[256];

int
main(int argc, char ** argv)
{
    int fd = open(argv[1], O_RDONLY);
    fork();
    read(fd, buffer, 10);
    write(1, buffer, 10);
    return 0;
}

HOWEVER, and this is a huge "however", this applies only to the low-level system call interface for reading files: open(2), read(2), etc. If you are using a higher-level buffered interface, like fgets and other functions in stdio.h, things get complicated. When the processes are forked, even though they inherit copies of file descriptors that point to single system-wide, shared structures of file information in the kernel, they also inherit separate copies of user-space buffering information that's used by stdio.h calls, and this buffering information includes its own offsets (and buffers, obviously), which aren't synchronized between processes.




回答2:


and so, is it possible that each process will read a unique line

As K. A. Buhr says, the processes will read different parts of the file, as a read from one will update the position on the other.

But if you're reading lines, you're in for trouble.

Unless you know the lengths of the lines in advance (i.e. they have a fixed length), you'll likely read partial lines, leaving the other process to possibly read the other part. To work around that, you'd need to either read one character at a time, or seek back to a line boundary after reading. Both of those will be subject to race conditions: the other process might read between your reads, or between your read and the seek.



来源:https://stackoverflow.com/questions/43708287/changing-offset-from-child-processes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!