打开一个文件操作系统做了什么？

  当我们打开一个文件时，主要涉及了进程，文件描述符，文件描述表，打开文件表，目录项，索引表之间的联系。

今天主要围绕这几个图来说
在这里插入图片描述

在这里插入图片描述

第一进程控制块PCB与文件描述符的关系

  在进程控制块维护一个指向files-structure的结构体（可以看作是一个指向file结构体的指针数组 *file[n]）,而所谓的文件描述符就是这个表的索引（就是数组的下标），表中存储的是一个指针（数组的类型），指向files结构体。

file结构体—文件控制块

struct file {
    union {
        struct llist_node    fu_llist;
        struct rcu_head     fu_rcuhead;
    } f_u;
    struct path        f_path;
#define f_dentry    f_path.dentry
    struct inode        *f_inode;    /* cached value */
    const struct file_operations    *f_op;

    /*
     * Protects f_ep_links, f_flags.
     * Must not be taken from IRQ context.
     */
    spinlock_t        f_lock;
    atomic_long_t        f_count;
    unsigned int         f_flags;
    fmode_t            f_mode;
    struct mutex        f_pos_lock;
    loff_t            f_pos;
    struct fown_struct    f_owner;
    const struct cred    *f_cred;
    struct file_ra_state    f_ra;

    u64            f_version;
#ifdef CONFIG_SECURITY
    void            *f_security;
#endif
    /* needed for tty driver, and maybe others */
    void            *private_data;

#ifdef CONFIG_EPOLL
    /* Used by fs/eventpoll.c to link all the hooks to this file */
    struct list_head    f_ep_links;
    struct list_head    f_tfile_llink;
#endif /* #ifdef CONFIG_EPOLL */
    struct address_space    *f_mapping;
#ifdef CONFIG_DEBUG_WRITECOUNT
    unsigned long f_mnt_write_state;
#endif
} __attribute__((aligned(4)));    /* lest something weird decides that 2 is OK */

struct file

file结构体中主要的属性
1.file.File Status Flag和file.f_count
在file结构体中维护File Status Flag（file结构体的成员f_flags）和当前读写位置（file结构体的成员f_pos）。在上图中，进程1和进程2都打开同一文件，但是对应不同的file结构体，因此可以有不同的File Status Flag和读写位置。file结构体中比较重要的成员还有f_count，表示引用计数（Reference Count），后面我们会讲到，dup、fork等系统调用会导致多个文件描述符指向同一个file结构体，例如有fd1和fd2都引用同一个file结构体，那么它的引用计数就是2，当close(fd1)时并不会释放file结构体，而只是把引用计数减到1，如果再close(fd2)，引用计数就会减到0同时释放file结构体，这才真的关闭了文件。

2.file.file_operations
每个file结构体都指向一个file_operations结构体，这个结构体的成员都是函数指针，指向实现各种文件操作的内核函数。比如在用户程序中read一个文件描述符，read通过系统调用进入内核，然后找到这个文件描述符所指向的file结构体，找到file结构体所指向的file_operations结构体，调用它的read成员所指向的内核函数以完成用户请求(应用层到内核层的调用流程)。在用户程序中调用lseek、read、write、ioctl、open等函数，最终都由内核调用file_operations的各成员所指向的内核函数完成用户请求。file_operations结构体中的release成员用于完成用户程序的close请求，之所以叫release而不叫close是因为它不一定真的关闭文件，而是减少引用计数，只有引用计数减到0才关闭文件。对于同一个文件系统上打开的常规文件来说，read、write等文件操作的步骤和方法应该是一样的，调用的函数应该是相同的，所以图中的三个打开文件的file结构体指向同一个file_operations结构体。如果打开一个字符设备文件，那么它的read、write操作肯定和常规文件不一样，不是读写磁盘的数据块而是读写硬件设备，所以file结构体应该指向不同的file_operations结构体(也就有了用户自定义结构体对象或者内核自定义结构体对象)，其中的各种文件操作函数由该设备的驱动程序实现。

来看一下file.file-operations结构体

struct file_operations {
    struct module *owner;
    loff_t (*llseek) (struct file *, loff_t, int);
    ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
    ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
    ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
    ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
    int (*iterate) (struct file *, struct dir_context *);
    unsigned int (*poll) (struct file *, struct poll_table_struct *);
    long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
    long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
    int (*mmap) (struct file *, struct vm_area_struct *);
    int (*open) (struct inode *, struct file *);
    int (*flush) (struct file *, fl_owner_t id);
    int (*release) (struct inode *, struct file *);
    int (*fsync) (struct file *, loff_t, loff_t, int datasync);
    int (*aio_fsync) (struct kiocb *, int datasync);
    int (*fasync) (int, struct file *, int);
    int (*lock) (struct file *, int, struct file_lock *);
    ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
    unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
    int (*check_flags)(int);
    int (*flock) (struct file *, int, struct file_lock *);
    ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int);
    ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int);
    int (*setlease)(struct file *, long, struct file_lock **);
    long (*fallocate)(struct file *file, int mode, loff_t offset,
              loff_t len);
    int (*show_fdinfo)(struct seq_file *m, struct file *f);
};

struct file_operations

3.file.dentry
每个file结构体都有一个指向dentry结构体的指针，“dentry”是directory entry（目录项）的缩写。我们传给open、stat等函数的参数的是一个路径，例如/home/akaedu/a，需要根据路径找到文件的inode。为了减少读盘次数，内核缓存了目录的树状结构，称为dentry cache(作用)，其中每个节点是一个dentry结构体，只要沿着路径各部分的dentry搜索即可，从根目录/找到home目录，然后找到akaedu目录，然后找到文件a。dentry cache只保存最近访问过的目录项，如果要找的目录项在cache中没有，就要从磁盘读到内存中。

4.dentry.inode
每个dentry结构体都有一个指针指向inode结构体。inode结构体保存着从磁盘inode读上来的信息。在上图的例子中，有两个dentry，分别表示/home/akaedu/a和/home/akaedu/b，它们都指向同一个inode，说明这两个文件互为硬链接。inode结构体中保存着从磁盘分区的inode读上来信息，例如所有者、文件大小、文件类型和权限位等(inode有哪些参数，正常理解file结构体可能包含这些信息，其实是file.inode成员管理这些信息)。每个inode结构体都有一个指向inode_operations结构体的指针，后者也是一组函数指针指向一些完成文件目录操作的内核函数。和file_operations不同，inode_operations所指向的不是针对某一个文件进行操作的函数，而是影响文件和目录布局的函数，例如添加删除文件和目录、跟踪符号链接等等，属于同一文件系统的各inode结构体可以指向同一个inode_operations结构体。

5.inod.super_block
inode结构体有一个指向super_block结构体的指针。super_block结构体保存着从磁盘分区的超级块读上来的信息，例如文件系统类型、块大小等。super_block结构体的s_root成员是一个指向dentry的指针，表示这个文件系统的根目录被mount到哪里，在上图的例子中这个分区被mount到/home目录下。

再来看看dentry结构体

struct dentry {
  90        atomic_t d_count;                //使用计数（打开同一两次count=2）
  91        unsigned int d_flags;           //目录项标时
  92        spinlock_t d_lock;                //单目录锁
  93        int d_mounted;                    //目录项的安装点
  94        struct inode *d_inode;        //与该目录项相关联的索引节点 （dentry所指的inode,）
  95                                        
  96        /*
  97         * The next three fields are touched by __d_lookup.  Place them here
  98         * so they all fit in a cache line.
  99         */
 100        struct hlist_node d_hash;        //散列表
 101        struct dentry *d_parent;          //父目录项
 102        struct qstr d_name;                //目录项名可快速查找
 103
 104        struct list_head d_lru;            // 未使用目录以LRU 算法链接的链表
 105        /*
 106         * d_child and d_rcu can share memory
 107         */
 108        union {
 109                struct list_head d_child;       /* child of parent list */
 110                struct rcu_head d_rcu;
 111        } d_u;
 112        struct list_head d_subdirs;     //该目录项子目录项所形成的链表
 113        struct list_head d_alias;        //索引节点别名链表
 114        unsigned long d_time;           //重新生效时间
 115        const struct dentry_operations *d_op;    // 操作目录项的函数
 116        struct super_block *d_sb;       //目录项树的根
 117        void *d_fsdata;                        //具体文件系统的数据
 118
 119        unsigned char d_iname[DNAME_INLINE_LEN_MIN];    //短文件名
 120};

1>索引节点中的i_dentry指向了它目录项，目录项中的d_alias，d_inode又指会了索引节点对象，目录项中的d_sb又指回了超级块对象。
2>我们可以看到不同于VFS 中的索引节点对象和超级块对象，目录项对象中没有对应磁盘的数据结构，所以说明目录项对象并没有真正标存在磁盘上，那么它也就没有脏标志位。
3>目录项的状态(被使用，未被使用和负状态）
a.它们是靠d_count的值来进行区分的，当d_count为正值说明目录项处于被使用状态。当d_count=0时表示该目录项是一个未被使用的目录项，但其d_inode指针仍然指向相关的的索引节点。该目录项仍然包含有效的信息，只是当前没有人引用他。d_count=NULL表示负（negative）状态，与目录项相关的inode对象不复存在（相应的磁盘索引节点可能已经被删除），dentry对象的d_inode 指针为NULL。但这种dentry对象仍然保存在dcache中，以便后续对同一文件名的查找能够快速完成。这种dentry对象在回收内存时将首先被释放。
4> d_subdirs：如果当前目录项是一个目录，那么该目录下所有的子目录形成一个链表。该字段是这个链表的表头；
d_child：如果当前目录项是一个目录，那么该目录项通过这个字段加入到父目录的d_subdirs链表当中。这个字段中的next和prev指针分别指向父目录中的另外两个子目录；
d_alias：一个inode可能对应多个目录项，所有的目录项形成一个链表。inode结构中的i_dentry即为这个链表的头结点。当前目录项以这个字段处于i_dentry链表中。该字段中的prev和next指针分别指向与该目录项同inode的其他两个（如果有的话）目录项
3.dentry和inode的区别：
　inode（可理解为ext2 inode）对应于物理磁盘上的具体对象，dentry是一个内存实体，其中的d_inode成员指向对应的inode。也就是说，一个inode可以在运行的时候链接多个dentry，而d_count记录了这个链接的数量。
4.dentry与dentry_cache
dentry_cache简称dcache，中文名称是目录项高速缓存，是Linux为了提高目录项对象的处理效率而设计的。它主要由两个数据结构组成：
1>哈希链表dentry_hashtable：dcache中的所有dentry对象都通过d_hash指针域链到相应的dentry哈希链表中。
2>未使用的dentry对象链表dentry_unused：dcache中所有处于unused状态和negative状态的dentry对象都通过其d_lru指针域链入dentry_unused链表中。该链表也称为LRU链表。
目录项高速缓存dcache是索引节点缓存icache的主控器（master），也即 dcache中的dentry对象控制着icache中的inode对象的生命期转换。无论何时，只要一个目录项对象存在于dcache中（非 negative状态），则相应的inode就将总是存在，因为 inode的引用计数i_count总是大于0。当dcache中的一个dentry被释放时，针对相应inode对象的iput()方法就会被调用。

file、dentry、inode、super_block这几个结构体组成了VFS的核心概念。对于ext2文件系统来说，在磁盘存储布局上也有inode和超级块的概念，所以很容易和VFS中的概念建立对应关系。而另外一些文件系统格式来自非UNIX系统（例如Windows的FAT32、NTFS），可能没有inode或超级块这样的概念，但为了能mount到Linux系统，也只好在驱动程序中硬凑一下，在Linux下看FAT32和NTFS分区会发现权限位是错的，所有文件都是rwxrwxrwx，因为它们本来就没有inode和权限位的概念，这是硬凑出来的。

dentry.operations

struct dentry_operations {
 135        int (*d_revalidate)(struct dentry *, struct nameidata *);
 136        int (*d_hash) (struct dentry *, struct qstr *);
 137        int (*d_compare) (struct dentry *, struct qstr *, struct qstr *);
 138        int (*d_delete)(struct dentry *);
 139        void (*d_release)(struct dentry *);
 140        void (*d_iput)(struct dentry *, struct inode *);
 141        char *(*d_dname)(struct dentry *, char *, int);
 142}；

1.int d_reavlidate(struct dentry *dentry ,int flags) 该函数判断目录对象是否有效。VFS准备从dcache中使用一个目录项时，会调用该函数.

2.int d_hash(struct dentry *dentry ,struct qstr *name)：该目录生成散列值，当目录项要加入到散列表时，VFS要调用此函数。

3.int d_compare( struct dentry *dentry, struct qstr *name1, struct qstr *name2) 该函数来比较name1和name2这两个文件名。使用该函数要加dcache_lock锁。

4.int d_delete(struct dentry *dentry):当d_count=0时，VFS调用次函数。使用该函数要叫 dcache_lock锁。

5.void d_release(struct dentry *dentry):当该目录对象将要被释放时，VFS调用该函数。

6.void d_iput(struct dentry *dentry,struct inode *inode)当一个目录项丢失了其索引节点时，VFS就掉用该函数。

inode

struct inode {
unsigned long                     i_ino;
atomic_t                               i_count;  //硬链接计数
umode_t                               i_mode;
unsigned int                        i_nlink;
uid_t                                      i_uid;   //文件使用者
gid_t                                      i_gid;
dev_t                                     i_rdev;
loff_t                                      i_size;     //文件大小
struct timespec                   i_atime;
unsigned long                     i_blocks;
unsigned short                    i_bytes;
unsigned char                      _sock;      //socket接口
12
struct inode_operations *i_op;  //inode 的函数，例如删除文件
struct file_operations *i_fop; /* former ->i_op->default_file_ops */
struct super_block *i_sb; //指向文件系统的超级块，确定inode属于哪个文件系统
......
};

所上讲解的dentry和inode是相对于内核的，是相对于虚拟文件系统的，不是相对于磁盘上的文件系统，
对于ext2文件系统来说，在磁盘存储布局上也有inode和超级块的概念，所以很容易和VFS中的概念建立对应关系。而另外一些文件系统格式来自非UNIX系统（例如Windows的FAT32、NTFS），可能没有inode或超级块这样的概念，但为了能mount到Linux系统，也只好在驱动程序中硬凑一下，在Linux下看FAT32和NTFS分区会发现权限位是错的，所有文件都是rwxrwxrwx，因为它们本来就没有inode和权限位的概念，这是硬凑出来的。

来源：CSDN

作者：Flower-Cui

链接：https://blog.csdn.net/weixin_43408374/article/details/104770835

标签

链表