问题
When I read about files in textbooks it seems that some concepts I knew about OS are repeated for files on the application level.
For example the terms block and page are used for logical representation of data in files (so we are not in the HD level organization). But I can not understand what is the idea here. Do we in the application define a block size and a page size and use that when accessing files e.g. using NIO or blocking IO?
How would we define these sizes normally? Arbitrarily? Am I confused here?
UPDATE after request of @RobinGreen
An example of what I am saying is e.g. the slotted-block page structure or the list representation for variable length records described e.g. in the book of Silberschatz for Database System concepts in the section for files
回答1:
In Linux, the operating system block size is unrelated to the hard drive's reported sector size in bytes (which is also fake these days - for BIOS compatibility reasons!).
At the application level, you can store data in fixed-size or variable-size blocks, which are unrelated to OS level blocks.
So there are many levels, all unrelated to each other!
Of course it is a good idea to read and write data in chunks, nevertheless. Reading and writing data 1 byte at a time would involve too many round-trips to the kernel, for example. But the right size to use is an empirical question: which size is most efficient for your use case?
Page sizes are slightly different. They are defined by the CPU architecture (at least on the x86/x86_64 family) and affect paging/swapping. An application does not directly encounter paging and swapping, but it encounters the effects, in terms of lower performance.
来源:https://stackoverflow.com/questions/20017744/what-do-we-mean-with-pages-blocks-for-files-in-the-application-level