Non-file FileSystems?

ぐ巨炮叔叔 提交于 2019-12-03 09:14:37

Emails are often stored in folders. But ever since I have migrated to Gmail, I have become accustomed to classifying my emails with tags.

I often wondered if we could manage a whole file-system that way: instead of storing files in folders, you could tag files with the tags you like. A file identifier would not look like this:

/home/john/personal/contacts.txt

but more like this:

contacts[john,personal]

Well... just food for thought (maybe this already exists!)

You can for example have dedicated solutions, like Oracle Raw Partitions. Other databases support similar thing. In these cases the filesystem provides unnecessary overhead and can be ommited - DB software will take care of organising the structure.

The problem seems very application dependent and files/folders seem to be a reasonable compromise for many applications (and is easy for human beings to comprehend).

Mainframes used to just give programmers a number of 'devices' to use. The device corresponsed to a drive or a partition thereof and the programmer was responsible for the organisation of all data on it. Of course they quickly built up libraries to help with that.

The only OS I think think of that does use the common hierachical arrangement of flat files (like UNIX) is PICK. That used a sort of relational database as the filesystem.

Microsoft had originally planned to introduce a new file-system for windows vista (WinFS - windows future storage). The idea was to store everything in a relational database (SQL Server). As far as I know, this project was never (or not yet?) finished.

There's more information about it on wikipedia.

I knew a guy who wrote his doctorate about a hard disk that comes with its own file system. It was based on an extension of SCSI commands that allowed the usual open, read, write and close commands to be sent to the disk directly, bypassing the file system drivers of the OS. I think the conclusion was that it is inflexible, and does not add much efficiency.

Anyway, this disk based file system still had a folder like structure I believe, so I don't think it really counts for you ;-)

Well, there's always Pick, where the OS and file system were an integrated database.

Traditional file systems are optimized for fast file access if you know the name of the file you want (including its path). Directories are a way of grouping files together so that they're easier to find if you know properties of the file but not its actual name.

Traditional file systems are not good at finding files if you know very little about them, however they are robust enough that one can add a layer on top of them to aid in retrieving files based on content or meta-information such as tags. That's what indexers are for.

The bottom line is we need a way to store persistently the bytes that the CPU needs to execute. So we have traditional file systems which are very good at organizing sequential sets of bytes. We also need to store persistently the bytes of files that aren't executed directly, but are used by things that do execute. Why create a new system for the same fundamental thing?

What more should a file system do other than store and retrieve bytes?

I'll echo the other responses. If I could pick a filesystem type, I personally would rather see a hybrid approach: a flat database of subtrees, where each subtree is considered as a cohesive unit, but if you consider the subtrees themselves as discrete units they would have no hierarchy, but instead could have metadata + be queryable on that metadata.

The reason for files is that humans like to attach names to "things" they have to use. Otherwise, it becomes hard to talk or think about or even distinguish them.

When we have too many things on a heap, we like to separate the heap. We sort it by some means and we like to build hierarchies where you can navigate arbitrarily sized amounts of things.

Hence directories and files just map our natural way of working with real objects. Since you can put anything in a file. On Unix, even hardware is mapped as "device nodes" into the file system which are special files which you can read/write to send commands to the hardware.

I think the metaphor is so powerful, it will stay.

I spent a while trying to come up with an automagically versioning file system that would maintain versions (and version history) of any specific file and/or directory structure.

The idea was that all of the standard access command (e.g. dir, read, etc.) would have an optional date/time parameter that could be passed to access the file system as it looked at that point in time.

I got pretty far with it, but had to abandon it when I had to actually go out and earn some money. It's been on the back-burner since then.

If you take a look at the start-up times for operating systems, it should be clear that improvements in accessing disks can be made. I'm not sure if the changes should be in the file system or rather in the OS start-up code.

Personally, I'm really sorry WinFS didn't fly. I loved the concept.. From Wikipedia (http://en.wikipedia.org/wiki/WinFS) :

WinFS includes a relational database for storage of information, and allows any type of information to be stored in it, provided there is a well defined schema for the type. Individual data items could then be related together by relationships, which are either inferred by the system based on certain attributes or explicitly stated by the user. As the data has a well defined schema, any application can reuse the data; and using the relationships, related data can be effectively organized as well as retrieved. Because the system knows the structure and intent of the information, it can be used to make complex queries that enable advanced searching through the data and aggregating various data items by exploiting the relationships between them.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!