Java File IO vs Local database

前端 未结 3 1700
孤独总比滥情好
孤独总比滥情好 2021-01-23 06:11

I am working on a project that involves parsing through a LARGE amount of data rapidly. Currently this data is on disk and broken down into a directory hierarchy:



        
相关标签:
3条回答
  • 2021-01-23 06:30

    what is the fastest way I can selectively load entries from my filesystem from varying DataSources and Days?

    selectively means filtering, so my answer is a localhost database. Generally speaking if you filter, sort, paginate or extract distinct records from a large number of records, it's hard to beat a localhost SQL server. You get a query optimizer (nobody does that Java), a cache (which requires effort in Java, especially the invalidation), database indexes (have not seen that being done in Java either) etc. It's possible to implement these things manually, but then your are writing a database in Java.

    On top of this you gain access to higher level SQL functions like window aggegrates etc., so in most cases there is no need to post-process data in Java.

    0 讨论(0)
  • 2021-01-23 06:40

    The issue could be solved both ways but it depends on few factors

    go for FileIO.

    1. if the volume is < millons of rows
    2. if your dont do a complicated query like Jon Skeet said
    3. if your referance for fetching the row is by using hte Folder Name: "DataSource" as the key

    go for DB

    1. if you see your program reading through millions of records
    2. you can do complicated selection, even multiple rows using a single select.
    3. if you have knowledge of creating a basic table structure for DB
    0 讨论(0)
  • 2021-01-23 06:41

    Depending on architecture you are using you can implement different ways of caching, in the Jboss there is a built-in Jboss Caching, there are also third party opensource software that lets utilizes caching, like Redis, or EhCache depending on your needs. Basically Caching stores objects in their memory, some are passivated/activated upon demand, when memory is exhausted it is stored as a physical IO file, which are also easily activated marshalled by the caching mechanism. It lowers the database connectivity held by your program. There are other caches but here are some of them that I've worked with:

    • Jboss:http://www.jboss.org/jbosscache/
    • Redis:http://redis.io/
    • EhCache:http://ehcache.org/
    0 讨论(0)
提交回复
热议问题