Read file at a certain rate in Java

前端 未结 6 576
梦谈多话
梦谈多话 2021-01-05 10:44

Is there an article/algorithm on how I can read a long file at a certain rate?

Say I do not want to pass 10 KB/sec while issuing reads.

相关标签:
6条回答
  • 2021-01-05 11:21
    • while !EOF
      • store System.currentTimeMillis() + 1000 (1 sec) in a long variable
      • read a 10K buffer
      • check if stored time has passed
        • if it isn't, Thread.sleep() for stored time - current time

    Creating ThrottledInputStream that takes another InputStream as suggested would be a nice solution.

    0 讨论(0)
  • 2021-01-05 11:29

    You can use a RateLimiter. And make your own implementation of the read in InputStream. An example of this can be seen bellow

    public class InputStreamFlow extends InputStream {
        private final InputStream inputStream;
        private final RateLimiter maxBytesPerSecond;
    
        public InputStreamFlow(InputStream inputStream, RateLimiter limiter) {
            this.inputStream = inputStream;
            this.maxBytesPerSecond = limiter;
        }
    
        @Override
        public int read() throws IOException {
            maxBytesPerSecond.acquire(1);
            return (inputStream.read());
        }
    
        @Override
        public int read(byte[] b) throws IOException {
            maxBytesPerSecond.acquire(b.length);
            return (inputStream.read(b));
        }
    
        @Override
        public int read(byte[] b, int off, int len) throws IOException {
            maxBytesPerSecond.acquire(len);
            return (inputStream.read(b,off, len));
        }
    }
    

    if you want to limit the flow by 1 MB/s you can get the input stream like this:

    final RateLimiter limiter = RateLimiter.create(RateLimiter.ONE_MB); 
    final InputStreamFlow inputStreamFlow = new InputStreamFlow(originalInputStream, limiter);
    
    0 讨论(0)
  • 2021-01-05 11:30

    A simple solution, by creating a ThrottledInputStream.

    This should be used like this:

            final InputStream slowIS = new ThrottledInputStream(new BufferedInputStream(new FileInputStream("c:\\file.txt"),8000),300);
    

    300 is the number of kilobytes per second. 8000 is the block size for BufferedInputStream.

    This should of course be generalized by implementing read(byte b[], int off, int len), which will spare you a ton of System.currentTimeMillis() calls. System.currentTimeMillis() is called once for each byte read, which can cause a bit of an overhead. It should also be possible to store the number of bytes that can savely be read without calling System.currentTimeMillis().

    Be sure to put a BufferedInputStream in between, otherwise the FileInputStream will be polled in single bytes rather than blocks. This will reduce the CPU load form 10% to almost 0. You will risk to exceed the data rate by the number of bytes in the block size.

    import java.io.InputStream;
    import java.io.IOException;
    
    public class ThrottledInputStream extends InputStream {
        private final InputStream rawStream;
        private long totalBytesRead;
        private long startTimeMillis;
    
        private static final int BYTES_PER_KILOBYTE = 1024;
        private static final int MILLIS_PER_SECOND = 1000;
        private final int ratePerMillis;
    
        public ThrottledInputStream(InputStream rawStream, int kBytesPersecond) {
            this.rawStream = rawStream;
            ratePerMillis = kBytesPersecond * BYTES_PER_KILOBYTE / MILLIS_PER_SECOND;
        }
    
        @Override
        public int read() throws IOException {
            if (startTimeMillis == 0) {
                startTimeMillis = System.currentTimeMillis();
            }
            long now = System.currentTimeMillis();
            long interval = now - startTimeMillis;
            //see if we are too fast..
            if (interval * ratePerMillis < totalBytesRead + 1) { //+1 because we are reading 1 byte
                try {
                    final long sleepTime = ratePerMillis / (totalBytesRead + 1) - interval; // will most likely only be relevant on the first few passes
                    Thread.sleep(Math.max(1, sleepTime));
                } catch (InterruptedException e) {//never realized what that is good for :)
                }
            }
            totalBytesRead += 1;
            return rawStream.read();
        }
    }
    
    0 讨论(0)
  • 2021-01-05 11:34

    If you have used Java I/O then you should be familiar with decorating streams. I suggest an InputStream subclass that takes another InputStream and throttles the flow rate. (You could subclass FileInputStream but that approach is highly error-prone and inflexible.)

    Your exact implementation will depend upon your exact requirements. Generally you will want to note the time your last read returned (System.nanoTime). On the current read, after the underlying read, wait until sufficient time has passed for the amount of data transferred. A more sophisticated implementation may buffer and return (almost) immediately with only as much data as rate dictates (be careful that you should only return a read length of 0 if the buffer is of zero length).

    0 讨论(0)
  • 2021-01-05 11:38

    It depends a little on whether you mean "don't exceed a certain rate" or "stay close to a certain rate."

    If you mean "don't exceed", you can guarantee that with a simple loop:

     while not EOF do
        read a buffer
        Thread.wait(time)
        write the buffer
     od
    

    The amount of time to wait is a simple function of the size of the buffer; if the buffer size is 10K bytes, you want to wait a second between reads.

    If you want to get closer than that, you probably need to use a timer.

    • create a Runnable to do the reading
    • create a Timer with a TimerTask to do the reading
    • schedule the TimerTask n times a second.

    If you're concerned about the speed at which you're passing the data on to something else, instead of controlling the read, put the data into a data structure like a queue or circular buffer, and control the other end; send data periodically. You need to be careful with that, though, depending on the data set size and such, because you can run into memory limitations if the reader is very much faster than the writer.

    0 讨论(0)
  • 2021-01-05 11:43

    The crude solution is just to read a chunk at a time and then sleep eg 10k then sleep a second. But the first question I have to ask is: why? There are a couple of likely answers:

    1. You don't want to create work faster than it can be done; or
    2. You don't want to create too great a load on the system.

    My suggestion is not to control it at the read level. That's kind of messy and inaccurate. Instead control it at the work end. Java has lots of great concurrency tools to deal with this. There are a few alternative ways of doing this.

    I tend to like using a producer consumer pattern for soling this kind of problem. It gives you great options on being able to monitor progress by having a reporting thread and so on and it can be a really clean solution.

    Something like an ArrayBlockingQueue can be used for the kind of throttling needed for both (1) and (2). With a limited capacity the reader will eventually block when the queue is full so won't fill up too fast. The workers (consumers) can be controlled to only work so fast to also throttle the rate covering (2).

    0 讨论(0)
提交回复
热议问题