Smart progress bar ETA computation

后端未结

关注

 11  455

In many applications, we have some progress bar for a file download, for a compression task, for a search, etc. We all often use progress bars to let users know something is

相关标签:

11条回答

礼貌的吻别

2020-12-22 16:15
First off, it helps to generate a running moving average. This weights more recent events more heavily.

To do this, keep a bunch of samples around (circular buffer or list), each a pair of progress and time. Keep the most recent N seconds of samples. Then generate a weighted average of the samples:
```
totalProgress += (curSample.progress - prevSample.progress) * scaleFactor
totalTime += (curSample.time - prevSample.time) * scaleFactor
```
where scaleFactor goes linearly from 0...1 as an inverse function of time in the past (thus weighing more recent samples more heavily). You can play around with this weighting, of course.

At the end, you can get the average rate of change:
```
 averageProgressRate = (totalProgress / totalTime);
```
You can use this to figure out the ETA by dividing the remaining progress by this number.

However, while this gives you a good trending number, you have one other issue - jitter. If, due to natural variations, your rate of progress moves around a bit (it's noisy) - e.g. maybe you're using this to estimate file downloads - you'll notice that the noise can easily cause your ETA to jump around, especially if it's pretty far in the future (several minutes or more).

To avoid jitter from affecting your ETA too much, you want this average rate of change number to respond slowly to updates. One way to approach this is to keep around a cached value of averageProgressRate, and instead of instantly updating it to the trending number you've just calculated, you simulate it as a heavy physical object with mass, applying a simulated 'force' to slowly move it towards the trending number. With mass, it has a bit of inertia and is less likely to be affected by jitter.

Here's a rough sample:
```
// desiredAverageProgressRate is computed from the weighted average above
// m_averageProgressRate is a member variable also in progress units/sec
// lastTimeElapsed = the time delta in seconds (since last simulation) 
// m_averageSpeed is a member variable in units/sec, used to hold the 
// the velocity of m_averageProgressRate


const float frictionCoeff = 0.75f;
const float mass = 4.0f;
const float maxSpeedCoeff = 0.25f;

// lose 25% of our speed per sec, simulating friction
m_averageSeekSpeed *= pow(frictionCoeff, lastTimeElapsed); 

float delta = desiredAvgProgressRate - m_averageProgressRate;

// update the velocity
float oldSpeed = m_averageSeekSpeed;
float accel = delta / mass;    
m_averageSeekSpeed += accel * lastTimeElapsed;  // v += at

// clamp the top speed to 25% of our current value
float sign = (m_averageSeekSpeed > 0.0f ? 1.0f : -1.0f);
float maxVal = m_averageProgressRate * maxSpeedCoeff;
if (fabs(m_averageSeekSpeed) > maxVal)
{
 m_averageSeekSpeed = sign * maxVal;
}

// make sure they have the same sign
if ((m_averageSeekSpeed > 0.0f) == (delta > 0.0f))
{
 float adjust = (oldSpeed + m_averageSeekSpeed) * 0.5f * lastTimeElapsed;

 // don't overshoot.
 if (fabs(adjust) > fabs(delta))
 {
    adjust = delta;
            // apply damping
    m_averageSeekSpeed *= 0.25f;
 }

 m_averageProgressRate += adjust;
}    
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2020-12-22 16:17
I have tried and simplified your "easy"/"wrong"/"OK" formula and it works best for me:
```
t / p - t
```
In Python:
```
>>> done=0.3; duration=10; "time left: %i" % (duration / done - duration)
'time left: 23'
```
That saves one op compared to (dur*(1-done)/done). And, in the edge case you describe, possibly ignoring the dialog for 30 minutes extra hardly matters after waiting all night.

Comparing this simple method to the one used by Transmission, I found it to be up to 72% more accurate.
0 讨论(0)
发布评论:

提交评论
- 加载中...
傲寒

2020-12-22 16:23

In certain instances, when you need to perform the same task on a regular basis, it might be a good idea of using past completion times to average against.

For example, I have an application that loads the iTunes library via its COM interface. The size of a given iTunes library generally do not increase dramatically from launch-to-launch in terms of the number of items, so in this example it might be possible to track the last three load times and load rates and then average against that and compute your current ETA.

This would be hugely more accurate than an instantaneous measurement and probably more consistent as well.

However, this method depends upon the size of the task being relatively similar to the previous ones, so this would not work for a decompressing method or something else where any given byte stream is the data to be crunched.

Just my $0.02

0 讨论(0)
发布评论:

提交评论
- 加载中...
佛祖请我去吃肉

2020-12-22 16:23
Uniform averaging

The simplest approach would be to predict the remaining time linearly:
```
t_rem := t_spent ( n - prog ) / prog
```
where t_rem is the predicted ETA, t_spent is the time elapsed since the commencement of the operation, prog the number of microtasks completed out of their full quantity n. To explain—n may be the number of rows in a table to process or the number of files to copy.

This method having no parameters, one need not worry about the fine-tuning of the exponent of attenuation. The trade-off is poor adaptation to a changing progress rate because all samples have equal contribution to the estimate, whereas it is only meet that recent samples should be have more weight that old ones, which leads us to

Exponential smoothing of rate

in which the standard technique is to estimate progress rate by averaging previous point measurements:
```
rate := 1 / (n * dt); { rate equals normalized progress per unit time }
if prog = 1 then      { if first microtask just completed }
    rate_est := rate; { initialize the estimate }
else
begin
    weight   := Exp( - dt / DECAY_T );
    rate_est := rate_est * weight + rate * (1.0 - weight);
    t_rem    := (1.0 - prog / n) / rate_est;
end;
```
where dt denotes the duration of the last completed microtask and is equal to the time passed since the previous progress update. Notice that weight is not a constant and must be adjusted according the length of time during which a certain rate was observed, because the longer we observed a certain speed the higher the exponential decay of the previous measurements. The constant DECAY_T denotes the length of time during which the weight of a sample decreases by a factor of e. SPWorley himself suggested a similar modification to gooli's proposal, although he applied it to the wrong term. An exponential average for equidistant measurements is:
```
Avg_e(n) = Avg_e(n-1) * alpha + m_n * (1 - alpha)
```
but what if the samples are not equidistant, as is the case with times in a typical progress bar? Take into account that alpha above is but an empirical quotient whose true value is:
```
alpha = Exp( - lambda * dt ),
```
where lambda is the parameter of the exponential window and dt the amount of change since the previous sample, which need not be time, but any linear and additive parameter. alpha is constant for equidistant measurements but varies with dt.

Mark that this method relies on a predefined time constant and is not scalable in time. In other words, if the exactly same process be uniformly slowed-down by a constant factor, this rate-based filter will become proportionally more sensitive to signal variations because at every step weight will be decreased. If we, however, desire a smoothing independent of the time scale, we should consider

Exponential smoothing of slowness

which is essentially the smoothing of rate turned upside down with the added simplification of a constant weight of because prog is growing by equidistant increments:
```
slowness := n * dt;   { slowness is the amount of time per unity progress }
if prog = 1 then      { if first microtask just completed }
    slowness_est := slowness; { initialize the estimate }
else
begin
    weight       := Exp( - 1 / (n * DECAY_P ) );
    slowness_est := slowness_est * weight + slowness * (1.0 - weight);
    t_rem        := (1.0 - prog / n) * slowness_est;
end;
```
The dimensionless constant DECAY_P denotes the normalized progress difference between two samples of which the weights are in the ratio of one to e. In other words, this constant determines the width of the smoothing window in progress domain, rather than in time domain. This technique is therefore independent of the time scale and has a constant spatial resolution.

Futher research: adaptive exponential smoothing

You are now equipped to try the various algorithms of adaptive exponential smoothing. Only remember to apply it to slowness rather than to rate.
0 讨论(0)
发布评论:

提交评论
- 加载中...
挽巷

2020-12-22 16:24
Whilst all the examples are valid, for the specific case of 'time left to download', I thought it would be a good idea to look at existing open source projects to see what they do.

From what I can see, Mozilla Firefox is the best at estimating the time remaining.

Mozilla Firefox

Firefox keeps a track of the last estimate for time remaining, and by using this and the current estimate for time remaining, it performs a smoothing function on the time. See the ETA code here. This uses a 'speed' which is previously caculated here and is a smoothed average of the last 10 readings.

This is a little complex, so to paraphrase:
- Take a smoothed average of the speed based 90% on the previous speed and 10% on the new speed.
- With this smoothed average speed work out the estimated time remaining.
- Use this estimated time remaining, and the previous estimated time remaining to created a new estimated time remaining (in order to avoid jumping)
Google Chrome

Chrome seems to jump about all over the place, and the code shows this.

One thing I do like with Chrome though is how they format time remaining. For > 1 hour it says '1 hrs left' For < 1 hour it says '59 mins left' For < 1 minute it says '52 secs left'

You can see how it's formatted here

DownThemAll! Manager

It doesn't use anything clever, meaning the ETA jumps about all over the place.

See the code here

pySmartDL (a python downloader)

Takes the average ETA of the last 30 ETA calculations. Sounds like a reasonable way to do it.

See the code here/blob/916f2592db326241a2bf4d8f2e0719c58b71e385/pySmartDL/pySmartDL.py#L651)

Transmission

Gives a pretty good ETA in most cases (except when starting off, as might be expected).

Uses a smoothing factor over the past 5 readings, similar to Firefox but not quite as complex. Fundamentally similar to Gooli's answer.

See the code here
0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2020-12-22 16:24

I always wish these things would tell me a range. If it said, "This task will most likely be done in between 8 min and 30 minutes," then I have some idea of what kind of break to take. If it's bouncing all over the place, I'm tempted to watch it until it settles down, which is a big waste of time.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页

Smart progress bar ETA computation

Uniform averaging

Exponential smoothing of rate

Exponential smoothing of slowness

Futher research: adaptive exponential smoothing