LINQ aggregate and group by periods of time

后端 未结 8 2040
醉酒成梦
醉酒成梦 2020-11-28 06:28

I\'m trying to understand how LINQ can be used to group data by intervals of time; and then ideally aggregate each group.

Finding numerous examples with explicit dat

相关标签:
8条回答
  • 2020-11-28 07:04

    For grouping by hour you need to group by the hour part of your timestamp which could be done as so:

    var groups = from s in series
      let groupKey = new DateTime(s.timestamp.Year, s.timestamp.Month, s.timestamp.Day, s.timestamp.Hour, 0, 0)
      group s by groupKey into g select new
                                          {
                                            TimeStamp = g.Key,
                                            Value = g.Average(a=>a.value)
                                          };
    
    0 讨论(0)
  • 2020-11-28 07:05

    A generalised solution:

        static IEnumerable<IGrouping<DateRange, T>> GroupBy<T>(this IOrderedEnumerable<T> enumerable, TimeSpan timeSpan, Func<T, DateTime> predicate)
        {
            Grouping<T> grouping = null;
            foreach (var (a, dt) in from b in enumerable select (b, predicate.Invoke(b)))
            {
                if (grouping == null || dt > grouping.Key.End)
                    yield return grouping = new Grouping<T>(new DateRange(dt, dt + timeSpan), a);
                else
                    grouping.Add(a);
            }
        }
    
        class Grouping<T> : IGrouping<DateRange, T>
        {
    
            readonly List<T> elements = new List<T>();
    
            public DateRange Key { get; }
    
            public Grouping(DateRange key) => Key = key;
    
            public Grouping(DateRange key, T element) : this(key) => Add(element);
    
            public void Add(T element) => elements.Add(element);
    
            public IEnumerator<T> GetEnumerator()=> this.elements.GetEnumerator();
    
            IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
        }
    
        class DateRange
        {
        
            public DateRange(DateTime start, DateTime end)
            {
                this.Start = start;
                this.End = end;
            }
    
            public DateTime Start { get; set; }
            public DateTime End { get; set; }
        }
    

    Test based on question (using AutoFixture library)

         void Test()
        {
            var many = new Fixture().CreateMany<Sample>(100);
    
            var groups = many.OrderBy(a => a.timestamp).GroupBy(TimeSpan.FromDays(365), a => a.timestamp).Select(a => a.Average(b => b.value)).ToArray();
    
        }
    
        public class Sample
        {
            public DateTime timestamp;
            public double value;
        }
    
    0 讨论(0)
  • 2020-11-28 07:07

    Even though I am really late, here are my 2 cents:

    I wanted to Round() the time values down AND up in 5 minute intervals:

    10:31 --> 10:30
    10:33 --> 10:35
    10:36 --> 10:35
    

    This can be achieved by converting to TimeSpan.Tick and converting back to DateTime and using Math.Round():

    public DateTime GetShiftedTimeStamp(DateTime timeStamp, int minutes)
    {
        return
            new DateTime(
                Convert.ToInt64(
                    Math.Round(timeStamp.Ticks / (decimal)TimeSpan.FromMinutes(minutes).Ticks, 0, MidpointRounding.AwayFromZero)
                        * TimeSpan.FromMinutes(minutes).Ticks));
    }
    

    The shiftedTimeStamp can be used in linq grouping as shown above.

    0 讨论(0)
  • 2020-11-28 07:15

    I'm very late to the game on this one, but I came accross this while searching for something else, and I thought i had a better way.

    series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
            .Select (s => new {
                series = s
                ,timestamp = s.First ().timestamp
                ,average = s.Average (x => x.value )
            }).Dump();
    

    Here is a sample linqpad program so you can validate and test

    void Main()
    {
        List<Sample> series = new List<Sample>();
    
        Random random = new Random(DateTime.Now.Millisecond);
        for (DateTime i = DateTime.Now.AddDays(-5); i < DateTime.Now; i += TimeSpan.FromMinutes(1))
        {
            series.Add(new UserQuery.Sample(){ timestamp = i, value = random.NextDouble() * 100 });
        }
        //series.Dump();
        series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
            .Select (s => new {
                series = s
                ,timestamp = s.First ().timestamp
                ,average = s.Average (x => x.value )
            }).Dump();
    }
    
    // Define other methods and classes here
    public class Sample
    {
         public DateTime timestamp;
         public double value;
    }
    
    0 讨论(0)
  • 2020-11-28 07:16

    I know this doesn't directly answer the question, but I was googling around looking for a very similar solution to aggregate candle data for stocks / crypto currencies from a smaller minute period to a higher minute period (5, 10, 15, 30). You can't simply go back from the current minute taking X at a time, as the timestamps for the aggregated periods won't be consistent. You also have to watch out that there's enough data at the start and end of the list to populate a full candlestick of the larger period. Given that, the solution I came up with was as follows. (It assumes that the candles for the smaller period, as indicated by rawPeriod, are sorted by ascending Timestamp.)

    public class Candle
    {
        public long Id { get; set; }
        public Period Period { get; set; }
        public DateTime Timestamp { get; set; }
        public double High { get; set; }
        public double Low { get; set; }
        public double Open { get; set; }
        public double Close { get; set; }
        public double BuyVolume { get; set; }
        public double SellVolume { get; set; }
    }
    
    public enum Period
    {
        Minute = 1,
        FiveMinutes = 5,
        QuarterOfAnHour = 15,
        HalfAnHour = 30
    }
    
        private List<Candle> AggregateCandlesIntoRequestedTimePeriod(Period rawPeriod, Period requestedPeriod, List<Candle> candles)
        {
            if (rawPeriod != requestedPeriod)
            {
                int rawPeriodDivisor = (int) requestedPeriod;
                candles = candles
                            .GroupBy(g => new { TimeBoundary = new DateTime(g.Timestamp.Year, g.Timestamp.Month, g.Timestamp.Day, g.Timestamp.Hour, (g.Timestamp.Minute / rawPeriodDivisor) * rawPeriodDivisor , 0) })
                            .Where(g => g.Count() == rawPeriodDivisor )
                            .Select(s => new Candle
                            {
                                Period = requestedPeriod,
                                Timestamp = s.Key.TimeBoundary,
                                High = s.Max(z => z.High),
                                Low = s.Min(z => z.Low),
                                Open = s.First().Open,
                                Close = s.Last().Close,
                                BuyVolume = s.Sum(z => z.BuyVolume),
                                SellVolume = s.Sum(z => z.SellVolume),
                            })
                            .OrderBy(o => o.Timestamp)
                            .ToList();
            }
    
            return candles;
        }
    
    0 讨论(0)
  • 2020-11-28 07:19

    I improved on BrokenGlass's answer by making it more generic and added safeguards. With his current answer, if you chose an interval of 9, it will not do what you'd expect. The same goes for any number 60 is not divisible by. For this example, I'm using 9 and starting at midnight (0:00).

    • Everything from 0:00 to 0:08.999 will be put into a group of 0:00 as you'd expect. It will keep doing this until you get to the grouping that starts at 0:54.
    • At 0:54, it will only group things from 0:54 to 0:59.999 instead of going up to 01:03.999.

    For me, this is a massive issue.

    I'm not sure how to fix that, but you can add safeguards.
    Changes:

    1. Any minute where 60 % [interval] equals 0 will be an acceptable interval. The if statements below safeguard this.
    2. Hour intervals work as well.

              double minIntervalAsDouble = Convert.ToDouble(minInterval);
              if (minIntervalAsDouble <= 0)
              {
                  string message = "minInterval must be a positive number, exiting";
                  Log.getInstance().Info(message);
                  throw new Exception(message);
              }
              else if (minIntervalAsDouble < 60.0 && 60.0 % minIntervalAsDouble != 0)
              {
                  string message = "60 must be divisible by minInterval...exiting";
                  Log.getInstance().Info(message);
                  throw new Exception(message);
              }
              else if (minIntervalAsDouble >= 60.0 && (24.0 % (minIntervalAsDouble / 60.0)) != 0 && (24.0 % (minIntervalAsDouble / 60.0) != 24.0))
              {
                  //hour part must be divisible...
                  string message = "If minInterval is greater than 60, 24 must be divisible by minInterval/60 (hour value)...exiting";
                  Log.getInstance().Info(message);
                  throw new Exception(message);
              }
              var groups = datas.GroupBy(x =>
              {
                  if (minInterval < 60)
                  {
                      var stamp = x.Created;
                      stamp = stamp.AddMinutes(-(stamp.Minute % minInterval));
                      stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                      stamp = stamp.AddSeconds(-stamp.Second);
                      return stamp;
                  }
                  else
                  {
                      var stamp = x.Created;
                      int hourValue = minInterval / 60;
                      stamp = stamp.AddHours(-(stamp.Hour % hourValue));
                      stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                      stamp = stamp.AddSeconds(-stamp.Second);
                      stamp = stamp.AddMinutes(-stamp.Minute);
                      return stamp;
                  }
              }).Select(o => new
              {
                  o.Key,
                  min = o.Min(f=>f.Created),
                  max = o.Max(f=>f.Created),
                  o
              }).ToList();
      

    Put whatever you'd like in the select statement! I put in min/max because it was easier to test it.

    0 讨论(0)
提交回复
热议问题