MongoDB: How to query a time-series with incomplete data?

后端 未结 1 387
南方客
南方客 2021-01-26 22:20

I\'m storing time series data in a mongoDB collection with one data point every 15min. But sometimes, due to bad conditions, some data points get lost. I have a dataset as follo

1条回答
  •  一生所求
    2021-01-26 23:08

    Here is aggregation with the approach I had mentioned in my first comment:

    db.collection.aggregate( [
      { 
          $sort: { timestamp: 1 } 
      },
      { 
          $group: { 
               _id: null,
               docs: { $push: { timestamp: "$timestamp", device_id: "$device_id", temp: "$temp", missing: false } },
               device_id: { $first: "$device_id" },
               start: { $first: { $toInt: { $divide: [ { "$toLong": "$timestamp" }, 1000 ] } } }, 
               end: { $last: { $toInt: { $divide: [ { "$toLong": "$timestamp" }, 1000 ] } } }
          } 
      },
      { 
          $addFields: {
               docs: {
                   $map: {
                        input: { $range: [ { $toInt: "$start" }, { $add: [ { $toInt: "$end" }, 900 ] }, 900 ] }, 
                        as: "ts",
                        in: {
                            ts_exists: { $arrayElemAt: [ 
                                                  { $filter: { 
                                                          input: "$docs", as: "d", 
                                                          cond: { $eq: [ { $toInt: { $divide: [ { "$toLong": "$$d.timestamp" }, 1000 ] } },
                                                                          "$$ts"
                                                                 ] }
                                                   }}, 
                                         0 ] },
                             ts: "$$ts"
                        }
                  }
              }
          }
      },
      { 
          $unwind: "$docs" 
      },
      { 
          $addFields: { 
              docs: { 
                  $ifNull: [ "$docs.ts_exists", { timestamp: { $toDate: { $multiply: [ "$docs.ts", 1000 ] } }, 
                                                  temp: 0, device_id: "$device_id", missing: true 
                                                 } 
                           ] 
              }
          }
      },
      { 
          $replaceRoot: { newRoot: "$docs" } 
      }
    ] ).pretty()
    

    Using the following input documents:

    {"device_id": "ABC","temp": 12,"timestamp": ISODate("2020-01-04T17:45:00.000+00:00") },
    {"device_id": "ABC","temp": 10,"timestamp": ISODate("2020-01-04T18:00:00.000+00:00") },
    {"device_id": "ABC","temp": 4,"timestamp": ISODate("2020-01-04T18:30:00.000+00:00") },
    {"device_id": "ABC","temp": 23,"timestamp": ISODate("2020-01-04T18:45:00.000+00:00") }
    

    The result:

    {
            "timestamp" : ISODate("2020-01-04T17:45:00Z"),
            "device_id" : "ABC",
            "temp" : 12,
            "missing" : false
    }
    {
            "timestamp" : ISODate("2020-01-04T18:00:00Z"),
            "device_id" : "ABC",
            "temp" : 10,
            "missing" : false
    }
    {
            "timestamp" : ISODate("2020-01-04T18:15:00Z"),
            "temp" : 0,
            "device_id" : "ABC",
            "missing" : true
    }
    {
            "timestamp" : ISODate("2020-01-04T18:30:00Z"),
            "device_id" : "ABC",
            "temp" : 4,
            "missing" : false
    }
    {
            "timestamp" : ISODate("2020-01-04T18:45:00Z"),
            "device_id" : "ABC",
            "temp" : 23,
            "missing" : false
    }
    

    0 讨论(0)
提交回复
热议问题