Select and sum DataTable rows with criteria

问题

I have this data table:

DataTable dt = new DataTable();
dt.Columns.Add("BBG IPC code", typeof(double));
dt.Columns.Add("Issuer Group", typeof(string));
dt.Columns.Add("Seniority", typeof(string));
dt.Columns.Add("Nom Value", typeof(double));
dt.Columns.Add("Mkt Value", typeof(double));
dt.Columns.Add("Rating", typeof(string));
dt.Columns.Add("Sector", typeof(string));
dt.Columns.Add("Analyst", typeof(string));
dt.Rows.Add(new object[] { 117896, "Financiere", "Senior", 101, 20000.76, "BB", "Materials", "BAETZ" });
dt.Rows.Add(new object[] { 117896, "Financiere", "Senior", 356, 300500, "BBB", "Materials", "BAETZ" });
dt.Rows.Add(new object[] { 117896, "Financiere", "Senior", 356, 30000, "BBB", "Energy", "BAETZ" });
dt.Rows.Add(new object[] { 117896, "Financiere", "Covered", 4888, 10000, "BB", "Energy", "BAETZ" });
dt.Rows.Add(new object[] { 117896, "Financiere", "Covered", 645, 50000, "BBB", "Energy", "BAETZ" });
dt.Rows.Add(new object[] { 117897, "Scentre Group", "Senior", 46452, 51066.5, "AA", "Energy", "BAETZ" });
dt.Rows.Add(new object[] { 117898, "Vereniging Achmea", "Senior", 778, 90789.9, "C", "Insurance", "BAETZ" });
dt.Rows.Add(new object[] { 117898, "Vereniging Achmea", "Senior", 7852, 10055.66, "C", "Utilities", "BAETZ" });

For each couple of values BBG IPC code and Seniority i need to check if the value of the columns Rating and Sector its the same, if its the same then merge this rows and sum the value of Mkt Value and Nom Value. Instead, if one or both are not the same, i need to select the row that has the highest value of Mkt Value(if the value is equal just take 1 row) and discard the other rows BUT in the column Mkt Value and Nom Value i still need the sum of all the rows.

For example: for the BBG IPC code number 117896 in the code there are different values of Rating and Sector i need the row with the highest value of Mkt Value (second row 300500) and discard the other 2 rows with low Mkt Value but before discard them i need to sum 300500+20000+30000 and 356+356+101. The result is {117896,"Financiere","Senior",813,350500,"BBB", "Materials", "BAETZ"}

I've tryed something like this, but there is an error that tells me i can't put in the CopyToDataTable a string value referred to the Field "Seniority"...

DataTable maxIPC_Seniority = dt.AsEnumerable()
            .OrderByDescending(x => x.Field<double>("Mkt Value"))
            .GroupBy(x => x.Field<double>("IPC"), x => x.Field<string>("Seniority"))
            .Select(x => x.FirstOrDefault())
            .CopyToDataTable();

And remains the problem of sum the discarded rows. Thank You for help.

回答1:

One problem is that when you're calling GroupBy, you're setting the "IPC" column as the Key selector, but there is no "IPC" column in the table. Instead you should use the actual column name, "BBG IPC code".

The next problem is that you're calling an overload of GroupBy which takes a key selector as the first argument and an element selector as the second argument, so it's just selecting the "Seniority" column in the groups.

Instead, to group by two columns as the key, we need to create a new anonymous object for the Key that contains properties with the column values:

var maxIPC_Seniority = dt.AsEnumerable()
    .OrderByDescending(row => row.Field<double>("Mkt Value"))
    .GroupBy(row =>
        new
        {
            IPC = row.Field<double>("BBG IPC code"),
            Seniority = row.Field<string>("Seniority")
        })
    .Select(group => group.FirstOrDefault())
    .CopyToDataTable();

Now, to do the combining of rows as you want to do, I think the only way to do that is to select a collection of object[] with the new data and then add those to the resulting table, since we can't just create a DataRow without a DataTable, so my answer does three things:

Create a new DataTable with the required columns
Select the merged data from the original table as an IEnumerable<object[]>
Add each object[] as a DataRow to the DataTable from step 1

For example:

// Create a new DataTable with the same columns as `dt`
DataTable maxIpcSeniority = dt.Clone();

// Group our set of original data, do the merging of rows as necessary
// and then return the row data as a list of object[]
var maxIpcSeniorityRowData = dt.AsEnumerable()
    .OrderByDescending(row => row.Field<double>("Mkt Value"))
    .GroupBy(row =>
        new
        {
            IPC = row.Field<double>("BBG IPC code"),
            Seniority = row.Field<string>("Seniority")
        })
    .Select(group =>
    {
        // Since the data is ordered by MktValue already, we can just grab 
        // the first one to use for filling in the non-merged fields
        var firstRow = group.First();

        return new object[]
        {
            group.Key.IPC,
            firstRow.Field<string>("Issuer Group"),
            group.Key.Seniority,
            group.Sum(row => row.Field<double>("Nom Value")),
            group.Sum(row => row.Field<double>("Mkt Value")),
            firstRow.Field<string>("Rating"),
            firstRow.Field<string>("Sector"),
            firstRow.Field<string>("Analyst")
        };
    })
    .ToList();

// Add each set of rowData to our new table
foreach (var rowData in maxIpcSeniorityRowData)
{
    maxIpcSeniority.Rows.Add(rowData);
}

If you can't use curly braces for some reason, you could use a Tuple (or even create a separate class) to use to store the GroupBy fields instead of an anonymous type. That way you can add the values through the constructor instead of initializing properties in curly braces. (Note that if you do create a class to do this, you'd need to overwrite Equals and GetHashCode for the grouping to work correctly).

Here's an example using a Tuple<double, string>:

var maxIpcSeniorityRowData = dt.AsEnumerable()
    .OrderByDescending(row => row.Field<double>("Mkt Value"))
    .GroupBy(row => new Tuple<double, string>(
        row.Field<double>("BBG IPC code"), 
        row.Field<string>("Seniority")))
    .Select(group =>
    {
        var firstRow = group.First();

        return new object[]
        {
            group.Key.Item1,
            firstRow.Field<string>("Issuer Group"),
            group.Key.Item2,
            group.Sum(row => row.Field<double>("Nom Value")),
            group.Sum(row => row.Field<double>("Mkt Value")),
            firstRow.Field<string>("Rating"),
            firstRow.Field<string>("Sector"),
            firstRow.Field<string>("Analyst")
        };
    })
    .ToList();

来源：https://stackoverflow.com/questions/63195468/select-and-sum-datatable-rows-with-criteria

标签

linq

datatable