OpenXML Excel how to change value of a cell when value is in SharedStringTable

纵然是瞬间 提交于 2019-12-24 17:17:27

问题


I am looking for a safe and efficient way to update the value of a cell where the text may be in the SharedStringTable (this appears to be the case of any spreadsheet created by MS Excel).

As the name implies SharedStringTable contains strings that may be used in multiple cells.

So just finding the item in the string table and update the value is NOT the way to go as it may be in use by other cells as well.

As far as I understand one must do the following:

  1. Check if the cell is using string table

  2. If so, check if the new string is already there in which case just use it (remember to remove the item with the old string if it is no longer in use by any other cells!)

  3. If not, check if the item with old string is refered to by any other cells in the spreadsheet

  4. If so, create new item with the new string and refer to it

  5. If not, just update existing item with new string

Are there any easier solution to this using the OpenXML SDK?

Also consider that one may want to update not only one cell but rather set new (different) values for several cells. So we may be calling the update cell method in a loop ...


回答1:


First take on this. Appears to work for my particular case. But it must be possible to improve on or, even better, do totally different:

private static void UpdateCell(SharedStringTable sharedStringTable, 
   Dictionary<string, SheetData> sheetDatas, string sheetName, 
   string cellReference, string text)
{
   Cell cell = sheetDatas[sheetName].Descendants<Cell>()
    .FirstOrDefault(c => c.CellReference.Value == cellReference);
   if (cell == null) return;
   if (cell.DataType == null || cell.DataType != CellValues.SharedString)
   {
    cell.RemoveAllChildren();
    cell.AppendChild(new InlineString(new Text { Text = text }));
    cell.DataType = CellValues.InlineString;
    return;
   }
   // Cell is refering to string table. Check if new text is already in string table, if so use it.
   IEnumerable<SharedStringItem> sharedStringItems 
    = sharedStringTable.Elements<SharedStringItem>();
   int i = 0;
   foreach (SharedStringItem sharedStringItem in sharedStringItems)
   {
    if (sharedStringItem.InnerText == text)
    {
       cell.CellValue = new CellValue(i.ToString());
       // TODO: Should clean up, ie remove item with old text from string table if it is no longer in use.
       return;
    }
    i++;
   }
   // New text not in string table. Check if any other cells in the Workbook referes to item with old text.
   foreach (SheetData sheetData in sheetDatas.Values)
   {
    var cells = sheetData.Descendants<Cell>();
    foreach (Cell cell0 in cells)
    {
       if (cell0.Equals(cell)) continue;
       if (cell0.DataType != null 
       && cell0.DataType == CellValues.SharedString 
       && cell0.CellValue.InnerText == cell.CellValue.InnerText)
       {
        // Other cells refer to item with old text so we cannot update it. Add new item.
        sharedStringTable.AppendChild(new SharedStringItem(new Text(text)));
        cell.CellValue.Text = (i).ToString();
        return;
       }
    }
   }
   // No other cells refered to old item. Update it.
   sharedStringItems.ElementAt(int.Parse(cell.CellValue.InnerText)).Text = new Text(text);
}

....

private static void DoIt(string filePath)
{
   using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(filePath, true))
   {
    SharedStringTable sharedStringTable 
       = spreadSheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>()
        .First().SharedStringTable;
    Dictionary<string, SheetData> sheetDatas = new Dictionary<string, SheetData>();
    foreach (var sheet in spreadSheet.WorkbookPart.Workbook.Descendants<Sheet>())
    {
       SheetData sheetData 
        = (spreadSheet.WorkbookPart.GetPartById(sheet.Id) as WorksheetPart)
           .Worksheet.GetFirstChild<SheetData>();
       sheetDatas.Add(sheet.Name, sheetData);
    }
    UpdateCell(sharedStringTable, sheetDatas, "Sheet1", "A2", "Mjau");
   }
}

WARNING: Do NOT use the above as is, it works with a particular spreadsheet. It is very likely things not handled if one use it in other situations. This is my first attempt at OpenXML for spreadsheet. Ended up following the suggestion made by George Polevoy. Much easier and appears to have no ill side-effects (That said there are a million other issues to handle when manipulating spreadsheets which may be edited outside your control...)




回答2:


As you can see the update operation of the shared string table really keeps developers busy.

In my experience shared string table does not add anything in terms of performance and file size economy. OpenXml format is compressed inside a packaging container anyway, so even if you have massively duplicated strings it won't affect the file size.

Microsoft Excel writes everything in shared string tables, even there's no duplication.

I'd recommend just to convert everything to InlineStrings before modifying the document, and the further operation becomes as simple as it gets.

You can write it simply as InlineStrings, and that would be a functionally equal document file.

Microsoft Excel would convert it back to shared string tables when the file is edited, but who cares.

I would suggest the shared string table feature removed in future versions of the standard, unless justified by some sound benchmarks.



来源:https://stackoverflow.com/questions/33719193/openxml-excel-how-to-change-value-of-a-cell-when-value-is-in-sharedstringtable

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!