OpenXML - Cell.DateType is null

≯℡__Kan透↙ 提交于 2020-12-05 05:35:40

问题


I can't determine when a Cell is a date.

I noticed the DataType is null so I can't distinguish if it's a Number of a Date.

I am using the next code to extract the cells:

WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(worksheetId);
SheetData sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
Row[] rows = worksheetPart.Worksheet.Descendants<Row>().ToArray();
for (int i = 0; i < rows.Length; i++)
{
    List<Cell> cells = rows[i].Elements<Cell>().ToList();
    foreach (var cell in cells) 
    {
        if (cell.DataType != null && cell.DataType.Value == CellValues.Date)
            //this line is not hit for some reason
         }
    }
}

Am I missing something?


回答1:


In short it is null because it is supposed to be for numeric and date types.

OpenXML documentation on msdn

The value of the DataType property is null for numeric and date types. It contains the value CellValues.SharedString for strings, and CellValues.Boolean for Boolean values.

There is a way though to distinguish between date and number cell formats using the NumberFormatId on the CellFormat. The trick is finding what id maps to what format. You can find out what format to use by creating new excel file and set a cell to the format in question (i.e. date):

Then extract excel file using 7zip and look inside xl/styles.xml file:

In the image above you can see this formatId 14 translates to short date. For a complete list of formats please refer to ECMA-376 documentation for Office Open XML formats (table should be buried somewhere inside part 4).

I created an enumeration for the most common formatIds:

private enum Formats
    {
        General = 0,
        Number = 1,
        Decimal = 2,
        Currency = 164,
        Accounting = 44,
        DateShort = 14,
        DateLong = 165,
        Time = 166,
        Percentage = 10,
        Fraction = 12,
        Scientific = 11,
        Text = 49
    }

You could then create a helper function that will get you the formatted value the way you would like:

private static string GetFormattedCellValue(WorkbookPart workbookPart, Cell cell)
    {
        if (cell == null)
        {
            return null;
        }

        string value = "";
        if (cell.DataType == null) // number & dates
        {
            int styleIndex = (int)cell.StyleIndex.Value;
            CellFormat cellFormat = (CellFormat)workbookPart.WorkbookStylesPart.Stylesheet.CellFormats.ElementAt(styleIndex);
            uint formatId = cellFormat.NumberFormatId.Value;

            if (formatId == (uint)Formats.DateShort || formatId == (uint)Formats.DateLong)
            {
                double oaDate;
                if (double.TryParse(cell.InnerText, out oaDate))
                {
                    value = DateTime.FromOADate(oaDate).ToShortDateString();
                }
            }
            else
            {
                value = cell.InnerText;
            }
        }
        else // Shared string or boolean
        {
            switch (cell.DataType.Value)
            {
                case CellValues.SharedString:
                    SharedStringItem ssi = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(int.Parse(cell.CellValue.InnerText));
                    value = ssi.Text.Text;
                    break;
                case CellValues.Boolean:
                    value = cell.CellValue.InnerText == "0" ? "false" : "true";
                    break;
                default:
                    value = cell.CellValue.InnerText;
                    break;
            }
        }

        return value;
    }



回答2:


The previous answer states that the formats are documented in Part 4 but apparently they have been moved to Part 1. I found them by searching for #,##0.00. They are currently in 18.8.30 numFmt (Number Format) for the number formats whose formatCode value is implied rather than explicitly saved in the file. See 18.8.31 numFmts (Number Formats) for number formats in this workbook, consisting of a sequence of numFmt records, where each numFmt record defines a particular number format, indicating how to format and render the numeric value of a cell.

The following are the implied formatCode attribute for all languages.

ID formatCode
0 General
1 0
2 0.00
3 #,##0
4 #,##0.00
9 0%
10 0.00%
11 0.00E+00
12 # ?/?
13 # ??/??
14 mm-dd-yy
15 d-mmm-yy
16 d-mmm
17 mmm-yy
18 h:mm AM/PM
19 h:mm:ss AM/PM
20 h:mm
21 h:mm:ss
22 m/d/yy h:mm
37 #,##0 ;(#,##0)
38 #,##0 ;[Red](#,##0)
39 #,##0.00;(#,##0.00)
40 #,##0.00;[Red](#,##0.00)
45 mm:ss
46 [h]:mm:ss
47 mmss.0
48 ##0.0E+0
49 @

There are many other implied format codes that are language-dependent, such as:

28 [$-404]e"年"m"月"d"日" m"月"d"日"



回答3:


Have you tried cell.HasValue insteed? Because int and Datetime is not always nullable, it depends of how the code is written.



来源:https://stackoverflow.com/questions/36670768/openxml-cell-datetype-is-null

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!