How to find datetime values in an Epplus ExcelWorksheet

淺唱寂寞╮ 提交于 2020-01-21 07:30:31

问题


I have a working ExcelPackage function in an MVC 5 application where I can successfully output a strongly-typed model to a new Excel file.

I have a case where a specific column is nullable DateTime with four possible values:

  • null
  • date only
  • time only
  • date and time

I have extracted a function that does the job of identifying any DateTime values in the Worksheet. Although the DateTime values are consistently identified and formatted correctly, I can't help but think there might be a better way to achieve the same outcome than a nested loop.

Because this function (and the containing class) are designed to accept any IEnumerable, I cannot predict which column -- or even if -- the IEnumerable will contain DateTime values. For that reason, I cannot hard code for a known range of cells.

To provide better context, the class constructor is pasted here, followed by the function that I am trying to make better. Even if you take away the comments, it's still a pretty ugly set of nested code.

In summary, my question is this: am I missing a more elegant or simpler coding approach in terms of the C# language or the Nuget Epplus package?

public class EpplusExcelPackage<T>
{
    private IEnumerable<T> _data;
    private string _reportName;

    public EpplusExcelPackage(IEnumerable<T> Data, string ReportName)
    {
        this._data = Data;
        this._reportName = ReportName;    
    }

    // much more code...

This is the method I hope to make more efficient:

private static void FormatDateTimeValuesInWorksheet(ExcelWorksheet worksheet)
{
    /* correctly format datetime values as:
     *     if date only, format as shortdate 
     *     if time only, format as am/pm time
     *     if date & time present, format as default datetime */

    // the worksheet is data is a strongly-typed model, populated in the model constructor

    System.DateTime dateValue; // used as the out variable of DateTime.TryParse()

    // nested for-loop to find datetime values in worksheet
    for (int i = worksheet.Dimension.Start.Column; i < worksheet.Dimension.End.Column; i++)
    {
        for (int j = worksheet.Dimension.Start.Row; j < worksheet.Dimension.End.Row; j++)
        {
            // ignore null cell values to prevent null exception error
            if (worksheet.Cells[i, j].Value != null)
            {
                // convert the cell value to string: required by TryParse()
                string cellValue = worksheet.Cells[i, j].Value.ToString();

                // identify type of datetime and format accordingly
                if (DateTime.TryParse(cellValue, out dateValue))
                {
                    if (dateValue.Date == Convert.ToDateTime("12/30/1899"))
                    {
                        worksheet.Cells[i, j].Value = dateValue.ToShortTimeString();
                    }
                    else if (dateValue.TimeOfDay.TotalSeconds == 0)
                    {
                        worksheet.Cells[i, j].Value = dateValue.ToShortDateString();
                    }
                    else // do not change
                    {
                        worksheet.Cells[i, j].Value = worksheet.Cells[i, j].Value;
                    }
                }
            }
        }
    }
}

回答1:


It comes down to how confident you are that the source excel tables are "properly" formatted. By that I mean are they stored as proper dates (i.e. numeric) or could you have the common excel problem of "numbers stored as string".

If the data is generally clean then you could avoid much of the casting back and forth you are doing to strings and dates by checking their types. Even this is not completely straight forward since Epplus likes to do its own interpretation when importing dates.

Take a look at this table (focus on col A):

Rows 1 - 4 have "properly" formatted data. That means that the dates and time are stored in excel as doubles. Rows 5-8 are "badly" formatted - numbers (and date/time) stored as string. If you run this:

var workbook = pck.Workbook;
var worksheet = workbook.Worksheets.First();
var cells = worksheet.Cells;

foreach (var cell in cells)
    Console.WriteLine($"{{Cell: {cell.Address}, Display: {cell.Text}, Value: {cell.Value}, Type: {cell.Value.GetType()}}}");

You get this in the output:

{Cell: A1, Display: 11:33:00 AM, Value: 0.48125, Type: System.Double}
{Cell: A2, Display: 1/1/2016, Value: 1/1/2016 12:00:00 AM, Type: System.DateTime}
{Cell: A3, Display: 1/1/16 11:33 AM, Value: 42370.48125, Type: System.Double}
{Cell: A4, Display: 1264, Value: 1264, Type: System.Double}
{Cell: A5, Display: 11:33:00 AM, Value: 11:33:00 AM, Type: System.String}
{Cell: A6, Display: 1/1/2016, Value: 1/1/2016, Type: System.String}
{Cell: A7, Display: 1/1/2016  11:33:00 AM, Value: 1/1/2016  11:33:00 AM, Type: System.String}
{Cell: A8, Display: 1264, Value: 1264, Type: System.String}

Since dates and times are just numbers technically (the integer part is the date and the decimal is the time) this gives you a way to convert or separate them. A time double of 0.0 means 00:00:00. Note that row 3 shows as a System.DateTime because, like I said, Epplus just happens to recognize that Excel Style as such but the others are Doubles.

So, you could use a Type check and avoid much of the string conversion and comparison. Again, if you are worried about badly formatted data then your approach is probably as good as any. I would suggest making the value Convert.ToDateTime("12/30/1899")) a constant somewhere before the for loops rather then recreating it every time you increment to save on some cpu cycles.



来源:https://stackoverflow.com/questions/37495707/how-to-find-datetime-values-in-an-epplus-excelworksheet

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!