问题
I would like to be able to efficiently retrieve a multi-dimensional array of formatted cell values from Excel. When I say formatted values, I mean I would like to get them exactly as they appear in Excel with all the cell NumberFormat applied.
The Range.Value and Range.Value2 properties work great for retrieving the cell values of a large number of cells into a multi-dimensional array. But those are the actual cell values (well at least with Range.Value2 is, I'm not quite sure what Range.Value is doing with respect to some of the values).
If I want to retrieve the actual text that is displayed in the cells, I can use the Range.Text property. This has some caveats. First, you need to AutoFit the cells or else you may get something like #### if not all the text is visible with the current cell width. Secondly, Range.Text does not work for more than one cell at a time so you would have to loop through all of the cells in the range and this can be extremely slow for large data sets.
The other method that I tried is to copy the range into the clipboard and then parse the clipboard text as a tab-separated data stream and transfer it into a multi-dimensional array. This seems to work great, although it is slower than getting Range.Value2, it is much faster for large datasets than getting Range.Text. However, I don't like the idea of using the system clipboard. If this was a really long operation that takes 60 seconds and while that operation is running, the user may decide to switch to another application and would be very unhappy to find that their clipboard either doesn't work or has mysterious data in it.
Is there a way that I can retrieve the formatted cell values to a multi-dimensional array efficiently?
I have added some sample code that is run from a couple ribbon buttons in a VSTO app. The first set some good test values and number formats and the second button will display what they look like when retrieved using one of these methods in a MessageBox.
The sample output on my system is(It could be different on yours due to Regional Settings):
Output using Range.Value
1/25/2008 3:19:32 PM 5.12345
2008-01-25 15:19:32 0.456
Output using Range.Value2
39472.6385648148 5.12345
2008-01-25 15:19:32 0.456
Output using Clipboard Copy
1/25/2008 15:19 5.12
2008-01-25 15:19:32 45.60%
Output using Range.Text and Autofit
1/25/2008 15:19 5.12
2008-01-25 15:19:32 45.60%
The Range.Text and Clipboard methods produce the correct output, but as explained above they both have problems: Range.Text is slow and Clipboard is bad practice.
private void SetSampleValues()
{
var sheet = (Microsoft.Office.Interop.Excel.Worksheet) Globals.ThisAddIn.Application.ActiveSheet;
sheet.Cells.ClearContents();
sheet.Cells.ClearFormats();
var range = sheet.Range["A1"];
range.NumberFormat = "General";
range.Value2 = "2008-01-25 15:19:32";
range = sheet.Range["A2"];
range.NumberFormat = "@";
range.Value2 = "2008-01-25 15:19:32";
range = sheet.Range["B1"];
range.NumberFormat = "0.00";
range.Value2 = "5.12345";
range = sheet.Range["B2"];
range.NumberFormat = "0.00%";
range.Value2 = ".456";
}
private string ArrayToString(ref object[,] vals)
{
int dim1Start = vals.GetLowerBound(0); //Excel Interop will return index-1 based arrays instead of index-0 based
int dim1End = vals.GetUpperBound(0);
int dim2Start = vals.GetLowerBound(1);
int dim2End = vals.GetUpperBound(1);
var sb = new StringBuilder();
for (int i = dim1Start; i <= dim1End; i++)
{
for (int j = dim2Start; j <= dim2End; j++)
{
sb.Append(vals[i, j]);
if (j != dim2End)
sb.Append("\t");
}
sb.Append("\n");
}
return sb.ToString();
}
private void GetCellValues()
{
var sheet = (Microsoft.Office.Interop.Excel.Worksheet)Globals.ThisAddIn.Application.ActiveSheet;
var usedRange = sheet.UsedRange;
var sb = new StringBuilder();
sb.Append("Output using Range.Value\n");
var vals = (object [,]) usedRange.Value; //1-based array
sb.Append(ArrayToString(ref vals));
sb.Append("\nOutput using Range.Value2\n");
vals = (object[,])usedRange.Value2; //1-based array
sb.Append(ArrayToString(ref vals));
sb.Append("\nOutput using Clipboard Copy\n");
string previousClipboardText = Clipboard.GetText();
usedRange.Copy();
string clipboardText = Clipboard.GetText();
Clipboard.SetText(previousClipboardText);
vals = new object[usedRange.Rows.Count, usedRange.Columns.Count]; //0-based array
ParseClipboard(clipboardText,ref vals);
sb.Append(ArrayToString(ref vals));
sb.Append("\nOutput using Range.Text and Autofit\n");
//if you dont autofit, Range.Text may give you something like #####
usedRange.Columns.AutoFit();
usedRange.Rows.AutoFit();
vals = new object[usedRange.Rows.Count, usedRange.Columns.Count];
int startRow = usedRange.Row;
int endRow = usedRange.Row + usedRange.Rows.Count - 1;
int startCol = usedRange.Column;
int endCol = usedRange.Column + usedRange.Columns.Count - 1;
for (int r = startRow; r <= endRow; r++)
{
for (int c = startCol; c <= endCol; c++)
{
vals[r - startRow, c - startCol] = sheet.Cells[r, c].Text;
}
}
sb.Append(ArrayToString(ref vals));
MessageBox.Show(sb.ToString());
}
//requires reference to Microsoft.VisualBasic to get TextFieldParser
private void ParseClipboard(string text, ref object[,] vals)
{
using (var tabReader = new TextFieldParser(new StringReader(text)))
{
tabReader.SetDelimiters("\t");
tabReader.HasFieldsEnclosedInQuotes = true;
int row = 0;
while (!tabReader.EndOfData)
{
var fields = tabReader.ReadFields();
for (int i = 0; i < fields.Length; i++)
vals[row, i] = fields[i];
row++;
}
}
}
private void button1_Click(object sender, RibbonControlEventArgs e)
{
SetSampleValues();
}
private void button2_Click(object sender, RibbonControlEventArgs e)
{
GetCellValues();
}
回答1:
I've found a partial solution. Apply the NumberFormat value to the parsed double of Value2. This only works for single cells as returning an array for NumberFormat with different formats in the array returns System.DBNull.
double.Parse(o.Value2.ToString()).ToString(o.NumberFormat.ToString())
The dates don't work with this though. If you know which columns contains certain things, like a formatted date, you can use DateTime.FromOADate on the double and then value.ToString(format) with the NumberFormat. The code below gets close but is not complete.
<snip>
sb.Append("\nOutput using Range.Value2\n");
vals = (object[,])usedRange.Value2; //1-based array
var format = GetFormat(usedRange);
sb.Append(ArrayToString(ref vals, format));
</snip>
private static object[,] GetFormat(Microsoft.Office.Interop.Excel.Range range)
{
var rows = range.Rows.Count;
var cols = range.Columns.Count;
object[,] vals = new object[rows, cols];
for (int r = 1; r <= rows; ++r)
{
for (int c = 1; c <= cols; ++c)
{
vals[r-1, c-1] = range[r, c].NumberFormat;
}
}
return vals;
}
private static string ArrayToString(ref object[,] vals, object[,] numberformat = null)
{
int dim1Start = vals.GetLowerBound(0); //Excel Interop will return index-1 based arrays instead of index-0 based
int dim1End = vals.GetUpperBound(0);
int dim2Start = vals.GetLowerBound(1);
int dim2End = vals.GetUpperBound(1);
var sb = new StringBuilder();
for (int i = dim1Start; i <= dim1End; i++)
{
for (int j = dim2Start; j <= dim2End; j++)
{
if (numberformat != null)
{
var format = numberformat[i-1, j-1].ToString();
double v;
if (double.TryParse(vals[i, j].ToString(), out v))
{
if (format.Contains(@"/") || format.Contains(":"))
{// parse a date
var date = DateTime.FromOADate(v);
sb.Append(date.ToString(format));
}
else
{
sb.Append(v.ToString(format));
}
}
else
{
sb.Append(vals[i, j].ToString());
}
}
else
{
sb.Append(vals[i, j]);
}
if (j != dim2End)
sb.Append("\t");
}
sb.Append("\n");
}
return sb.ToString();
}
回答2:
One solution to your problem is to use:
Range(XYZ).Value(11) = Range(ABC).Value(11)
As described here, this will:
Returns the recordset representation of the specified Range object in an XML format.
Assuming that your excel is configured in OpenXML format, this will copy the value/formula AND the formatting of range ABC and inject it into range XYZ.
Additionally, this answer explains the difference between Value and Value2.
.Value2 gives you the underlying value of the cell (could be empty, string, error, number (double) or boolean)
.Value gives you the same as .Value2 except if the cell was formatted as currency or date it gives you a VBA currency (which may truncate decimal places) or VBA date.
来源:https://stackoverflow.com/questions/29993785/get-formatted-cell-values-efficiently