I am parsing CSV files to lists of objects with strongly-typed properties. This involves parsing each string value from the file to an IConvertible
type (
If you have a known set of types to convert, you can do a series of if/elseif/elseif/else
(or switch/case
on the type name) to essentially distribute it to specialized parsing methods. This should be pretty fast. This is as described in @Fabio's answer.
If you still have performance issues, you can also create a lookup table which will let you add new parsing methods as you need to support them:
Given some basic parsing wrappers:
public delegate bool TryParseMethod<T>(string input, out T value);
public interface ITryParser
{
bool TryParse(string input, out object value);
}
public class TryParser<T> : ITryParser
{
private TryParseMethod<T> ParsingMethod;
public TryParser(TryParseMethod<T> parsingMethod)
{
this.ParsingMethod = parsingMethod;
}
public bool TryParse(string input, out object value)
{
T parsedOutput;
bool success = ParsingMethod(input, out parsedOutput);
value = parsedOutput;
return success;
}
}
You can then setup a conversion helper which does the lookup and calls the appropriate parser:
public static class DataConversion
{
private static Dictionary<Type, ITryParser> Parsers;
static DataConversion()
{
Parsers = new Dictionary<Type, ITryParser>();
AddParser<DateTime>(DateTime.TryParse);
AddParser<int>(Int32.TryParse);
AddParser<double>(Double.TryParse);
AddParser<decimal>(Decimal.TryParse);
AddParser<string>((string input, out string value) => {value = input; return true;});
}
public static void AddParser<T>(TryParseMethod<T> parseMethod)
{
Parsers.Add(typeof(T), new TryParser<T>(parseMethod));
}
public static bool Convert<T>(string input, out T value)
{
object parseResult;
bool success = Convert(typeof(T), input, out parseResult);
if (success)
value = (T)parseResult;
else
value = default(T);
return success;
}
public static bool Convert(Type type, string input, out object value)
{
ITryParser parser;
if (Parsers.TryGetValue(type, out parser))
return parser.TryParse(input, out value);
else
throw new NotSupportedException(String.Format("The specified type \"{0}\" is not supported.", type.FullName));
}
}
Then usage might be like:
//for a known type at compile time
int value;
if (!DataConversion.Convert<int>("3", out value))
{
//log failure
}
//or for unknown type at compile time:
object value;
if (!DataConversion.Convert(myType, dataValue, out value))
{
//log failure
}
This could probably have the generics expanded on to avoid object
boxing and type casting, but as it stands this works fine; perhaps only optimize that aspect if you have a measurable performance from it.
EDIT: You can update the DataConversion.Convert
method so that if it doesn't have the specified converter registered, it can fall-back to your TypeConverter
method or throw an appropriate exception. It's up to you if you want to have a catch-all or simply have your predefined set of supported types and avoid having your try/catch
all over again. As it stands, the code has been updated to throw a NotSupportedException
with a message indicating the unsupported type. Feel free to tweak as it makes sense. Performance wise, maybe it makes sense to do the catch-all as perhaps those will be fewer and far between once you specify specialized parsers for the most commonly used types.
How about constructing a regular expression for each type and applying it to the string before calling Parse? You'd have to build the regular expression such that if the string doesn't match, it wouldn't parse. This would be a little slower if the string parses since you'd have to do the regex test, but it would be way faster if it doesn't parse.
You could put the regex strings in a Dictionary<Type, string>
, which would make determining which regex string to use simple.
You could use the TryParse
method :
if (DateTime.TryParse(input, out dateTime))
{
Console.WriteLine(dateTime);
}
If you know a type where you trying to parse, then use TryParse method:
String value;
Int32 parsedValue;
if (Int32.TryParse(value, parsedValue) == True)
// actions if parsed ok
else
// actions if not parsed
Same for other types
Decimal.TryParse(value, parsedValue)
Double.TryParse(value, parsedValue)
DateTime.TryParse(value, parsedValue)
Or you can use next workaround:
Create a parse methods for every type with same name, but different signature(wrap TryParse inside of them):
Private bool TryParsing(String value, out Int32 parsedValue)
{
Return Int32.TryParse(value, parsedValue)
}
Private bool TryParsing(String value, out Double parsedValue)
{
Return Double.TryParse(value, parsedValue)
}
Private bool TryParsing(String value, out Decimal parsedValue)
{
Return Decimal.TryParse(value, parsedValue)
}
Private bool TryParsing(String value, out DateTime parsedValue)
{
Return DateTime.TryParse(value, parsedValue)
}
Then you can use method TryParsing
with your types
It depends. If you're using a DateTime, you can always use the TryParse function. This will be a magnitude faster.