I am using the code below to read a ~2.5Gb Xml file as fast as I can (thanks to MemoryMappedFile). However, I am getting the following exception: "'.', hexadecimal value 0x00, is an invalid character. Line 9778, position 73249406.". I beleive it is due to some encoding problem. How do I make sure that the MemoryMappedViewStream reads the file using UTF-8?
static void Main(string[] args)
{
using (var file = MemoryMappedFile.CreateFromFile(@"d:\temp\temp.xml", FileMode.Open, "MyMemMapFile"))
{
using (MemoryMappedViewStream stream = file.CreateViewStream())
{
Read(stream);
}
}
}
static void Read(Stream stream)
{
using (XmlReader reader = XmlReader.Create(stream))
{
reader.MoveToContent();
while (reader.Read())
{
}
}
}
You could use the StreamReader class to set the encoding:
static void Main(string[] args)
{
using (var file = MemoryMappedFile.CreateFromFile(@"d:\temp\temp.xml", FileMode.Open, "MyMemMapFile"))
{
using (MemoryMappedViewStream stream = file.CreateViewStream())
{
Read(stream);
}
}
}
static void Read(Stream stream)
{
using (XmlReader reader = XmlReader.Create(new StreamReader(stream, Encoding.UTF8)))
{
reader.MoveToContent();
while (reader.Read())
{
}
}
}
Hope, this helps.
On MSDN you get the following.
"The XmlReader scans the first bytes of the stream looking for a byte order mark or other sign of encoding"
Does your xml file specify an encoding?
<?xml version="1.0" encoding="UTF-8"?>
来源:https://stackoverflow.com/questions/7125050/change-the-encoding-to-utf-8-on-a-stream-memorymappedviewstream