问题
Is it possible to submit input to an FParsec parser in chunks, as from a socket? If not, is it possible to retrieve the current result and unparsed portion of an input stream so that I might accomplish this? I'm trying to run the chunks of input coming in from SocketAsyncEventArgs
without buffering entire messages.
Update
The reason for noting the use of SocketAsyncEventArgs
was to denote that sending data to a CharStream
might result in asynchronous access to the underlying Stream
. Specifically, I'm looking at using a circular buffer to push the data coming in from the socket. I remember the FParsec documentation noting that the underlying Stream
should not be accessed asynchronously, so I had planned on manually controlling the chunked parsing.
Ultimate questions:
- Can I use a circular buffer under my
Stream
passed to theCharStream
? - Do I not need to worry myself with manually controlling the chunking in this scenario?
回答1:
The normal version of FParsec (though not the Low-Trust version) reads the input chunk-wise, or "block-wise", as I call it in the CharStream documentation. Thus, if you construct a CharStream
from a System.IO.Stream
and the content is large enough to span multiple CharStream
blocks, you can start parsing before you've fully retrieved the input.
Note however, that the CharStream
will consume the input stream in chunks of a fixed (but configurable) size, i.e. it will call the Read
method of the System.IO.Stream
as often as is necessary to fill a complete block. Hence, if you parse the input faster than you can retrieve new input, the CharStream
may block even though there is already some unparsed input, because there's not yet enough input to fill a complete block.
Update
The answer(s) to your ultimate questions: 42.
How you implement the
Stream
from which you construct theCharStream
is entirely up to you. The restriction you're remembering that excludes parallel access only applies to theCharStream
class, which isn't thread safe.Implementing the
Stream
as a circular buffer will likely restrict the maximum distance over which you can backtrack.The block size of the
CharStream
influences how far you can backtrack when theStream
does not support seeking.The simplest way to parse input asynchronously is to do the parsing in an async task (i.e. on a background thread). In the task you could simply read the socket synchronously, or, if you don't trust the buffering by the OS, you could use a stream class like the
BlockingStream
described in the article you linked in the second comment below.If the input can be easily separated into independent chunks (e.g. lines for a line-based text format), it might be more efficient to chunk it up yourself and then parse the input chunk by chunk.
来源:https://stackoverflow.com/questions/8891019/chunked-parsing-with-fparsec