i have a VB.NET program that handles the content of documents. The programm handles high volumes of documents as \"batch\"(>2Million documents;total 1TB volume) Some of this doc
Here is the POSIX regex for control characters: [:cntrl:], from Regular Expression on Wikipedia.
[:cntrl:]
Try
resultString = Regex.Replace(subjectString, "\p{C}+", "");
This will remove all "other" Unicode characters (control, format, private use, surrogate, and unassigned) from your string.