问题 In an application that accepts, stores, processes, and displays Unicode text (for the purpose of discussion, let's say that it's a web application), which characters should always be removed from incoming text? I can think of some, mostly listed in the C0 and C1 control codes Wikipedia article: The range 0x00 - 0x19 (mostly control characters), excluding 0x09 (tab), 0x0A (LF), and 0x0D (CR) The range 0x7F - 0x9F (more control characters) Ranges of characters that can safely be accepted would