We are using the vertical bar |
(|
) character as a field separator in one of our modules. so the users should not use this character in a title.
If they do use it, I would like to replace it with a similar character.
Is there a Unicode replacement for it? The only character that I have found and looks similar to it is broken vertical bar ¦
(¦
).
I do not understand what you really need. Do you need to change the separator sequence to something guaranteed not to exist in the dataset?
If so, then that’s what Unicode’s 66 “non-character” code points are specifically designed for. You can use them as internal sentinels knowing that they cannot occur in valid data.
If you’re just looking for a visual lookalike, that’s very different. I would not suggest that, because there are lots of confusables. Here are just a few of those:
U+0007C | GC=Sm SC=Common VERTICAL LINE
U+000A6 ¦ GC=So SC=Common BROKEN BAR
U+002C8 ˈ GC=Lm SC=Common MODIFIER LETTER VERTICAL LINE
U+002CC ˌ GC=Lm SC=Common MODIFIER LETTER LOW VERTICAL LINE
U+02016 ‖ GC=Po SC=Common DOUBLE VERTICAL LINE
U+023D0 ⏐ GC=So SC=Common VERTICAL LINE EXTENSION
U+02758 ❘ GC=So SC=Common LIGHT VERTICAL BAR
U+02759 ❙ GC=So SC=Common MEDIUM VERTICAL BAR
U+0275A ❚ GC=So SC=Common HEAVY VERTICAL BAR
U+02AF4 ⫴ GC=Sm SC=Common TRIPLE VERTICAL BAR BINARY RELATION
U+02AF5 ⫵ GC=Sm SC=Common TRIPLE VERTICAL BAR WITH HORIZONTAL STROKE
U+02AFC ⫼ GC=Sm SC=Common LARGE TRIPLE VERTICAL BAR OPERATOR
U+02AFE ⫾ GC=Sm SC=Common WHITE VERTICAL BAR
U+02AFF ⫿ GC=Sm SC=Common N-ARY WHITE VERTICAL BAR
U+0FF5C | GC=Sm SC=Common FULLWIDTH VERTICAL LINE
U+0FFE4 ¦ GC=So SC=Common FULLWIDTH BROKEN BAR
There's a 'light vertical bar' in Unicode: ❘, codepoint U+2758
http://www.fileformat.info/info/unicode/char/007c/index.htm
See Also:
- latin letter dental click U+01C0
- hebrew punctuation paseq U+05C0
- divides U+2223
- light vertical bar U+2758
Unicode, and indeed ASCII before it, has characters that are designed to be used for exactly your situation.
There are characters that are designed to be used as:
- Unit Separator (
␟
): Between fields of a record, or members of a row. - Record Separators (
␞
): End of a record or row
Those characters you see are the visual representations:
Now in reality you aren't supposed to use those characters. The actual characters go back to the ASCII days:
Character Symbol ASCII Unicode Unicode name
---------------- ------ ----- ------- -------------------------
Unit separator ␟ 0x0F U+001F Information separator one
Record separator ␞ 0x1E U+001E Information separator two
Unfortunately the actual record separator and unit separator characters are unprintable:
- Field separator:
- Record separator:
Which is why it is nice that the symbols exist for those characters:
- Field separator: ␟
- Record separator: ␞
And nothing stops you from using those characters themselves:
AUD␟Australian dollar␟0.923
BRL␟Brazilian real␟0.3443
CNY␟Chinese renminbi␟0.1926
EUR␟European euro␟1.5009
JPY␟Japanese yen␟0.01229
MXN␟Mexican peso␟0.06894
NOK␟Norwegian krone␟0.154
RUB␟Russian ruble␟0.02074
CHF␟Swiss franc␟1.3448
GBP␟UK pound sterling␟1.6844
VND␟Vietnamese dong␟0.000057
I know you said you wanted something visually similar. But:
- stackoverflow is a wiki, where we add useful information
- it's nice when there's an exact intended solution for a given problem
来源:https://stackoverflow.com/questions/10572627/vertical-bar-unicode-replacement