Vertical bar (|) Unicode replacement

二次信任 提交于 2019-12-03 16:25:37

I do not understand what you really need. Do you need to change the separator sequence to something guaranteed not to exist in the dataset?

If so, then that’s what Unicode’s 66 “non-character” code points are specifically designed for. You can use them as internal sentinels knowing that they cannot occur in valid data.

If you’re just looking for a visual lookalike, that’s very different. I would not suggest that, because there are lots of confusables. Here are just a few of those:

U+0007C ‭ |  GC=Sm SC=Common       VERTICAL LINE
U+000A6 ‭ ¦  GC=So SC=Common       BROKEN BAR
U+002C8 ‭ ˈ  GC=Lm SC=Common       MODIFIER LETTER VERTICAL LINE
U+002CC ‭ ˌ  GC=Lm SC=Common       MODIFIER LETTER LOW VERTICAL LINE
U+02016 ‭ ‖  GC=Po SC=Common       DOUBLE VERTICAL LINE
U+023D0 ‭ ⏐  GC=So SC=Common       VERTICAL LINE EXTENSION
U+02758 ‭ ❘  GC=So SC=Common       LIGHT VERTICAL BAR
U+02759 ‭ ❙  GC=So SC=Common       MEDIUM VERTICAL BAR
U+0275A ‭ ❚  GC=So SC=Common       HEAVY VERTICAL BAR
U+02AF4 ‭ ⫴  GC=Sm SC=Common       TRIPLE VERTICAL BAR BINARY RELATION
U+02AF5 ‭ ⫵  GC=Sm SC=Common       TRIPLE VERTICAL BAR WITH HORIZONTAL STROKE
U+02AFC ‭ ⫼  GC=Sm SC=Common       LARGE TRIPLE VERTICAL BAR OPERATOR
U+02AFE ‭ ⫾  GC=Sm SC=Common       WHITE VERTICAL BAR
U+02AFF ‭ ⫿  GC=Sm SC=Common       N-ARY WHITE VERTICAL BAR
U+0FF5C ‭ | GC=Sm SC=Common       FULLWIDTH VERTICAL LINE
U+0FFE4 ‭ ¦ GC=So SC=Common       FULLWIDTH BROKEN BAR
user1254893

There's a 'light vertical bar' in Unicode: ❘, codepoint U+2758

http://www.fileformat.info/info/unicode/char/007c/index.htm

See Also:

  • latin letter dental click U+01C0
  • hebrew punctuation paseq U+05C0
  • divides U+2223
  • light vertical bar U+2758

Unicode, and indeed ASCII before it, has characters that are designed to be used for exactly your situation.

There are characters that are designed to be used as:

  • Unit Separator (): Between fields of a record, or members of a row.
  • Record Separators (): End of a record or row

Those characters you see are the visual representations:

  • U+241F - Symbol for unit separator
  • : U+241E - Symbol for record separator

Now in reality you aren't supposed to use those characters. The actual characters go back to the ASCII days:

Character         Symbol  ASCII  Unicode  Unicode name
----------------  ------  -----  -------  -------------------------
Unit separator        ␟  0x0F   U+001F   Information separator one
Record separator      ␞  0x1E   U+001E   Information separator two

Unfortunately the actual record separator and unit separator characters are unprintable:

  • Field separator:
  • Record separator:

Which is why it is nice that the symbols exist for those characters:

  • Field separator: ␟
  • Record separator: ␞

And nothing stops you from using those characters themselves:

AUD␟Australian dollar␟0.923
BRL␟Brazilian real␟0.3443
CNY␟Chinese renminbi␟0.1926
EUR␟European euro␟1.5009
JPY␟Japanese yen␟0.01229
MXN␟Mexican peso␟0.06894
NOK␟Norwegian krone␟0.154
RUB␟Russian ruble␟0.02074
CHF␟Swiss franc␟1.3448
GBP␟UK pound sterling␟1.6844
VND␟Vietnamese dong␟0.000057

I know you said you wanted something visually similar. But:

  • stackoverflow is a wiki, where we add useful information
  • it's nice when there's an exact intended solution for a given problem
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!