How do you echo a 4-digit Unicode character in Bash?

后端未结

关注

 18  1825

I\'d like to add the Unicode skull and crossbones to my shell prompt (specifically the \'SKULL AND CROSSBONES\' (U+2620)), but I can\'t figure out the magic incantation to m

相关标签:

18条回答

你的背包

2020-11-29 15:13
The printf builtin (just as the coreutils' printf) knows the \u escape sequence which accepts 4-digit Unicode characters:
```
   \uHHHH Unicode (ISO/IEC 10646) character with hex value HHHH (4 digits)
```
Test with Bash 4.2.37(1):
```
$ printf '\u2620\n'
☠
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

[愿得一人]

2020-11-29 15:14

Here is a list of all unicode emoji's available:

https://en.wikipedia.org/wiki/Emoji#Unicode_blocks

Example:

echo -e "\U1F304"


          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2020-11-29 15:14
              
            
            
                                                                       
Based on Stack Overflow questions Unix cut, remove first token and https://stackoverflow.com/a/15903654/781312:

(octal=$(echo -n ☠ | od -t o1 | head -1 | cut -d' ' -f2- | sed -e 's#\([0-9]\+\) *#\\0\1#g')
echo Octal representation is following $octal
echo -e "$octal")


Output is the following.

Octal representation is following \0342\0230\0240
☠

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长发绾君心        
                
              
                            
                2020-11-29 15:20
              
            
            
                                                                       
Any of these three commands will print the character you want in a console, provided the console do accept UTF-8 characters (most current ones do):

echo -e "SKULL AND CROSSBONES (U+2620) \U02620"
echo $'SKULL AND CROSSBONES (U+2620) \U02620'
printf "%b" "SKULL AND CROSSBONES (U+2620) \U02620\n"

SKULL AND CROSSBONES (U+2620) ☠


After, you could copy and paste the actual glyph (image, character) to any (UTF-8 enabled) text editor.

If you need to see how such Unicode Code Point is encoded in UTF-8, use xxd (much better hex viewer than od):

echo $'(U+2620) \U02620' | xxd
0000000: 2855 2b32 3632 3029 20e2 98a0 0a         (U+2620) ....

That means that the UTF8 encoding is: e2 98 a0


Or, in HEX to avoid errors: 0xE2 0x98 0xA0. That is, the values between the space (HEX 20) and the Line-Feed (Hex 0A).

If you want a deep dive into converting numbers to chars: look here to see an article from Greg's wiki (BashFAQ) about ASCII encoding in Bash!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  小鲜肉        
                
              
                            
                2020-11-29 15:21
              
            
            
                                                                       
Just put "☠" in your shell script. In the correct locale and on a Unicode-enabled console it'll print just fine:

$ echo ☠
☠
$


An ugly "workaround" would be to output the UTF-8 sequence, but that also depends on the encoding used:

$ echo -e '\xE2\x98\xA0'
☠
$

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2020-11-29 15:21
              
            
            
                                                                       
Sorry for reviving this old question.  But when using bash there is a very easy approach to create Unicode codepoints from plain ASCII input, which even does not fork at all:

unicode() { local -n a="$1"; local c; printf -vc '\\U%08x' "$2"; printf -va "$c"; }
unicodes() { local a c; for a; do printf -vc '\\U%08x' "$a"; printf "$c"; done; };


Use it as follows to define certain codepoints

unicode crossbones 0x2620
echo "$crossbones"


or to dump the first 65536 unicode codepoints to stdout (takes less than 2s on my machine.  The additional space is to prevent certain characters to flow into each other due to shell's monospace font):

for a in {0..65535}; do unicodes "$a"; printf ' '; done


or to tell a little very typical parent's story (this needs Unicode 2010):

unicodes 0x1F6BC 32 43 32 0x1F62D 32 32 43 32 0x1F37C 32 61 32 0x263A 32 32 43 32 0x1F4A9 10


Explanation:


printf '\UXXXXXXXX' prints out any Unicode character
printf '\\U%08x' number prints \UXXXXXXXX with the number converted to Hex, this then is fed to another printf to actually print out the Unicode character
printf recognizes octal (0oct), hex (0xHEX) and decimal (0 or numbers starting with 1 to 9) as numbers, so you can choose whichever representation fits best
printf -v var .. gathers the output of printf into a variable, without fork (which tremendously speeds up things)
local variable is there to not pollute the global namespace
local -n var=other aliases var to other, such that assignment to var alters other.  One interesting part here is, that var is part of the local namespace, while other is part of the global namespace.


Please note that there is no such thing as local or global namespace in bash.  Variables are kept in the environment, and such are always global.  Local just puts away the current value and restores it when the function is left again.  Other functions called from within the function with local will still see the "local" value.  This is a fundamentally different concept than all the normal scoping rules found in other languages (and what bash does is very powerful but can lead to errors if you are a programmer who is not aware of that).


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3