How can I re-add a unicode byte order marker in linux?

前端 未结 7 1901
-上瘾入骨i
-上瘾入骨i 2021-02-13 15:22

I have a rather large SQL file which starts with the byte order marker of FFFE. I have split this file using the unicode aware linux split tool into 100,000 line chunks. But whe

7条回答
  •  日久生厌
    2021-02-13 15:48

    Matthew Flaschen's answer is a good one, however it has a couple of flaws.

    • There's no check that the copy succeeded before the original file is truncated. It would be better to make everything contingent on a successful copy, or test for the existence of the temporary file, or to operate on the copy. If you're a belt-and-suspenders kind of person, you'd do a combo as I've illustrated below
    • The ls is unnecessary.
    • I'd use a better variable name than "i" - perhaps "file".

    Of course, you could be very paranoid and check for the existence of the temporary file at the beginning so you don't accidentally overwrite it and/or use a UUID or a generated file name. One of mktemp, tempfile or uuidgen would do the trick.

    td=TMPDIR
    export TMPDIR=
    
    usertemp=~/temp            # set this to use a temp directory on the same filesystem
                               # you could use ./temp to ensure that it's one the same one
                               # you can use mktemp -d to create the dir instead of mkdir
    
    if [[ ! -d $usertemp ]]    # if this user temp directory doesn't exist
    then                       # then create it, unless you can't 
        mkdir $usertemp || export TMPDIR=$td    # if you can't create it and TMPDIR is/was
    fi                                          # empty then mktemp automatically falls
                                                # back to /tmp
    
    for file in *.sql
    do
        # TMPDIR if set overrides the argument to -p
        temp=$(mktemp -p $usertemp) || { echo "$0: Unable to create temp file."; exit 1; }
    
        { printf '\xFF\xFE' > "$temp" &&
        cat "$file" >> "$temp"; } || { echo "$0: Write failed on $file"; exit 1; }
    
        { rm "$file" && 
        mv "$temp" "$file"; } || { echo "$0: Replacement failed for $file; exit 1; }
    done
    export TMPDIR=$td
    

    Traps might be better than all the separate error handlers I've added.

    No doubt all this extra caution is overkill for a one-shot script, but these techniques can save you when push comes to shove, especially in a multi-file operation.

提交回复
热议问题