I have a batch script that uses drag and drop and creates some html code based on the filenames of the dropped files/folders. With
chcp 65001
Include them inside your batch file.
@echo off
for /f "tokens=2 delims=:" %%f in ('findstr /b /c:"BOFM:" "%~dpnx0"') do echo %%f
exit /b
rem Here starts the special characters part
BOFM:ÿþ:
The line which starts with BOFM: is typed as ALT+charchode to get the desired characters.
EDITED -
I give up. I'm not able to make it work consistently with multiple pagecodes across batch file, datafiles and editors. There is no way to guarantee what will be generated. So, i took @foxidrive answer (awesome!) to generate the file prefix and tried.
What i've found is that if we use FF FE
as a prefix for a file generated from cmd
not in unicode mode (/u
parameter) but with a unicode pagecode (65001), we are generating a file marked as unicode (the prefix) but the content is not, we only generate one byte per character. So we get the "chinese"? characters, just a bad translation of a single byte character flow into two byte characters.
If we use the same prefix but from a unicode cmd (with /u
parameter) and an unicode pagecode (65001), then a real unicode file is generated, and the content is correctly shown from command line, notepad and browsers (tested in ie and firefox). But this is a real unicode file, so two bytes per character are generated.
Instead of FF FE
, we can send a utf8 BOM EF BB BF
, from a non unicode cmd but with unicode pagecode. This generates a utf8 with BOM prefix, one or multibyte for character (depends on utf encoding of each character) which shows correctly in editors and browsers but not in command line.
The code (adapted from OP attached files) i've been trying is (to be run from a non unicode cmd):
@echo off
if ["%~1"]==[""] goto :EOF
setlocal enableextensions enabledelayedexpansion
rem File to generate
set "myFile=aText.txt"
rem save current pagecode
for /f "tokens=2 delims=:" %%f in ('chcp') do set "cp=%%f"
rem Generate BOM
call :generateBOM "%myFile%"
rem change to unicode
chcp 65001 > nul
:loop
echo %1 >> "%myFile%"
for %%a in ("%1") do (
echo %%~nxa
echo ^
^^
) >> "%myFile%"
shift
if ["%~1"]==[""] goto showData
goto loop
:showData
"%myFile%"
:endProcess
rem Cleanup and restore pagecode
endlocal & chcp %cp% > nul
exit /b
:generateBOM file
rem [ EF BB BF ] utf8 bom encoded value = 77u/
rem [ FF FE ] unicode bom encoded value = //4=
echo 77u/>"%~1"
rem Yes, certutil allows decode inplace, so no temporary file needed
certutil -f -decode "%~1" "%~1" >nul
endlocal
goto :EOF