Команды SFK


    1        2        3        4        5        6        7        8        9        10    

Раздел 10. Help - Помощь
help chain | help chars | help colors | help compile | help nocase | help office | help options | help patterns | help select | help shell | help unicode | help var | samp |

Help:   Рус   |   Eng        

Команда: help chars
characters and codepages with SFK for Windows:

   SFK uses 8-bit character codes with a possible
   range of 255 different characters. see: sfk ascii

   character codes 32-126, or hexadecimal 0x20-0x7E,
   are 7-bit ASCII characters. within SFK they are
   called "Low Codes", or LoCodes. as long as you
   use only a-z A-Z 0-9 !"#$%&_ etc. you use LoCodes,
   which will work the same on every computer in the
   world, and you can ignore code pages.
 
   but as soon as you want to use accent characters,
   umlauts, cyrillic, greek etc. you need HiCodes
   in the range 0x80-0xFF. these are dependent on the
   codepages of your Windows system, and you can only
   use chars of your own language, plus English.

   your Windows CMD.EXE command line uses two codepages:

   1. ANSI codepage 1251 for data processing.
      every text within SFK is encoded in this codepage.
      Most text editor programs like Notepad will
      use this codepage by default.
 
   2. Dos/OEM codepage 866 for input and display.
      what you type on your keyboard is encoded in 866.
      the CMD.EXE terminal can only display HiCodes in
      this codepage correctly.

   HiCode conversions step by step:

   -  when you run sfk, and pass parameters, these are
      converted from OEM to Ansi and then given to sfk.
      so sfk gets only Ansi encoded parameters.

   -  within SFK all data processing is done with Ansi,
      e.g. filter ... +xed ... will pass Ansi text.

   -  when printing text to terminal, SFK converts it
      from Ansi to OEM for output. otherwise HiCodes
      would all look wrong, as the terminal needs OEM.

   -  when writing text output to file, like
         filter ... >out.txt
         filter ... +tofile out.txt
      it is written as Ansi, without any conversion.
      you can then open out.txt with the Notepad
      or Depeche View, which expect Ansi text,
      and HiChars will display correctly.

   Beware of HiCodes within batch files.
 
   -  if you run SFK interactively like:
         sfk filter in.txt -+myword
      and myword contains HiCodes, you type them
      all as OEM chars, and it works.
 
   -  if you create a batch file with Windows Notepad,
      and therein type
         sfk filter in.txt -+myword
      and myword contains HiCodes, you will find that
      filter no longer finds the word.
      Because Notepad created an Ansi encoded text file,
      so the "myword" chars are Ansi encoded.

      what happens?
      -  CMD.EXE still thinks "myword" is OEM,
         and incorrectly "converts" it to Ansi,
         which actually breaks all HiCode chars.
      -  sfk.exe then gets myword with completely
         wrong encoding, and the search fails.

      how to fix this:
      -  write your .bat files with OEM encoding.
         this can be done with Notepad++:
         -  create a new file mytest.bat
         -  select: Encoding / Character Set / your area,
            then select your OEM codepage.
         -  now type sfk commands into the batch file,
            and save it.
      -  side effect: if you create sfk scripts
         embedded in such a batch file, like:
            sfk batch mytest2.bat
         searches therein will fail again if this
         is OEM encoded. because by default "sfk script"
         wants to load Ansi text. to fix this use
         option -dos like: sfk script -dos ...

   What is not possible?
 
   SFK cannot process any text outside your Ansi codepage.

   for example, if a computer uses Western Europe
   codepage 1252, it is possible to search German umlauts
   and some French accent characters. but it is impossible
   to search and filter cyrillic text (encoded in 1251),
   and it will even be impossible to type cyrillic chars
   in the first place, as the keyboard has no such keys.

   see also:
      sfk help nocase   about case insensitive search
      sfk help unicode  unicode to Ansi conversion