Команды SFK


    1        2        3        4        5        6        7        8        9        10    

Раздел 5. Search and Compare - Поиск и сравнение
deplist | dupfind | extract | find | hexfind | md5 | md5check | md5gento | ofind | pathfind | reflist | xfind | xfindbin | xhexfind |

Help:   Рус   |   Eng        

Команда: ofind
sfk ofind singleDirName "/searchtext/"
sfk ofind singleFileName "/searchtext/" [options]
sfk ofind -dir mydir -file .docx .xlsx -text "/from/[totext/]"

   search in office files like .docx .xlsx .ods .odt
   and in plain text files using wildcards * and ?
   as well as SFK Simple Expressions in brackets [].

   the search text must be surrounded by a delimiter like / or _
   or any other character not part of the search text.

   by default, full text lines containing hits are shown.
   use option -pure to show only the found text.

   search text can be followed by a totext to reformat output.

   subdirectories are included by default
      the sfk default for most commands is to process the given directories,
      as well as all subdirs within them. specify -nosub to disable this.

   options
      -nosub        do not include files in subdirectories.
      -verbose      always show which file is currently read.
      -justoffice   search only in office files, not in plain text etc.
      -case         case-sensitive text comparison. default is insensitive.
                    for details type: sfk help nocase
      -text         starts a list of search patterns of the form /src/ or
                    /src/totext/ where / is the separator char, src the text
                    to search for, and totext a mask to reformat output.
                    any separator char can be used which is not part of the
                    search text, i.e. /foo/ or _foo_ both search "foo".
                    -text is not required if a single filename is given.
      -pat          the same as -text, starting a pattern list.
      -bylist x.txt read search patterns from a file x.txt, supporting
                    multiple lines per pattern. (add -full for more.)
      -bylinelist x read /from/to/ or just /from/ patterns from a file x
                    with one pattern per line. (add -full for more.)
                    -by(line)list does not support sfk variables.
                    to use variables in patterns create an sfk script
                    with patterns as parameters. "sfk script" for more.
      -firsthit     show only first found pattern match per file.
      -utfin        with -utfout only: search text is already given
                    as UTF-8, do not convert internally for search.
      -tracesel     tell in detail which files are searched or ignored.
      -quiet        do not show progress infos.
      -names        list only names of files containing at least one hit.
      -notnames     list only names of files not containing any hit.
      -justrc       print no search results, just set return code on hits.
      -full         print full help text telling about -bylist pattern files,
                    special character case sensitivity and nested or repeated
                    replace behaviour.

   output options
      -utfout       keep raw UTF-8 encoding on output, to use it
                    with further commands requiring UTF-8 data.
      -conlines=n1  show n lines of context around search hits. by default
                    only text lines containing one or more hits are shown.
                    all lines together cannot hold more than:
      -conchars=n2  max. number of characters of all context lines together.
                    default is 240 or n1*160. cannot be larger than 32000.
      -conresline   show full result line but no further context (default)
      -sep[arator]  show "---" separator between hits within a file.
      -septext s    use separator text s (supports slash patterns \n etc.)
      -nosep        do not show "---" separator between hits within a file.
      -indent=n     set n chars of indentation for result display.
      -pure         extract only searched data, same as -context=0.
                    you may also set an environment variable:
                    set SFK_CONFIG=xfind:pure,xfindbin:pure
                    use -pure -tofile x to extract binary content as is.
      -fill=c       replace binary null and other unprintable characters
                    with character c. default is a dot "."
      -hex          print output as hex dump instead of plain text.
      -showle       highlight CR/LF line endings in hex dump output
      -nofile       do not insert :file header lines in output.
      -crlf, -lf    for file headers and default totext: force crlf or lf
                    line endings instead of system default
      -filehead s   file header to insert on every matching file.
                    only [file.name] surrounded by text can be used.
                    default is -filehead ":file [file.name]" unless a
                    single file is searched. cannot be used with xhexfind.
                    to get result and name in the same line use [file.name]
                    in the expression, like: sfk xfind -pure -nofile mydir
                    "/foo*bar/[file.name]: [all]\n/"
      -sep s        define separator s between hits in a file
      -rawterm      on output to terminal do not strip codes below 32.
                    null bytes are always stripped.
      -to dir\$file write output files to given path. for details about
                    output file masks, type "sfk help opt" or "sfk run".
      -tofile x     write output data to a single output filename x
                    (which is not interpreted as a mask but taken as is).
      +tofile x     as last parameter (command chaining): write text as
                    displayed on terminal to a file x.
      -more[n]      pause output every 30 or n lines.

   return codes for batch files
      0 = no matches, 1 = matches found, >1 = major error occurred.
      see also "sfk help opt" on how to influence error processing.

   quoted multi line parameters are supported in scripts
      using full trim. type "sfk script" for details.

   wildcards and SFK expressions
      SFK Expressions are simple patterns containing literal text,
      wildcards * and ? and character classes in square brackets [].
      basically, the syntax provides extended wilcards but no
      further logic and is not related to regular expressions.

      search patterns are surrounded by a separator character which
      can be anything not contained in the search text, like / or _

      within a pattern /fromtext/totext/ the fromtext may contain:

        *                       - 0 to 4000 characters in the same
                                  text line or paragraph, i.e. all
                                  bytes not being CR, LF or NULL.
                                  4000 is just a default maximum
                                  that can be changed by:
        [0.100000 chars]        - 0 to 100000 characters in the same
                                  text line or paragraph, i.e. the
                                  same as * but with a larger range.
        ?                       - one character.
        ?????                   - same as [5.5 chars] or [5 chars]
        [bytes]                 - 0 to 4000 bytes (with CR,LF,NULL)
                                  i.e. it collects stream text
                                  across lines, even in binary data
        **                      - the same as [bytes].
        [0.100 bytes]           - 0 to 100 bytes
        [.100000 bytes]         - up to 100000 bytes
        [1.* bytes]             - 1 to default maximum bytes
        [2 chars]               - exactly 2 chars
        [30 bytes]              - exactly 30 bytes
        [byte of aeiou]         - one vocal (a OR A OR e OR ...),
                                  case insensitive by default.
                                  "aeiou" is a character list.
        [byte of \\\x2f]        - a backslash \ or forw. slash /
        [bytes of \r\n \t]      - whitespace incl. line ends
        [bytes of (\r\n \t)]    - the same, () are optional
        [bytes not \r\n\0]      - up to 4000 bytes as long as no
                                  CR, LF or NULL byte appears
        [chars]                 - the same as [bytes not \r\n\0],
                                  i.e. collect text in a line
        [char not ( \t)]        - same as [byte not ( \r\n\0\t)],
                                  everything not blanks and tabs
        [char not )( \t]        - not brackets, blanks and tabs,
                                  same as not (\(\) \t)
        [chars of a-z0-9]       - means a-zA-Z0-9 as search is
                                  case insensitive by default
        [chars of \x61-\x7A]    - search a-z but not A-Z, or use
                                  option -case for case search
        [eol]                   - end of line by characters:
                                  CRLF or LF or CR

        [white]     = chars of (\t )     - 0 or more whitespaces
        [xwhite]    = bytes of (\t \r\n) - same but across lines
        [1 white]   = byte  of (\t )     - 1 whitespace
        [digit]     = byte  of (0-9)     - 1 digit
        [digits]    = bytes of (0-9)     - 0 or more digits
        [hexdigit]  = byte  of (0-9a-f)  - 1 hexadecimal digit
        [hexdigits]  = bytes of (0-9a-f) - 0 or more hex digits

        special keywords that do not count as tokens:
        [skip]   - at the start of a pattern: skip such text
                   completely, do not count it as a search hit.
        [keep]   - search also the following text but keep it
                   in the input data, without consuming it.
        [ortext] - foo[ortext]bar searches word foo or bar.
                   [ortext] is allowed only between literals.

        anchors that have no length of their own:
        [start]  - start of file
        [end]    - end of file
        [lstart] - line start, i.e. start or CRLF or CR or LF
        [lend]   - logical line end, i.e. eol or end of file.
                   to replace line ends use [eol] instead.

        how to search or replace special characters:
        -  to search or replace text containing the literal characters
           * ? \ [ ] then these must be escaped like \* \? \\ \[ \]
        -  ( ) are escaped only within character lists, like \( \)
        -  to search or replace the forward slash '/' type \x2f or use
           another char around from/to text, e.g. _fromtext_totext_
        -  parameters with blanks and non trivial characters need double
           quotes "", see also "about Shell Command Characters" below.

        expansion priorities: (highest first)
        if two search parts are side by side, and the same input
        character matches both, then these priorities apply:

          5:  start, end, lstart, lend
          4:  literal text, eol
          3:  whitelist classes: byte of, bytes of
          2:  blacklist classes: chars not, bytes not
          1:  plain wildcards: ?, *, **, byte, bytes, chars

        this means in "/[bytes]foo/" the [bytes] will stop to collect
        characters as soon as "foo" is found, as "foo" is a literal.
        on same or higher priority the right side stops the left side.

      the totext may contain:

        [part 1]            use first text part of the fromtext.
                            e.g. the fromtext /*foo[.100 chars]bar*/
                            contains parts :   1 2         3    4 5
        [part1]             the same (blank is optional).
        [parts 1,2,3]       use parts 1, 2 and 3.
        [parts 1-10]        use parts 1 to 10.
        [strip(part1,\0)]   use part 1 but remove zero bytes.
                            only zero bytes "\0" can be removed.
        [file.name]         full input filename with path
        [file.relname]      input filename without path
        [file.path]         input file's path
        [file.base]         relname without last .extension
        [file.ext]          input filename extension
        [all]               use all parts from fromtext.

        [setvar name]...[endvar]   set variable "name" with data
                                   between setvar and endvar.
        [getvar name]              fill in data from variable "name"

        although anchors like lstart, lend count as a separate part
        they need NOT be specified in the totext. this means that
        /[lstart]foo[lend]/bar/ just changes the word "foo".

   supported slash patterns
      \t    = TAB
      \r    = CR
      \n    = LF
      \x00  = one byte with code 00 hexadecimal
      \0    = short form for \x00
      \q    = a double quote "
      \\    = the backslash character \ itself
      \[    = the bracket open character [
      \]    = the bracket close character ]
      \*    = the literal star character *
      \?    = the literal question mark  ?
      \-    = to use literal "-" in a command
      Within multi line -bylist files:
      \     = slash+blank is changed to a single blank
      Only within "char of" or "byte not" lists:
      \(    = to use literal character "("
      \)    = to use literal character ")"

   SFK expression options
      -showpart(s)  print /from/ part numbers, range statistics
                    and expansion priority points per part.
                    done automatically if a required /to/ text
                    is not given with a command.
      -showbest     if a /from/ pattern finds nothing, use this to
                    see how many parts would match so far, and with
                    up to how many bytes per part. anchors like [lstart]
                    may show a non zero length when matching (CR)LF.
      -showlist     with -bylist, show the internal joined list if
                    commands are spread across multiple lines.
      -showall      show all of the above.
      -xmaxlen=n    set default maximum length for chars or bytes commands,
                    e.g. -xmaxlen=10000 means /foo*bar/ matches with up to
                    10000 characters between foo and bar. the default max
                    length without this option is 4000 characters.

   performance notes
    - always use a string literal, or single byte or char, at the start
      of your search expressions, like in /foo*bar/ starting with 'f'.
      Do not use a wildcard like * at the start like in /*foobar/
      when searching huge input data, as your search will slow down by
      factor 256. Use /[lstart]*foobar/ instead.
    - the system may cache output file(s), writing to disk in background
      after sfk has finished. subsequent batch commands may execute slower.

   chaining support
      sfk extract output can be sent only to +xed or +xex.
      other commands require an xed conversion step like
      sfk extract ... +xed +view

   aliases
      sfk xhexfind is the same as xfind -hex
      to extract unmodified binary data you may use either
      sfk xfind -pure ... -tofile or sfk extract ... -tofile

   office file support
      sfk ofind        search in .xml text file contents of
                       office files like .docx .xlsx .ods .odt.
      sfk help office  for more infos and options

   see also
      sfk xfind        for more search pattern examples

   examples
      sfk ofind mydir "/myword/"
         search office and plain text files in mydir
         containing the word 'myword'.
      sfk ofind mydir "/myword/" -names +copy out
         same as above, but copy the found files
         to a folder 'out'.
      sfk ofind mydir "/foo*bar/"
         search foo followed by bar in the same line.
      sfk ofind -pure mydir "/foo**bar/[part2]\n/"
         search text starting with foo, then several
         text lines, then ending with bar. print
         only the found text between foo and bar.