XXCOPY
[ Back to Table of Contents ] [ << ] [ >> ]

XXCOPY TECHNICAL BULLETIN #28


From:    Kan Yabumoto           tech@xxcopy.com
To:      XXCOPY user
Subject: The Wild-Wildcard Source: the source spec with wildcards
Date:    2001-01-28
===============================================================================

XXCOPY Command Parameter Syntax:

  XXCOPY   source  [ destination ]  [ switches... ]

  We have shown XXCOPY's basic command line syntax at numerous
  occasions.  In this article, the topic is focused on the first
  item, the source specifier (any of the switch arguments can be
  placed anywhere including at the position left to the source).


Source Specifier (XCOPY-compatible standard):

  In another article, XXTB #25, the standard source
  specifier that is compatible with Microsoft's XCOPY is discussed.
  The standard source specifier is made of the following three parts.

      [ volume_spec ] [ directory ] [ file_pattern ]

  The other article discussed the case where the directory specifier
  contains no wildcard character because Microsoft's XCOPY will treat
  them literally (the * and ? has no special power as wildcard).

  On the other hand, wildcard characters in the source specifier are
  handled more appropriately by XXCOPY which is the subject of this
  article.


The Wild-Wildcard Source Specifier (XXCOPY-extended feature):

  This is one of the most distinguishing feature of XXCOPY from most
  other file management utilities.  The source directory specifier
  can be further separated in two sub-parts (compare with the standard,
  three-part source specifier).

   [ volume_spec ] [ base_dir ] [ directory_pattern ] [ file_pattern ]

  The [ directory ] component in the standard specifier is now broken
  up to [ base_dir ] and [ directory_pattern ].  The "constant" part
  of the directory specifier which has no wildcard will be classfied
  as the base_dir.  The remaining part that include a wildcard will
  be classified as the directory_pattern.  Any of the four parts can
  be omitted. But, of course at least one must be present as the
  source specifier.

  For example

      XCOPY   C:\Windows\sys*\*.dll   D:\dst\   /S

  According to the standard three-part scheme, it breaks up like

      volume_spec:          C:
      directory:            \Windows\sys*\
      file_pattern:         *.dll

  Of course, with Microsoft's XCOPY, you get nothing by this command.
  XCOPY looks for a directory, C:\Window\sys*\ which does not exist if
  interpreted literally (XCOPY does just that) and find no matching
  files (*.DLL).

  With XXCOPY's wild-wild-source (four-part scheme) feature, it works as.

      XCOPY   C:\Windows\sys*\*.dll   D:\dst\   /S

      volume_spec:          C:
      base_dir:             \Windows\
      directory_pattern:    sys*\
      file_pattern:         *.dll

      The command line effectively combines the action previously done
      with multiple lines like

      XXCOPY  C:\windows\system\*.dll    d:\dst\system\    /S
      XXCOPY  C:\windows\system32\*.dll  d:\dst\system32\  /S
      ...


The Multi-level Subdirectory Specifier:

  In various examples, you may have seen a source specifier like

      XXCOPY  C:\Windows\*\?cache*\*\*.jpg  \dst\

  Yes, XXCOPY's unique Wild-Wildcard Source feature allows you to use
  wildcards liberally pretty much anywhere in the source specifier.
  That includes the new \*\ notation where a single asterisk forms
  a sole level of directory.  You can go really wild with this
  feature of having as many wildcards anywhere, any level, any
  number...  It makes XXCOPY a very wild beast indeed.

  The \*\ sequence is a new notation which we came up with XXCOPY
  in order to encode the multi-level directory name matching.
  Actually, the same concept has been present in Microsoft's XCOPY
  in the form of the /S switch which specifies that a filename
  pattern be applied to multiple-level subdirectories.  For example,

     XCOPY   C:\Windows\*.jpg  \dst\  /S
     XXCOPY  C:\Windows\*.jpg  \dst\  /S

  The /S switch is a very basic switch and most XCOPY/XXCOPY users
  are familiar with this concept.  It includes not only the first
  level directory, but also includes all subdirectories.

       C:\Windows\mywife.jpg             // first-level directory
       C:\Windows\cache\mother1.jpg      // another-level
       C:\Windows\cache\deep\son.jpg     // third-level
      ...

  * * * *  OK, Microsoft's XCOPY runs out of gas here. * * * *

  The rest of the discussion applies only to the XXCOPY utility.
  Using the new \*\ notation, the /S switch can be substituted as

     XXCOPY  C:\Windows\*\*.jpg  \dst\

     In this command line, the \*\ sequence immediately before the
     filename template (*.jpg) makes the files to be applied to all
     subdirectories beyond the path (C:\Windows\).  Therefore, the
     *.jpg pattern applies to any subdirectories which is how the
     /S switch works.

  Next, I will show you even a better example of \*\ sequence which
  illustrates a case which cannot be specified by the traditional /S
  switch.

     XXCOPY  C:\Windows\*\cache\*.jpg  \dst\

     In this case, the subdirectory cache may appear at any level
     of subdirectory (including the first level).  Somewhat similar
     to the spirit of the /S switch, but it does NOT allow the
     last name part (*.jpg) to be matched in any other directory
     level than the one immediately inside the cache\ directory.
     Note the difference carefully:  the \*\ sequence does not
     appear between \cache\ and *.jpg.

  Therefore,the following three cases are all different one another.

     XXCOPY  C:\Windows\*\*.jpg           \dst\
     XXCOPY  C:\Windows\*\cache\*.jpg     \dst\
     XXCOPY  C:\Windows\*\cache\*\*.jpg   \dst\

     The first line is equivalent to the familiar /S switch where
     file pattern *.jpg applies to any level below C:\Windows\.

     In the case of the second line, \*\ modifies the multi-level
     matching of only the directory pattern, \cache\ (it just happens
     that it contains no wildcard charcter, but it may be allowed).
     But, the filename pattern, *.jpg applied only to the immediate
     directory of whichever \cache\ directory.

     The third case is the most universal case of all: the \*\
     sequence appears in both before the directory pattern, \cache\,
     and before the filename pattern, *.jpg.

     Here are some variations of the multi-level directory specifier:

        \*\         // zero or more levels of subdirectory
        \?*\        // exactly one level of subdirectory of any name
        \*\?*\      // one or more levels of subdirectory

     There are no particular limit that is set by XXCOPY.  You may
     use as many wildcars you want in the source specifier.  Of
     course, there is a practical limit in the whole length of the
     source specifier (260 character in all for a full pathname in
     Windows).


Just for old-timer's finger habit:

  For backward compatibility mostly to accommodate old timers' finger
  habit, Microsoft allows *.* to denote any file (or directory) name
  which may not necessarily has the dot character in it.  To honor
  the same tradition (and to make it fully XCOPY-compatible), XXCOPY
  accepts *.* as equivalent to the simpler (and preferred) single-
  asterisk, *.  To be symmetrical, the multi-level subdirectory
  matching sequence \*\ may be substitued by \*.*\.  Similarly,
  \*\*\ (or even \*\*\*\*) is a redundant (but permissible) expression
  which will be treated as equivant to \*\,


What is the "Base Directory":

  We call the "constant" part of the source directory in an XXCOPY
  operation the Base Directory.  There is always only one Base
  Directory in XXCOPY command.  In the traditional XCOPY-compatible
  (without wildcard) source directory specifier, the pathname up to
  the last name (the file_pattern) was the Base Directory.  With
  wildcard specifiers in the source specifier, the Base Directory
  refers to the first part of the source specifier which does not
  contain any wildcard character.  This is why there is always only
  one Base Directory.

  The distinction of the Base Directory from the directory_pattern is
  significant not for the name's sake.  But, it is the directory
  level which is the base directory to which a relative path is
  referenced.  The Base Directory is used in both the formation of
  the destination directory and the referece point for an exclusion
  (/X) directory.

  For example, using the same command line showen earlier:

     XXCOPY   C:\Windows\*\*cache*\*.jpg   D:\dst\  /I

  In the destination directory, you will find files like...

     C:\Windows\abc\mycache\xrated.jpg --> D:\dst\abc\mycache\xrated.jpg
     C:\Windows\a\b\cachex\xxx_pic.jpg --> D:\dst\a\b\cachex\xxx_pic.jpg
     C:\Windows\cache\pta_oked.jpg     --> D:\dst\cache\pta_oked.jpg

     (The /I switch let a new directory to be created if missing).

  The Base Directory in this case is the

      C:\Windows\

  which is the longest source directory path which does not contain
  a wildcard.  So, if you have a relative referece in an exclusion
  switch, the path will will be relative to the Base directory.

  For instance,

     XXCOPY  C:\Windows\*\*cache*\*.jpg   D:\dst\  /Xcache*\

     Here, the exclusion specifier (/Xcache*\) gives the pattern for
     the directories to be excluded as "cache*\" which is relative to
     the Base Directory.  that is C:\Windows\cache*\.  And the line

     XXCOPY  C:\Windows\*\*cache*\*.jpg  D:\dst\  /XC:\Windows\cache*\

     In the above example, the following file would be caught by the
     exclusion specifier.

     C:\Windows\cache\pta_oked.jpg


Does the Wild-Wildcard Source scheme apply to the exclude swich?

  Unfortunately, the answer is NO.  The exclusion specifier is
  not implemted as flexibly as that of the source directory
  specifier.  It is mostly the for the sake of reasonable issue.
  If the exclusion specifiers are given a total freedom in terms of
  the placement of wildcard characters just like the source
  specifier, unless we come up with a very clever algorithm,
  the combinatorial explosion will be so severe, the operation
  will be intorelably slow it will not be useful --- that is
  our official excuse at least.  On the other hand, the current
  set of exclusion feature is chosen in such a way that the
  overall XXCOPY performance will not severely compromized even
  by a very large number of exclusion specifiers.   Currently,
  the use of wildcard in an excluded item is limited to the
  last name (either file or directory) portion of the specifier.



© Copyright 2016 Pixelab All rights reserved.

[ XXCOPY Home ] [ Table of Contents ] [ << ] [ >> ]

Join the XXCOPY group