prevnext   » SZS: Wiimms SZS Toolset » Guides » Wildcards

Wildcards

Contents

1.   Introduction

Usually wildcards are managed by the calling shell. So if you use for excample ./path/*.szs, the shell searches the files, expand the command line, so that the SZS tools gets a list of files, something like ./path/a.szs ./path/b.szs ./path/c.szs. This list can become very, very long, so that the permitted characters for the command line are no longer sufficient.

For example, let's take the command wlect dis ./path/to/*.szs ... and assume that there are 1000 tracks in the directory ./path/to/. Since the outer shell resolves pattern *.szs, all file names are written to the command line one after the other. A calculation from the ct.wiimm.de shows that the file names have an average length of 51 characters. There are also 10 characters for the path and 1 character for the separator. This means that the file names of the 1000 tracks alone require 62 000 characters on the command line. In principle, it should even be possible to process up to 4028 tracks. We ignore the additional _d variants for this calculation. For these 4028 tracks, the command line would then have to take up 250 000 characters.

That's quite a lot, but it's not a problem for Linux, which reserves at least 2 MB for the command line. In Cygwin, which is used for Windows, the limit is 32 000 characters, and in Windows 10 it is only 8192 characters. Both are not enough for our scenario.

Another problem is the use of wildcards in options. So --add-section *.gct expands to --add-section a.gct b.gct. Only the first file a.gct is considered as a parameter of the option.

For these cases, the SZS tools evaluate so-called wildcards for most commands and some options by themself. In the above examples, it would then only have to be prevented that the outer shell evaluates the wildcards. This is usually done by quoting the parameters. For the examples of above, use "./path/*.szs" or --add-section "*.gct".

You can test the expansion with commands wszst WILDCARDS and wszst EXPAND. So try for example wszst WILDCARDS "./path/to/*.szs".

The following sections describe how placeholders are resolved.

1.1   Behavior

Some commands and options accept wildcards and expand them internally. Wildcards were implemented in 3 different variants. The built-in help and this website inform about the details of the implementation.

Usually wildcards of all parameters are evaluated and the matching file names are stored in a list with their real file names. This list is sorted by file paths. Duplicates are removed from the list. After adding all file names, the individual files are processed. The pipe character (|) is used to control the wildcard scanning.

For some commands and all options, the parameters are evaluated step by step. If available, the wildcards are evaluated immediately and all matching files are processed directly.

1.2   The pipe character

Since v2.36 the pipe character (|) is used to control the wildcard scanning. The pipe character was chosen because it is very rarely used in filenames. Quote all parameters to avoid interpretation by the command shell.

You can use the switches |+ and |- to allow or forbid (default) the search for hidden files. This options are active for the following parameters with wildcards.

If a filename beginns with a pipe character (like |filename), then wildcard and pipe character parsing is disabled. Instead filename is used exactly as specified.

For commands with sub-file support, the pipe character in the middle is used as separator between file name and sub-file name. If your file really contains a pipe character, then use 2 pipe characters in a row.

Examples:

"|+"
Enable search for hidden files.
"./|+"
Do a wildcard search for pattern ./|+.
"abc.szs"
Use filename abc.szs as specified, because it contains no wildcard an no pipe character.
"abc*.szs"
Do a wildcard search for pattern abc*.szs.
"|abc*.szs"
Use filename abc*.szs as specified, because of the pipe character at the beginning.
"||abc*.szs"
Use filename |abc*.szs as specified including the second pipe character, because of the pipe character at the beginning.
"abc.szs|course.kmp"
Use filename abc.szs as specified. And then select sub-file course.kmp.
"abc*.szs|course.kmp"
Do a wildcard search for pattern abc*.szs. Select sub-file course.kmp for each file found.
"abc*.szs||course.kmp"
|| is not a sub-file separator and will be relaced by a single pipe character. So do a wildcard search for pattern abc*.szs|course.kmp.

1.3   --no-wildcards

Disable wildcard parsing and use each filename exactly as specified.
Option --no-wildcards is only available for commands and options with wildcard support. It disables wildcard and pipe character parsing at all, so that each filename is used exactly as specified.

1.4   --in-order

Process the input files in order of the command line and don't delete duplicates.
Option --in-order is only available for commands and options with wildcard support. It disables sorting so that all file names are appended to the list in the order of the command line. This means that no duplicates are recognized or removed from the list. So it leads to an evaluation similar to old versions of the tools.

2.   Simple Wildcards

The SZS tools choose between doing a simple string match and wildcard matching by checking if the pattern contains one of these wildcard characters: *, ?, #, [, { and TAB. This check is done for every directory and for the base name of the path.

3.   Wildcard **

A special case is the wildcard **. It is recognized only at the beginning of a path component. Optionally it can be followed by a range specification to declare the minimum and/or maximum recursion (sub-directory) depth. The following list explains all allowed variants:
**
This matches any number (≥0) sub-directories.
**A
If A is an unsigned integer, then it matches exact A sub-directories.
**A-
If A is an unsigned integer, then it matches from A to unlimited sub-directories.
**A-B
If A and B are unsigned integers, then it matches from A to B sub-directories.
**-B
If B is an unsigned integer, then it matches from 0 to B sub-directories.
Examples:  ./tracks/**/*.szs  **-2/*.szs