logo
linux

Globbing in Linux

Globbing in Linux
7 min read
#linux

Globbing involves bash interpreting glob characters to perform filename expansion of pattern searching in Linux. In this article, we will learn about globbing through various examples.

main point here

Introduction.

Globbing is the process whereby bash interprets glob characters and performs file expansion or pattern matching.

When the shell finds a glob, it expands the pathname and extends the glob with matching file names.

Globbing patterns.

  1. asterisks (*). Globbing is used with various commands to perform actions of a group of files once, for example, we can concatenate, list, remove, archive, etc using globbing.

For example, to list all files with .txt extension, we write:

$ ls *.txt
Image

To remove them using the rm command we write:

$ rm *.txt

In the above examples, we use the * - globstar character to match any string before the .txt.

What if we know the first character of a file name, we can find the file by writing:

$ find f*.txt
Image

The above command looks for all files starting with f followed by any characters between, and has a file extension .txt in the current directory.

  1. Question mark (?). We can also match a single character by using the ? - question mark glob as follows:
$ ls file?.txt
Image

The above example assumes we have a certain naming convention, that is, we name our files, file1, file2, file3 ... and so on with some common file extension. In this case, ? matches all files using a single character that comes after file. This is not limited to digits, it could also be characters, however, note that the file extension must be .txt.

We can also further use this glob to match files with a specific number of characters, for example:

$ rm ????.txt

Here we remove all files with four characters that end with the .txt file extension.

We can also use the question mark glob to match a file extension. For example, to match all files with an extension with four characters such as .docx or .java, etc, we write:

$ cat test.????
Image

In Linux, hidden files are preceded with a . character in their file names, to list all hidden files we write:

$ ls .*
Image

Here we match all strings after the . dot character.

  1. Square brackets. We can also specify ranges, for example, to copy all files that have a digit in their file names, we write:
$ cp *[0-9]* [destination]
Image

Here all files with a number between 0 and 9 in their filename are copied to the destination directory we specify, we can also limit this range, for example, to copy files with digits between 1 and 4 we use the range: [1-4].

Ranges are not limited to digits, we can also use characters, with characters we can specify upper or lower cases. We can also combine characters and digits.

For example to remove all files with digits between 1 and 4 and lowercase characters between f and j and upper case characters between A and C we write:

$ rm *[A-Cf-j1-4]*
Image

We also use LC_CTYPE locale interpretations, for example, to list all files with digits, we write:

$ ls *[[:digit:]]*

To list all files with an upper case character in their file names, we write:

$ ls [[:upper:]]*
Image

Other interpretations include;

  • [[:alnum:]] - alphabets, digits, lower and upper cases included.
  • [[:alpha:]] - only alphabets, lower and upper cases included.
  • [[:punct:]] - punctuation characters, e.g, #, $, %, *, (, ), ^ etc.
  • [[:lower:]] - lower cases only considered.
  • [[:space:]] - space characters considered.
  1. Caret(^). We use this glob to match starting characters in a string. For example to list files starting with a specific uppercase character between A to F we write:
$ ls | grep '[^A-F]'
Image

We can also use it with square brackets. When used outside of the square brackets, it matches files starting with a given range of characters. For example to display all strings in a file starting with characters between A and C we write:

$ grep '^[M-V]' File.txt
Image

When used inside square brackets, it matches strings in the file with the specified character within the specifed range. For example, let's list all lines with characters between A and C

$ grep '[^A-C]' [file name]
Image
  1. Exclamation(!). The exclamation glob performs a similar function as the carret glob. That is, we can do what we previously did above with an exclamation.
$ grep '[!A-C]' [file name]
Image
  1. Dollar sign($). The dollar sign glob is the converse of the exclamation and carret globs. That is, it matches the ending character. For example, to display all lines ending with a specific character we use the grep command as follows:
$ grep E$ File.txt
Image

In the example above we display all lines in the file File.txt ending with a letter E.

another example:

$ grep 'ONG TEXT$' File.txt
Image

In the above example, we print a line ending with the text between the single quotation marks.

  1. Curly brackets . We can combine the globs we have learned about to get a more refined output. For this, we need to place them inside curly brackets. For example, let's list all files that begin with four characters followed by a digit between 1 and 3 with a file extension with three characters and all files beginning with any character(s) but ending with ile.txt
$ ls {????[1-3].???,*ile.txt}
Image

In the above example, text1.txt and text3.txt are matched by the first glob while file.txt and File.txt is matched by the second glob.

  1. Pipes(|). Just like in programming languages, the pipe is used to apply more than one condition. That is, it applies the first OR the second. For example, to list files with .pdf OR .txt file extensions we write:
$ ls f*+(.txt|.pdf)
Image

Summary.

Globbing involves expanding a wildcard pattern such as * or ? into a list of pathnames that match a pattern.

We use globbing with various commands such as rm, find, grep, ls, cat, cp to achieve a more desired result while executing these commands.

Globbing is very useful in cases where we have scripts that interact with user input. A user's input might be vague and so we have to have measures that handle such cases. For example, if we expect the input to be a Yes, YES, Y we can use the asterisk glob to handle this.

References

references here