RegEx One Atom Twelve
, ,

Basic vs Extended Regular Expressions

The use of the following command line syntax (sans quotes) is interchangeable:

  • “egrep”
  • execution of grep with command line option [ -E ], “grep -E”

Grep (and Grep for Windows)

Depending upon the nature of the search I want to perform, I often use grep, a GNU Operating System application, common to the Linux operating system. grep is available for Windows, so I use it there when I can for its unique ability to scan file contents. Not only does grep provide a means of locating a desired string of text from a collection of files on the filesystem, it is also an exercise in the use of regular expressions, and Command Line syntax.

When executing grep, one has the option to use basic, extended, or Perl compatible Regular Expressions. As there are several options for executing a search, I often reference the --help option on the command line, or the included HTML help files. The following excerpt from section 5.5 of the Grep [for Windows] HTML help manual explains the difference between the default grep behaviour (basic regex), and that of egrep

egrep | grep -E: grep manual section 5.5

In basic regular expressions the metacharacters `?’, `+’, `{‘, `|’, `(‘, and `)’ lose their special meaning;
instead use the backslashed versions `\?’, `\+’, `\{‘, `\|’, `\(‘, and `\)’.

Traditional egrep did not support the `{‘ metacharacter, and some egrep implementations support `\{‘ instead, so portable scripts [ where portable is used in context to reference “that which may be used on any of the variety of diverse computer systems” ] should avoid `{‘ in `egrep’ patterns and should use `[{]’ to match a literal `{‘.

GNU egrep attempts to support traditional usage by assuming that `{‘ is not special if it would be the start of an invalid interval specification. For example, the shell command `egrep ‘{1” searches for the two-character string `{1’ instead of reporting a syntax error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.

Note: Wishing to provide a proper reference for the reader, I can not recall precisely where I obtained my DOS / Windows cmd.exe compatible version of grep (grep.exe), as there are multiple versions available. I believe it is the same as is bundled with several other useful applications in the GNU utilities for Win32 distribution. For the most authoritative resource, I recommend following the link above (near the top of the article body), which points to the Free Software Foundation web page for grep.

Update, 2010-04-02: following the URL above, as cited to point to GNU Utilities for Win32, I encountered an HTTP authorization prompt (i.e. a dialogue window appears, asking for a username and password). Perhaps it is a temporary error, so try it if you wish, however the URL I have stored in the primary links points instead to a sourceforge project titled Native Win32 Ports of Some GNU Utilities. In my recollection, it is the same content. Weird. I could be wrong. Try the old link if you want, but I have more confidence in the Sourceforge URL.

Leave a Reply

Your email address will not be published. Required fields are marked *