Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.syntblaze.com/llms.txt

Use this file to discover all available pages before exploring further.

The =~ operator is a binary regular expression matching operator used exclusively within Bash’s extended conditional expression construct ([[ ]]). It evaluates whether a string on the left side matches an Extended Regular Expression (ERE) provided on the right side.
[[ $string =~ $regex ]]

Execution Mechanics

  • Engine: Bash utilizes the system’s C library regex functions (regcomp and regexec) to evaluate the pattern as a POSIX ERE. Because regex parsing is delegated to the underlying system library, the availability of non-standard extensions varies. On systems using glibc (most Linux distributions), shorthand character classes like \s and \w are supported as ERE extensions. However, for strict cross-platform portability, standard POSIX character classes like [[:space:]] or [[:digit:]] are preferred.
  • Exit Status: The operator yields an exit status of 0 (true) if the string matches the pattern, 1 (false) if it does not match, and 2 if the regular expression is syntactically invalid.
  • Case Sensitivity: By default, the regex evaluation is case-sensitive. This behavior can be altered globally by enabling the nocasematch shell option (shopt -s nocasematch), which instructs the =~ operator to evaluate patterns case-insensitively.
  • Compatibility: This operator is a Bash extension and is not POSIX-compliant. It will result in a syntax error if used within the standard [ ] (test) command.

Quoting and Evaluation Rules

The parsing of the right-hand side of the =~ operator is highly sensitive to quoting. If any part of the regular expression is enclosed in single (') or double (") quotes, Bash treats that specific quoted portion as a literal string rather than a regex metacharacter.

# Evaluates as a regular expression (matches "a", "b", or "c")
[[ $string =~ [abc] ]]


# Evaluates as a literal string (matches the exact sequence "[abc]")
[[ $string =~ "[abc]" ]]
To avoid parsing errors with complex patterns containing spaces or shell metacharacters (like |, <, >), the standard technical practice is to assign the regular expression to a variable before evaluation.

# Recommended syntax using POSIX ERE character classes
pattern='^prefix-([0-9]{3})[[:space:]]+(.*)$'
[[ $string =~ $pattern ]]

Capture Groups and BASH_REMATCH

When a successful match occurs, Bash automatically populates a global array variable named BASH_REMATCH with the results of the match and any parenthesized subexpressions (capture groups). This array is strictly read-only and managed exclusively by the shell. Attempting to modify, reassign, or unset it will result in a runtime error (e.g., bash: BASH_REMATCH: readonly variable).
  • ${BASH_REMATCH[0]}: Contains the contiguous substring of the left-hand string that matched the entire regular expression.
  • ${BASH_REMATCH[n]}: Contains the substring that matched the nth parenthesized subexpression in the regex.
string="data-2048-xyz"
pattern='^([a-z]+)-([0-9]+)'

[[ $string =~ $pattern ]]


# Array state post-evaluation:

# ${BASH_REMATCH[0]} == "data-2048"

# ${BASH_REMATCH[1]} == "data"

# ${BASH_REMATCH[2]} == "2048"
The BASH_REMATCH array is overwritten by the shell every time the =~ operator is executed. If the match fails, the array is cleared.
Master Bash with Deep Grasping Methodology!Learn More