deepdream of
          a sidewalk
Show Answer
Math and science::Theory of Computation

Regular expressions

Regular expressions can be seen as a parallel to arithmetic expressions.

  • (5+3)×4 is an expression whose symbols represent numbers (e.g. natural numbers) and as a whole represents another number, 32.
  • (01)0 is an expression involving [...] and as a whole represents [...].

A regular expression R is a sequence of symbols that is one of:

  1. a symbol a for some a in an alphabet Σ
  2. ε
  3. (R1R2), where R1 and R2 are regular expressions
  4. (R1R2), where R1 and R2 are regular expressions
  5. (R1), where R1 is a regular expression 

A regular expression is said to represent a language according to the rules below. We write L(R) to mean the language of the regular expression R.

  1. From item 1, the regular expression a represents the language {a}.
  2. From item 2, the regular expression ε represents the language {ε}.
  3. From item 3, the regular expression represents [...].
  4. From item 4 & 5, the regular expression (R1R2) and (R1R2) represent the languages obtained by taking the union and concatenation of the languages represented by R1 and R2.
  5. From item 6, the regular expression (R1) represents the star of the language represented by R1.

We say that two regular expressions R1 and R2 are equal if they represent the same language, and we write R1=R2.

This definition is lacking in how it describes 'represents'. Maybe it is a function. I think it's coverage of equality is just sufficient, as it delegates the well defined formulation of set equality.

ε vs

The regular expression ε represents the language containing a single string, the empty string, {ε}. The regular expression represents the empty language, {}.

Precedence order

Star has highest precedence, followed by concatenation, with union last.

R+Shorthand

R+ is used as convenient shorthand for [...].

Two identities

If R is any regular expression then:

R[...]=R
R[...]=R