 Math and science::Theory of Computation

# Regular expressions

Regular expressions can be seen as a parallel to arithmetic expressions.

• $$(5 + 3) \times 4$$ is an expression whose symbols represent numbers (e.g. natural numbers) and as a whole represents another number, 32.
• $$(0 \cup 1)0^\star$$ is an expression involving [...] and as a whole represents [...].

A regular expression $$R$$ is a sequence of symbols that is one of:

1. a symbol $$a$$ for some $$a$$ in an alphabet $$\Sigma$$
2. $$\varepsilon$$
3. $$\emptyset$$
4. $$(R_1 \cup R_2)$$, where $$R_1$$ and $$R_2$$ are regular expressions
5. $$(R_1 \circ R_2)$$, where $$R_1$$ and $$R_2$$ are regular expressions
6. $$(R_1^*)$$, where $$R_1$$ is a regular expression

A regular expression is said to represent a language according to the rules below. We write $$L(R)$$ to mean the language of the regular expression $$R$$.

1. From item 1, the regular expression $$a$$ represents the language $$\{a\}$$.
2. From item 2, the regular expression $$\varepsilon$$ represents the language $$\{ \varepsilon \}$$.
3. From item 3, the regular expression $$\emptyset$$ represents [...].
4. From item 4 & 5, the regular expression $$( R_1 \cup R_2)$$ and $$(R_1 \circ R_2)$$ represent the languages obtained by taking the union and concatenation of the languages represented by $$R_1$$ and $$R_2$$.
5. From item 6, the regular expression $$(R_1^\star)$$ represents the star of the language represented by $$R_1$$.

We say that two regular expressions $$R_1$$ and $$R_2$$ are equal if they represent the same language, and we write $$R_1 = R_2$$.

This definition is lacking in how it describes 'represents'. Maybe it is a function. I think it's coverage of equality is just sufficient, as it delegates the well defined formulation of set equality.

#### $$\varepsilon$$ vs $$\emptyset$$

The regular expression $$\varepsilon$$ represents the language containing a single string, the empty string, $$\{\varepsilon\}$$. The regular expression $$\emptyset$$ represents the empty language, $$\{\}$$.

#### Precedence order

Star has highest precedence, followed by concatenation, with union last.

#### $$R^+$$Shorthand

$$R^+$$ is used as convenient shorthand for [...].

#### Two identities

If $$R$$ is any regular expression then:

$R \cup [...] = R$
$R \circ [...] = R$