written 8.5 years ago by |
It is defined as representing a regular expression in terms of regular language.
ϕ is regular expression then regular language is { }.
ϵϵ is a regular expression then regular language is {ϵ}.
r is regular expression then regular language {r}.
Let R and S be regular expression then regular language be $L_R$ and $L_S$.
R + S or R|S be regular expression then regular language is $L_R$ U $L_S$.
R . S be regular expression then regular language is $L_R . L_S$.
R* be regular language then regular expression is LR*.
$R^+$ be regular language then regular expression is $L_R^+$.
where * means 0 or more occurrence ,
+ means 1 or more occurrence.
Example:
Regular Expression | Regular language |
---|---|
a | {a} |
B | {b} |
a+b | {a,b} |
a∙∙b | {ab} |
$a^*$ | {ϵϵ,a,aa,aaa,aaaa,......} |
$b^+$ | {b,bb,bbb,bbbb,.......} |
Applications:
Regular expressions are useful in a wide variety of text processing tasks, and more generally string processing, where the data need not be textual. Common applications include data validation, data scraping (especially web scraping), data wrangling, simple parsing, the production of syntax highlighting systems, and many other tasks.
While regexps would be useful on Internet search engines, processing them across the entire database could consume excessive computer resources depending on the complexity and design of the regex.