# Regular Expressions

> *Regular expressions (regex)* are powerful tools for pattern matching and text processing. They are represented as a pattern that consists of a special set of characters to search for in a string `str`.

This page provides syntax for regular expressions in Julia . Each section includes an example to demonstrate the described methods.

### Functions <a href="#functions" id="functions"></a>

| Action                          | Function                    |
| ------------------------------- | --------------------------- |
| Check if regex matches a string | `occursin(r"pattern", str)` |
| Capture regex matches           | `match(r"pattern", str)`    |
| Specify alternative regex       | `pattern1\|pattern2`        |

### Character Class <a href="#character_class" id="character_class"></a>

*Character class* specifies a list of characters to match (`[...]` where `...` represents the list) or not match (`[^...]`)

| Character Class                                            | `...`         |
| ---------------------------------------------------------- | ------------- |
| Any lowercase vowel                                        | `\[aeiou]`    |
| Any digit                                                  | `[0-9]`       |
| Any lowercase letter                                       | `[a-z]`       |
| Any uppercase letter                                       | `[A-Z]`       |
| Any digit, lowercase letter, or uppercase letter           | `[a-zA-Z0-9]` |
| Anything except a lowercase vowel                          | `[^aeiou]`    |
| Anything except a digit                                    | `[^0-9]`      |
| Anything except a space                                    | `[^ ]`        |
| Any character                                              | `.`           |
| Any word character (equivalent to `[a-zA-Z0-9_]`)          | `\w`          |
| Any non-word character (equivalent to `[^a-zA-Z0-9_]`)     | `W`           |
| A digit character (equivalent to `[0-9]`)                  | `\d`          |
| Any non-digit character (equivalent to `[^0-9]`)           | `\D`          |
| Any whitespace character (equivalent to `[\t\r\n\f]`)      | `\s`          |
| Any non-whitespace character (equivalent to `[^\t\r\n\f]`) | `\S`          |

### Anchors <a href="#anchors" id="anchors"></a>

*Anchors* are special characters that can be used to match a pattern at a specified position

| Anchor              | Special Character |
| ------------------- | ----------------- |
| Beginning of line   | `^`               |
| End of line         | `$`               |
| Beginning of string | `\A`              |
| End of string       | `\Z`              |

### Repetition and Quantifier Characters <a href="#repetition_and_quantifier_characters" id="repetition_and_quantifier_characters"></a>

*Repetition or quantifier characters* specify the number of times to match a particular character or set of characters

| Repetition                     | Character |
| ------------------------------ | --------- |
| Zero or more times             | `*`       |
| One or more times              | `+`       |
| Zero or one time               | `?`       |
| Exactly n times                | `{n}`     |
| n or more times                | `{n,}`    |
| m or less times                | `{,m}`    |
| At least n and at most m times | `{n.m}`   |

Input:

```julia
# regex.jl
number1 = "(555)123-4567"
number2 = "123-45-6789"

# check if matches
if occursin(r"\([0-9]{3}\)[0-9]{3}-[0-9]{4}", number1)
   println("match!")
end

if occursin(r"\([0-9]{3}\)[0-9]{3}-[0-9]{4}", number2)
  println("match!")
else
  println("no match!")
end

# capture matches
# use parentheses to "capture" different parts of a regular 
# expression for later use the first set of parentheses corresponds 
# to index 1, second to index 2, etc.

number_details = match(r"\(([0-9]{3})\)([0-9]{3}-[0-9]{4})", number1)

if number_details != nothing
   area_code = number_details[1]
   phone_number = number_details[2]

   println("area code: $area_code")
   println("phone number: $phone_number")
end
```

Output:

```julia
match!
no match!
area code: 555
phone number: 123-4567
```

## Resources

* Julia Documentation: [Manual - Strings](https://docs.julialang.org/en/v1/manual/strings/) (see Regular Expressions)
* Think Julia: [Chapter 8 - Strings](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html#chap08)
* [Regular Expressions 101](https://regex101.com/)
* [Regular Expressions Library](http://www.regexlib.com/)
* [Regular Expressions Cheat Sheet](http://www.regexlib.com/CheatSheet.aspx)
