Regex 101

Understanding Regex 101

Learn Regex and its basic methods while implementing brackets, flags, and quantifiers

Oct 16, 2019 · 10 min read

Intro to Regular Expressions

As a software developer, you’ve probably encountered regular expressions several times and were confused when seeing this daunting set of characters grouped together like this:

[code class=”php”]<span id="762f" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">/^w+([.-]?w)+@w+([.]?w)+(.[a-zA-Z]{2,3})+$/</span>[/code]

And you may have wondered what this means…

Regular expressions (Regex or RegExp) are extremely useful in stepping up your algorithm game and will make you a better problem solver. The structure of regular expressions can be intimidating at first, but it is very rewarding once you grasp the patterns and implement them in your work properly.

What is Regex and why is it important?

A Regex, or regular expression, is a type of object that is used to help you extract information from any string data by searching through text to find what you need. Whether it’s numbers, letters, punctuation, or even white space, Regex allows you to check and match any character combination in strings.

For example, let’s say you needed to match the format of a social security number or email address. You can utilize Regex to check for patterns in the text strings and use it to replace or validate another substring. Think of Regex as your own search bar — it gives you the freedom to define your own search criteria for a pattern that fits your needs and assists you in finding what you were looking for.

Two Ways To Create A Regular Expression

[code class=”php”]<span id="029a" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const regexLiteral = /helloworld/;</span>[/code]

Syntax: /pattern/flags

2. Regular Expression Constructor: For a RegExp constructor, this method builds the expression for you.

[code class=”php”]<span id="8c34" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const greeting = ‘hello’</span><span id="dd20" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const regexConstr = new RegExp(greeting);</span>[/code]

Syntax: new RegExp(pattern[, flags])

Rule of thumb: If your regular expression is constant and does not change its value, you should use the Regex literal for better performance. In cases where it is dynamic and not a literal string (i.e., an expression), it is best to use the Regex constructor (see examples above).

Regular Expression Methods

There are three common Regex methods that you should be familiar with: test, match, and replace.

RegExp.prototype.test()

This .test method returns a boolean — checking if the string contains a match or no match in the search pattern.

[code class=”php”]<span id="8bb4" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const str1 = "i love regex";</span><span id="e33f" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const str2 = "it is cool";</span><span id="9b0f" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const hasRegex = /regex/;</span><span id="d22c" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">hasRegex.test(str1);</span><span id="1999" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: true</span><span id="5a6c" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">hasRegex.test(str2);</span><span id="0055" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: false</span>[/code]

String.prototype.match()

Now instead of using RegExp.test(String) which just returns a boolean if the pattern is matched, you can use the match method. This method returns an array with the whole matched string. Though it’s great to have the test method check whether the pattern is true or not, there will be times where we want to be in control of actually doing the match.

That’s where the .match method comes in handy! It returns an array of the match which can be helpful information depending on your use case. Here is a very basic example below. Later on you will see that when combined with flags, match becomes a powerful tool.

[code class=”php”]<span id="0dd6" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const str = "I love JavaScript";</span><span id="8033" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const result = str.match(/JavaScript/);</span><span id="a01a" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">console.log(result)</span><span id="1ed4" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: [‘JavaScript’]</span>[/code]

String.prototype.replace()

This .replace method searches for a string for a specified value (or regular expression) and returns a new string where the specified value is replaced.

[code class=”php”]<span id="14fc" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const sentence = ‘I love dogs more than cats.’;</span><span id="9cba" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const regex = /dogs/;</span><span id="c62a" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">console.log(sentence.replace(regex, ‘bunnies’));</span><span id="3365" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: "I love bunnies more than cats."</span>[/code]

NOTE: You CANNOT replace multiple instances of a word using a regular value, but you CAN do this with Regex.

[code class=”php”]<span id="cc0e" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const str = "Hello World World!";</span><span id="a978" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const replacement = str.replace("World", "Planet");</span><span id="9ed1" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">console.log(replacement)</span><span id="5740" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: “Hello Planet World!”</span>[/code]

Bracket Expressions

Inside the bracket expressions, you can place any special characters you want to use to specify the character sets.

For example, [code class=”php”]const regex=/[A-Z]/[/code]. Notice that A-Z is inside the square brackets so this will search for all uppercase letters in the alphabet.

[code class=”php”]<span id="d8a0" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph=""><strong class="je jq">[a-z] </strong>matches a string that has all lowercase letters in the entire alphabet</span><span id="7a65" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph=""><strong class="je jq">[A-Z] </strong>matches a string that has all the uppercase letters in the entire alphabet</span><span id="4a8c" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph=""><strong class="je jq">[abcd] </strong>matches a string that has a, b, c, d</span><span id="2bca" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph=""><strong class="je jq">[a-d] </strong>exactly the same as previous example so you can either specify each character or group them</span><span id="b742" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph=""><strong class="je jq">[a-gA-C0-7] </strong>matches string that has lowercase letters a-g, uppercase letters A-C, or numbers 0-7</span><span id="f424" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph=""><strong class="je jq">[^a-zA-Z] </strong>matches a string that DOES NOT have all lowercase or uppercase letters. (Inside a character set, the ^ character means all the characters that are <strong class="je jq">NOT </strong>in the a-z or A-Z.)</span>[/code]

Flags

After we end with a slash character, we can either choose one specific flag or combine them. Regex uses flags to be more specific on how to properly find and match the defined custom characters.

[code class=”php”]<span id="c295" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const sentence = ‘The Cat in the Hat is not a cat.’</span><span id="7279" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const regex = /[A-Z]/;</span><span id="85ef" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const found = sentence.match(regex);</span><span id="6343" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">console.log(found);</span><span id="ded7" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: [‘T’]</span>[/code]

Before we go into the specific flags, you should keep in mind that flags are optional. Without flags, Regex will find the first character that returns true in an array within the slashes. So in this case, our code will return [code class=”php”][‘T’] [/code]because it found the first uppercase letter in the sentence.

As an additional side note, there are three other character classes that can help when using multiple character sets to match.

[code class=”php”]<span id="a193" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const sentence = ‘There are 350 dogs and 17 cats in the house.’</span><span id="98b1" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const regex = /d/</span><span id="4278" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const found = sentence.match(regex);</span><span id="c63d" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">console.log(found);</span><span id="40e0" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: [‘3’]</span>[/code]

The negations of d, w, and s will be D, W, and S. It will find the following:

Quantifiers

Quantifiers are basic symbols in regular expressions that have a special meaning.

Let’s go through this example to demonstrate our understanding of Regex.

[code class=”php”]<span id="3563" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const str = ‘for__if__rof__fi’</span><span id="7569" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const regex = /[a-z]+/g;</span><span id="ad32" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const found = str.match(regex);</span><span id="b7aa" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">console.log(found);</span>[/code]

You can see the regular expression where it is checking all the lowercase letters from a-z and using the + symbol to match up all the previous items. So when you console log found, it will return [code class=”php”][‘for’, ‘if’, ‘rof’, ‘fi’][/code].

Let’s say that + symbol was not there and the Regex was only:

[code class=”php”]<span id="20c6" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const regex = /[a-z]/g;</span>[/code]

Then it will return [code class=”php”][‘f’,’o’,’r’,’i’,’f’,’r’,’o’,’f’,’f’,’i’][/code].

Putting it all together

Remember this long string of characters we saw at the beginning of this article?

This Regex seen at the beginning of the article is actually a very common use case where this is applied for email address formatting. Now that we have learned the basic methods and terminologies used in Regex, let’s break this down one step at a time.

[code class=”php”]<span id="1bc9" class="jd ib dt as je b fn jf jg r jh" data-selectable-paragraph="">const email = ‘student-id@alumni.school.edu’</span><span id="76e7" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const regex = /^w+([.-]?w)+@w+([.]?w)+(.[a-zA-Z]{2,3})+$/</span><span id="3d9c" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">const found = regex.test(email);</span><span id="da73" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">console.log(found);</span><span id="a9f7" class="jd ib dt as je b fn js jt ju jv jw jg r jh" data-selectable-paragraph="">// expected output: true</span>[/code]

And that’s it! Now we know how to use Regex for a basic email validation. Additionally, you can implement brackets, flags, and/or quantifiers in your Regex to accommodate for other edge cases not considered in our Regex string.

Conclusion

It can be very beneficial for developers to gain knowledge in Regex. As seen above, Regex is most commonly used in situations where security validation is needed. It can also be implemented when developers need to match URLs or parse through some text and/or extract certain information such as a date format of yyyy-mm-dd. Regex is everywhere!

People can easily excuse themselves from knowing Regex because it seems difficult to understand. But it doesn’t have to be. You can see it as a gradual curve and start from the basics today.

Helpful Resources

Thanks for reading and I hope you all feel more comfortable using Regex in your algorithms!

Scroll to Top