Skip to main content

Module 25: Regular Expressions

Regular Expressions (RegEx) are powerful patterns used for matching, searching, and manipulating text in JavaScript.


1. Creating Regular Expressions

1.1 Literal Notation

// Basic pattern
const pattern1 = /hello/;

// With flags
const pattern2 = /hello/i; // Case-insensitive
const pattern3 = /hello/g; // Global search
const pattern4 = /hello/gi; // Both flags

1.2 Constructor Notation

// String pattern
const pattern1 = new RegExp('hello');

// With flags
const pattern2 = new RegExp('hello', 'i');
const pattern3 = new RegExp('hello', 'gi');

// Dynamic patterns
const searchTerm = 'world';
const pattern4 = new RegExp(searchTerm, 'i');

2. RegEx Flags

2.1 Common Flags

// i - Case-insensitive
const pattern1 = /hello/i;
console.log(pattern1.test('HELLO')); // true
console.log(pattern1.test('Hello')); // true

// g - Global (find all matches)
const text = 'cat dog cat bird cat';
const pattern2 = /cat/g;
console.log(text.match(pattern2)); // ['cat', 'cat', 'cat']

// m - Multiline (^ and $ match line boundaries)
const multiline = 'line1\nline2\nline3';
const pattern3 = /^line/gm;
console.log(multiline.match(pattern3)); // ['line', 'line', 'line']

// s - Dotall (. matches newlines)
const pattern4 = /a.b/s;
console.log(pattern4.test('a\nb')); // true

// u - Unicode
const pattern5 = /\u{1F600}/u; // 😀
console.log(pattern5.test('😀')); // true

// y - Sticky (matches from lastIndex)
const pattern6 = /hello/y;

3. Character Classes

3.1 Basic Classes

// . - Any character (except newline)
/a.c/.test('abc'); // true
/a.c/.test('a c'); // true
/a.c/.test('ac'); // false

// \d - Digit [0-9]
/\d/.test('5'); // true
/\d/.test('a'); // false

// \D - Non-digit [^0-9]
/\D/.test('a'); // true
/\D/.test('5'); // false

// \w - Word character [a-zA-Z0-9_]
/\w/.test('a'); // true
/\w/.test('5'); // true
/\w/.test('_'); // true
/\w/.test(' '); // false

// \W - Non-word character
/\W/.test(' '); // true
/\W/.test('a'); // false

// \s - Whitespace [ \t\n\r\f\v]
/\s/.test(' '); // true
/\s/.test('\n'); // true
/\s/.test('a'); // false

// \S - Non-whitespace
/\S/.test('a'); // true
/\S/.test(' '); // false

3.2 Custom Character Classes

// [abc] - Match a, b, or c
/[abc]/.test('a'); // true
/[abc]/.test('b'); // true
/[abc]/.test('d'); // false

// [^abc] - Not a, b, or c
/[^abc]/.test('d'); // true
/[^abc]/.test('a'); // false

// [a-z] - Range
/[a-z]/.test('m'); // true
/[A-Z]/.test('M'); // true
/[0-9]/.test('5'); // true

// [a-zA-Z0-9] - Multiple ranges
/[a-zA-Z0-9]/.test('5'); // true
/[a-zA-Z0-9]/.test('M'); // true

4. Quantifiers

4.1 Basic Quantifiers

// * - 0 or more
/ab*c/.test('ac'); // true (0 b's)
/ab*c/.test('abc'); // true (1 b)
/ab*c/.test('abbc'); // true (2 b's)

// + - 1 or more
/ab+c/.test('ac'); // false (0 b's)
/ab+c/.test('abc'); // true (1 b)
/ab+c/.test('abbc'); // true (2 b's)

// ? - 0 or 1
/ab?c/.test('ac'); // true (0 b's)
/ab?c/.test('abc'); // true (1 b)
/ab?c/.test('abbc'); // false (2 b's)

// {n} - Exactly n
/a{3}/.test('aa'); // false
/a{3}/.test('aaa'); // true
/a{3}/.test('aaaa'); // true (contains aaa)

// {n,} - n or more
/a{2,}/.test('a'); // false
/a{2,}/.test('aa'); // true
/a{2,}/.test('aaa'); // true

// {n,m} - Between n and m
/a{2,4}/.test('a'); // false
/a{2,4}/.test('aa'); // true
/a{2,4}/.test('aaa'); // true
/a{2,4}/.test('aaaaa'); // true (contains 2-4 a's)

4.2 Greedy vs Non-Greedy

const text = '<div>Hello</div>';

// Greedy (default) - matches as much as possible
const greedy = /<.*>/;
console.log(text.match(greedy));
// ['<div>Hello</div>']

// Non-greedy - matches as little as possible
const nonGreedy = /<.*?>/;
console.log(text.match(nonGreedy));
// ['<div>']

5. Anchors

5.1 Position Anchors

// ^ - Start of string
/^hello/.test('hello world'); // true
/^hello/.test('say hello'); // false

// $ - End of string
/world$/.test('hello world'); // true
/world$/.test('world hello'); // false

// Combined
/^hello$/.test('hello'); // true (exact match)
/^hello$/.test('hello world'); // false

// \b - Word boundary
/\bhello\b/.test('hello world'); // true
/\bhello\b/.test('helloworld'); // false

// \B - Non-word boundary
/\Bhello\B/.test('helloworld'); // false
/\Bhello\B/.test('shelloworld'); // true

6. Groups and Capturing

6.1 Capturing Groups

// Basic capturing group
const pattern = /(\d{3})-(\d{3})-(\d{4})/;
const phone = '123-456-7890';
const match = phone.match(pattern);

console.log(match[0]); // '123-456-7890' (full match)
console.log(match[1]); // '123' (first group)
console.log(match[2]); // '456' (second group)
console.log(match[3]); // '7890' (third group)

// Named capturing groups
const pattern2 = /(?<area>\d{3})-(?<exchange>\d{3})-(?<number>\d{4})/;
const match2 = phone.match(pattern2);

console.log(match2.groups.area); // '123'
console.log(match2.groups.exchange); // '456'
console.log(match2.groups.number); // '7890'

6.2 Non-Capturing Groups

// (?:...) - Non-capturing group
const pattern = /(?:https?):\/\/(\w+\.\w+)/;
const url = 'https://example.com';
const match = url.match(pattern);

console.log(match[0]); // 'https://example.com'
console.log(match[1]); // 'example.com'
// No match[2] because (?:https?) doesn't capture

6.3 Backreferences

// \1, \2, etc. - Reference captured groups
const pattern = /(\w+)\s\1/; // Match repeated word
console.log(pattern.test('hello hello')); // true
console.log(pattern.test('hello world')); // false

// Named backreferences
const pattern2 = /(?<word>\w+)\s\k<word>/;
console.log(pattern2.test('hello hello')); // true

7. Alternation and Lookahead

7.1 Alternation

// | - OR operator
const pattern = /cat|dog|bird/;
console.log(pattern.test('cat')); // true
console.log(pattern.test('dog')); // true
console.log(pattern.test('fish')); // false

// With groups
const pattern2 = /gr(a|e)y/;
console.log(pattern2.test('gray')); // true
console.log(pattern2.test('grey')); // true

7.2 Lookahead

// (?=...) - Positive lookahead
const pattern1 = /\d(?=px)/;
console.log('12px'.match(pattern1)); // ['1'] (digit before 'px')

// (?!...) - Negative lookahead
const pattern2 = /\d(?!px)/;
console.log('12px 34em'.match(pattern2)); // ['2', '3', '4']

// Password validation with lookahead
const passwordPattern = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
console.log(passwordPattern.test('Pass123!')); // true
console.log(passwordPattern.test('password')); // false

7.3 Lookbehind

// (?<=...) - Positive lookbehind
const pattern1 = /(?<=\$)\d+/;
console.log('$100'.match(pattern1)); // ['100']

// (?<!...) - Negative lookbehind
const pattern2 = /(?<!\$)\d+/;
console.log('$100 200'.match(pattern2)); // ['00', '200']

8. String Methods with RegEx

8.1 test()

const pattern = /hello/i;

console.log(pattern.test('Hello World')); // true
console.log(pattern.test('Goodbye')); // false

8.2 exec()

const pattern = /(\d{3})-(\d{3})-(\d{4})/;
const text = 'Call me at 123-456-7890';
const match = pattern.exec(text);

console.log(match[0]); // '123-456-7890'
console.log(match[1]); // '123'
console.log(match.index); // 11 (start position)

// Global flag with exec
const pattern2 = /\d+/g;
const text2 = 'a1b2c3';

console.log(pattern2.exec(text2)); // ['1']
console.log(pattern2.exec(text2)); // ['2']
console.log(pattern2.exec(text2)); // ['3']
console.log(pattern2.exec(text2)); // null (resets)

8.3 match()

const text = 'The rain in Spain';

// Without global flag
const match1 = text.match(/ain/);
console.log(match1); // ['ain', index: 5, ...]

// With global flag
const match2 = text.match(/ain/g);
console.log(match2); // ['ain', 'ain']

// Named groups
const pattern = /(?<word>\w+)/;
const match3 = 'hello'.match(pattern);
console.log(match3.groups.word); // 'hello'

8.4 matchAll()

const text = 'test1 test2 test3';
const pattern = /test(\d)/g;

// Returns iterator
const matches = text.matchAll(pattern);

for (let match of matches) {
console.log(match[0], match[1]);
}
// test1 1
// test2 2
// test3 3

// Convert to array
const matchesArray = [...text.matchAll(pattern)];

8.5 replace()

const text = 'Hello World';

// Simple replacement
console.log(text.replace(/World/, 'JavaScript'));
// 'Hello JavaScript'

// With captured groups
const phone = '123-456-7890';
const formatted = phone.replace(/(\d{3})-(\d{3})-(\d{4})/, '($1) $2-$3');
console.log(formatted); // '(123) 456-7890'

// With function
const result = text.replace(/\w+/g, match => match.toUpperCase());
console.log(result); // 'HELLO WORLD'

// Named groups
const text2 = 'John Doe';
const reversed = text2.replace(/(?<first>\w+) (?<last>\w+)/, '$<last>, $<first>');
console.log(reversed); // 'Doe, John'

8.6 replaceAll()

const text = 'cat dog cat bird cat';

// Replace all occurrences
const result = text.replaceAll('cat', 'animal');
console.log(result); // 'animal dog animal bird animal'

// With regex (must have g flag)
const result2 = text.replaceAll(/cat/g, 'animal');
console.log(result2); // 'animal dog animal bird animal'
const text = 'Hello World';

// Returns index of first match
console.log(text.search(/World/)); // 6
console.log(text.search(/xyz/)); // -1 (not found)

// Case-insensitive
console.log(text.search(/world/i)); // 6

8.8 split()

const text = 'apple,banana;orange|grape';

// Split by comma
console.log(text.split(','));
// ['apple', 'banana;orange|grape']

// Split by multiple delimiters
console.log(text.split(/[,;|]/));
// ['apple', 'banana', 'orange', 'grape']

// With limit
console.log(text.split(/[,;|]/, 2));
// ['apple', 'banana']

// Capture groups included in result
const text2 = 'a1b2c3';
console.log(text2.split(/(\d)/));
// ['a', '1', 'b', '2', 'c', '3', '']

9. Common Patterns

9.1 Email Validation

const emailPattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;

console.log(emailPattern.test('user@example.com')); // true
console.log(emailPattern.test('invalid.email')); // false
console.log(emailPattern.test('user@domain')); // false

// More comprehensive
const emailPattern2 = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

9.2 URL Validation

const urlPattern = /^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)$/;

console.log(urlPattern.test('https://example.com')); // true
console.log(urlPattern.test('http://www.example.com/path')); // true
console.log(urlPattern.test('not a url')); // false

9.3 Phone Number

// US phone format
const phonePattern = /^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$/;

console.log(phonePattern.test('123-456-7890')); // true
console.log(phonePattern.test('(123) 456-7890')); // true
console.log(phonePattern.test('1234567890')); // true

9.4 Password Strength

// At least 8 chars, 1 uppercase, 1 lowercase, 1 number, 1 special char
const passwordPattern = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

console.log(passwordPattern.test('Pass123!')); // true
console.log(passwordPattern.test('password')); // false
console.log(passwordPattern.test('Pass123')); // false (no special)

9.5 Date Formats

// MM/DD/YYYY
const datePattern1 = /^(0[1-9]|1[0-2])\/(0[1-9]|[12][0-9]|3[01])\/\d{4}$/;
console.log(datePattern1.test('12/31/2023')); // true

// YYYY-MM-DD
const datePattern2 = /^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$/;
console.log(datePattern2.test('2023-12-31')); // true

9.6 Credit Card

// Basic credit card (13-19 digits)
const ccPattern = /^\d{13,19}$/;

// Visa
const visaPattern = /^4[0-9]{12}(?:[0-9]{3})?$/;

// Mastercard
const mastercardPattern = /^5[1-5][0-9]{14}$/;

// With spaces/dashes
const ccWithSeparators = /^[\d\s-]{13,19}$/;

9.7 Username

// 3-16 chars, alphanumeric and underscore
const usernamePattern = /^[a-zA-Z0-9_]{3,16}$/;

console.log(usernamePattern.test('john_doe')); // true
console.log(usernamePattern.test('ab')); // false (too short)
console.log(usernamePattern.test('user@name')); // false (invalid char)

9.8 Hex Color

const hexPattern = /^#?([a-f0-9]{6}|[a-f0-9]{3})$/i;

console.log(hexPattern.test('#ff0000')); // true
console.log(hexPattern.test('#f00')); // true
console.log(hexPattern.test('ff0000')); // true
console.log(hexPattern.test('#gg0000')); // false

10. Advanced Techniques

10.1 Escaping Special Characters

// Special chars: . * + ? ^ $ { } ( ) | [ ] \ /

function escapeRegExp(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

const userInput = 'How much? $5.00';
const escaped = escapeRegExp(userInput);
const pattern = new RegExp(escaped);

console.log(pattern.test('How much? $5.00')); // true

10.2 Unicode Property Escapes

// Match specific Unicode categories
const emojiPattern = /\p{Emoji}/u;
console.log(emojiPattern.test('😀')); // true

// Match specific scripts
const greekPattern = /\p{Script=Greek}/u;
console.log(greekPattern.test('α')); // true
console.log(greekPattern.test('a')); // false

10.3 Conditional Patterns

// Different patterns based on condition
function validateInput(input, type) {
const patterns = {
email: /^[^\s@]+@[^\s@]+\.[^\s@]+$/,
phone: /^\d{3}-\d{3}-\d{4}$/,
zip: /^\d{5}(-\d{4})?$/
};

return patterns[type]?.test(input) ?? false;
}

console.log(validateInput('user@example.com', 'email')); // true
console.log(validateInput('123-456-7890', 'phone')); // true

11. Best Practices

11.1 Performance

// ❌ Avoid catastrophic backtracking
const bad = /^(a+)+$/;
// 'aaaaaaaaaaaaaaaaaaX' takes exponential time

// ✅ Use possessive quantifiers or atomic groups
const good = /^a+$/;

11.2 Readability

// ❌ Hard to read
const pattern1 = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

// ✅ Use comments (with x flag in some environments)
// Or break into parts
const hasLowercase = /(?=.*[a-z])/;
const hasUppercase = /(?=.*[A-Z])/;
const hasDigit = /(?=.*\d)/;
const hasSpecial = /(?=.*[@$!%*?&])/;
const minLength = /^.{8,}$/;

11.3 Reusability

// Create reusable patterns
const patterns = {
email: /^[^\s@]+@[^\s@]+\.[^\s@]+$/,
phone: /^\d{3}-\d{3}-\d{4}$/,
url: /^https?:\/\/.+/,

validate(input, type) {
return this[type]?.test(input) ?? false;
}
};

console.log(patterns.validate('user@example.com', 'email')); // true
Common Pitfalls
  • Forgetting to escape special characters
  • Using greedy quantifiers when non-greedy is needed
  • Not anchoring patterns (^ and $)
  • Catastrophic backtracking with nested quantifiers
  • Forgetting the g flag for global replacements

Summary

In this module, you learned:

  • ✅ Creating regular expressions with literals and constructors
  • ✅ RegEx flags: i, g, m, s, u, y
  • ✅ Character classes and quantifiers
  • ✅ Anchors and boundaries
  • ✅ Groups, capturing, and backreferences
  • ✅ Lookahead and lookbehind assertions
  • ✅ String methods: test, exec, match, replace, split
  • ✅ Common patterns: email, URL, phone, password
  • ✅ Advanced techniques and best practices
Congratulations!

You've completed the JavaScript tutorial series! You now have a solid foundation in JavaScript, from basic syntax to advanced topics like functional programming, async patterns, and regular expressions.


Practice Exercises

  1. Create email and password validators
  2. Build a URL parser using regex
  3. Implement a syntax highlighter for code
  4. Create a template string replacer
  5. Build a phone number formatter
  6. Implement credit card validation
  7. Create a search and highlight function
  8. Build a markdown parser (basic)

Additional Resources