Skip to content

LwamB/regular-expressions-regex

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Regular Expression in R and Python

What is regular expression?

Regular expression is not a library nor is it a programming language. Instead, regular expression is a sequence of characters that specifies a search pattern in any given text (string).

A text can consist of pretty much anything from letters to numbers, space characters to special characters. As long as the string follows some sort of pattern, regex is robust enough to be able to capture this pattern and return a specific part of the string.

Basic regex characters

Characters

  • Escape character: \
  • Any character: .
  • Digit: \d
  • Not a digit: \D
  • Word character: \w
  • Not a word character: \W
  • Whitespace: \s
  • Not whitespace: \S
  • Word boundary: \b
  • Not a word boundary: \B
  • Beginning of a string: ^
  • End of a string: $

Groupings

  • Matches characters in brackets: [ ]
  • Matches characters not in brackets: [^ ]
  • Either or: |
  • Capturing group: ( )

Quantifiers

  • 0 or more: *
  • 1 or more: +
  • 0 or 1: ?
  • An exact number of characters: { }
  • Range of number of characters: {Minimum, Maximum}

Regex examples

  • Phone numbers
  • Dates
  • Names
  • URLs
  • Email address
  • Address

Medium article

Link to full write-up on Towards Data Science here.

Additional resources

Follow me

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%