A RegEx or Regular Expression, is a sequence of characters that forms a search pattern.
RegEx can be used to check if a string contains the specified search pattern.
Python has a built-in package called re, which can be used to work with Regular Expressions.
Import the re module.
# Search the string to see if it starts with "The" and ends with "India":
import re
txt = "The rain in India"
x = bool(re.search("^The.*India$", txt))
print(x)
# prints True
RegEx Functions
The re module offers a set of functions that allows us to search a string for a match.
findall() function
The findall() function returns a list containing all matches.
Note: If no matches are found, an empty list is returned.
import re
txt = "The rain in India"
x = re.findall("in", txt)
print(x)
# prints ['in', 'in']
search() function
The search() function searches the string for a match, and returns a Match object if there is a match.
If there is more than one match, only the first occurrence of the match will be returned.
Note: If no matches are found, the value None is returned.
# Search for the first white-space character in the string
import re
txt = "The rain in India"
x = re.search("\s", txt)
print("The first white-space character is located in position:", x.start())
# prints The first white-space character is located in position: 3
split() function
The split() function returns a list where the string has been split at each match.
# Split at each white-space character:
import re
#Split the string at every white-space character:
txt = "The rain in India"
x = re.split("\s", txt)
print(x)
# prints ['The', 'rain', 'in', 'India']
You can control the number of occurrences by specifying the maxsplit parameter.
# Split the string only at the first occurrence
import re
txt = "The rain in India"
x = re.split("\s", txt, 1)
print(x)
# prints ['The', 'rain in India']
sub() function
The sub() function replaces the matches with the text of your choice.
# Replace every white-space character with the number 9:
import re
txt = "The rain in India"
x = re.sub("\s", "9", txt)
print(x)
# prints The9rain9in9India
You can control the number of replacements by specifying the count parameter.
import re
txt = "The rain in India"
x = re.sub("\s", "9", txt, 2)
print(x)
# prints The9rain9in India
You can learn more about RegEx from here
Try the following example in the editor below.
Given a string txt, perform the operations as defined in the comments.