Python

Introduction to regular expression

blog image

 Regular expression is the powerful tools for various kinds of string manupulation. They are domain specific language (DSL) that present as a library in most modern programming language, not just python. DSl is highly specialized programming language. RE is one and SQL is another which is mainly used for data manipulation.

They are useful for two main tasks.

1.Verifying that string match a patterns(example string has a format of an email)

2.performing substitutions in a string (such as changing all american spellings to british ones)

RE in python can be accessed by using re module, which is the part of standard library.

re.match function can be used to determine whether it matches at the beginning of a string. if it does match returns and object, if not return None. To avoid any confusion, while working with RE,we would use raw string as r"expression". raw string don't escape anything, which make use of re easier.

 other function

re.search() -> find a match of a pattern anywhere in the string

re.findall() -> retur a list of all substrings that match a pattern

re.finditer()

match function doesnot match the pattern,as it looks at the beginning of the string.

finditer() does same thing as re.findall() except it return an iterator, rather than a list

import re
pattern=r"spam"
if re.match(pattern,"eggspam sausagespam"):
    print("match")
else:
    print("no match")

if re.search(pattern,"eggspam sausagespam"):
    print("match")
else:
    print("no match")

print(re.findall(pattern,"eggspam sausagespam"))
print(re.finditer(pattern,"eggspam sausagespam"))


"""
no match
match
['spam', 'spam']
"""

 

The regex search returns an object with several method that give details about it.

These method include group which returns the string matched,start and end which return start and ending position of the first match,span which return the start and end positions of the first match as a tuple.

import re
pattern=r"spam"
match=re.search(pattern,"eggspam sausagespam")
if match:
    print(match.group())
    print(match.start())
    print(match.end())
    print(match.span())
else:
    print("no match")
"""
spam
3
7
(3, 7)
"""

Search and replace

syntax:re.sub(patterns,replace,string,connt=0)

This method replaces all occurance of the ptterns in string with replace, substituting all occurances , unless count provided. This method returns the modified string.

import re
string=r"my name is amrit. i am studing computer engineering"
pattern=r"am studing"
new_str=re.sub(pattern,"had completed",string)
print(new_str)

"""
my name is amrit. i had completed computer engineering
"""

RE for email validation

/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/


About author

author image

Amrit Panta

Python developer, content writer



3 Comments

Amanda Martines 5 days ago

Exercitation photo booth stumptown tote bag Banksy, elit small batch freegan sed. Craft beer elit seitan exercitation, photo booth et 8-bit kale chips proident chillwave deep v laborum. Aliquip veniam delectus, Marfa eiusmod Pinterest in do umami readymade swag. Selfies iPhone Kickstarter, drinking vinegar jean.

Reply

Baltej Singh 5 days ago

Drinking vinegar stumptown yr pop-up artisan sunt. Deep v cliche lomo biodiesel Neutra selfies. Shorts fixie consequat flexitarian four loko tempor duis single-origin coffee. Banksy, elit small.

Reply

Marie Johnson 5 days ago

Kickstarter seitan retro. Drinking vinegar stumptown yr pop-up artisan sunt. Deep v cliche lomo biodiesel Neutra selfies. Shorts fixie consequat flexitarian four loko tempor duis single-origin coffee. Banksy, elit small.

Reply

Leave a Reply

Scroll to Top