Python RegEx

❮ ก่อนหน้า ถัดไป ❯

RegEx หรือ Regular Expression คือลำดับของอักขระที่สร้างรูปแบบการค้นหา

สามารถใช้ RegEx เพื่อตรวจสอบว่าสตริงมีรูปแบบการค้นหาที่ระบุหรือไม่

โมดูล RegEx

Python มีแพ็คเกจในตัวที่เรียกว่าreซึ่งสามารถใช้เพื่อทำงานกับนิพจน์ทั่วไป

นำเข้าreโมดูล:

import re

RegEx ใน Python

เมื่อคุณนำเข้าreโมดูลแล้ว คุณสามารถเริ่มใช้นิพจน์ทั่วไปได้:

ตัวอย่าง

ค้นหาสตริงเพื่อดูว่าเริ่มต้นด้วย "The" และลงท้ายด้วย "Spain" หรือไม่:

import re

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

ฟังก์ชัน RegEx

โมดูล นำreเสนอชุดของฟังก์ชันที่ช่วยให้เราค้นหาสตริงสำหรับการจับคู่:

Function	Description
findall	Returns a list containing all matches
search	Returns a Match object if there is a match anywhere in the string
split	Returns a list where the string has been split at each match
sub	Replaces one or many matches with a string

เมตาคาแรคเตอร์

Metacharacters เป็นอักขระที่มีความหมายพิเศษ:

Character	Description	Example
[]	A set of characters	"[a-m]"
\	Signals a special sequence (can also be used to escape special characters)	"\d"
.	Any character (except newline character)	"he..o"
^	Starts with	"^hello"
$	Ends with	"planet$"
*	Zero or more occurrences	"he.*o"
+	One or more occurrences	"he.+o"
?	Zero or one occurrences	"he.?o"
{}	Exactly the specified number of occurrences	"he{2}o"
\|	Either or	"falls\|stays"
()	Capture and group

ลำดับพิเศษ

ลำดับพิเศษ\ตามด้วยอักขระตัวหนึ่งในรายการด้านล่าง และมีความหมายพิเศษ:

Character	Description	Example
\A	Returns a match if the specified characters are at the beginning of the string	"\AThe"
\b	Returns a match where the specified characters are at the beginning or at the end of a word (the "r" in the beginning is making sure that the string is being treated as a "raw string")	r"\bain" r"ain\b"
\B	Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word (the "r" in the beginning is making sure that the string is being treated as a "raw string")	r"\Bain" r"ain\B"
\d	Returns a match where the string contains digits (numbers from 0-9)	"\d"
\D	Returns a match where the string DOES NOT contain digits	"\D"
\s	Returns a match where the string contains a white space character	"\s"
\S	Returns a match where the string DOES NOT contain a white space character	"\S"
\w	Returns a match where the string contains any word characters (characters from a to Z, digits from 0-9, and the underscore _ character)	"\w"
\W	Returns a match where the string DOES NOT contain any word characters	"\W"
\Z	Returns a match if the specified characters are at the end of the string	"Spain\Z"

ชุด

ชุดคือชุดของอักขระภายในวงเล็บเหลี่ยม[]ที่มีความหมายพิเศษ:

Set	Description	Try it
[arn]	Returns a match where one of the specified characters (`a`, `r`, or `n`) are present
[a-n]	Returns a match for any lower case character, alphabetically between `a` and `n`
[^arn]	Returns a match for any character EXCEPT `a`, `r`, and `n`
[0123]	Returns a match where any of the specified digits (`0`, `1`, `2`, or `3`) are present
[0-9]	Returns a match for any digit between `0` and `9`
[0-5][0-9]	Returns a match for any two-digit numbers from `00` and `59`
[a-zA-Z]	Returns a match for any character alphabetically between `a` and `z`, lower case OR upper case
[+]	In sets, `+`, `*`, `.`, `\|`, `()`, `$`,`{}` has no special meaning, so `[+]` means: return a match for any `+` character in the string

ฟังก์ชัน findall()

ฟังก์ชันfindall()ส่งคืนรายการที่มีการแข่งขันทั้งหมด

ตัวอย่าง

พิมพ์รายการการแข่งขันทั้งหมด:

import re

txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)

รายการมีรายการที่ตรงกันในลำดับที่พบ

หากไม่พบรายการที่ตรงกัน รายการว่างจะถูกส่งคืน:

ตัวอย่าง

ส่งคืนรายการว่างหากไม่พบรายการที่ตรงกัน:

import re

txt = "The rain in Spain"
x = re.findall("Portugal", txt)
print(x)

ฟังก์ชั่นการค้นหา ()

ฟังก์ชันsearch()ค้นหาสตริงสำหรับการจับคู่ และส่งกลับออบเจกต์ Matchหากมีการจับคู่

หากมีการแข่งขันมากกว่าหนึ่งรายการ ระบบจะส่งคืนเฉพาะการแข่งขันครั้งแรกเท่านั้น:

ตัวอย่าง

ค้นหาอักขระช่องว่างสีขาวตัวแรกในสตริง:

import re

txt = "The rain in Spain"
x = re.search("\s", txt)

print("The first white-space character is located in position:", x.start())

หากไม่พบรายการที่ตรงกัน ค่าNoneจะถูกส่งคืน:

ตัวอย่าง

ทำการค้นหาที่ไม่ตรงกัน:

import re

txt = "The rain in Spain"
x = re.search("Portugal", txt)
print(x)

ฟังก์ชัน split()

ฟังก์ชันsplit()ส่งคืนรายการที่มีการแยกสตริงในแต่ละการแข่งขัน:

ตัวอย่าง

แยกที่อักขระช่องว่างสีขาวแต่ละตัว:

import re

txt = "The rain in Spain"
x = re.split("\s", txt)
print(x)

คุณสามารถควบคุมจำนวนครั้งโดยการระบุ maxsplit พารามิเตอร์:

ตัวอย่าง

แยกสตริงเฉพาะที่เกิดขึ้นครั้งแรก:

import re

txt = "The rain in Spain"
x = re.split("\s", txt, 1)
print(x)

ฟังก์ชันย่อย()

ฟังก์ชัน นี้sub()จะแทนที่การจับคู่ด้วยข้อความที่คุณเลือก:

ตัวอย่าง

แทนที่ทุกตัวอักษรเว้นวรรคด้วยตัวเลข 9:

import re

txt = "The rain in Spain"
x = re.sub("\s", "9", txt)
print(x)

คุณสามารถควบคุมจำนวนการแทนที่ได้โดยการระบุ count พารามิเตอร์:

ตัวอย่าง

แทนที่ 2 รายการแรก:

import re

txt = "The rain in Spain"
x = re.sub("\s", "9", txt, 2)
print(x)

จับคู่วัตถุ

Match Object เป็นวัตถุที่มีข้อมูลเกี่ยวกับการค้นหาและผลลัพธ์

หมายเหตุ:หากไม่มีการจับคู่ ค่าNoneจะถูกส่งคืน แทนที่จะเป็น Match Object

ตัวอย่าง

ทำการค้นหาที่จะส่งคืน Match Object:

import re

txt = "The rain in Spain"
x = re.search("ai", txt)
print(x) #this will print an object

Match object มีคุณสมบัติและวิธีการที่ใช้เพื่อดึงข้อมูลเกี่ยวกับการค้นหาและผลลัพธ์:

.span()ส่งคืนทูเพิลที่มีตำแหน่งเริ่มต้นและสิ้นสุดของการแข่งขัน
.stringส่งกลับสตริงที่ส่งผ่านไปยังฟังก์ชัน
.group()ส่งคืนส่วนของสตริงที่มีการจับคู่

ตัวอย่าง

พิมพ์ตำแหน่ง (ตำแหน่งเริ่มต้นและสิ้นสุด) ของการแข่งขันนัดแรก

นิพจน์ทั่วไปจะค้นหาคำใดๆ ที่ขึ้นต้นด้วยตัวพิมพ์ใหญ่ "S":

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.span())

ตัวอย่าง

พิมพ์สตริงที่ส่งผ่านไปยังฟังก์ชัน:

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.string)

ตัวอย่าง

พิมพ์ส่วนของสตริงที่มีการจับคู่

นิพจน์ทั่วไปจะค้นหาคำใดๆ ที่ขึ้นต้นด้วยตัวพิมพ์ใหญ่ "S":

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.group())

หมายเหตุ:หากไม่มีการจับคู่ ค่าNoneจะถูกส่งคืน แทนที่จะเป็น Match Object

❮ ก่อนหน้า ถัดไป ❯

บทช่วยสอนงูหลาม

การจัดการไฟล์

โมดูล Python

Python Matplotlib

การเรียนรู้ของเครื่อง

Python MySQL

Python MongoDB

การอ้างอิงหลาม

การอ้างอิงโมดูล

Python ฮาวทู

ตัวอย่าง Python

Python RegEx

โมดูล RegEx

RegEx ใน Python

ตัวอย่าง

ฟังก์ชัน RegEx

เมตาคาแรคเตอร์

ลำดับพิเศษ

ชุด

ฟังก์ชัน findall()

ตัวอย่าง

ตัวอย่าง

ฟังก์ชั่นการค้นหา ()

ตัวอย่าง

ตัวอย่าง

ฟังก์ชัน split()

ตัวอย่าง

ตัวอย่าง

ฟังก์ชันย่อย()

ตัวอย่าง

ตัวอย่าง

จับคู่วัตถุ

ตัวอย่าง

ตัวอย่าง

ตัวอย่าง

ตัวอย่าง