网站首页 > 基础教程正文

【python】一文学会使用正则表达式

ccvgpt 2025-05-14 12:10:33 基础教程 1 ℃

1. 基本模式匹配

要在字符串中找到一个与模式匹配的匹配项：

import re
text = "Search this string for patterns."
match = re.search(r"patterns", text)
if match:
    print("Pattern found!")

编译正则表达式

编译一个用于重复使用的正则表达式：

【python】一文学会使用正则表达式

pattern = re.compile(r"patterns")
match = pattern.search(text)

3. 在开头或结尾匹配

检查字符串是否以模式开头或结尾：

if re.match(r"^Search", text):
    print("Starts with 'Search'")
if re.search(r"patterns.#34;, text):
    print("Ends with 'patterns.'")

4. 查找所有匹配项

要查找字符串中所有模式出现的位置：

all_matches = re.findall(r"t\w+", text)  # Finds words starting with 't'
print(all_matches)

5. 搜索和替换（替换）

要在字符串中替换模式出现：

replaced_text = re.sub(r"string", "sentence", text)
print(replaced_text)

6. 分割字符串

将字符串按模式出现分割：

words = re.split(r"\s+", text)  # Split on one or more spaces
print(words)

7. 转义特殊字符

匹配特殊字符时，请将其转义：

escaped = re.search(r"\bfor\b", text)  # \b is a word boundary

8. 分组与捕获

将模式的部分分组并提取它们的值：

match = re.search(r"(\w+) (\w+)", text)
if match:
    print(match.group())  # The whole match
    print(match.group(1)) # The first group

9. 非捕获组

定义不捕获的组：

match = re.search(r"(?:\w+) (\w+)", text)
if match:
    print(match.group(1))  # The first (and only) group

10. 预查和回溯断言

匹配基于其前后内容而不包括其本身的模式：

lookahead = re.search(r"\b\w+(?= string)", text)  # Word before ' string'
lookbehind = re.search(r"(?<=Search )\w+", text)  # Word after 'Search '
if lookahead:
    print(lookahead.group())
if lookbehind:
    print(lookbehind.group())

11. 修改模式匹配行为的标志

要使用类似 re.IGNORECASE 的标志来更改模式匹配方式：

case_insensitive = re.findall(r"search", text, re.IGNORECASE)
print(case_insensitive)

12. 使用命名组

将名称分配给组并通过名称引用它们：

match = re.search(r"(?P<first>\w+) (?P<second>\w+)", text)
if match:
    print(match.group('first'))
    print(match.group('second'))

13. 多行匹配

要使用re.MULTILINE标志匹配多行模式：

multi_line_text = "Start\nmiddle end"
matches = re.findall(r"^m\w+", multi_line_text, re.MULTILINE)
print(matches)

14. 懒惰量词

尽可能少地匹配字符，使用懒惰量词（*?、+?、??）：

html = "<body><h1>Title</h1></body>"
match = re.search(r"<.*?>", html)
if match:
    print(match.group())  # Matches '<body>'

15. 长篇正则表达式

使用re.VERBOSE以获得更易读的正则表达式：

pattern = re.compile(r"""
    \b      # Word boundary
    \w+     # One or more word characters
    \s      # Space
    """, re.VERBOSE)
match = pattern.search(text)

网站首页 > 基础教程 正文

【python】一文学会使用正则表达式

1. 基本模式匹配

编译正则表达式

3. 在开头或结尾匹配

4. 查找所有匹配项

5. 搜索和替换（替换）

6. 分割字符串

7. 转义特殊字符

8. 分组与捕获

9. 非捕获组

10. 预查和回溯断言

11. 修改模式匹配行为的标志

12. 使用命名组

13. 多行匹配

14. 懒惰量词

15. 长篇正则表达式

猜你喜欢

网站首页 > 基础教程正文