最新消息:20210917 已从crifan.com换到crifan.org

【问题解答】正则re.sub匹配不到全部的组的内容

正则 crifan 480浏览 0评论
问:
自己的帖子
【整理】详解Python中re.sub
中有个评论:
楼主您好,请问一下, 比如把一个字符串strs = “it is a good job. he is very happy” 用 re.sub(“(it) is”,”\1’s”,strs),可以得到”it’s a good job. he is very happy“。同样我使用re.sub(“(it|he) is”,”\1’s”,strs),不能同时替换两个(it is, he is)->(it’s, he’s),如有可以的话,请问怎么写?
答:
去试了:
import re


inputStr = "it is a good job. he is very happy"
replacedStr = re.sub("(it) is", "\1’s", inputStr)
print("replacedStr=%s" % replacedStr)
-》 并没有得到:it’s a good job. he is very happy
只有:
replacedStr=’s a good job. he is very happy
参考自己的教程中的:
inputStr = "hello crifan, nihao crifan"

replacedStr = re.sub(r"hello (\w+), nihao \1", "\g<1>", inputStr)

print "replacedStr=",replacedStr; #crifan
改为:
\g<1>
的写法:
replacedStrWithG = re.sub("(it) is", "\g<1>’s", inputStr)
print("replacedStrWithG=%s" % replacedStrWithG)
才可以得到
replacedStrWithG=it’s a good job. he is very happy
后面也是类似的逻辑:
replaceAllStr = re.sub("(it|he) is", "\1’s", inputStr)
print("replaceAllStr=%s" % replaceAllStr)
只能得到:
replaceAllStr=’s a good job. ’s very happy
replaceAllStrWithG = re.sub("(it|he) is", "\g<1>’s", inputStr)
print("replaceAllStrWithG=%s" % replaceAllStrWithG)
才能得到:
replaceAllStrWithG=it’s a good job. he’s very happy
之前此人错误的原因:
把 当前要被替换的字符串中的 \1,放到了 被替换后的字符串中了
我之前的例子:
inputStr = "hello crifan, nihao crifan"
想要匹配到:
hello xxx, nihao xxx
中的第二个xxx,确保和第一个一样,才会用:
"hello (\w+), nihao \1"
而,re.sub的替换后的结果中,要去引用,被替换中的组=group,要加上
\g<N>
N =1,2,3。。。
具体写法
replacedStr = re.sub(r"hello (\w+), nihao \1", "\g<1>", inputStr)
或用命名的组:
inputStr = "hello crifan, nihao crifan"

replacedStr = re.sub(r"hello (?P<name>\w+), nihao (?P=name)", "\g<name>", inputStr)
此处例子中的写法是:
replaceAllStrNamedGroup = re.sub("(?P<hostPart>it|he) is", "\g<hostPart>’s", inputStr)
print("replaceAllStrNamedGroup=%s" % replaceAllStrNamedGroup)
同样可以得到我们要的:
replaceAllStrNamedGroup=it’s a good job. he’s very happy
完整代码:
# Function: re.sub not match all, comment on https://www.crifan.com/python_re_sub_detailed_introduction/
# Author: Crifan Li
# Update: 20210816


import re


inputStr = "it is a good job. he is very happy"
replacedStr = re.sub("(it) is", "\1’s", inputStr)
print("replacedStr=%s" % replacedStr)
# replacedStr=’s a good job. he is very happy
# -》 并没有得到:it’s a good job. he is very happy


replacedStrWithG = re.sub("(it) is", "\g<1>’s", inputStr)
print("replacedStrWithG=%s" % replacedStrWithG)
# replacedStrWithG=it’s a good job. he is very happy


replaceAllStr = re.sub("(it|he) is", "\1’s", inputStr)
print("replaceAllStr=%s" % replaceAllStr)
# replaceAllStr=’s a good job. ’s very happy


replaceAllStrWithG = re.sub("(it|he) is", "\g<1>’s", inputStr)
print("replaceAllStrWithG=%s" % replaceAllStrWithG)
# replaceAllStrWithG=it’s a good job. he’s very happy

replaceAllStrNamedGroup = re.sub("(?P<hostPart>it|he) is", "\g<hostPart>’s", inputStr)
print("replaceAllStrNamedGroup=%s" % replaceAllStrNamedGroup)
# replaceAllStrNamedGroup=it’s a good job. he’s very happy

转载请注明:在路上 » 【问题解答】正则re.sub匹配不到全部的组的内容

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
90 queries in 0.174 seconds, using 23.31MB memory