最新消息:20210917 已从crifan.com换到crifan.org

【已解决】Python中用正则re.match报错:发生异常 error multiple repeat at position

报错 crifan 1680浏览 0评论
折腾:
【已解决】用Python把印象笔记中标题和链接合并一起
期间,对于代码:
        # <div><a href="https://npm.taobao.org/mirrors/">https://npm.taobao.org/mirrors/</a></div>
        # <div><a href="https://npm.taobao.org/mirrors/python/">https://npm.taobao.org/mirrors/python/</a></div>
        # if hrefValue != aStr:
        hrefP = "(https?://)?%s/?" % aStr
        isSameUrl = re.match(hrefP, hrefValue, re.I)
结果报错:
发生异常: error
multiple repeat at position 61
File "/Users/xxx/dev/crifan/EvernoteToWordpress/EvernoteToWordpress.py", line 200, in mergeNoteTitleAndUrl
对于:
hrefP=
'(https?://)?Is there a way to get a call graph for certain c++ function in Visual Studio? - Stack Overflow/?'

hrefValue=
'https://stackoverflow.com/questions/16190058/is-there-a-way-to-get-a-call-graph-for-certain-c-function-in-visual-studio'
结果无法正常匹配,而报错:
multiple repeat at position 61
python re multiple repeat at position
python – sre_constants.error: multiple repeat at position 2 – Stack Overflow
所以需要去对于
'(https?://)?Is there a way to get a call graph for certain c++ function in Visual Studio? - Stack Overflow/?'
其中特殊的字符串,需要转义?
c++
中的
+
+
Studio?
中的
?
估计是,所以去加上:
        validAStr = aStr.replace("+", "\+")
        validAStr = validAStr.replace("?", "\?")
        validAStr = validAStr.replace("*", "\*")
        validAStr = validAStr.replace(".", "\.")
再优化为:
        SpecialCharList = [
            ".",
            "+",
            "*",
            "?",
        ]
        for eachSpecialChar in SpecialCharList:
            escapedChar = "\%s" % eachSpecialChar
            aStr = aStr.replace(eachSpecialChar, escapedChar)
结果:
从:
'Is there a way to get a call graph for certain c++ function in Visual Studio? - Stack Overflow'
被替换成了:
'Is there a way to get a call graph for certain c\\+\\+ function in Visual Studio\\? - Stack Overflow'
再去匹配,看看是否会报错
就不会报错了。
【总结】
此处由于代码
        hrefP = "(https?://)?%s/?" % aStr
        isSameUrl = re.match(hrefP, hrefValue, re.I)
要去匹配的值aStr
'Is there a way to get a call graph for certain c++ function in Visual Studio? - Stack Overflow'
中包含特殊的正则字符,且是连续的 ++
导致逻辑上不成立:++ 在规则中是不合法的
所以报错
发生异常: error
multiple repeat at position 61
解决办法:
确保正则的pattern中,没有无效的规则。
此处即,把特殊字符,都替换,加上反斜杠\去转义:
        SpecialCharList = [
            ".",
            "+",
            "*",
            "?",
        ]
        for eachSpecialChar in SpecialCharList:
            escapedChar = "\%s" % eachSpecialChar # '\\.'
            aStr = aStr.replace(eachSpecialChar, escapedChar)
        # 'Is there a way to get a call graph for certain c++ function in Visual Studio? - Stack Overflow'
        # ->
        # 'Is there a way to get a call graph for certain c\\+\\+ function in Visual Studio\\? - Stack Overflow'


        isSameUrl = re.match(hrefP, hrefValue, re.I)
即可规避此问题。

转载请注明:在路上 » 【已解决】Python中用正则re.match报错:发生异常 error multiple repeat at position

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
90 queries in 0.188 seconds, using 23.32MB memory