最新消息:20210917 已从crifan.com换到crifan.org

【已解决】Python中使用re.search时出错:raise error, v # invalid expression, sre_constants.error: syntax error

Python re crifan 5377浏览 0评论

【问题】

python中,使用正则期间,用如下代码:

#http://autoexplosion.com/cars/buy/150594.php
foundMainType = re.search("http://autoexplosion\.com/(?<mainType>\w+)/buy/(?<adId>\d+)\.php", itemLink);

结果出错:

Traceback (most recent call last):

  File "E:\Dev_Root\freelance\Elance\projects\40377988_data_mining\40377988_data_mining\40377988_data_mining.py", line 3

80, in <module>

    main();

  File "E:\Dev_Root\freelance\Elance\projects\40377988_data_mining\40377988_data_mining\40377988_data_mining.py", line 3

04, in main

    itemInfoDict = processEachItem(itemLink);

  File "E:\Dev_Root\freelance\Elance\projects\40377988_data_mining\40377988_data_mining\40377988_data_mining.py", line 1

83, in processEachItem

    foundMainType = re.search("http://autoexplosion\.com/(?<mainType>\w+)/buy/(?<adId>\d+)\.php", itemLink);

  File "E:\dev_install_root\Python27\lib\re.py", line 142, in search

    return _compile(pattern, flags).search(string)

  File "E:\dev_install_root\Python27\lib\re.py", line 244, in _compile

    raise error, v # invalid expression

sre_constants.error: syntax error

【解决过程】

1.调试了半天,结果也还是没找到错误的原因。

2.后来去看了re的语法,才发现是:

(?P<name>...)

Similar to regular parentheses, but the substring matched by the group is accessible within the rest of the regular expression via the symbolic group name name. Group names must be valid Python identifiers, and each group name must be defined only once within a regular expression. A symbolic group is also a numbered group, just as if the group were not named. So the group named id in the example below can also be referenced as the numbered group 1.

For example, if the pattern is (?P<id>[a-zA-Z_]\w*), the group can be referenced by its name in arguments to methods of match objects, such as m.group('id') or m.end('id'), and also by name in the regular expression itself (using (?P=id)) and replacement text given to .sub() (using \g<id>).

即,是:

(?P<xxx>…)

而不是:

(?<xxx>…)

所以,改为:

#http://autoexplosion.com/cars/buy/150594.php

foundMainType = re.search("http://autoexplosion\.com/(?P<mainType>\w+)/buy/(?P<adId>\d+)\.php", itemLink);

就可以了。

3. 而此处,之所以错写成:

(?<xxx>…)

是因为,

最近写C#程序太多了,写Python程序太少了。。。

注:C#中的正则,named group是(?<xxx>…)

 

【总结】

看来真是,代码一旦不经常写,之前再熟悉的东西,都可能会忘记,都可能搞混淆的。。。

转载请注明:在路上 » 【已解决】Python中使用re.search时出错:raise error, v # invalid expression, sre_constants.error: syntax error

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
99 queries in 0.201 seconds, using 23.33MB memory