折腾:
期间,想要对于这种:

去匹配到:
多个script
即分组的分组
结果用:
<code>scriptMatch = re.search("(?P<scriptList>(?P<singleScript>place.+?)\n{2,1000})+", allLine, flags=re.I | re.M | re.DOTALL)
</code>得到的scriptList也只是第一个script,而不是以为的所有的script

python re multiple group of group
Python Re: Multiple Capturing Groups – Stack Overflow
regex – Python Regular Expression Multiple Groups – Stack Overflow
python – How to match multiple groups regex – Stack Overflow
re中去搜索到分组的集合
好像都提到了re.finditer
去找找re.finditer
https://docs.python.org/3/library/re.html#re.finditer
“re.finditer(pattern, string, flags=0)
Return an iterator yielding match objects over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result.
 
Changed in version 3.7: Non-empty matches can now start just after a previous empty match.
 ”
然后试试。

Repeating a Capturing Group vs. Capturing a Repeated Group
调试了好一会,加上
re.DOTALL的flag后,终于可以搜索到了:

然后又遇到一个坑:
iterator的object,被访问一次后,就变成空了:
所以此处的:
<code>    matchIterator = re.finditer(singleScriptPattern, allLine, flags=re.I | re.M | re.DOTALL)
    print("matchIterator=%s" % matchIterator)
    if matchIterator:
      for scriptNum, eachScriptMatch in enumerate(matchIterator):
        print("[%d] eachScriptMatch=%s" % (scriptNum, eachScriptMatch))
        singleScript = eachScriptMatch.group("singleScript")
        print("singleScript=%s" % singleScript)
</code>在:
if matchIterator:
matchIterator就是空了,后续for循环就无效了,得不到我们要的数据了。

【总结】
此处,直接用:
<code>matchIterator = re.finditer(yourPattern, toSearchStr, flags) </code>
即可得到对应的iterator类型的变量,可以用for循环去分别获取每个值
而其中的yourPattern,如果内部带group,则每个match的值中,可以直接用.group(“xx”)去获取对应的值,比如对于:
<code> Place: School canteen Topic: food Tittle:Have lunch Age: 3-4 J: What did you have for lunch? L: I ate rice, fish and bread. J: Do you like rice? L: Yes, I do. J: Do you like fish? L: Yes, I do. J: Do you like bread? L: No, I don’t. J: What did you drink? L: I drank milk. J: Do you like milk? L: Yes, I do. Place: home Topic: house Tittle: Doing housework Age: 4-5 J: Do you like cooking, mom? M: Yes, I do a lot. What about you? J: Mom, you know me. I can’t cook. M: But can you help me wash dishes? J: Yes, I can help you. M: Let’s make a deal, ok? J: What kind of deal? M: I’m going to cook. J: And then? M: Then you wash the dishes after the meal. J: That’s ok. I’ d like to help you mom. M: You are a good boy. 。。。 </code>
正则代码:
<code>    singleScriptPattern = r"(?P<singleScript>place:(?P<place>.+?)\ntopic:(?P<topic>.+?)\ntittle:(?P<title>.+?)\nage:(?P<age>.+?)\n(?P<content>.+?))\n{2,1000}"
    matchIterator = re.finditer(singleScriptPattern, allLine, flags=re.I | re.M | re.DOTALL)
    print("matchIterator=%s" % matchIterator)
    # if matchIterator:
    for scriptNum, eachScriptMatch in enumerate(matchIterator):
      print("[%d] eachScriptMatch=%s" % (scriptNum, eachScriptMatch))
      singleScript = eachScriptMatch.group("singleScript")
      print("singleScript=%s" % singleScript)
      place = eachScriptMatch.group("place")
      print("place=%s" % place)
      topic = eachScriptMatch.group("topic")
      print("topic=%s" % topic)
      title = eachScriptMatch.group("title")
      print("title=%s" % title)
      age = eachScriptMatch.group("age")
      print("age=%s" % age)
      content = eachScriptMatch.group("content")
      print("content=%s" % content)
</code>结果:
<code>matchIterator=<callable_iterator object at 0x10e3f7b70> [0] eachScriptMatch=<_sre.SRE_Match object; span=(1, 309), match='Place: School canteen\nTopic: food\nTittle:Have l> singleScript=Place: School canteen Topic: food Tittle:Have lunch Age: 3-4 J: What did you have for lunch? L: I ate rice, fish and bread. J: Do you like rice? L: Yes, I do. J: Do you like fish? L: Yes, I do. J: Do you like bread? L: No, I don’t. J: What did you drink? L: I drank milk. J: Do you like milk? L: Yes, I do. place= School canteen topic= food title=Have lunch age= 3-4 age=J: What did you have for lunch? L: I ate rice, fish and bread. J: Do you like rice? L: Yes, I do. J: Do you like fish? L: Yes, I do. J: Do you like bread? L: No, I don’t. J: What did you drink? L: I drank milk. J: Do you like milk? L: Yes, I do. </code>

转载请注明:在路上 » 【已解决】Python中用正则re去搜索分组的集合