Python2的re模块中关于MatchObject的group系列方法不解
PHP中文网
PHP中文网 2017-04-18 09:43:31
[Python讨论组]

问题

不知是自己理解不对,还是自己的需求不对

我的问题描述在相关代码中(下文)

  • RegexObject.search (不是re.search)

search(string[, pos[, endpos]])
Scan through string looking for a location where this regular expression produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

The optional second parameter pos gives an index in the string where the search is to start; it defaults to 0. This is not completely equivalent to slicing the string; the '^' pattern character matches at the real beginning of the string and at positions just after a newline, but not necessarily at the index where the search is to start.

The optional parameter endpos limits how far the string will be searched; it will be as if the string is endpos characters long, so only the characters from pos to endpos - 1 will be searched for a match. If endpos is less than pos, no match will be found, otherwise, if rx is a compiled regular expression object, rx.search(string, 0, 50) is equivalent to rx.search(string[:50], 0).

  • MatchObject-group系列方法

group([group1, ...])
Returns one or more subgroups of the match. If there is a single argument, the result is a single string; if there are multiple arguments, the result is a tuple with one item per argument. Without arguments, group1 defaults to zero (the whole match is returned). If a groupN argument is zero, the corresponding return value is the entire matching string; if it is in the inclusive range [1..99], it is the string matching the corresponding parenthesized group. If a group number is negative or larger than the number of groups defined in the pattern, an IndexError exception is raised. If a group is contained in a part of the pattern that did not match, the corresponding result is None. If a group is contained in a part of the pattern that matched multiple times, the last match is returned.

If the regular expression uses the (?P...) syntax, the groupN arguments may also be strings identifying groups by their group name. If a string argument is not used as a group name in the pattern, an IndexError exception is raised.

groups([default])
Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern. The default argument is used for groups that did not participate in the match; it defaults to None. (Incompatibility note: in the original Python 1.5 release, if the tuple was one element long, a string would be returned instead. In later versions (from 1.5.1 on), a singleton tuple is returned in such cases.)

groupdict([default])
Return a dictionary containing all the named subgroups of the match, keyed by the subgroup name. The default argument is used for groups that did not participate in the match; it defaults to None. For example:

相关代码

s = """111999
    222888
    333777
    444666"""

    regex = re.compile(r'(?P\d{3})(?P\d{3})', re.MULTILINE)

    m = regex.search(s)

    print(regex.findall(s))
    print(m.groups())   # 不是所有的吗
    print(m.group(0))   # 不是所有的吗,怎么只有一部分?
    print(m.group('first'))  #不是所有的吗?
    print(m.groupdict())    #不是所有的吗?

    
output
[('111', '999'), ('222', '888'), ('333', '777'), ('444', '666')]
('111', '999')
111999
111
{'second': '999', 'first': '111'}

貌似group系列方法只会匹配第一个?

重现

  1. 拷贝代码,运行

  2. 查看输出(同时查看对应文档)

PHP中文网
PHP中文网

认证0级讲师

全部回复(3)
大家讲道理

PHP中文网

search函数只会返回第一个match的地方对应的MatchObject,不会返回所有的。

迷茫

search方法只匹配一次

要匹配多次,用 finditer,这个方法返回一个迭代器,可以用for循环遍历。

热门教程
更多>
最新下载
更多>
网站特效
网站源码
网站素材
前端模板
关于我们 免责申明 举报中心 意见反馈 讲师合作 广告合作 最新更新 English
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送
PHP中文网APP
随时随地碎片化学习

Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号