Python regex, matching pattern over multiple lines.. why isn't this working?
Solution 1
Try re.findall(r"####(.*?)\s(.*?)\s####", string, re.DOTALL)
(works with re.compile
too, of course).
This regexp will return tuples containing the number of the section and the section content.
For your example, this will return [('1', 'ttteest'), ('2', ' \n\nttest')]
.
(BTW: your example won't run, for multiline strings, use '''
or """
)
Solution 2
Multiline doesn't mean .
will match line return, it means that ^
and $
are limited to lines only
re.M re.MULTILINE
When specified, the pattern character '^' matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character '$' >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, '^' matches only at the beginning of the string, and '$' only at the >end of the string and immediately before the newline (if any) at the end of the string.
re.S
or re.DOTALL
makes .
match even new lines.
Source
Rick
Web programmer with an interest in web task automation, building websites, etc, I prefer to do everything in Python now as I have moved to it from using a variety of other languages in the past. I also like to do front-end AJAX / javascript work but am moving to do this through Python as well, with the Pyjamas framework.
Updated on August 21, 2020Comments
-
Rick over 3 years
I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can't figure out why its not working.. I have wrapped different areas of text in my document with the wrappers like "####1" and am trying to parse based on this but its just not working no matter what I try, I think I am using multiline correctly.. any advice is appreciated
This returns no results at all:
string=' ####1 ttteest ####1 ttttteeeestt ####2 ttest ####2' import re pattern = '.*?####(.*?)####' returnmatch = re.compile(pattern, re.MULTILINE).findall(string) return returnmatch