regexp & named groups

This is the place for queries that don't fit in any of the other categories.

regexp & named groups

Postby robinm » Wed Jun 04, 2014 8:42 pm

I'm getting file names in various formats say tran_20140401_01.dat and I want to match the file names against a string like "tran_{DATE}_{SEQ}.dat" and also be able to grab the date and sequence parts. I had a go at the following which I thought might work but it seemed to believe that m was assigned to None. Is there a better way of doing this?

Code: Select all
import re

my_pat = "tran_{DATE}_{SEQ}.dat"
re_pat = re.sub( "{", "(?P<", my_pat )
re_pat = re.sub( "}",">)", re_pat )
print( re_pat )

filename = "tran_20140501_01.dat"
m = re.search( re_pat, filename )
print( m.groupdict() )

Code: Select all
tran_(?P<DATE>)_(?P<SEQ>).dat
Traceback (most recent call last):
  File "E:\Python34\MyTest\reader.py", line 10, in <module>
    print( m.groupdict() )
AttributeError: 'NoneType' object has no attribute 'groupdict'
>>>

I'm a bit confused. Any suggestions?
Last edited by stranac on Thu Jun 05, 2014 11:59 am, edited 1 time in total.
Reason: Replaced annoying color with code tags
robinm
 
Posts: 5
Joined: Tue Jun 03, 2014 8:27 pm

Re: regexp & named groups

Postby micseydel » Wed Jun 04, 2014 9:09 pm

How does this look?
Code: Select all
>>> import re
>>> pat = re.compile('tran_(\d+)_(\d+)\.dat')
>>> pat.findall('tran_20140401_01.dat')
[('20140401', '01')]

You can also simply use re.search() if re.findall() isn't what you want
Code: Select all
>>> s = pat.search('tran_20140401_01.dat')
>>> s.group(0)
'tran_20140401_01.dat'
>>> s.group(1)
'20140401'
>>> s.group(2)
'01'
Join the #python-forum IRC channel on irc.freenode.net for off-topic chat!

Please prefer not to PM members. The point of the forum is so that anyone can benefit. We don't want to help you over PMs/emails/Skype chats that others can't benefit from :)
User avatar
micseydel
 
Posts: 2036
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: regexp & named groups

Postby stranac » Thu Jun 05, 2014 12:03 pm

(?P<DATE>) and (?P<SEQ>) both match an empty string.
Code: Select all
>>> re.search('tran_(?P<DATE>)_(?P<SEQ>).dat', 'tran__.dat').groupdict()
{'DATE': '', 'SEQ': ''}

You probably wanted 'tran_(?P<DATE>\d+)_(?P<SEQ>\d+).dat':
Code: Select all
>>> re.search('tran_(?P<DATE>\d+)_(?P<SEQ>\d+).dat', 'tran_20140501_01.dat').groupdict()
{'DATE': '20140501', 'SEQ': '01'}
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1486
Joined: Thu Feb 07, 2013 3:42 pm


Return to General Coding Help

Who is online

Users browsing this forum: Google [Bot], VToftheNorth and 2 guests