regexp & named groups

This is the place for queries that don't fit in any of the other categories.

regexp & named groups

Postby robinm » Wed Jun 04, 2014 8:42 pm

I'm getting file names in various formats say tran_20140401_01.dat and I want to match the file names against a string like "tran_{DATE}_{SEQ}.dat" and also be able to grab the date and sequence parts. I had a go at the following which I thought might work but it seemed to believe that m was assigned to None. Is there a better way of doing this?

Code: Select all
import re

my_pat = "tran_{DATE}_{SEQ}.dat"
re_pat = re.sub( "{", "(?P<", my_pat )
re_pat = re.sub( "}",">)", re_pat )
print( re_pat )

filename = "tran_20140501_01.dat"
m = re_pat, filename )
print( m.groupdict() )

Code: Select all
Traceback (most recent call last):
  File "E:\Python34\MyTest\", line 10, in <module>
    print( m.groupdict() )
AttributeError: 'NoneType' object has no attribute 'groupdict'

I'm a bit confused. Any suggestions?
Last edited by stranac on Thu Jun 05, 2014 11:59 am, edited 1 time in total.
Reason: Replaced annoying color with code tags
Posts: 5
Joined: Tue Jun 03, 2014 8:27 pm

Re: regexp & named groups

Postby micseydel » Wed Jun 04, 2014 9:09 pm

How does this look?
Code: Select all
>>> import re
>>> pat = re.compile('tran_(\d+)_(\d+)\.dat')
>>> pat.findall('tran_20140401_01.dat')
[('20140401', '01')]

You can also simply use if re.findall() isn't what you want
Code: Select all
>>> s ='tran_20140401_01.dat')
Due to the reasons discussed here we will be moving to on October 1, 2016.

This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
User avatar
Posts: 3000
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: regexp & named groups

Postby stranac » Thu Jun 05, 2014 12:03 pm

(?P<DATE>) and (?P<SEQ>) both match an empty string.
Code: Select all
>>>'tran_(?P<DATE>)_(?P<SEQ>).dat', 'tran__.dat').groupdict()
{'DATE': '', 'SEQ': ''}

You probably wanted 'tran_(?P<DATE>\d+)_(?P<SEQ>\d+).dat':
Code: Select all
>>>'tran_(?P<DATE>\d+)_(?P<SEQ>\d+).dat', 'tran_20140501_01.dat').groupdict()
{'DATE': '20140501', 'SEQ': '01'}
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
Posts: 1790
Joined: Thu Feb 07, 2013 3:42 pm

Return to General Coding Help

Who is online

Users browsing this forum: Bing [Bot] and 9 guests