regexp & named groups

This is the place for queries that don't fit in any of the other categories.

regexp & named groups

Postby robinm » Wed Jun 04, 2014 8:42 pm

I'm getting file names in various formats say tran_20140401_01.dat and I want to match the file names against a string like "tran_{DATE}_{SEQ}.dat" and also be able to grab the date and sequence parts. I had a go at the following which I thought might work but it seemed to believe that m was assigned to None. Is there a better way of doing this?

Code: Select all
import re

my_pat = "tran_{DATE}_{SEQ}.dat"
re_pat = re.sub( "{", "(?P<", my_pat )
re_pat = re.sub( "}",">)", re_pat )
print( re_pat )

filename = "tran_20140501_01.dat"
m = re.search( re_pat, filename )
print( m.groupdict() )

Code: Select all
tran_(?P<DATE>)_(?P<SEQ>).dat
Traceback (most recent call last):
  File "E:\Python34\MyTest\reader.py", line 10, in <module>
    print( m.groupdict() )
AttributeError: 'NoneType' object has no attribute 'groupdict'
>>>

I'm a bit confused. Any suggestions?
Last edited by stranac on Thu Jun 05, 2014 11:59 am, edited 1 time in total.
Reason: Replaced annoying color with code tags
robinm
 
Posts: 5
Joined: Tue Jun 03, 2014 8:27 pm

Re: regexp & named groups

Postby micseydel » Wed Jun 04, 2014 9:09 pm

How does this look?
Code: Select all
>>> import re
>>> pat = re.compile('tran_(\d+)_(\d+)\.dat')
>>> pat.findall('tran_20140401_01.dat')
[('20140401', '01')]

You can also simply use re.search() if re.findall() isn't what you want
Code: Select all
>>> s = pat.search('tran_20140401_01.dat')
>>> s.group(0)
'tran_20140401_01.dat'
>>> s.group(1)
'20140401'
>>> s.group(2)
'01'
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1358
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: regexp & named groups

Postby stranac » Thu Jun 05, 2014 12:03 pm

(?P<DATE>) and (?P<SEQ>) both match an empty string.
Code: Select all
>>> re.search('tran_(?P<DATE>)_(?P<SEQ>).dat', 'tran__.dat').groupdict()
{'DATE': '', 'SEQ': ''}

You probably wanted 'tran_(?P<DATE>\d+)_(?P<SEQ>\d+).dat':
Code: Select all
>>> re.search('tran_(?P<DATE>\d+)_(?P<SEQ>\d+).dat', 'tran_20140501_01.dat').groupdict()
{'DATE': '20140501', 'SEQ': '01'}
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1144
Joined: Thu Feb 07, 2013 3:42 pm


Return to General Coding Help

Who is online

Users browsing this forum: No registered users and 2 guests