Line splitting based on delimiter in to multiple lists

This is the place for queries that don't fit in any of the other categories.

Line splitting based on delimiter in to multiple lists

Postby colpwd » Sun Jul 06, 2014 3:34 am

Hi All,

Appreciate your time in looking at this topic.
I have a file which is populated with lines like as follows,

a,b,c
d,e,f
g,h,i
j,k,l

I would like to process each line based on the ',' delimiter, and place the 3 strings within 3 different lists. The idea is that after processing every line in the file, a,d,g and j would all exist in a single list, and similarly the same for the other two lists.

I have flexibility over how the file can be arranged, and the delimiter is not locked to the ',' character.

Use of the split function is good, although it creates a single list per line in the case above.
I thought of indexing and splicing to do this, but curious to see what else is out there and how others would tackle this problem.

Thanks,

Colin.
colpwd
 
Posts: 7
Joined: Thu Jul 03, 2014 10:31 pm

Re: Line splitting based on delimiter in to multiple lists

Postby metulburr » Sun Jul 06, 2014 3:56 am

Im not on a computer right now so i cwnt give qn example, but what your looking for is the built in function zip() used in a for loop

As for splitting the lines you can use file.readlines() in which returns a list of the file delimited by newline. Then loop that list splitting by your delimitter comma. At that point you will have lists similar to x y and z

Edit
Similar to this example
Code: Select all
>>> x = ['this', 'is', 'the', 'first', 'list']
>>> y = [1, 2, 3, 4, 5]
>>> z = [0.01, 0.2, 0.3, 0.04, 0.05]
>>> zip(x,y,z)
[('this', 1, 0.01), ('is', 2, 0.20000000000000001), ('the', 3, 0.29999999999999999), ('first', 4, 0.040000000000000001), ('list', 5, 0.050000000000000003)]
>>> for (a,b,c) in zip(x,y,z):
...     print a, b, c
...
this 1 0.01
is 2 0.2
the 3 0.3
first 4 0.04
list 5 0.05
New Users, Read This
OS Ubuntu 14.04, Arch Linux, Gentoo, Windows 7/8
https://github.com/metulburr
steam
User avatar
metulburr
 
Posts: 1476
Joined: Thu Feb 07, 2013 4:47 pm
Location: Elmira, NY

Re: Line splitting based on delimiter in to multiple lists

Postby colpwd » Sun Jul 06, 2014 4:27 am

Thanks metulburr,

I was not aware of this in-built function.

Just to go back a step, I have been looking at the best way to get the x,y,z lists that appear in your example.
If you are looping through a file, and splitting each line based on a particular delimiter to produce those lists, how would you go about naming those lists so that you could then run the zip function on them?

I have read that variable-named variables are not good practice.

Thanks,

Colin.
colpwd
 
Posts: 7
Joined: Thu Jul 03, 2014 10:31 pm

Re: Line splitting based on delimiter in to multiple lists

Postby metulburr » Sun Jul 06, 2014 6:11 am

You dont need to name each list. You just need to unpack the nested list to the zip function like so:

Assuming file.txt has the content you described.
Code: Select all
with open('file.txt') as f:
    lines = f.readlines()
   
lines = [line.strip().split(',') for line in lines]
   
for items in zip(*lines):
    print(items)


--output--
Code: Select all
('a', 'd', 'g', 'j')
('b', 'e', 'h', 'k')
('c', 'f', 'i', 'l')
New Users, Read This
OS Ubuntu 14.04, Arch Linux, Gentoo, Windows 7/8
https://github.com/metulburr
steam
User avatar
metulburr
 
Posts: 1476
Joined: Thu Feb 07, 2013 4:47 pm
Location: Elmira, NY

Re: Line splitting based on delimiter in to multiple lists

Postby colpwd » Sun Jul 06, 2014 9:09 am

Thank you very much. Much appreciated.
colpwd
 
Posts: 7
Joined: Thu Jul 03, 2014 10:31 pm


Return to General Coding Help

Who is online

Users browsing this forum: Baidu [Spider], Crimson King, W3C [Linkcheck] and 3 guests