Counting keywords in CSV files

This is the place for queries that don't fit in any of the other categories.

Counting keywords in CSV files

Postby BoeingGuy » Tue Jul 16, 2013 12:59 pm

Hello,

I have multiple CSV files which were generated from a keyword based SQL query. I now have to write a code which will read in a CSV file and take a keyword as input and display an output in terms of how many times the keyword appears in said file and where it appears. I have uploaded a sample file and below is the beginning of my code that I have so far.

Code: Select all
import csv                                                                     
import collections
 
mRNA = collections.Counter()
with open('mrna.csv') as input_file:
    for row in csv.reader(input_file, delimiter=';'):
        mRNA[row[1]] = 1
 
print 'Number of times mRNA is Repeated: %s' % mRNA['mrna']
print mRNA.most_common()
Attachments
miRNA.csv
(10.65 KiB) Downloaded 37 times
BoeingGuy
 
Posts: 3
Joined: Wed Jul 10, 2013 2:22 pm

Re: Counting keywords in CSV files

Postby tnknepp » Tue Jul 16, 2013 2:39 pm

I would expect a csv file to have commas as the delimiter, not a semi-colon. Also, in your file there are few commas (when they appear they are grammatical, not really delimiting anything).

Also, I don't understand what is going on here:
Code: Select all
for row in csv.reader(input_file, delimiter=';'):
        mRNA[row[1]] = 1


Can you clarify?
Python: 2.7 via Anaconda
Numpy: 1.7
Pandas: 0.11
OS: Windows 7
IDE: Spyder/IPython
User avatar
tnknepp
 
Posts: 119
Joined: Mon Mar 11, 2013 7:41 pm

Re: Counting keywords in CSV files

Postby BoeingGuy » Tue Jul 16, 2013 2:51 pm

@tnknepp

With the row function I was trying to increment the count of the keyword everytime it appeared but it looks like I did not do it correctly.
BoeingGuy
 
Posts: 3
Joined: Wed Jul 10, 2013 2:22 pm

Re: Counting keywords in CSV files

Postby ochichinyezaboombwa » Tue Jul 16, 2013 4:06 pm

BoeingGuy wrote: I was trying to increment the count of the keyword everytime it appeared but it looks like I did not do it correctly.
-Exactly. The whole point of Counter is that it Counts things for you, that's why it's called Counter. All you need to do is give it things to Count. For example:
Code: Select all
line = "Python, C++, Python, Perl, PHP, Java, Java, Java, C++, Scheme"
words = [w.strip() for w in line.split(",")]
mRNA.update(words)
print (mRNA)
ochichinyezaboombwa
 
Posts: 200
Joined: Tue Jun 04, 2013 7:53 pm


Return to General Coding Help

Who is online

Users browsing this forum: Bing [Bot] and 1 guest