search and replace large amount of info

This is the place for queries that don't fit in any of the other categories.

search and replace large amount of info

Postby lmr1405 » Thu Nov 07, 2013 11:01 pm

HI all,

I need to take the following two files as input and replace the words in input 2 with its corresponding tag (some words have multiple tags,which need to be accounted for as seen in "api . ani").

input1
Code: Select all
last-n  nmod+j+n    year-n 9492
last-n  nmod+j+n    night-n 8075
first-n nmod+j+n-the    time-n 7749
same-n   nmod+j+n-the    time-n 7530
other-j nmod+j+n-the    hand-n 5319
ast-j   nmod+j+n   year-n 1000
last-j   nmod+j+n   night-n 5000
first-j   nmod+j+n-the   time-n 1000
same-j   nmod+j+n-the   time-n 3000
other-j   nmod+j+n-the   hand-n 200


input2

Code: Select all
same   ani.api
first   ani
abaya   art
abbacy   log
abbe   hum
abbess   hum
abbey   art


for the following output:
Code: Select all
last-n  nmod+j+n    year-n 9492
last-n  nmod+j+n    night-n 8075
ani.api nmod+j+n-the    time-n 7749
ani   nmod+j+n-the    time-n 7530
other-j nmod+j+n-the    hand-n 5319
ast-j   nmod+j+n   year-n 1000
last-j   nmod+j+n   night-n 5000
ani.api   nmod+j+n-the   time-n 1000
ani   nmod+j+n-the   time-n 3000
other-j   nmod+j+n-the   hand-n 200


This is the code I am trying to work with, but I need some help to solve the problem.

Code: Select all
from __future__ import division
import collections
import codecs

input_file1 = codecs.open("test_awking","r",encoding="utf-8")
input_file2 = codecs.open("final_FILLERS2CATS.map","r",encoding="utf-8")

countsCorps = {}
with input_file1 as f:
   for line in f:
      (concept, link, slot, freq) = line.split()

classDict = {}
with input_file2 as f:
   for line in f:
      (classConc, classId) = line.split()
   classDict[classConc] = classId
   
   classes = classDict[classConc].split(".")
   try lemma in classDict.keys():
      count[lemma][]+=1/len(classes)
   except
      AttributeError
   continue
   print lemma, link, slot, freq
      
input_file1.close()
input_file2.close()


I think I am getting a bit confused in the organization of arguments.. but I am getting too far in and need some fresh insight. Can anyone offer assistance?

Thank you
lmr1405
 
Posts: 22
Joined: Fri Oct 25, 2013 9:49 am

Return to General Coding Help

Who is online

Users browsing this forum: No registered users and 5 guests