STEMMING FOR PYTHON

This is the place for queries that don't fit in any of the other categories.

STEMMING FOR PYTHON

Postby el_manu » Wed Aug 21, 2013 7:51 am

Stemming

If you search for something in Google and use a word like "running", Google is smart enough to match "run" or "runs" as well. That's because search engines do what's called stemming before matching words.

In English, stemming involves removing common endings from words to produce a base word. It's hard to come up with a complete set of rules that work for all words, but this simplified set does a pretty good job:

If the word starts with a capital letter, output it without changes.
If the word ends in 's', 'ed', or 'ing' remove those letters, but if the resulting stemmed word is only 1 or 2 letters long (e.g. chopping the ing from sing), use the original word.

Your program should read one word of input and print out the corresponding stemmed word. For example:

Enter the word: states
state


Another example interaction with your program is:
Enter the word: rowed
row




Remember that capitalised words should not be stemmed:

Enter the word: James
James



and nor should words that become too short after stemming:

Enter the word: sing
sing



Google actually does quite sophisticated stemming. They give an example on their search help page.

You should only implement the rules we've listed above, even though they get some words, like 'buses' wrong (converting it to buse). Stemmers make these kinds of mistakes all the time!











I am sorry but I really need to be pointed in the right direction. Do i need to slice up the string???
Last edited by micseydel on Wed Aug 21, 2013 8:57 am, edited 1 time in total.
Reason: Locked.
el_manu
 
Posts: 87
Joined: Mon Aug 19, 2013 8:30 am

Re: STEMMING FOR PYTHON

Postby micseydel » Wed Aug 21, 2013 8:57 am

What attempt have you made? Where is this assignment from?
Join the #python-forum IRC channel on irc.freenode.net!
User avatar
micseydel
 
Posts: 1131
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: STEMMING FOR PYTHON

Postby el_manu » Wed Aug 21, 2013 12:27 pm

Code: Select all
a = input("Enter the word: ")
b = int(len(a))
c = (b - 3)
d = (b - 2)
e = (b - 1)
if a.endswith('ing'):
   if c <= 2:
      print(a)
   else:
      print(a[0:c])
if a.endswith('ed'):
   if d <= 2:
      print(a)
   else:
      print(a[0:d])
if a.endswith('s'):
   if e <= 2:
      print(a)
   else:
      print(a[0:e])



my code but how can I get the capitalized ones not to cut off?????
Last edited by Mekire on Fri Aug 23, 2013 2:12 am, edited 1 time in total.
Reason: Locked
el_manu
 
Posts: 87
Joined: Mon Aug 19, 2013 8:30 am

Re: STEMMING FOR PYTHON

Postby hansn » Wed Aug 21, 2013 1:28 pm

There's actually a very convenient built-in function for checking wether a string is a title or not:

Code: Select all
istitle(...)
    S.istitle() -> bool
   
    Return True if S is a titlecased string and there is at least one
    character in S, i.e. uppercase characters may only follow uncased
    characters and lowercase characters only cased ones. Return False
    otherwise.

Code: Select all
>>> "All Words Capitalized".istitle()
True
>>> "Not all Words Capitalized".istitle()
False
>>> "Word".istitle()
True
>>> "word".istitle()
False
hansn
 
Posts: 87
Joined: Thu Feb 21, 2013 8:46 pm

Re: STEMMING FOR PYTHON

Postby micseydel » Wed Aug 21, 2013 5:34 pm

Tips:
Variable names matter.
Unnecessary parenthesis are unnecessary (aka "noise").
len() already returns an int; there is no need to turn the return value into an int.
Perhaps a-e having semantic names would change my mind about this, but at this point I'd forgo the variable entirely and just subtract by a different value in the code where you use those variables, since you're doing math anyway and you're not really saving any work.
Join the #python-forum IRC channel on irc.freenode.net!
User avatar
micseydel
 
Posts: 1131
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: STEMMING FOR PYTHON

Postby Somelauw » Wed Aug 21, 2013 6:44 pm

Program doesn't work.

Code: Select all
% python3 stemming.py
Enter the word: mess
mes
Join the #python-forum IRC channel on irc.freenode.net!
Somelauw
 
Posts: 68
Joined: Tue Feb 12, 2013 8:30 pm

Re: STEMMING FOR PYTHON

Postby RonJeremy96 » Thu Aug 22, 2013 10:35 am

You really shouldn't cheat on the Grok Learning Challenge!
Last edited by stranac on Thu Aug 22, 2013 1:01 pm, edited 1 time in total.
Reason: Removed the annoying formatting
RonJeremy96
 
Posts: 2
Joined: Thu Aug 22, 2013 10:33 am

Re: STEMMING FOR PYTHON

Postby stranac » Thu Aug 22, 2013 1:06 pm

RonJeremy96 wrote:You really shouldn't cheat on the Grok Learning Challenge!

Why do you consider his post cheating, but yours not?
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1093
Joined: Thu Feb 07, 2013 3:42 pm

Re: STEMMING FOR PYTHON

Postby el_manu » Fri Aug 23, 2013 12:43 am

I WANNA GET RID OF THIS U GUYS WERE NO HELP I DID IT ON MY OWN
Last edited by Yoriz on Fri Aug 23, 2013 1:05 am, edited 1 time in total.
Reason: Locked
el_manu
 
Posts: 87
Joined: Mon Aug 19, 2013 8:30 am

Re: STEMMING FOR PYTHON

Postby Mekire » Fri Aug 23, 2013 2:07 am

el_manu wrote:I WANNA GET RID OF THIS U GUYS WERE NO HELP I DID IT ON MY OWN

Whether you did it on your own is not in dispute. At least from this thread it does seem like you came to your own conclusions, so you should have nothing to worry about. We gladly help people out with homework (generally by pointing them in the right direction) here if they are showing efforts to attempt to solve their problems on their own. We don't however, allow people to post their homework; get their answers; and then delete their thread (this is actually a common problem and extremely frustrating for those who put in time to help). If you choose to seek assistance for homework on a public forum it is your obligation to do so ethically.

-Mek
User avatar
Mekire
 
Posts: 984
Joined: Thu Feb 07, 2013 11:33 pm
Location: Amakusa, Japan


Return to General Coding Help

Who is online

Users browsing this forum: No registered users and 2 guests