Hash md5 comparison

This is the place for queries that don't fit in any of the other categories.

Hash md5 comparison

Postby greg2 » Mon Apr 14, 2014 9:20 am

My professor give me the assignement to modify a python program but I'm not very good with programming and I'm looking for help.. The program now compares different md5 hash and if it finds it two that are the same it moves it to another folder.. I must modify the program so that instead of moving it to another folder it delete the hash md5 who are the same..
Code: Select all
'''
Scan, find and move to output dir the duplicated files
'''
hashes = set()
for dirname, dirnames, filenames in os.walk(dirName):
    #for subdirname in dirnames:
    #    print os.path.join(dirname, subdirname)

    # print path to all filenames.
    for filename in filenames:
        #print os.path.join(dirname, filename)
        hash_digest = md5_for_file(os.path.join(dirname, filename))
        if hash_digest in hashes:
            os.rename(os.path.join(dirname, filename),os.path.join(outputdir, filename))
            #os.remove(os.path.join(dirname, filename))
        else:
            hashes.add(hash_digest)


I think i should replace this part os.path.join(outputdir, filename)) with os.remove(os.path.join(dirname, filename)) but it seems too easy and I'm not very sure about it... Any help would be appreciated.. Maybe it's a stupid question but I don't really know much about Python.. Thanks

PS: Sorry if my English is not perfect, I'm not a native
Last edited by stranac on Mon Apr 14, 2014 9:27 am, edited 1 time in total.
Reason: First post lock.
greg2
 
Posts: 1
Joined: Mon Apr 14, 2014 9:06 am

Re: Hash md5 comparison

Postby stranac » Mon Apr 14, 2014 9:28 am

Yes, that should work.
Friendship is magic!

R.I.P. Tracy M. You will be missed.
User avatar
stranac
 
Posts: 1144
Joined: Thu Feb 07, 2013 3:42 pm


Return to General Coding Help

Who is online

Users browsing this forum: Baidu [Spider], snippsat, W3C [Linkcheck] and 4 guests