How to speed this up?

This is the place for queries that don't fit in any of the other categories.

Re: How to speed this up?

Postby PyMD5 » Tue Aug 23, 2016 4:18 am

wavic wrote:I don't know if this is going through faster but you can try. I was looking at numpy also but I've got headache
I look at itertools for the first time. Yummy :twisted:
Code: Select all
import itertools

for ip in itertools.product(range(3), repeat=2): print(ip) # it will be range(256), repeat=4
(0, 0)
(0, 1)
(0, 2)
(1, 0)
(1, 1)
(1, 2)
(2, 0)
(2, 1)
(2, 2)


In Numpy there is something called broadcasting but I didn't get it. It looks like it will do the job ( generating tuples of all possible octets )
May be here there is a better solution. I think broadcasting is not what's need


Nope, itertools is actually slower than a for loop. I used the following code to loop through what amounts to the entire IPv4 address space on a single CPU core, without doing anything else in the code:
Code: Select all
from datetime import datetime
import itertools

startold = datetime.now()
for a in range(256):
  for b in range(256):
    for c in range(256):
      for d in range(256):
        pass
endold = datetime.now()

time.sleep(5)

startnew = datetime.now()
for ip in itertools.product(range(256), repeat=4):
  pass
endnew = datetime.now()

print('{}:\t{}\n{}:\t{}'.format("OLD",str(endold - startold),"NEW",str(endnew - startnew)))


for loop: 0:02:40.302430
itertools: 0:05:23.216252
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby PyMD5 » Tue Aug 23, 2016 4:28 am

I also went through the hashlib.py file and copied it, then stripped out any code that didn't relate to MD5. I also stripped out any error handling, so if you use it, you'd better be sure you've got the code working with the original hashlib.py file first.

Here's the difference in how it's called:
import fastmd5,ipaddress
ip6 = ipaddress.IPv6Address('0000:0000:0000:0000:0000:0000:0000:0000')
hashlibmd5 = fastmd5.md5
h = hashlibmd5(bytes(str(ip6 + a),'utf-8')).hexdigest()

import hashlib,ipaddress
ip6 = ipaddress.IPv6Address('0000:0000:0000:0000:0000:0000:0000:0000')
hashlibmd5 = hashlib.md5
h = hashlibmd5(bytes(str(ip6 + a),'utf-8')).hexdigest()

It's not a big difference in speed, but when you're slogging through a few trillion MD5 hashes, every bit counts.

Here's all that's needed to generate MD5 hashes, saved as fastmd5.py
Code: Select all
import _hashlib
globals()['md5'] = getattr(_hashlib, 'openssl_md5')
Last edited by PyMD5 on Sun Aug 28, 2016 5:18 am, edited 1 time in total.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby PyMD5 » Fri Aug 26, 2016 12:00 am

The fastest version yet. This is for IPv4, run from a CMD prompt like this "python.exe md5crack.py". It removes any MD5 hashes that are found so you don't have to waste CPU cycles checking for them, immediately prints to screen any MD5 hash:IP address matches found, and if the entire list of MD5 hashes has been found, it exits all threads.

Checking the global hashinput is really slow, so the code takes the global hashinput and chucks it into a local hashin for each process, which is much faster. Then, when an MD5:IP address match is found, it removes that MD5 hash from hashinput, and updates hashin. Thus, each local list has any found MD5 hashes culled, and it still processes at maximum speed.

Code: Select all
import multiprocessing
from datetime import datetime

def worker(hashinput,range_start,range_end):
  import fastmd5,sys # Local import is faster
  sys.setswitchinterval(.05) # Reduce context switching from 0.005 default to improve performance
  hashMD5 = fastmd5.md5
  aarange = range(range_start,range_end) # individual iterables for each loop is faster
  bbrange = range(0,256)
  ccrange = range(0,256)
  ddrange = range(0,256)

  hashin = [] # local variable is faster, receives the contents of hashinput
  for element in hashinput:
    hashin.append(element)

  for aa in aarange:
    if hashin != hashinput:
      hashin = []
      for element in hashinput:
        hashin.append(element)
    if not hashin:
      sys.exit(0)
    print('{!s}\rProgress: {:.2%} Hashes: {!s}'.format((" " * 75),(aa-range_start)/(range_end-range_start),len(hashin)), end='\r')
    for bb in bbrange:
      for cc in ccrange:
        for dd in ddrange:
          h = hashMD5(bytes('%s.%s.%s.%s' % (aa,bb,cc,dd),'utf-8')).hexdigest()
          for entry in hashin:
            if h == entry:
              hashinput.remove(entry)
              print('Match Found: %s.%s.%s.%s\t%s' % (aa,bb,cc,dd,entry))

if __name__ == '__main__':
  mgr = multiprocessing.Manager()

  hashlist = [
    '11111111111111111111111111111111', # TEST HASH - replace with your own hashes
    '22222222222222222222222222222222', # TEST HASH - replace with your own hashes
    '33333333333333333333333333333333', # TEST HASH - replace with your own hashes
  ]

  hashinput = mgr.list()
  for item in hashlist:
    hashinput.append(item)

  jobs=[
    multiprocessing.Process(target=worker, args=(hashinput, 0, 16)),
    multiprocessing.Process(target=worker, args=(hashinput, 16, 32)),
    multiprocessing.Process(target=worker, args=(hashinput, 32, 48)),
    multiprocessing.Process(target=worker, args=(hashinput, 48, 64)),
    multiprocessing.Process(target=worker, args=(hashinput, 64, 80)),
    multiprocessing.Process(target=worker, args=(hashinput, 80, 96)),
    multiprocessing.Process(target=worker, args=(hashinput, 96, 112)),
    multiprocessing.Process(target=worker, args=(hashinput, 112, 128)),
    multiprocessing.Process(target=worker, args=(hashinput, 128, 144)),
    multiprocessing.Process(target=worker, args=(hashinput, 144, 160)),
    multiprocessing.Process(target=worker, args=(hashinput, 160, 176)),
    multiprocessing.Process(target=worker, args=(hashinput, 176, 192)),
    multiprocessing.Process(target=worker, args=(hashinput, 192, 208)),
    multiprocessing.Process(target=worker, args=(hashinput, 208, 224)),
    multiprocessing.Process(target=worker, args=(hashinput, 224, 240)),
    multiprocessing.Process(target=worker, args=(hashinput, 240, 256)),
  ]

  print('{!s} {!s} {!s} {!s}\n\n{!s} {!s} {!s}'.format("Starting",len(jobs),"processes",datetime.now(),"Checking for",len(hashinput),"hashes:"))
  for entry in hashinput:
    print(entry)
  print('\nResults:')

  for j in jobs:
    j.daemon = False
    j.start()
  for j in jobs:
    j.join()

  print('\n{!s} {!s}'.format("SEARCH COMPLETE",datetime.now()))


If you're using the hashlib file instead of fastmd5.py, you'd change the following lines:

FROM: import multiprocessing,ipaddress,fastmd5,sys
TO: import multiprocessing,ipaddress,hashlib,sys

FROM: hashMD5 = fastmd5.md5
TO: hashMD5 = hashlib.md5
Last edited by PyMD5 on Tue Sep 06, 2016 5:04 am, edited 7 times in total.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby PyMD5 » Sat Aug 27, 2016 10:05 pm

And the IPv6 version of the code... this runs slower because it has to use ipaddress.IPv6Address to construct the IP address (whereas the IPv4 version can just concatenate 4 sets of numbers punctuated by periods), since IPv6 is more complicated (it can be long, or short, or have an IPv4 mapping, etc.). But ipaddress.IPv6Address is dog slow. If anyone knows of a faster means of constructing valid IPv6 addresses, let me know.

This is meant to be run from a CMD prompt like this: python.exe MD5crack_IPv6.py
Unless you've set the PATH statement to include Python, you'll likely have to change directory (cd) to the directory where MD5crack_IPv6.py is located.

This code also takes the hashinput global and chucks it into a hashin local for each process just as the IPv4 code does, because processing on the global is really slow. Once an MD5:IP match is found, that MD5 hash is removed from the hashinput global, then hashin is updated for each process to reflect the changes. Thus the code doesn't have to waste CPU cycles on hashes that have already been found.

You'll note the starting_ip='::' line below. If you know the starting IP address of the range of IP addresses you're trying to find the corresponding MD5 hashes for, enter it here in place of '::'. The code will start at that starting IP address and count up from there.

You'll also have to alter the following:

multiprocessing.Process(target=worker,args=(hashinput,starting_ip,0,2500000000)),
multiprocessing.Process(target=worker,args=(hashinput,starting_ip,2500000000,5000000000)),
multiprocessing.Process(target=worker,args=(hashinput,starting_ip,5000000000,7500000000)),
multiprocessing.Process(target=worker,args=(hashinput,starting_ip,7500000000,10000000000)),
multiprocessing.Process(target=worker,args=(hashinput,starting_ip,10000000000,12500000000)),


Specifically you can add or remove lines to match how many CPU cores you want to utilize. I'm using 5 CPU cores in the example above. Additionally, you should alter the last two numbers of each line to give you a reasonable runtime. I have each CPU core checking 2.5 billion MD5:IP address combinations, which means they'll each run for ~24 hours.

(Contrast that to the IPv4 code, which can check all ~4 billion IPv4 addresses in ~1 hour... as I said, ipaddress.IPv6Address is dog slow. The IPv4 code is ~8 times faster than the IPv6 code.)

Code: Select all
import multiprocessing
from datetime import datetime

def worker(hashinput,starting_ip,range_start,range_end):
  import ipaddress,fastmd5,sys # Local import is faster
  sys.setswitchinterval(.05) # Reduce context switching from 0.005 default to improve performance
  hashMD5 = fastmd5.md5
  ip = ipaddress.IPv6Address(starting_ip)
  checkrange = range(range_start,range_end)

  hashin = [] # local variable is faster, receives the contents of hashinput
  for element in hashinput:
    hashin.append(element)

  for count in checkrange:
    ipcount = ip + count
    if (count%3333333) == 0:
      if hashin != hashinput:
        hashin = []
        for element in hashinput:
          hashin.append(element)
      if not hashin:
        sys.exit(0)
      print('{!s}\rProgress: {:.2%}, {!s} hashes, Count: {!s}\t{!s}'.format((" " * 95),((count-range_start)/(range_end-range_start)),len(hashin),(count-range_start),ipcount),end='\r')
    h = hashMD5(bytes('%s' % ipcount,'utf-8')).hexdigest()
    for entry in hashin:
      if h == entry:
        print('%s\t%s' % (entry,ipcount))
        hashinput.remove(entry)

if __name__=='__main__':
  mgr = multiprocessing.Manager()
  starting_ip='::'
  hashlist=[
    '11111111111111111111111111111111', # TEST HASH - replace with your own hashes
    '22222222222222222222222222222222', # TEST HASH - replace with your own hashes
    '33333333333333333333333333333333', # TEST HASH - replace with your own hashes
  ]

  hashinput = mgr.list()
  for item in hashlist:
    hashinput.append(item)

  jobs=[
    multiprocessing.Process(target=worker,args=(hashinput,starting_ip,0,2500000000)),
    multiprocessing.Process(target=worker,args=(hashinput,starting_ip,2500000000,5000000000)),
    multiprocessing.Process(target=worker,args=(hashinput,starting_ip,5000000000,7500000000)),
    multiprocessing.Process(target=worker,args=(hashinput,starting_ip,7500000000,10000000000)),
    multiprocessing.Process(target=worker,args=(hashinput,starting_ip,10000000000,12500000000)),
  ]

  print('Starting {!s} processes {!s}\n\nChecking for {!s} hashes:'.format(len(jobs),datetime.now(),len(hashinput)))
  for entry in hashinput:
    print(entry)
  print('\nResults:')

  for j in jobs:
    j.daemon=False
    j.start()
  for j in jobs:
    j.join()

  print('\nSEARCH COMPLETE {!s}'.format(datetime.now()))


If you're using the hashlib file instead of fastmd5.py, you'd change the following lines:

FROM: import multiprocessing,ipaddress,fastmd5,sys
TO: import multiprocessing,ipaddress,hashlib,sys

FROM: hashMD5=fastmd5.md5
TO: hashMD5 = hashlib.md5
Last edited by PyMD5 on Tue Sep 06, 2016 5:01 am, edited 7 times in total.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby PyMD5 » Sun Aug 28, 2016 4:49 am

I've found yet another optimization...

instead of doing this:
Code: Select all
for aa in range(range_start, range_end):
  for bb in range(0,256):
    for cc in range(0,256):
      for dd in range(0,256):


do this:
Code: Select all
aarange = range(range_start,range_end)
bbrange = range(0,256)
ccrange = range(0,256)
ddrange = range(0,256)

for aa in aarange:
  for bb in bbrange:
    for cc in ccrange:
      for dd in ddrange:


I've run this more than a dozen times to time them, just iterating without any other code. When iterating over the entire IPv4 space, the second bit of code consistently takes ~16 seconds less.

[EDIT: rather than one iterable for all the loops (ie: bcd = range(0,256); for bb in bcd; for cc in bcd; etc.), it's even faster to denote one iterable for each loop. Consistently 9 seconds faster over the entire IPv4 address space, doing nothing else in the code but iterating. The code above is changed to reflect this.]

Additionally, rather than doing this:
bcd = range(256)

do this:
bcd = range(0,256)

The second bit of code saves ~19 seconds when iterating over the entire IPv4 space.

It's not a huge amount of time, but every little bit helps, especially when applying optimization to code that's working on trillions of IPv6 addresses.
Last edited by PyMD5 on Sun Sep 04, 2016 2:04 am, edited 1 time in total.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby Ofnuts » Sun Aug 28, 2016 8:13 am

Have you tried to use a generator instead of your three nested loops?
This forum has been moved to http://python-forum.io/. See you there.
User avatar
Ofnuts
 
Posts: 2659
Joined: Thu May 14, 2015 9:46 am
Location: Paris, France, EU, Earth, Solar system, Milky Way, Local Cluster, Universe #32987440940987

Re: How to speed this up?

Postby Larz60+ » Sun Aug 28, 2016 10:44 am

Hello,

Have you looked at https://github.com/fluentpython/example-code/blob/master/02-array-seq/listcomp_speed.py?

A list comprehension generates/allocates a list, given the size of your lists this becomes non trivial.


You can try Cartesian Products

from fluent python pg 26:

Listcomps can generate lists from the Cartesian product of two or more iterables. The
items that make up the cartesian product are tuples made from items from every
input iterable. The resulting list has a length equal to the lengths of the input iterables
multiplied. See Figure 2-2.


Also, a genex (as mentioned by ofnuts previously)

from fluent python pg 27:

To initialize tuples, arrays, and other types of sequences, you could also start from a
listcomp, but a genexp saves memory because it yields items one by one using the
iterator protocol instead of building a whole list just to feed another constructor.
Genexps use the same syntax as listcomps, but are enclosed in parentheses rather
than brackets.
...
uses a genexp with a Cartesian product to print out a roster of T-shirts
of two colors in three sizes. In contrast with Example 2-4, here the six-item list of Tshirts is never built in memory: the generator expression feeds the for loop produc‐
ing one item at a time. If the two lists used in the Cartesian product had 1,000 items
each, using a generator expression would save the expense of building a list with a
million items just to feed the for loop.


So you greatly reduce the amount of memory used.
That in and of itself should save time

I highly recommend owing a copy of Fluent Python by Luciano Ramalho 54.99 (Book & Ebook) O'Reilly

It's a year old now, so there may be used copies available. Also, if you contact O'Reilly, they can let you know next time it goes on sale.
They have 1/2 price sales all the time, sometimes much better.
It will take a spot in my library next to The python standard Library, The python cookbook, and Python Essential Reference

Larz60+
Larz60+
 
Posts: 1307
Joined: Thu Apr 03, 2014 4:06 pm

Re: How to speed this up?

Postby snippsat » Sun Aug 28, 2016 11:06 am

If possible try run on GPU CUDA Python – Using the NumbaPro.
A MandelBrot example accelerated with CUDA Python.
19x speed-up over the CPU-only accelerated version using GPUs and a 2000x speed-up over pure interpreted Python code.

I would also try PyPy.
It's easy to run even if it's a incredibly complex system,
python my_code.py change to pypy my_code.py.
Here is a run of your test code with itertools,with a little older version of PyPy 2.1.
Code: Select all
#Python 3.4
λ python it.py
OLD
:    0:02:17.431449
NEW:    0:04:27.603164

#PyPy 2.1 
λ pypy it.py
OLD
:    0:00:26.202000
NEW:    0:02:47.154000

A fun video to watch when David Beazley take a look into PyPy.
We will be moving to python-forum.io on October 1 2016
User avatar
snippsat
 
Posts: 1251
Joined: Thu Feb 21, 2013 12:04 am

Re: How to speed this up?

Postby PyMD5 » Sun Aug 28, 2016 8:32 pm

Ofnuts wrote:Have you tried to use a generator instead of your three nested loops?


Yeah, I tried every possible combination of generator (with the MD5 hash generation inside and outside the generator, with the iteration through the known hashes inside and outside the generator, etc.), and it was slower than the simple for loops... at least for the IPv4 code.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby PyMD5 » Mon Sep 05, 2016 10:42 pm

I found another tweak to speed things up... localization of imports.

Rather than importing everything at the top of the code, you only import what you need, then import the rest where it's used. In my case, where it's used is in the sub-processes, so it makes those imports local rather than global, which speeds things up. The IPv4 code ran in 50 minutes, 24 seconds for 1 MD5 hash, rather than the previous 1 hour, 3 minutes.

So the first few lines of code. Rather than doing this:
Code: Select all
import multiprocessing,ipaddress,fastmd5,sys
from datetime import datetime

def worker(hashinput,range_start,range_end):
  hashMD5 = fastmd5.md5


I'm now doing this:
Code: Select all
import multiprocessing,sys
from datetime import datetime

def worker(hashinput,range_start,range_end):
  import ipaddress,fastmd5
  hashMD5 = fastmd5.md5
Last edited by PyMD5 on Sat Sep 10, 2016 3:06 am, edited 1 time in total.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby PyMD5 » Mon Sep 05, 2016 10:43 pm

I found another tweak to speed things up... localization of imports.

Rather than importing everything at the top of the code, you only import what you need, then import the rest where it's used. In my case, where it's used is in the sub-processes, so it makes those imports local rather than global, which speeds things up. The IPv4 code ran in 50 minutes, 24 seconds for 1 MD5 hash, rather than the previous 1 hour, 3 minutes.

So the first few lines of code. Rather than doing this:
Code: Select all
import multiprocessing,fastmd5,sys
from datetime import datetime

def worker(hashinput,range_start,range_end):
  hashMD5 = fastmd5.md5

and this:
Code: Select all
if __name__ == '__main__':
  sys.setswitchinterval(.05) # Reduce context switching from 0.005 default to improve performance


I'm now doing this:
Code: Select all
import multiprocessing
from datetime import datetime

def worker(hashinput,range_start,range_end):
  import fastmd5,sys
  sys.setswitchinterval(.05) # Reduce context switching from 0.005 default to improve performance
  hashMD5 = fastmd5.md5
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby Ofnuts » Tue Sep 06, 2016 12:16 pm

PyMD5 wrote:I found another tweak to speed things up... localization of imports.

Rather than importing everything at the top of the code, you only import what you need, then import the rest where it's used. In my case, where it's used is in the sub-processes, so it makes those imports local rather than global, which speeds things up. The IPv4 code ran in 50 minutes, 24 seconds for 1 MD5 hash, rather than the previous 1 hour, 3 minutes.


Uh? Since you use them anyway, where is the gain? See http://stackoverflow.com/questions/9614 ... tion-level
This forum has been moved to http://python-forum.io/. See you there.
User avatar
Ofnuts
 
Posts: 2659
Joined: Thu May 14, 2015 9:46 am
Location: Paris, France, EU, Earth, Solar system, Milky Way, Local Cluster, Universe #32987440940987

Re: How to speed this up?

Postby PyMD5 » Tue Sep 06, 2016 11:36 pm

Ofnuts wrote:
PyMD5 wrote:I found another tweak to speed things up... localization of imports.

Rather than importing everything at the top of the code, you only import what you need, then import the rest where it's used. In my case, where it's used is in the sub-processes, so it makes those imports local rather than global, which speeds things up. The IPv4 code ran in 50 minutes, 24 seconds for 1 MD5 hash, rather than the previous 1 hour, 3 minutes.


Uh? Since you use them anyway, where is the gain? See http://stackoverflow.com/questions/9614 ... tion-level


All I know is the IPv4 code saved ~13 minutes runtime over the whole IPv4 address space, and the IPv6 code saved ~5 hours running 2 billion IP addresses per process.

I think it's something along the lines of global vs. local variables, something akin to that. I've already figured out that accessing global scope variables is slow.

Importing locally doesn't modify the module-level state, but it does create a variable in the local scope... in this case, importing at the top of the code means the local scope would be the main section of the code before the fork to the processes, which means that variable has to be made global in scope so the processes can access it. Importing it in the function (which is the code that runs each process) keeps that variable local to that function (and thus each process), which is faster. I think.

All names, including those bound to modules, are searched for in the local, non-local, global, and built-in scopes, in that order.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby ichabod801 » Wed Sep 07, 2016 12:54 am

Then couldn't you save more time by making the function a class with a __call__ method, and importing the module once during __init__?
Due to the reasons discussed here we will be moving to python-forum.io on October 1st, 2016.
This forum will be locked down and no one will be able to post/edit/create threads, etc. here from thereafter. Please create an account at the new site to continue discussion.
ichabod801
 
Posts: 688
Joined: Sat Feb 09, 2013 12:54 pm
Location: Outside Washington DC

Re: How to speed this up?

Postby PyMD5 » Wed Sep 07, 2016 1:33 am

Hmmm, I don't know how to do that yet. I'll do more testing.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby PyMD5 » Wed Sep 07, 2016 4:20 am

Another thing I'm looking at doing is directly accessing _hashlib, rather than having to go through hashlib.py (or fastmd5.py, my stripped-down version of hashlib.py).

Here's fastmd5.py:
Code: Select all
import _hashlib
globals()['md5'] = getattr(_hashlib, 'openssl_md5')


That's all one needs to generate MD5 hashes.

You'll note it uses globals(). In studying up on what globals() is, it apparently allows dictionary-based access to global variables... which means the generated IP address that we want to MD5-hash must be a global variable, even if it's created in a function (which I thought gave it a local scope). Unless I'm not understanding how the generated IP address is passed to the fastmd5.py and on to _hashlib.

So, how would I go about directly accessing _hashlib from within the function without globals() and without hashlib.py or fastmd5.py?
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby wavic » Wed Sep 07, 2016 10:39 pm

Did you try some kind of caching. I think it may help
wavic
 
Posts: 165
Joined: Wed May 25, 2016 8:51 pm

Re: How to speed this up?

Postby PyMD5 » Thu Sep 08, 2016 4:37 am

Actually, I'm working on that. I found that with a list comprehension, it greatly sped up the process with a small range of IP addresses, but as the list grew, it'd consume all memory and slow to a crawl... so I'm combining my current code with a list comprehension that only runs on a million IP addresses at a time, wrapped in a for loop that increments by one million. If it works, that should just about double the speed of this thing.
PyMD5
 
Posts: 35
Joined: Thu Jun 30, 2016 5:19 am

Re: How to speed this up?

Postby casevh » Thu Sep 08, 2016 5:07 am

PyMD5 wrote:Another thing I'm looking at doing is directly accessing _hashlib, rather than having to go through hashlib.py (or fastmd5.py, my stripped-down version of hashlib.py).
.... snipped ....
So, how would I go about directly accessing _hashlib from within the function without globals() and without hashlib.py or fastmd5.py?


Code: Select all
from _hashlib import openssl_md5 as md5
casevh
 
Posts: 114
Joined: Sat Feb 09, 2013 7:35 am

Re: How to speed this up?

Postby casevh » Thu Sep 08, 2016 5:59 am

Here is my attempt at speeding up your worker function. I'm not sure how it compares to your current version.

Code: Select all
from _hashlib import openssl_md5 as md5
import time

def worker(hashinput,range_start,range_end, _md5 = md5):

  aarange = [bytes('%s' % i, 'ascii') for i in range(range_start,range_end)]
  bbrange = [bytes('.%s' % i, 'ascii') for i in range(0, 255)]
  ccrange = [bytes('.%s' % i, 'ascii') for i in range(0, 255)]
  ddrange = [bytes('.%s' % i, 'ascii') for i in range(0, 255)]

  hashin = None

  for aa in aarange:
    temphash = set(hashinput)
    if hashin != temphash:
      hashin = temphash
    if not hashin:
      return
    for bb in bbrange:
      for cc in ccrange:
        temp = aa + bb + cc
        for dd in ddrange:
          h = _md5(temp + dd).hexdigest()
          # h = temp + dd
          if h in hashin:
            hashinput.remove(entry)
            # Print syntax not verified!
            print('Match Found: %s.%s.%s.%s\t%s' % (aa,bb,cc,dd,entry))

if __name__ == '__main__':

  hashlist = [
    '11111111111111111111111111111111', # TEST HASH - replace with your own hashes
    '22222222222222222222222222222222', # TEST HASH - replace with your own hashes
    '33333333333333333333333333333333', # TEST HASH - replace with your own hashes
  ]

  start = time.time()
  worker(hashlist, 0, 3)
  print(time.time() - start)
casevh
 
Posts: 114
Joined: Sat Feb 09, 2013 7:35 am

PreviousNext

Return to General Coding Help

Who is online

Users browsing this forum: Google [Bot] and 12 guests