Lol I understand what you are doing... the argument you are passing is NOT the data size of the file to generate. It is the upper value of the integers to generate.
If you pass 50*1024*1024, using my function, you are asking for a file that prints all integers up to 52428800 which is almost a 500 megabyte file.
@Metul... the file time only starts to blow up once I use large numbers which is proportional to my processing speed. Try increasing the number drastically and see if they stay around even. It shouldn't stay even if you make them large enough unless something very strange is happening. You may just be using a much more powerful computer than mine.
Anyway... final silly answer:
- Code: Select all
"""lim is the max number to print up to (non-inclusive).
n should be chosen to be as large as possible while still fitting in memory.
If lim can already fit in memory do not pass a third argument."""
gener=(xrange(i,i+n) if i+n<lim else xrange(i,lim) for i in xrange(0,lim,n))
gener = (xrange(lim),)
with open(filename,'a') as myfile:
for chunk_range in gener:
data = "\n".join(str(num) for num in chunk_range)+"\n"
if __name__ == "__main__":
NUMBER = 50*1024*1024
PER_CHUNK = 4000000
start = time.time()
Note the above code WILL create a file with 52428800 numbers. It runs in 20 seconds on my machine. If you only want a 50 megabyte file lower the number to what Metul originally suggested of 6388889.
Edit: @Metul: When booting in linux mint I produced speeds similar to yours. I have no clue what retardation windows is doing to add 3 seconds on to the time.
@OP: if you are running under windows, it seems that "\n" adds two bytes, not one. Try taking this into account.