1. The most basic way to read files:
# File: readline-example-1.py
file =open("sample.txt")while1:
line = file.readline()if not line:break
pass # do something
Reading data from the file line by line is obviously slow; but it saves memory.
Read the 10M sample.txt file on my machine, about 32000 lines per second
2. Use fileinput module
# File: readline-example-2.py
import fileinput
for line in fileinput.input("sample.txt"):
pass
The writing method is simpler, but after testing, it is found that only 13,000 rows of data can be read per second, and the efficiency is more than twice slower than the previous method...
3. File reading with cache
# File: readline-example-3.py
file =open("sample.txt")while1:
lines = file.readlines(100000)if not lines:breakfor line in lines:
pass # do something
Is this method really better? It turns out that with the same data test, it can read 96900 rows of data per second! The efficiency is 3 times that of the first method and 7 times that of the second method!
————————————————————————————————————————————————————————————
After Python 2.2, we can directly use a for loop to read each row of data on a file object:
# File: readline-example-5.py
file =open("sample.txt")for line in file:
pass # do something
In Python 2.1, you can only use the xreadlines iterator to achieve:
# File: readline-example-4.py
file =open("sample.txt")for line in file.xreadlines():
pass # do something
Recommended Posts