Python reads files by line

1. The most basic way to read files:

# File: readline-example-1.py

file =open("sample.txt")while1:

 line = file.readline()if not line:break

 pass # do something

Reading data from the file line by line is obviously slow; but it saves memory.

Read the 10M sample.txt file on my machine, about 32000 lines per second

2. Use fileinput module

# File: readline-example-2.py

import fileinput

for line in fileinput.input("sample.txt"):

 pass

The writing method is simpler, but after testing, it is found that only 13,000 rows of data can be read per second, and the efficiency is more than twice slower than the previous method...

3. File reading with cache

# File: readline-example-3.py

file =open("sample.txt")while1:

 lines = file.readlines(100000)if not lines:breakfor line in lines:

  pass # do something

Is this method really better? It turns out that with the same data test, it can read 96900 rows of data per second! The efficiency is 3 times that of the first method and 7 times that of the second method!

————————————————————————————————————————————————————————————

After Python 2.2, we can directly use a for loop to read each row of data on a file object:

# File: readline-example-5.py

file =open("sample.txt")for line in file:

 pass # do something

In Python 2.1, you can only use the xreadlines iterator to achieve:

# File: readline-example-4.py

file =open("sample.txt")for line in file.xreadlines():

 pass # do something