Python file read and write operations

Read file

Open a file using the open() method (open() returns a file object, which is iterable):

>>> f =open('test.txt','r')

r means a text file, rb is a binary file. (The default value of the mode parameter is r)

If the file does not exist, the open() function will throw an error of IOError, and give an error code and detailed information to tell you that the file does not exist:

>>> f=open('test.txt','r')Traceback(most recent call last):
 File "<stdin>", line 1,in<module>
FileNotFoundError:[Errno 2] No such file or directory:'test.txt'

The file must be closed after use, because the file object will occupy the resources of the operating system, and the number of files that the operating system can open at the same time is also limited

>>> f.close()

Since IOError may be generated during file reading and writing, once an error occurs, the following f.close() will not be called. Therefore, in order to ensure that the file can be closed correctly regardless of whether there is an error, we can use try ... finally to achieve:

try:
 f =open('/path/to/file','r')print(f.read())finally:if f:
  f.close()

But it is too cumbersome to be so realistic every time, so Python introduced the with statement to automatically call the close() method for us:

withopen('/path/to/file','r')as f:print(f.read())

The python file object provides three "read" methods: read(), readline() and readlines(). Each method can accept a variable to limit the amount of data read each time.

Note: These three methods are to read in the'\n' at the end of each line, it will not remove the'\n' by default, we need to remove it manually.

In[2]:withopen('test1.txt','r')as f1:
 list1 = f1.readlines()
In[3]: list1
Out[3]:['111\n','222\n','333\n','444\n','555\n','666\n']

Remove'\n'

In[4]:withopen('test1.txt','r')as f1:
 list1 = f1.readlines()for i inrange(0,len(list1)):
 list1[i]= list1[i].rstrip('\n')
In[5]: list1
Out[5]:['111','222','333','444','555','666']

For read() and readline(),'\n' is also read in, but it can be displayed normally during print (because the'\n' in print is considered to mean a newline)

In[7]:withopen('test1.txt','r')as f1:
 list1 = f1.read()
In[8]: list1
Out[8]:'111\n222\n333\n444\n555\n666\n'
In[9]:print(list1)111222333444555666

In[10]:withopen('test1.txt','r')as f1:
 list1 = f1.readline()
In[11]: list1
Out[11]:'111\n'
In[12]:print(list1)111

An example of a python interview question:

There are two files, each with many lines of ip address, find the same ip address in the two files:

# coding:utf-8import bisect

withopen('test1.txt','r')as f1:
 list1 = f1.readlines()for i inrange(0,len(list1)):
 list1[i]= list1[i].strip('\n')withopen('test2.txt','r')as f2:
 list2 = f2.readlines()for i inrange(0,len(list2)):
 list2[i]= list2[i].strip('\n')

list2.sort()
length_2 =len(list2)
same_data =[]for i in list1:
 pos = bisect.bisect_left(list2, i)if pos <len(list2) and list2[pos]== i:
  same_data.append(i)
same_data =list(set(same_data))print(same_data)

The main points are: (1) Use with (2) Process the'\n' at the end of the line (3) Use binary search to improve algorithm efficiency. (4) Use set to quickly remove duplicates.

Write file###

Writing a file is the same as reading a file. The only difference is that when the open() function is called, the identifier &#39;w&#39; or &#39;wb&#39; is passed in to indicate writing a text file or writing a binary file:

>>> f =open('test.txt','w') #if'wb'Means writing binary files
>>> f.write('Hello, world!')>>> f.close()

Note: The mode of'w' is Jiangzi: if there is no such file, create one; if there is, then the contents of the original file will be cleared first and then new things will be written. So if you don't want to clear the original content but append new content directly, use the'a' mode.

We can call write() repeatedly to write the file, but we must call f.close() to close the file. When we write a file, the operating system often does not write the data to the disk immediately, but puts it in the memory cache, and then writes it slowly when it is free. Only when the close() method is called, the operating system guarantees that all unwritten data is written to the disk. The consequence of forgetting to call close() is that only part of the data may be written to the disk, and the rest is lost. So, use the with statement to be insured:

withopen('test.txt','w')as f:
 f.write('Hello, world!')

The python file object provides two "write" methods: write() and writelines().

f1 =open('test1.txt','w')
f1.writelines(["1","2","3"])
# At this time test1.The content of txt is:123

f1 =open('test1.txt','w')
f1.writelines(["1\n","2\n","3\n"])
# At this time test1.The content of txt is:
#    1
#    2        
#    3

Regarding the mode parameter of open():

' r': read

' w': write

' a': Append

' r+' == r+w (read and write, if the file does not exist, an error (IOError) will be reported)

' w+' == w+r (read and write, if the file does not exist, create it)

' a+' ==a+r (can be appended and writable, if the file does not exist, it will be created)

Correspondingly, if it is a binary file, just add a b:

' rb'  'wb'  'ab'  'rb+'  'wb+'  'ab+'

JSON

JSON (JavaScript Object Notation, JS Object Notation) is a lightweight data exchange format. The JSON data format is actually the dictionary format in python, which can contain an array enclosed in square brackets, which is a list in python.

In python, there are special modules for processing json format-json and picle modules

The Json module provides four methods: dumps, dump, loads, load

The pickle module also provides four functions: dumps, dump, loads, load

  1. Dumps and dump:

dumps and dump serialization methods

dumps only completed serialization to str,

dump must pass the file descriptor and save the serialized str to the file

View source code:

def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
  allow_nan=True, cls=None, indent=None, separators=None,default=None, sort_keys=False,**kw):
 # Serialize ``obj`` to a JSON formatted ``str``.
 # The serial number &quot;obj&quot; data type is converted to a string in JSON format
def dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True,
  allow_nan=True, cls=None, indent=None, separators=None,default=None, sort_keys=False,**kw):"""Serialize ``obj``as a JSON formatted stream to ``fp``(a
 ``. write()``-supporting file-like object).
  I understand it as two actions, one action is to convert &quot;obj&quot; into a string in JSON format, and the other action is to write a string into a file, which means that the file descriptor fp is a required parameter"""

Sample code:

>>> import json
>>> json.dumps([])    #dumps can format all basic data types as strings
'[]'>>> json.dumps(1)    #digital
'1'>>> json.dumps('1')   #String
'"1"'>>> dict ={"name":"Tom","age":23}>>> json.dumps(dict)     #dictionary
'{" name": "Tom", "age": 23}'
a ={"name":"Tom","age":23}withopen("test.json","w", encoding='utf-8')as f:
 # indent is super easy to use, formatted to save the dictionary, the default is None, less than 0 is zero spaces
 f.write(json.dumps(a, indent=4))
 # json.dump(a,f,indent=4)   #Same effect as above

Saved file effect:

  1. loads and load

Load and load deserialization method

loads only completed deserialization,

load only receives file descriptors, completes file reading and deserialization

View source code:

def loads(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None,**kw):"""Deserialize ``s``(a ``str`` instance containing a JSON document) to a Python object.
  Deserialize a JSON document containing str type into a python object"""
def load(fp, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None,**kw):"""Deserialize ``fp``(a ``.read()``-supporting file-like object containing a JSON document) to a Python object.
  Serialize a readable file containing JSON format data into a python object"""

Examples:

>>> json.loads('{"name":"Tom", "age":23}'){'age':23,'name':'Tom'}import json
withopen("test.json","r", encoding='utf-8')as f:
 aa = json.loads(f.read())
 f.seek(0)
 bb = json.load(f)    #With json.loads(f.read())print(aa)print(bb)

# Output:
{' name':'Tom','age':23}{'name':'Tom','age':23}
  1. json and picle modules

Both the json module and the picle module have four methods: dumps, dump, loads, and load, and the usage is the same.

What's not necessary is that the json module serializes in a common format, which is recognized by other programming languages, which is a normal string.

The picle module serialized only python can recognize, and other programming languages don’t recognize it as garbled characters

But picle can serialize the function, but other files want to use the function, and the file definition is required in the file (the definition and parameters must be the same, and the content can be different)

  1. The correspondence between python object (obj) and json object
+- - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - +| Python            | JSON          |+===================+===============+| dict              | object        |+-------------------+---------------+| list, tuple       | array         |+-------------------+---------------+| str               | string        |+-------------------+---------------+| int, float        | number        |+-------------------+---------------+| True              |true|+-------------------+---------------+| False             |false|+-------------------+---------------+| None              |null|+-------------------+---------------+

V. Summary

  1. json serialization method:

dumps: No file operation dump: serialization + write file

  1. json deserialization method:

loads: no file operation load: read file + deserialization

  1. The data serialized by the json module is more general

The data serialized by the picle module is only available in python, but it is powerful and can be a serial number function

  1. For the data types that the json module can serialize and deserialize, see the correspondence table of python objects (obj) and json objects

  2. Format and write files using indent = 4

OS.PATH

split

Split the directory name and return a tuple given by its directory name and base name
Split a pathname.  Returns tuple "(head, tail)" where "tail" is
everything after the final slash.  Either part may be empty.
>>> os.path.split("/tmp/f1.txt")('/tmp','f1.txt')>>> os.path.split("/home/test.sh")('/home','test.sh')

splitext

Split file name and return a tuple consisting of file name and extension
Split the extension from a pathname.
Extension is everything from the last dot to the end, ignoring
leading dots.  Returns "(root, ext)"; ext may be empty.
>>> os.path.splitext("/home/test.sh")('/home/test','.sh')>>> os.path.splitext("/tmp/f1.txt")('/tmp/f1','.txt')
# Rename file:>>> os.rename('test.txt','test.py')
# Delete file:>>> os.remove('test.py')
# View the absolute path of the current directory:>>> os.path.abspath('.')'/Users/michael'
# Create a new directory in a directory, first show the full path of the new directory:>>> os.path.join('/Users/michael','testdir')'/Users/michael/testdir'
# Then create a directory:>>> os.mkdir('/Users/michael/testdir')
# Delete a directory:>>> os.rmdir('/Users/michael/testdir')

Recommended Posts

Python file read and write operations
Python open read and write
Python memory mapped file read and write method
How to read and write files with Python
How to write python configuration file
Python file and directory operation code summary
Python file operation
python_ file processing
Python and Go
Python write Tetris
Python introspection and reflection
[python] python2 and python3 under ubuntu
Python deconstruction and packaging
Write gui in python
Python3 configuration and entry.md
Python | An article to understand Python list, tuple and string operations
Python automated operation and maintenance 2
Python know crawler and anti crawler
centos7 install python3 and ipython
Python implements TCP file transfer
ubuntu18.04 compile and install python3.8
Centos 6.10 reinstall python and yum
CentOS7 install python3 and pip3
Python tornado upload file function
Python automated operation and maintenance 1
Python data structure and algorithm
Python multi-process and multi-thread basics
CentOS 6.9 compile and install python
Quick start Python file operation
CentOS 6 compile and install python 3
Generators and iterators in Python
Python write breakpoint download software