Python IO

File opening and closing##

File opening and closing are two functions, an open function and a close function

Prototype of open function

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

As mentioned earlier, the open function returns a file-like object, but this file-like object is not fixed, and the type of this object will change with the open mode.

  1. Open the file in text mode ('w','r','wt','rt', etc.), and return a TextIOWrapper.
  2. When the file is opened in binary mode, the returned object will also change.
  3. In binary reading mode, a BufferedReader is returned.
  4. In binary write mode and binary append mode, a BufferedWriter is returned.
  5. In binary read/write mode, a BufferedRandom is returned.
In [1]: f =open('./')	#Open directly with the open function, if the file does not exist, FileNotFoundError will occur
FileNotFoundError                         Traceback(most recent call last)<ipython-input-1-b6df97277b77>in<module>()---->1 f =open('./')

FileNotFoundError:[Errno 2] No such file or directory:'./'

In [2]: f =open('./')	#After creating the file, you can open it and return a file-like object

In [3]:	#Read out the entire contents of the file
Out[3]:"#!/usr/bin/env python\n# coding=utf-8\nprint('hello world')\n"

In [4]: f.close()	#Close file

File read and write##

File reading and writing are mainly read and write and their variants. File reading and writing depends on the mode parameter of the open function.

The mode parameter of the open function###

The specific meaning of Mode is as follows


  1. When mode='x', if the file does not exist, an exception FileExistsError will be thrown.
  2. When mode='w', as long as the file is opened, even if no content is written, the file will be cleared first.
  3. When the mode contains +, additional read and write operations will be added, that is, it was originally read-only, and writable operations will be added. It turned out to be write-only, and read operations will be added, but + does not change other behaviors.


In [1]: f =open('./', mode='rt')	# mode=t The content read is a string

In [2]: s =

In [3]: s
Out[3]:"#!/usr/bin/env python\n# coding=utf-8\nprint('hello world')\n"

In [4]:type(s)	#s is of type str
Out[4]: str

In [5]: f.close()

In [6]: f =open('./', mode='rb')	# mode=b read bytes

In [7]: s =

In [8]: s
Out[8]: b"#!/usr/bin/env python\n# coding=utf-8\nprint('hello world')\n"

In [9]:type(s)
Out[9]: bytes

File pointer###

When opening a file, the interpreter will hold a pointer to a certain location in the file. When we read and write files, we always start from the pointer and move the pointer backwards. When mode=r, the pointer points to 0 (start of file), when mode=a, the pointer points to EOF (end of file)

The two functions related to file pointers are tell function and seek function

tell function

Returns the position of the current stream. For a file, it is the position of the file stream, that is, the position of the file pointer.

seek function

Change the position of the file stream and return the new absolute position.

seek(cookie, whence=0,/) method of _io.TextIOWrapper instance

Summary of file pointers

When seek exceeds the end of the file, there will be no exceptions, and tell will also exceed the end of the file, but when writing data, it will still write from the end of the file.

The write operation starts at min(EOF, tell())

File buffer###

The file buffer is determined by the buffering parameter of the open function, buffering represents the buffering mode, and the default value of the parameter is -1, which means that both text mode and binary mode use the default buffer.





to sum up

Context management##

Context management will automatically close the file when leaving, but will not open a new scope.

In [1]:withopen('./')as f:...:     pass

In [2]: f.readable()	#After leaving the context management, the file has been closed and can no longer be I/O operation
ValueError                                Traceback(most recent call last)<ipython-input-18-97a5eee249a2>in<module>()---->1 f.readable()

ValueError: I/O operation on closed file	

In [3]: f
Out[3]:<_io.TextIOWrapper name='./' mode='r' encoding='UTF-8'>

In [4]: f.closed	#f is closed
Out[4]: True

In addition to with open(&#39;./;) as f: for context management, there is another way of writing

In [21]: f =open('./')

In [22]:with f:...:     pass

File-like object##

Objects with a read() method returned by the open() function are collectively called file-like objects in Python. In addition to file, it can also be a byte stream of memory, a network stream, a custom stream, and so on. Common ones are StringIO and BytesIO.


StringIO, as its name implies, reads and writes str in memory.

To write str to StringIO, we need to create a StringIO object first, and then write and read it as an item file. The operations supported by file are basically supported by StringIO.

In [1]:from io import StringIO

In [2]:help(StringIO)

In [3]: sio =StringIO()	#Create a StringIO object, you can also use str to initialize StringIO

In [4]: sio.write('hello world')

In [5]: sio.write(' !')

In [6]: sio.getvalue()	# getvalue()The method is used to obtain the written str.
Out[6]:'hello world !'

In [7]: sio.closed
Out[7]: False

In [8]: sio.readline()

In [9]: sio.seekable()
Out[9]: True

In [10]:,0)	#Support seek operation

In [11]: sio.readline()
Out[11]:'hello world !'

To read StringIO, you can initialize StringIO with a str, and then read it like a file:

In [1]:from io import StringIO

In [2]: sio =StringIO('I\nlove\npython!')

In [3]:for line in sio.readlines():...:print(line.strip())...:     


StringIO can only operate on str. If you want to manipulate binary data, you need to use BytesIO.

BytesIO realizes reading and writing bytes in memory, we create a BytesIO, and then write some bytes:

In [1]:from io import BytesIO

In [2]: bio =BytesIO()

In [3]: bio.write(b'abcd')

In [4]:

In [5]:
Out[5]: b'abcd'

In [6]: bio.getvalue()	#getvalue can have everything alone at once, no matter where the file pointer is
Out[6]: b'abcd'

Similar to StringIO, BytesIO can be initialized with one bytes, and then read like a file:

In [1]:from io import BytesIO

In [2]: bio =BytesIO(b'abcd')

In [3]:
Out[3]: b'abcd'

Path manipulation pathlib

There are two ways of path manipulation, os.path and pathlib.

Pathlib is supported by default since python3.2 and above. If you want to use pathlib in python2.7, you need to install it

pip install pathlib

For the source code of the pathlib module, see: Lib/

Directory operations###

The basic use of the pathlib directory is the Path class in the pathlib module.

In [1]:import pathlib	#Introduce the pathlib module

In [2]: cwd = pathlib.Path('.')	#Use the Path class of the pathlib module to initialize the current path, the parameter is a PurePath

In [3]: cwd	#The return value is a PosixPath, if it is a windows environment, it will return a WindowsPath

Through help(pathlib.Path), you can view the various Methods of the Path class.

Help on classPathin module pathlib:classPath(PurePath)|  PurePath represents a filesystem path and offers operations which
 | don't imply any actual filesystem I/O.  Depending on your system,|  instantiating a PurePath will return either a PurePosixPath or a
 | PureWindowsPath object.  You can also instantiate either of these classes
 | directly, regardless of your system.||  Method resolution order:|      Path
 |  PurePath
 |  builtins.object
 || Methods defined here:||__enter__(self)||__exit__(self, t, v, tb)|...

Several functions for directory operations:

Examples of use are as follows

In [4]: cwd.is_dir()
Out[4]: True

In [5]: cwd.iterdir()	#The iterdir function returns a generator
Out[5]:<generator object Path.iterdir at 0x7f6727d926d0>

In [6]:for f in cwd.iterdir():	#Will not generate'.'with'..'...:print(type(f))...:print(f)...:<class'pathlib.PosixPath'>
< class'pathlib.PosixPath'>

In [7]: cwd.mkdir('abc')	#mkdir of pathlib is a method of path object
TypeError                                 Traceback(most recent call last)<ipython-input-7-3b48dd61eb0f>in<module>()---->1 cwd.mkdir('abc')/home/clg/.pyenv/versions/3.5.2/lib/python3.5/ inmkdir(self, mode, parents, exist_ok)1212if not parents:1213try:->1214                 self._accessor.mkdir(self, mode)1215             except FileExistsError:1216if not exist_ok or not self.is_dir():/home/clg/.pyenv/versions/3.5.2/lib/python3.5/ inwrapped(pathobj,*args)369         @functools.wraps(strfunc)370         def wrapped(pathobj,*args):-->371returnstrfunc(str(pathobj),*args)372returnstaticmethod(wrapped)373 

TypeError: an integer is required(got type str)

In [8]: d = pathlib.Path('./abc')

In [9]: d.exists()
Out[9]: False

In [10]: d.mkdir(755)	 #Create a folder, but 755 is not equal to 0o755(Octal)

In [11]:%ls  abc/

In [12]:%ls -ld ./abc
d-wxrw---t.2 clg clg 6 Feb 1321:01./abc/	#There is a problem with the mode specified, so the permissions are not normal

In [13]: d.rmdir()

In [14]: d.exists()
Out[14]: False

In [15]: d.mkdir(0o755)	#Specify mode using octal

In [16]:%ls -ld ./abc
drwxr-xr-x.2 clg clg 6 Feb 1321:03./abc/

General operations###

Mainly general operations of some paths

In [17]: f = pathlib.Path('./ab/cd/a.txt')

In [18]: f.exists()
Out[18]: False

In [19]: f.is_file()
Out[19]: False

In [20]: f.is_absolute()
Out[20]: False

In [21]: f = pathlib.Path('./')

In [22]: f.is_file()
Out[22]: True

In [23]: f.is_absolute()
Out[23]: False

In [24]: f.absolute()	#Get the absolute path of the path

In [25]: f.chmod(0o755)	#Permission to change path

In [26]:%ls -ld ./
- rwxr-xr-x.1 clg clg 58 Feb  813:32./*

In [27]: f.cwd()	#Return a new path to the current working directory

In [28]: f.home()

In [29]: pathlib.Path('~').expanduser()	#will~Absolute path of successful conversion

In [30]:	#name is an attribute, not a method
TypeError                                 Traceback(most recent call last)<ipython-input-30-f0ea48ccc8ff>in<module>()---->1

TypeError:'str' object is not callable

In [31]:	#Get the base name basename

In [32]: f.home().name

In [33]: f.owner()	#Get owner

In [34]: f.home().parent

In [35]:

In [36]: f.absolute().parts	#Get path split

In [37]: f.root	#Get the root directory, but'./'What you get is'.'

In [38]: f.home().root	#Get the root directory

In [39]: f.suffix	#Get suffix

In [40]: f.stat()	#Similar to os.stat(), Return various information of the path
Out[40]: os.stat_result(st_mode=33261, st_ino=34951327, st_dev=64768, st_nlink=1, st_uid=1000, st_gid=1000, st_size=58, st_atime=1486531928, st_mtime=1486531926, st_ctime=1486995977)

In [41]: f.stat().st_mode	#Get stat()How to return each information in the result: use'.'

In [42]: d = pathlib.Path('..')

In [43]:for x in d.glob(*.py):	# rglob(self, pattern)Parameter is a pattern
 File "<ipython-input-43-3fdfb8e408ac>", line 1for x in d.glob(*.py):^
SyntaxError: invalid syntax

In [44]:for x in d.glob('*.py'):	#Return the wildcard file in the current path
 ...: print(x)...:../
.. /
.. /
.. /

In [45]:for x in d.rglob('*.py'):	#Return wildcard files under the current path and its sub-paths (recursively)
 ...: print(x)...:../
.. /
.. /
.. /
.. /subworkspace/
.. /subworkspace/

File Copy, Move and Delete###

Use the shutil module

import shutil

Serialization and Deserialization##

Python private protocol pickle

pickle is a private serialization protocol for Python

See the pickle source code: lib/python3.5/

Main function

In [1]:import pickle

In [2]:classA:	#Declare a class A
 ...:  def print(self):...:print('aaaa')...:         

In [3]: a =A()	#Define an object a of class A

In [4]: pickle.dumps(a)	#Object export as data
Out[4]: b'\x80\x03c__main__\nA\nq\x00)\x81q\x01.'

In [5]: b = pickle.dumps(a)

In [6]: pickle.loads(b)	#Export data as objects
Out[6]:<__main__.A at 0x7f5dcdc71dd8>

In [7]: a
Out[7]:<__main__.A at 0x7f5dcdd28be0>	#The addresses of the two objects are different, but the contents of the two objects are indeed the same

In [8]: aa = pickle.loads(b)

In [9]: a.print()	#The print function of the original object

In [10]: aa.print()	#The print function of the deserialized object

General json protocol###

The data types supported by JSON format are as follows

Type Description
Number Double precision floating point format in JavaScript
String Unicode backslash escaped double quotation marks, corresponding to str
Boolean true or false
Array An ordered sequence of values, corresponding to list
Value It can be a string, a number, true or false (true/false), empty (null), etc.
Object Unordered collection of key-value pairs, corresponding to dict in python
Whitespace Can use tokens in any pair
null empty

Examples of use are as follows

In [1]:import json

In [2]: d ={'a':1,'b':[1,2,3]}

In [3]: json.dumps(d)
Out[3]:'{"a": 1, "b": [1, 2, 3]}'

In [4]: json.loads('{"a": 1, "b": [1, 2, 3]}')

json reference: JSON data format

