Recently, when I was processing data, I encountered a .mat file, so I recorded my understanding and processing steps.
. The mat file is a commonly used data format in Matlab, and the format is similar to json key-value pairs.
{'__ header__': b'MATLAB 5.0 MAT-file Platform: nt, Created on: Wed Sep 9 16:13:43 2020','__version__':'1.0','__globals__':[],'key1':array([[0,1]]),'key2':array([[3]])}
In Python, scipy is needed to manipulate .mat files. First, if it is not installed, please execute:
pip install scipy
import scipy.io as sio
data1 ={"key1":[0,1],"key2":3}
sio.savemat("save.mat",data1)
At this point, you can find that there is an additional "save.mat" file under the current folder
import scipy.io as sio
data1 ={"key1":[0,1],"key2":3}
# sio.savemat("save.mat",data1)
data2 = sio.loadmat("save.mat")print("data1: ",type(data1),data1)print("data2: ",type(data2),data2)
Output:
( ml) Y:\song\Codes\face_recall>python deal_data.py
data1:<class'dict'>{'key1':[0,1],'key2':3}
data2:<class'dict'>{'__header__': b'MATLAB 5.0 MAT-file Platform: nt, Created on: Wed Sep 9 16:13:43 2020','__version__':'1.0','__globals__':[],'key1':array([[0,1]]),'key2':array([[3]])}
It can be seen that when the content of the mat file is saved, some information will be automatically added: header, version, globals
'__ header__': b'MATLAB 5.0 MAT-file Platform: nt, Created on: Wed Sep 9 16:13:43 2020','__version__':'1.0','__globals__':[]
At the same time, it is still data in dict format. The modified list or scale will be converted to array, which can be regarded as numpy.array
Operation 3: Modify the mat file
import scipy.io as sio
data1 ={"key1":[0,1],"key2":3}
# sio.savemat("save.mat",data1)
data2 = sio.loadmat("save.mat")print("data1: ",type(data1),data1)print("data2: ",type(data2),data2)
data2["__version__"]="2.0"
data2["key2"]=4
sio.savemat("save.mat",data2)
data3 = sio.loadmat("save.mat")print("data3: ",type(data3),data3)
Output:
( ml) Y:\song\Codes\face_recall>python deal_data.py
data1:<class'dict'>{'key1':[0,1],'key2':3}
data2:<class'dict'>{'__header__': b'MATLAB 5.0 MAT-file Platform: nt, Created on: Wed Sep 9 16:13:43 2020','__version__':'1.0','__globals__':[],'key1':array([[0,1]]),'key2':array([[3]])}
data3:<class'dict'>{'__header__': b'MATLAB 5.0 MAT-file Platform: nt, Created on: Wed Sep 9 16:47:59 2020','__version__':'1.0','__globals__':[],'key1':array([[0,1]]),'key2':array([[4]])}
It can be found that data2["key2"] has been modified to 4, and the "version" information cannot be modified. This question is considered because "*" is built-in information, which is similar to private variables and cannot be modified.
In 1. You can see that the data saved with scipy is the "'MATLAB 5.0" version, and the v7.3 version of the .mat file is the format for saving large files in matlab. It cannot be read using the above method. At this time Need to use h5py, install as follows:
pip install h5py
use
import h5py
data = h5py.File('data.mat')