Recently, I received a request to use Python to parse the data packet from C, and the format of the data packet is defined by the following structure:
typedef struct msg_t
{
int oid;
int msg_len;
char msg_data[0];}MSG_T;
The length of the msg_data
string is given by msg_len
, so you need to parse out the value of msg_len
first, and then read the content of msg_len
.
This operation can be done through the struct module in Python. The python parsing code for the above data structure is as follows:
OID =0
msgLen =0
msgData =""
sFormat =""
OID, msgLen = struct.unpack('II', syncMsg[0:8])
sFormat ='II'+str(msgLen)+'s'
OID, msgLen, msgData = struct.unpack(sFormat, syncMsg)
msgData = msgData.decode()
# print("OID: ", OID,"\nMsgLen: ", msgLen,"\nMsgData: ", msgData.decode())
The core of the code is the single quote part when unpack
, where I
represents Int
, and 128s
represents a string of length 128. Here, the length is analyzed first, and then the data format is spliced, and then analyzed.
The supported formats in struct are as follows:
Format | C Type | Python | Bytes |
---|---|---|---|
x | pad byte | no value | 1 |
c | char | string of length 1 | 1 |
b | signed char | integer | 1 |
B | unsigned char | integer | 1 |
? | _ Bool | bool | 1 |
h | short | integer | 2 |
H | unsigned short | integer | 2 |
i | int | integer | 4 |
I (uppercase i) | unsigned int | integer or long | 4 |
l (lowercase L) | long | integer | 4 |
L | unsigned long | long | 4 |
q | long long | long | 8 |
Q | unsigned long long | long | 8 |
f | float | float | 4 |
d | double | float | 8 |
s | char[] | string | 1 |
p | char[] | string | 1 |
P | void * | long | 4 |
Author: Frytea
Title: Python parsing variable length structure
Link: https://blog.frytea.com/archives/453/
Copyright: This work by TL-Song is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.