How to process excel table with python

openpyxl is a third-party library that can handle Excel files in xlsx format. pip install openpyxl installation.

Read Excel file

Need to import related functions

from openpyxl import load_workbook
# Read and write by default, you can specify write if necessary_only and read_only is True
wb =load_workbook('pythontab.xlsx')
The file opened by default is readable and writable. If necessary, you can specify the parameter read_only is True.

Get the worksheet--Sheet

# Get the names of all sheets
print(wb.get_sheet_names())
# Obtain sheet according to sheet name
a_sheet = wb.get_sheet_by_name('Sheet1')
# Get the sheet name
print(a_sheet.title)
# Get the currently displayed sheet,You can also use wb.get_active_sheet()
sheet = wb.active

Get cell

# Get the value of a cell, observe excel and find that it is also in the order of letters and then numbers, that is, columns first, then rows
b4 = sheet['B4']
# Return separately
print(f'({b4.column}, {b4.row}) is {b4.value}') #The returned number is int type
# In addition to obtaining subscripts, you can also use the cell function,Change to a number, this means B2
b4_too = sheet.cell(row=4, column=2)print(b4_too.value)

b4.column returns B, b4.row returns 4, and value is the value of that cell. In addition, the cell has an attribute coordinate, and the cell like b4 returns the coordinate B4.

Get the largest row and largest column

Get rows and columns

sheet.rows is the generator, which contains the data of each row, and each row is wrapped by a tuple.

sheet.columns is similar, but inside is each tuple is the cell of each column.

# Because of the line, so return to A1, B1,C1 this order
for row in sheet.rows:for cell in row:print(cell.value)
# A1, A2,A3 this order
for column in sheet.columns:for cell in column:print(cell.value)

The above code can get the data of all cells. What if you want to get the data of a certain row? Just give it an index, because sheet.rows is a generator type, and you cannot use an index. After converting to a list, you can use the index. List(sheet.rows)[2] will get the tuple object of the second row.

for cell inlist(sheet.rows)[2]:print(cell.value)

How to get cells in any range?

You can use the range function, the following writing method, to obtain all the cells in the rectangular area with A1 as the upper left corner and B3 as the lower right corner. Note that the range starts from 1, because in openpyxl, in order to be consistent with the expression in Excel, it is not consistent with the programming language that uses 0 to represent the first value.

for i inrange(1,4):for j inrange(1,3):print(sheet.cell(row=i, column=j))
# out
< Cell pythontab.A1 
< Cell pythontab.B1 
< Cell pythontab.A2 
< Cell pythontab.B2 
< Cell pythontab.A3 
< Cell pythontab.B3

It can also be used like slices. sheet['A1':'B3'] returns a tuple. The tuple is still a tuple. The cells in each row form a tuple.

for row_cell in sheet['A1':'B3']:for cell in row_cell:print(cell)for cell in sheet['A1':'B3']:print(cell)
# out(<Cell pythontab.A1 ,<Cell pythontab.B1 )(<Cell pythontab.A2 ,<Cell pythontab.B2 )(<Cell pythontab.A3 ,<Cell pythontab.B3 )

Get the column number according to the letter, return the letter according to the column number

Need to be imported, these two functions exist in openpyxl.utils

from openpyxl.utils import get_column_letter, column_index_from_string
# Return letters based on column numbers
print(get_column_letter(2)) # B
# Return the number of the column based on the letter
print(column_index_from_string('D')) # 4

Write data to Excel

Worksheet related

Need to import WorkBook

from openpyxl import Workbook
wb =Workbook()

This creates a new worksheet (but it has not been saved yet).

To specify the write-only mode, you can specify the parameter write_only=True. Generally, the default writable and readable mode is sufficient.

print(wb.get_sheet_names()) #Provide a table named Sheet by default, and create a new default Sheet1 under office2016
# Direct assignment can change the name of the worksheet
sheet.title ='Sheet1'
# Create a new worksheet, you can specify the index, arrange its position in the workbook appropriately
wb.create_sheet('Data', index=1) #Is arranged to the second worksheet, index=0 is the first position
# Delete a worksheet
wb.remove(sheet)
del wb[sheet]

Write to cell

You can also use the formula

# Just assign a value to the cell directly
sheet['A1']='good'
# Write average value at B9
sheet['B9']='=AVERAGE(B2:B8)'

But if you need to add data_only=True when reading, so that B9 will return a number, if you don’t add this parameter, the return will be the formula itself'=AVERAGE(B2:B8)'

append function

You can add multiple rows of data at a time, starting from the first blank row (the following are blank rows).

# Add a row
row =[1,2,3,4,5]
sheet.append(row)
# Add multiple lines
rows =[['Number','data1','data2'],[2,40,30],[3,40,25],[4,50,30],[5,30,10],[6,25,5],[7,50,10],]

Because the append function can only be written by line. What if we want to write in columns. Can append meet the demand? If the above list is nested as a matrix. Just transpose the matrix. It can be achieved using the zip() function, but the internal list becomes a tuple. They are all iterable objects and do not affect them.

list(zip(*rows))
# out
[(' Number',2,3,4,5,6,7),('data1',40,40,50,30,25,50),('data2',30,25,30,10,5,10)]

Explain the above list(zip(*rows)). First *rows breaks up the list, which is equivalent to filling in several parameters. zip extracts the first value from a list and combines it into a tuple, and then extracts each list from The second value is extracted from the list and combined into a tuple until the last value of the shortest list is extracted. The value after the longer list is discarded. In other words, the number of the last tuple is determined by each parameter. (Iterable object) determined by the shortest length. For example, if a value is deleted at random, the shortest list length is 2, and the values in the data2 column (looked vertically) are all discarded.

rows =[['Number','data1','data2'],[2,40],[3,40,25],[4,50,30],[5,30,10],[6,25,5],[7,50,10],]
# out
[(' Number',2,3,4,5,6,7),('data1',40,40,50,30,25,50)]

Finally, zip returns the zip object, and the data cannot be seen. Just use the list conversion. Using zip can facilitate writing data in columns.

Font

bold_itatic_24_font =Font(name='Contour', size=24, italic=True, color=colors.RED, bold=True)
sheet['A1'].font = bold_itatic_24_font

The above code specifies that the line 24 is bold and italic, and the font color is red. Use the font property of the cell directly to assign the Font object to it.

Alignment

Also directly use the cell attribute aligment, where vertical centering and horizontal centering are specified. In addition to center, parameters such as right and left can also be used.

# Set the data in B1 to be centered vertically and horizontally
sheet['B1'].alignment =Alignment(horizontal='center', vertical='center')

Set row height and column width

Sometimes the data is too long to display, you need to stretch the cell

# Row 2 height
sheet.row_dimensions[2].height =40
# C column width
sheet.column_dimensions['C'].width =30

Combine and split cells

The so-called merged cell means that the cell in the upper left corner of the merged area is used as the reference, and other cells are covered to make it a large cell.

On the contrary, after splitting the cell, the value of this large cell is returned to the original upper left corner position.

# Merge cells and write data to the upper left corner
sheet.merge_cells('B1:G1') #Combine several cells in a row
sheet.merge_cells('A1:C3') #Combine cells in a rectangular area

After merging, you can only write data to the upper left corner, which is the left coordinate in the interval.

If these cells to be merged have data, only the data in the upper left corner will be kept, and the others will be discarded. In other words, if the data is not written in the upper left corner before the merge, there will be no data in the cells after the merge.

Below is the code to split the cell. After splitting, the value returns to the A1 position.

sheet.unmerge_cells('A1:C3')

Example code extension:

# coding=utf-8import xlrd

# open a file
data = xlrd.open_workbook('file/demo.xlsx')

# View worksheet
data.sheet_names()print("sheets："+str(data.sheet_names()))

# Get worksheet by file name,Get worksheet 1
table = data.sheet_by_name('Work schedule 1')

# Print data.sheet_names()It can be found that the returned value is a list, and worksheet 1 is obtained by indexing the list.
# table = data.sheet_by_index(0)

# Get the number of rows and columns
# Number of rows: table.nrows
# Number of columns: table.ncols
print("Total number of rows:"+str(table.nrows))print("Total number of columns:"+str(table.ncols))

# Get the value of the entire row and the value of the entire column, the returned result is an array
# Whole row value: table.row_values(start,end)
# Entire column value: table.col_values(start,end)
# The parameter start is the number from which to start printing,
# end is the end of printing to that position, the default is none
print("Whole row value:"+str(table.row_values(0)))print("Entire column value:"+str(table.col_values(1)))

# Get the value of a cell, for example, get the value of cell B3
cel_B3 = table.cell(3,2).value
print("The value of the third row and second column:"+ cel_B3)

So far, this article on how to use python to process excel forms is introduced. For more detailed explanations of python processing excel forms, please search for ZaLou.Cn's previous articles or continue to browse related articles below. Hope you will support ZaLou more in the future. Cn!