Python data analysis-data update

In the process of analyzing massive data, rows and columns may need to be added, or some rows and columns may be deleted.

Today, I will introduce the fifth lesson of data analysis and teach you how to update the data frame in python.

Contents of this article

  1. Append a row to the end of the data frame
  2. Insert a column in the data frame
  3. Delete rows in the data frame
  4. Delete column in data frame
  5. Delete rows that meet certain conditions

Note: This article uses the data frame date_frame in the first lesson of data analysis [Python data analysis—data creation]:

** 1 Append a row to the end of the data frame**

Assuming that you want to add a row to the original data frame, you can first define the dictionary corresponding to the row. The specific sentence is as follows:

new_row ={'ID':['1000009'],'name':['Tang Poems'],'gender':['Female'],'age':[21],'height':[1.68]}

Note: The format should be consistent with the original data frame.

Use the append function to append the new row to the original data frame. The specific statement is as follows:

new_row1 = pd.DataFrame(new_row)  
date_frame.append(new_row1)

The results are as follows:

** 2 Insert a column in the data frame**

Since you can add rows in the data frame, you can also add columns in the data frame. You can use the insert function to add a column anywhere in the data frame.

For example, if I want to insert a new column in the first column of the data frame, I can run the following statement in python:

date_frame.insert(0,'calss',['class1','class1','class1','class1','class2','class2','class2','class2','class2'])

The results are as follows:

Among them, the 0 in .insert represents the position of the newly inserted column,'calss' represents the name of the newly added column, ['class1',...,'class2'] represents the content of the newly added column, pay attention to the new column to be added The length of the original data frame is the same.

Since you can add rows and columns in the data frame, you can also delete rows and columns in the data frame. First look at deleting rows in the data frame.

** 3 Delete rows in the data frame**

You can use the drop function to delete a row or multiple rows.

First look at the specific code to delete the first line:

date_frame.drop([0])

The results are as follows:

. drop([0]) means to delete the row whose index number is 0 (the first row). To delete other rows, you can change 0 to the index number of the corresponding row.

Then look at the specific code to delete the first and fifth lines:

date_frame.drop([0,4])

The results are as follows:

. drop([0,4]) means to delete rows with index numbers 0 and 4, which actually represent the first and fifth rows.

To delete more lines, you can refer to the code for deleting two lines.

** 4 Delete column in data frame**

You can also use the drop function to delete columns. First look at the specific code to delete the name column:

date_frame.drop(columns ='name')
date_frame.drop('name', axis =1)  #axis =1 means to operate on the column

The results are as follows:

If you want to delete the name column and gender column, you can enter the following code:

date_frame.drop(columns =['name','gender'])
date_frame.drop(['name','gender'], axis =1)

The results are as follows:

The code for deleting multiple columns can also refer to the code for deleting two columns.

** 5 Delete rows that meet certain conditions**

Suppose you want to delete all records older than 18 years old, you can enter the following statement in python:

date_frame.drop(index =(date_frame.loc[(date_frame.age>18)].index))

The results are as follows:

Among them, .loc[(date_frame.age>18)].index represents the index with age greater than 18.

At this point, the introduction of changing the data frame in python has been completed. You can practice it and think about whether you can perform other operations on the data frame.

Recommended Posts

Python data analysis-data update
Python data analysis-data selection
Python data analysis-data establishment
02. Python data types
Python data model
Python data analysis
python data structure
Python data format-CSV
Python data analysis-apply function
Python basic data types
Python basic data types
Python Data Science: Neural Networks
Python common data structure collation
Noteworthy update points in Python 3.9
Python Data Science: Logistic Regression
Python data structure and algorithm
How does python update packages
Python Data Science: Regularization Methods
Python Data Science: Related Analysis
Python Data Science: Linear Regression
Python Faker data forgery module
Python Data Science: Chi-Square Test
How does Python list update value
Python Data Science: Linear Regression Diagnosis
Python realizes online microblog data visualization
Is python suitable for data mining
Automatically generate data analysis report with Python
Python access to npy format data examples
Java or Python for big data analysis
Python uses pandas to process Excel data