Due to the rapid development of the Internet, more and more data and information are stored on the Internet. Through the analysis of these data, major companies can obtain some information that is helpful for decision-making.
For example, by analyzing the Taobao browsing record data of some users, it is possible to discover the potential consumption points of these customers, and to place advertisements at designated points by classification to increase the sales of goods.
Another example is the credit field. By analyzing the applicant's credit data, modeling and calculating the likelihood of the applicant being overdue, decide whether to lend, thereby increasing the use value of company funds.
Today, when data analysis is becoming more and more popular, learning to analyze data is an important weight for your promotion and salary increase.
Starting today, this official account will publish a series of free tutorials on data analysis and modeling. Help everyone quickly get started with data analysis and comprehend the charm of python.
This article is the first lesson of data analysis, teaching you how to manually build a data frame in python. This is the basis of data analysis and a commonly used tool for data testing.
Contents of this article
Import package
The data frame to be created
The python code to build the above data frame
Output print result
** 1 Import package**
For students who have not installed python, please install python according to the online tutorial. It is recommended to install an anaconda, so that many libraries will be installed by the way.
# coding:utf-8 #Affirm the coding format, use Chinese
import pandas as pd #Import the package and give this package an alias pd
from pandas import DataFrame #Import the DataFrame class in pandas
First, import the pandas package in jupyter. Since the data frame to be created contains Chinese, a declaration of coding as utf-8 is added at the beginning of the code.
** 2 The data frame to be built**
The data format we want to manually establish in python is as follows:
Each row represents a student, and the columns are represented as follows: ID represents student number, name represents name, gender represents gender, age represents age, and height represents height.
** 3 The python code to build the above data frame**
Express the above table with a dictionary in python, and use the pd.DataFrame function to convert the dictionary into a data frame.
date ={'ID':['1000001','1000002','1000003','1000004','1000005','1000006','1000007','1000008'],'name':['Lu Yifan','Donnina','Zhou Yinghui','Zou Caiqi','Zhang Zekun','Li Jiamin','Ling Zizhou','Zhao Junqi'],'gender':['male','Female','Female','Female','male','Female','male','male'],'age':[17,18,20,19,18,16,17,18],'hegth':[1.78,1.68,1.62,1.73,175,160,1.82,180]}
date_frame = pd.DataFrame(date)
** 4 Print the created data frame**
Enter the data_frame keyword in jupyter to get the following results:
At this point, the task of manually creating a data frame in python has been completed. Follow this tutorial to create a data frame that belongs to you.
Recommended Posts