Python data analysis-data establishment

Due to the rapid development of the Internet, more and more data and information are stored on the Internet. Through the analysis of these data, major companies can obtain some information that is helpful for decision-making.

For example, by analyzing the Taobao browsing record data of some users, it is possible to discover the potential consumption points of these customers, and to place advertisements at designated points by classification to increase the sales of goods.

Another example is the credit field. By analyzing the applicant's credit data, modeling and calculating the likelihood of the applicant being overdue, decide whether to lend, thereby increasing the use value of company funds.

Today, when data analysis is becoming more and more popular, learning to analyze data is an important weight for your promotion and salary increase.

Starting today, this official account will publish a series of free tutorials on data analysis and modeling. Help everyone quickly get started with data analysis and comprehend the charm of python.

This article is the first lesson of data analysis, teaching you how to manually build a data frame in python. This is the basis of data analysis and a commonly used tool for data testing.

Contents of this article

  1. Import package

  2. The data frame to be created

  3. The python code to build the above data frame

  4. Output print result

** 1 Import package**

For students who have not installed python, please install python according to the online tutorial. It is recommended to install an anaconda, so that many libraries will be installed by the way.

# coding:utf-8            #Affirm the coding format, use Chinese
import pandas as pd      #Import the package and give this package an alias pd
from pandas import DataFrame  #Import the DataFrame class in pandas

First, import the pandas package in jupyter. Since the data frame to be created contains Chinese, a declaration of coding as utf-8 is added at the beginning of the code.

** 2 The data frame to be built**

The data format we want to manually establish in python is as follows:

Each row represents a student, and the columns are represented as follows: ID represents student number, name represents name, gender represents gender, age represents age, and height represents height.

** 3 The python code to build the above data frame**

Express the above table with a dictionary in python, and use the pd.DataFrame function to convert the dictionary into a data frame.

date ={'ID':['1000001','1000002','1000003','1000004','1000005','1000006','1000007','1000008'],'name':['Lu Yifan','Donnina','Zhou Yinghui','Zou Caiqi','Zhang Zekun','Li Jiamin','Ling Zizhou','Zhao Junqi'],'gender':['male','Female','Female','Female','male','Female','male','male'],'age':[17,18,20,19,18,16,17,18],'hegth':[1.78,1.68,1.62,1.73,175,160,1.82,180]}
date_frame = pd.DataFrame(date)

** 4 Print the created data frame**

Enter the data_frame keyword in jupyter to get the following results:

At this point, the task of manually creating a data frame in python has been completed. Follow this tutorial to create a data frame that belongs to you.

Recommended Posts

Python data analysis-data establishment
Python data analysis-data update
Python data analysis-data selection
02. Python data types
Python data analysis
python data structure
Python data format-CSV
Python data analysis-apply function
Python basic data types
Python basic data types
Python Data Science: Neural Networks
Python common data structure collation
Python3 crawler data cleaning analysis
Python parses simple XML data
Python Data Science: Logistic Regression
Python data structure and algorithm
Python Data Science: Regularization Methods
Python Data Science: Related Analysis
Python Data Science: Linear Regression
Python Faker data forgery module
Python Data Science: Chi-Square Test
Python realizes online microblog data visualization
Is python suitable for data mining
Automatically generate data analysis report with Python
Python access to npy format data examples
Java or Python for big data analysis
Python uses pandas to process Excel data