ML之FE:特征工程中的特征拼接处理(常用于横向拼接自变量特征和因变量特征)

ML之FE:特征工程中的特征拼接处理(常用于横向拼接自变量特征和因变量特征)


特征工程中的特征拼接处理(常用于横向拼接自变量特征和因变量特征)

输出结果

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
 #   Column                    Non-Null Count  Dtype
---  ------                    --------------  -----
 0   Pregnancies               768 non-null    int64
 1   Glucose                   768 non-null    int64
 2   BloodPressure             768 non-null    int64
 3   SkinThickness             768 non-null    int64
 4   Insulin                   768 non-null    int64
 5   BMI                       768 non-null    float64
 6   DiabetesPedigreeFunction  768 non-null    float64
 7   Age                       768 non-null    int64
 8   Outcome                   768 non-null    int64
dtypes: float64(2), int64(7)
memory usage: 54.1 KB
None
   Pregnancies  Glucose  BloodPressure  SkinThickness   BMI  Outcome
0            6      148             72             35  33.6        1
1            1       85             66             29  26.6        0
2            8      183             64              0  23.3        1
3            1       89             66             23  28.1        0
4            0      137             40             35  43.1        1

实现代码

# ML之DS:特征工程中的特征拼接处理(常用于横向拼接自变量特征和因变量特征)
import pandas as pd

data_frame=pd.read_csv('data_csv_xls\diabetes\diabetes.csv')
print(data_frame.info())

col_label='Outcome'
cols_other=['Pregnancies','Glucose','BloodPressure','SkinThickness','BMI']
data_X=data_frame[cols_other]
data_y_label_μ=data_frame[col_label]
data_dall = pd.concat([data_X, data_y_label_μ], axis=1)
print(data_dall.head())
(0)

相关推荐