Python之sklearn-pandas:sklearn-pandas库函数的简介、安装、使用方法之详细攻略

Python之sklearn-pandas:sklearn-pandas库函数的简介、安装、使用方法之详细攻略


sklearn-pandas库函数的简介

sklearn-pandas模块提供了Scikit-Learn的机器学习方法和pandas风格的数据框架之间的桥梁。特别是,它提供了一种将DataFrame列映射到转换的方法,这些转换稍后将被重新组合为特性。

sklearn-pandas库函数的安装

pip install sklearn-pandas
pip install --user -i https://pypi.tuna.tsinghua.edu.cn/simple sklearn-pandas

sklearn-pandas库函数的使用方法

1、基础用法

>>> from sklearn_pandas import DataFrameMapper
>>> import pandas as pd
>>> import numpy as np
>>> import sklearn.preprocessing, sklearn.decomposition, ...     sklearn.linear_model, sklearn.pipeline, sklearn.metrics
>>> from sklearn.feature_extraction.text import CountVectorizer

>>> mapper = DataFrameMapper([
...     ('pet', sklearn.preprocessing.LabelBinarizer()),
...     (['children'], sklearn.preprocessing.StandardScaler())
... ])

2、案例应用

>>> from sklearn.base import TransformerMixin
>>> class DateEncoder(TransformerMixin):
...    def fit(self, X, y=None):
...        return self
...
...    def transform(self, X):
...        dt = X.dt
...        return pd.concat([dt.year, dt.month, dt.day], axis=1)
>>> dates_df = pd.DataFrame(
...     {'dates': pd.date_range('2015-10-30', '2015-11-02')})
>>> mapper_dates = DataFrameMapper([
...     ('dates', DateEncoder())
... ], input_df=True)
>>> mapper_dates.fit_transform(dates_df)
array([[2015,   10,   30],
       [2015,   10,   31],
       [2015,   11,    1],
       [2015,   11,    2]])
(0)

相关推荐