ML之sklearn：sklearn.linear_mode中的LogisticRegression函数的简介、使用方法之详细攻略 / 四六文摘

ML之sklearn：sklearn.linear_mode中的LogisticRegression函数的简介、使用方法之详细攻略sklearn.linear_mode中的LogisticRegression函数的简介、使用方法class LogisticRegression Found at: sklearn.linear_model._logisticclass LogisticRegression(BaseEstimator, LinearClassifierMixin, SparseCoefMixin):"""Logistic Regression (aka logit, MaxEnt) classifier.In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the 'multi_class' option is set to 'ovr', and uses the cross-entropy loss if the 'multi_class' option is set to 'multinomial'. (Currently the 'multinomial' option is supported only by the 'lbfgs', 'sag', 'saga' and 'newton-cg' solvers.)This class implements regularized logistic regression using the 'liblinear' library, 'newton-cg', 'sag', 'saga' and 'lbfgs' solvers. **Note that regularization is applied by default**. It can handle both dense and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit floats for optimal performance; any other input format will be converted (and copied).The 'newton-cg', 'sag', and 'lbfgs' solvers support only L2 regularization with primal formulation, or no regularization. The 'liblinear' solver supports both L1 and L2 regularization, with a dual formulation only for the L2 penalty. The Elastic-Net regularization is only supported by the 'saga' solver.Read more in the :ref:`User Guide <logistic_regression>`.逻辑回归(又名logit, MaxEnt)分类器。在多类情况下，如果“multi_class”选项设置为“OvR”，训练算法使用one vs-rest (OvR)方案，如果“multi_class”选项设置为“多项”，训练算法使用交叉熵损失。(目前，“多项”选项仅由“lbfgs”、“sag”、“saga”和“newton-cg”求解器支持。)这个类使用“liblinear”库、“newton-cg”、“sag”、“saga”和“lbfgs”求解器实现正则逻辑回归。**注意正则化是在默认情况下应用的**。它可以处理稠密和稀疏输入。使用C-ordered数组或包含64位浮点数的CSR矩阵，以获得最佳性能;任何其他输入格式都将被转换(和复制)。“newton-cg”、“sag”和“lbfgs”求解器只支持使用原始公式的L2正则化，或者不支持正则化。“liblinear”求解器支持L1和L2正则化，只有L2惩罚的对偶公式。弹性网正则化仅由“saga”求解器支持。详见:ref: ' User Guide <logistic_regression> '。</logistic_regression>Parameters----------penalty : {'l1', 'l2', 'elasticnet', 'none'}, default='l2'Used to specify the norm used in the penalization. The 'newton-cg', 'sag' and 'lbfgs' solvers support only l2 penalties. 'elasticnet' is only supported by the 'saga' solver. If 'none' (not supported by the liblinear solver), no regularization is applied... versionadded:: 0.19l1 penalty with SAGA solver (allowing 'multinomial' + L1)dual : bool, default=FalseDual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.tol : float, default=1e-4Tolerance for stopping criteria.C : float, default=1.0Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.fit_intercept : bool, default=TrueSpecifies if a constant (a.k.a. bias or intercept) should be added to the decision function.intercept_scaling : float, default=1Useful only when the solver 'liblinear' is used and self.fit_intercept is set to True. In this case, x becomes [x, self.intercept_scaling], i.e. a "synthetic" feature with constant value equal to intercept_scaling is appended to the instance vector.The intercept becomes ``intercept_scaling * synthetic_feature_weight``.Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.class_weight : dict or 'balanced', default=NoneWeights associated with classes in the form ``{class_label: weight}``. If not given, all classes are supposed to have weight one.The "balanced" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as ``n_samples / (n_classes * np.bincount(y))``.Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified... versionadded:: 0.17*class_weight='balanced'*random_state : int, RandomState instance, default=None Used when ``solver`` == 'sag', 'saga' or 'liblinear' to shuffle the data. See :term:`Glossary <random_state>` for details.solver : {'newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'}, \ default='lbfgs'Algorithm to use in the optimization problem.- For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones.- For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs' handle multinomial loss; 'liblinear' is limited to one-versus-rest schemes.- 'newton-cg', 'lbfgs', 'sag' and 'saga' handle L2 or no penalty- 'liblinear' and 'saga' also handle L1 penalty- 'saga' also supports 'elasticnet' penalty- 'liblinear' does not support setting ``penalty='none'``Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.参数---------处罚:{l1, l2,'elasticnet’,'没有’},默认=“l2”用于指定在处罚中使用的规范。“newton-cg”，“sag”和“lbfgs”求解器只支持l2惩罚。“elasticnet”仅由“saga”求解器支持。如果“none”(liblinear求解器不支持)，则不应用正则化。. .versionadded:: 0.19l1惩罚与SAGA求解器(允许“多项”+ l1)bool，默认=False双重或原始配方。对偶公式仅适用于l2罚用线性求解器。当n_samples > n_features时，preferred dual=False。tol:浮动，默认=1e-4停止标准的容忍度。C: float, default=1.0正则化强度的逆;必须是正浮点数。与支持向量机一样，值越小，正则化越强。fit_intercept: bool，默认=True指定一个常数(即偏差或拦截)是否应该添加到决策函数中。intercept_scaling:浮动，默认=1只有在使用“liblinear”求解器和self时才有用。fit_intercept设置为True。在这种情况下，x变成[x, self。intercept_scaling]，即。一个常数值等于intercept_scaling的“合成”特性被附加到实例向量中。拦截变成' ' intercept_scaling * synthetic_feature_weight ' '。注意!合成特征权重与所有其他特征一样，采用l1/l2正则化。为了减少正则化对合成特征权重的影响(因此对拦截的影响)，必须增加intercept_scaling。class_weight: dict或'balanced'，默认为None以' ' {class_label: weight} ' ' '形式关联类的权重。如果没有给出，所有类的权重都应该是1。“平衡”模式使用y的值自动调整权重与输入数据中的类频率成反比，如' ' n_samples / (n_classes * np.bincount(y)) ' '。注意，如果指定了sample_weight，那么这些权重将与sample_weight相乘(通过fit方法传递)。. .versionadded:: 0.17* class_weight = '平衡' *random_state: int, RandomState instance, default=None，当' ' solver ' ' = 'sag'， 'saga'或'liblinear'洗发数据时使用。详见:term: ' Glossary <random_state> '。</random_state>解决:{'newton-cg’,'lbfgs’,'liblinear’,“凹陷”,“传奇”},\默认=“lbfgs”算法用于优化问题。对于小数据集，“liblinear”是一个不错的选择，而“sag”和“saga”对于大数据集更快。-对于多类问题，只有“newton-cg”、“sag”、“saga”和“lbfgs”处理多项损失;“liblinear”仅限于“一对二”方案。- 'newton-cg'， 'lbfgs'， 'sag'和'saga'处理L2或没有处罚-“liblinear”和“saga”也可以处理L1惩罚-《英雄传奇》也支持《弹性网》的惩罚- 'liblinear'不支持设置' ' penalty='none' ' '请注意，“sag”和“saga”的快速收敛只能保证在大致相同规模的特性上。您可以使用sklearn.preprocessing中的scaler对数据进行预处理。.. versionadded:: 0.17Stochastic Average Gradient descent solver... versionadded:: 0.19SAGA solver... versionchanged:: 0.22The default solver changed from 'liblinear' to 'lbfgs' in 0.22.max_iter : int, default=100Maximum number of iterations taken for the solvers to converge.multi_class : {'auto', 'ovr', 'multinomial'}, default='auto'If the option chosen is 'ovr', then a binary problem is fit for each label. For 'multinomial' the loss minimised is the multinomial loss fit across the entire probability distribution, *even when the data is binary*. 'multinomial' is unavailable when solver='liblinear'. 'auto' selects 'ovr' if the data is binary, or if solver='liblinear', and otherwise selects 'multinomial'... versionadded:: 0.18Stochastic Average Gradient descent solver for 'multinomial' case... versionchanged:: 0.22Default changed from 'ovr' to 'auto' in 0.22.verbose : int, default=0For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.warm_start : bool, default=FalseWhen set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. See :term:`the Glossary <warm_start>`... versionadded:: 0.17*warm_start* to support *lbfgs*, *newton-cg*, *sag*, *saga* solvers.n_jobs : int, default=NoneNumber of CPU cores used when parallelizing over classes if multi_class='ovr'". This parameter is ignored when the ``solver`` is set to 'liblinear' regardless of whether 'multi_class' is specified or not. ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context. ``-1`` means using all processors.See :term:`Glossary <n_jobs>` for more details.l1_ratio : float, default=NoneThe Elastic-Net mixing parameter, with ``0 <= l1_ratio <= 1``. Only used if ``penalty='elasticnet'``. Setting ``l1_ratio=0`` is equivalent to using ``penalty='l2'``, while setting ``l1_ratio=1`` is equivalent to using ``penalty='l1'``. For ``0 < l1_ratio <1``, the penalty is a combination of L1 and L2.. .versionadded:: 0.17随机平均梯度下降求解器。. .versionadded:: 0.19SAGA solver。. .versionchanged:: 0.22在0.22中，默认求解器从“liblinear”更改为“lbfgs”。max_iter: int，默认=100使求解器收敛的最大迭代次数。multi_class: {'auto'， 'ovr'， '多项'}，默认='auto'如果选择的选项是'ovr'，那么每个标签都适合一个二进制问题。对于“多项”损失最小化是多项式损失适合整个概率分布，即使当数据是二进制*。当求解器='liblinear'时，不可用多项式。auto选择'ovr'如果数据是二进制的，或者solver='liblinear'，否则选择'多项'。. .versionadded:: 0.18“多项式”情况的随机平均梯度下降求解器。. .versionchanged:: 0.22在0.22中默认从“ovr”改为“auto”。int，默认=0对于liblinear和lbfgs求解器，将冗长设置为任意正数。warm_start: bool，默认=False当设置为True时，重用前面调用的解决方案以适合初始化，否则就擦除前面的解决方案。对于线性求解器是没用的。参见:term: ' the Glossary <warm_start> '。</warm_start>. .versionadded:: 0.17*warm_start*支持*lbfgs*， *newton-cg*， *sag*， *saga*求解器。n_jobs: int，默认=无如果multi_class='ovr'"，则在类上并行时使用的CPU核数。当' ' solver ' '被设置为'liblinear'时，不管'multi_class'是否被指定，这个参数都会被忽略。' ' None ' '表示1，除非在:obj: ' joblib.parallel_backend '上下文中。“-1”表示使用所有处理器。有关更多细节，请参见:term: ' Glossary <n_jobs> '。</n_jobs>l1_ratio: float, default=None弹网混合参数``0 <= l1_ratio <= 1``。只在``penalty= ` elasticnet ``时使用。设置' ' l1_ratio=0 ' '等价于使用' ' penalty='l2' ' '，设置' ' l1_ratio=1 ' '等价于使用' ' penalty='l1' ' '。对于' ' 0 < l1_ratio <1 ' '，惩罚是L1和L2的组合。Attributes----------classes_ : ndarray of shape (n_classes, )A list of class labels known to the classifier.coef_ : ndarray of shape (1, n_features) or (n_classes, n_features) Coefficient of the features in the decision function.`coef_` is of shape (1, n_features) when the given problem is binary.In particular, when `multi_class='multinomial'`, `coef_` corresponds to outcome 1 (True) and `-coef_` corresponds to outcome 0 (False).intercept_ : ndarray of shape (1,) or (n_classes,)Intercept (a.k.a. bias) added to the decision function.If `fit_intercept` is set to False, the intercept is set to zero.`intercept_` is of shape (1,) when the given problem is binary. In particular, when `multi_class='multinomial'`, `intercept_` corresponds to outcome 1 (True) and `-intercept_` corresponds to outcome 0 (False).n_iter_ : ndarray of shape (n_classes,) or (1, )Actual number of iterations for all classes. If binary or multinomial, it returns only 1 element. For liblinear solver, only the maximum number of iteration across all classes is given... versionchanged:: 0.20In SciPy <= 1.0.0 the number of lbfgs iterations may exceed ``max_iter``. ``n_iter_`` will now report at most ``max_iter``.See Also--------SGDClassifier : Incrementally trained logistic regression (when given the parameter ``loss="log"``).LogisticRegressionCV : Logistic regression with built-in cross validation.Notes-----The underlying C implementation uses a random number generator to select features when fitting the model. It is thus not uncommon, to have slightly different results for the same input data. If that happens, try with a smaller tol parameter.Predict output may not match that of standalone liblinear in certain cases. See :ref:`differences from liblinear <liblinear_differences>` in the narrative documentation.属性----------classes_:形状的ndarray分类器已知的类标签列表。coef_:决策函数中特征的形状(1,n_features)或(n_classes, n_features)系数的ndarray。当给定的问题是二进制时，' coef_ '是形状(1,n_features)。特别是，当“multi_class=”多项“”时，“coef_”对应结果1 (True)，而“-coef_”对应结果0 (False)。intercept_:形状(1，)或(n_classes，)的ndarray在决策函数中加入截距(即偏差)。如果' fit_intercept '设置为False，则拦截设置为零。当给定的问题是二进制时，intercept_ '的形状是(1，)。特别是，当“multi_class=”多项“”时，“intercept_”对应结果1 (True)，而“-intercept_”对应结果0 (False)。n_iter_:形状(n_classes，)或(1，)的ndarray所有类的实际迭代次数。如果是二项或多项，则只返回1个元素。对于线性求解器，只给出了所有类的最大迭代次数。. .versionchanged:: 0.20在SciPy <= 1.0.0中，lbfgs迭代次数可能超过' ' max_iter ' '。' ' n_iter_ ' '现在最多报告' ' max_iter ' '。另请参阅--------增量训练逻辑回归(当给定参数' ' loss="log" ' ')。逻辑回归cv:内置交叉验证的逻辑回归。笔记    -----底层的C实现使用一个随机数生成器来选择适合模型的特性。因此，对于相同的输入数据，结果略有不同的情况并不少见。如果出现这种情况，尝试使用较小的tol参数。在某些情况下，Predict输出可能与独立liblinear的输出不匹配。参见:ref:“区别于liblinear <liblinear_differences>”。</liblinear_differences>References----------L-BFGS-B -- Software for Large-scale Bound-constrained Optimization Ciyou Zhu, Richard Byrd, Jorge Nocedal and Jose Luis Morales. http://users.iems.northwestern.edu/~nocedal/lbfgsb.htmlLIBLINEAR -- A Library for Large Linear Classificationhttps://www.csie.ntu.edu.tw/~cjlin/liblinear/SAG -- Mark Schmidt, Nicolas Le Roux, and Francis Bach Minimizing Finite Sums with the Stochastic Average Gradienthttps://hal.inria.fr/hal-00860051/documentSAGA -- Defazio, A., Bach F. & Lacoste-Julien S. (2014).SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectiveshttps://arxiv.org/abs/1407.0202Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin (2011). Dual coordinate descentmethods for logistic regression and maximum entropy models. Machine Learning 85(1-2):41-75.https://www.csie.ntu.edu.tw/~cjlin/papers/maxent_dual.pdf引用---------Ciyou Zhu, Richard Byrd, Jorge Nocedal和Jose Luis moral. http://users.iems.northwestern.edu/~ Nocedal /lbfgsb.htmlLIBLINEAR——一个大型线性分类的图书馆https://www.csie.ntu.edu.tw/ cjlin / liblinear /SAG——Mark Schmidt, Nicolas Le Roux和Francis Bach用随机平均梯度最小化有限和https://hal.inria.fr/hal-00860051/document佐贺—德法齐奥，巴赫F. &拉科斯特-朱利安S.(2014)。一个支持非强凸复合目标的快速增量梯度方法https://arxiv.org/abs/1407.0202俞香福、黄方兰、林志仁(2011)。双坐标下降逻辑回归和最大熵模型的方法。机器学习85 (1 - 2):41 - 75。https://www.csie.ntu.edu.tw/ cjlin /论文/ maxent_dual.pdfExamples-------->>> from sklearn.datasets import load_iris>>> from sklearn.linear_model import LogisticRegression>>> X, y = load_iris(return_X_y=True)>>> clf = LogisticRegression(random_state=0).fit(X, y)>>> clf.predict(X[:2, :])array([0, 0])>>> clf.predict_proba(X[:2, :])array([[9.8...e-01, 1.8...e-02, 1.4...e-08],[9.7...e-01, 2.8...e-02, ...e-08]])>>> clf.score(X, y)0.97..."""@_deprecate_positional_argsdef __init__(self, penalty='l2', *, dual=False, tol=1e-4, C=1.0,fit_intercept=True, intercept_scaling=1, class_weight=None,random_state=None, solver='lbfgs', max_iter=100,multi_class='auto', verbose=0, warm_start=False, n_jobs=None,l1_ratio=None):self.penalty = penaltyself.dual = dualself.tol = tolself.C = Cself.fit_intercept = fit_interceptself.intercept_scaling = intercept_scalingself.class_weight = class_weightself.random_state = random_stateself.solver = solverself.max_iter = max_iterself.multi_class = multi_classself.verbose = verboseself.warm_start = warm_startself.n_jobs = n_jobsself.l1_ratio = l1_ratiodef fit(self, X, y, sample_weight=None):"""Fit the model according to the given training data.Parameters----------X : {array-like, sparse matrix} of shape (n_samples, n_features)Training vector, where n_samples is the number of samples andn_features is the number of features.y : array-like of shape (n_samples,)Target vector relative to X.sample_weight : array-like of shape (n_samples,) default=NoneArray of weights that are assigned to individual samples.If not provided, then each sample is given unit weight... versionadded:: 0.17*sample_weight* support to LogisticRegression.Returns-------selfFitted estimator.Notes-----The SAGA solver supports both float64 and float32 bit arrays."""solver = _check_solver(self.solver, self.penalty, self.dual)if not isinstance(self.C, numbers.Number) or self.C < 0:raise ValueError("Penalty term must be positive; got (C=%r)" % self.C)if self.penalty == 'elasticnet':if (not isinstance(self.l1_ratio, numbers.Number) orself.l1_ratio < 0 or self.l1_ratio > 1):raise ValueError("l1_ratio must be between 0 and 1;"" got (l1_ratio=%r)" %self.l1_ratio)elif self.l1_ratio is not None:warnings.warn("l1_ratio parameter is only used when penalty is ""'elasticnet'. Got ""(penalty={})".format(self.penalty))if self.penalty == 'none':if self.C != 1.0: # default valueswarnings.warn("Setting penalty='none' will ignore the C andl1_ratio ""parameters")# Note that check for l1_ratio is done right aboveC_ = np.infpenalty = 'l2'else:C_ = self.Cpenalty = self.penaltyif not isinstance(self.max_iter, numbers.Number) or self.max_iter < 0:raise ValueError("Maximum number of iteration must be positive;"" got (max_iter=%r)" %self.max_iter)if not isinstance(self.tol, numbers.Number) or self.tol < 0:raise ValueError("Tolerance for stopping criteria must be ""positive; got (tol=%r)" %self.tol)if solver == 'lbfgs':_dtype = np.float64else:_dtype = [np.float64, np.float32]X, y = self._validate_data(X, y, accept_sparse='csr', dtype=_dtype,order="C",accept_large_sparse=solver != 'liblinear')check_classification_targets(y)self.classes_ = np.unique(y)multi_class = _check_multi_class(self.multi_class, solver,len(self.classes_))if solver == 'liblinear':if effective_n_jobs(self.n_jobs) != 1:warnings.warn("'n_jobs' > 1 does not have any effect when"" 'solver' is set to 'liblinear'. Got 'n_jobs'"" = {}.".format(effective_n_jobs(self.n_jobs)))self.coef_, self.intercept_, n_iter_ = _fit_liblinear(X, y, self.C, self.fit_intercept, self.intercept_scaling, self.class_weight, self.penalty, self.dual, self.verbose, self.max_iter, self.tol, self.random_state,sample_weight=sample_weight)self.n_iter_ = np.array([n_iter_])return selfif solver in ['sag', 'saga']:max_squared_sum = row_norms(X, squared=True).max()else:max_squared_sum = Nonen_classes = len(self.classes_)classes_ = self.classes_if n_classes < 2:raise ValueError("This solver needs samples of at least 2 classes"" in the data, but the data contains only one"" class: %r" %classes_[0])if len(self.classes_) == 2:n_classes = 1classes_ = classes_[1:]if self.warm_start:warm_start_coef = getattr(self, 'coef_', None)else:warm_start_coef = Noneif warm_start_coef is not None and self.fit_intercept:warm_start_coef = np.append(warm_start_coef,self.intercept_[:np.newaxis],axis=1)self.coef_ = list()self.intercept_ = np.zeros(n_classes)# Hack so that we iterate only once for the multinomial case.if multi_class == 'multinomial':classes_ = [None]warm_start_coef = [warm_start_coef]if warm_start_coef is None:warm_start_coef = [None] * n_classespath_func = delayed(_logistic_regression_path)# The SAG solver releases the GIL so it's more efficient to use# threads for this solver.if solver in ['sag', 'saga']:prefer = 'threads'else:prefer = 'processes'fold_coefs_ = Parallel(n_jobs=self.n_jobs, verbose=self.verbose, **_joblib_parallel_args(prefer=prefer))(path_func(X, y, pos_class=class_, Cs=[C_],l1_ratio=self.l1_ratio, fit_intercept=self.fit_intercept,tol=self.tol, verbose=self.verbose, solver=solver,multi_class=multi_class, max_iter=self.max_iter,class_weight=self.class_weight, check_input=False,random_state=self.random_state, coef=warm_start_coef_,penalty=penalty, max_squared_sum=max_squared_sum,sample_weight=sample_weight) for(class_, warm_start_coef_) in zip(classes_, warm_start_coef))fold_coefs_, _, n_iter_ = zip(*fold_coefs_)self.n_iter_ = np.asarray(n_iter_, dtype=np.int32)[:0]n_features = X.shape[1]if multi_class == 'multinomial':self.coef_ = fold_coefs_[0][0]else:self.coef_ = np.asarray(fold_coefs_)self.coef_ = self.coef_.reshape(n_classes, n_features +int(self.fit_intercept))if self.fit_intercept:self.intercept_ = self.coef_[:-1]self.coef_ = self.coef_[::-1]return selfdef predict_proba(self, X):"""Probability estimates.The returned estimates for all classes are ordered by the label of classes. For a multi_class problem, if multi_class is set to be "multinomial" the softmax function is used to find the predicted probability of each class.Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.Parameters----------X : array-like of shape (n_samples, n_features)Vector to be scored, where `n_samples` is the number of samples and `n_features` is the number of features.Returns-------T : array-like of shape (n_samples, n_classes)Returns the probability of the sample for each class in the model, where classes are ordered as they are in ``self.classes_``."""check_is_fitted(self)ovr = self.multi_class in ["ovr", "warn"] or (self.multi_class == 'auto'and (self.classes_.size <= 2 orself.solver == 'liblinear'))if ovr:return super()._predict_proba_lr(X)else:decision = self.decision_function(X)if decision.ndim == 1:# Workaround for multi_class="multinomial" and binaryoutcomes# which requires softmax prediction with only a 1D decision.decision_2d = np.c_[-decisiondecision]else:decision_2d = decisionreturn softmax(decision_2d, copy=False)def predict_log_proba(self, X):"""Predict logarithm of probability estimates.The returned estimates for all classes are ordered by the label of classes.Parameters----------X : array-like of shape (n_samples, n_features)Vector to be scored, where `n_samples` is the number of samples and `n_features` is the number of features.Returns-------T : array-like of shape (n_samples, n_classes)Returns the log-probability of the sample for each class in the model, where classes are ordered as they are in ``self.classes_``."""return np.log(self.predict_proba(X))概率的估计。所有类返回的估计值都按照类的标签排序。对于一个多类问题，将多类设为“多项式”，利用softmax函数求出每一类的预测概率。否则使用one vs-rest方法，i。计算概率的每一类假设它是正使用logistic函数。并在所有类中规范化这些值。

ML之sklearn：sklearn.linear_mode中的LogisticRegression函数的简介、使用方法之详细攻略

相关推荐