NominalRegressor#
- class sklearn_nominal.sklearn.nominal_model.NominalRegressor(backend='pandas', *args, **kwargs)[source]#
Base class for nominal regressors.
This class coordinates the regression workflow for models that handle nominal features natively.
Examples
>>> from sklearn_nominal.sklearn.nominal_model import NominalRegressor >>> class MyRegressor(NominalRegressor): ... def make_model(self, d): ... # Return a backend-specific trainer ... pass
Methods
build_error(criterion)Builds the regression error function for the given criterion.
Checks if the model has been fitted.
Returns the complexity of the fitted model.
fit(x, y)Fits the nominal regressor.
get_dtypes(x)Extracts and maps data types from the input.
Returns the names of the features seen during fit.
make_model(d)Abstract method to create the model trainer.
predict(x)Predicts target values for input samples.
pretty_print([class_names])Returns a string representation of the fitted model.
score(X, y[, sample_weight])Return the coefficient of determination of the prediction.
set_dtypes(x)Sets and persists the data types based on the input.
set_model(model)Sets the underlying backend model and marks it as fitted.
set_sklearn_tags(tags)Sets scikit-learn tags for the supervised nominal model.
Validates and prepares data for regression fitting.
Validates and prepares input data for prediction.
- build_error(criterion)[source]#
Builds the regression error function for the given criterion.
- Parameters:
- criterionstr
The error criterion to use (e.g., “std” for standard deviation).
- Returns:
- TargetError
An instance of the requested error function.
- Raises:
- ValueError
If the criterion is not recognized.
- check_is_fitted()#
Checks if the model has been fitted.
- Raises:
- NotFittedError
If the
is_fitted_attribute is not set or is False.
- complexity()#
Returns the complexity of the fitted model.
The definition of complexity is backend and model dependent. For trees, it typically represents the number of nodes.
- Returns:
- int or float
The complexity metric of the model.
- Raises:
- NotFittedError
If the model has not been fitted yet.
- fit(x, y)[source]#
Fits the nominal regressor.
- Parameters:
- x{array-like, sparse matrix} of shape (n_samples, n_features)
The training input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
The target values.
- Returns:
- selfobject
Returns the instance itself.
- get_dtypes(x)#
Extracts and maps data types from the input.
This method identifies the data types of the input features to ensure they are correctly handled by the backend.
- Parameters:
- x{array-like, sparse matrix} of shape (n_samples, n_features)
The input data.
- Returns:
- dict or None
A dictionary mapping column names to data types if
xis a DataFrame, otherwise None.
- get_feature_names()#
Returns the names of the features seen during fit.
- Returns:
- ndarray of str or None
The feature names, or None if they were not available during fit (e.g., if input was a numpy array).
- abstractmethod make_model(d)[source]#
Abstract method to create the model trainer.
- Parameters:
- dDataset
The training dataset prepared by
validate_data_fit_regression.
- Returns:
- trainerobject
A trainer instance capable of fitting the provided dataset.
- predict(x)[source]#
Predicts target values for input samples.
- Parameters:
- x{array-like, sparse matrix} of shape (n_samples, n_features)
The input samples.
- Returns:
- ndarray of shape (n_samples,) or (n_samples, n_outputs)
The predicted target values.
- pretty_print(class_names=None)#
Returns a string representation of the fitted model.
Delegates the visualization logic to the underlying backend model.
- Parameters:
- class_nameslist of str, optional
Names of the classes to use in the output. If None, default identifiers are used.
- Returns:
- str
A human-readable representation of the model.
- Raises:
- NotFittedError
If the model has not been fitted yet.
- score(X, y, sample_weight=None)#
Return the coefficient of determination of the prediction.
The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value ofy, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
True values for
X.- sample_weightarray-like of shape (n_samples,), default=None
Sample weights.
- Returns:
- scorefloat
\(R^2\) of
self.predict(X)w.r.t.y.
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score(). This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- set_dtypes(x)#
Sets and persists the data types based on the input.
This is called during
fitto ensure that subsequent calls topredictcan cast the input data to the same types, preserving nominal/numeric distinctions.- Parameters:
- x{pd.DataFrame, np.ndarray, sparse matrix}
The input data to extract types from.
- Raises:
- ValueError
If the input type is not supported or if the input is not 2D.
- set_model(model)#
Sets the underlying backend model and marks it as fitted.
- Parameters:
- modelsklearn_nominal.backend.core.Model
The trained model instance from the backend.
- set_sklearn_tags(tags)#
Sets scikit-learn tags for the supervised nominal model.
- Parameters:
- tagsTags
The scikit-learn tags object to be modified.
- validate_data_fit_regression(x, y)[source]#
Validates and prepares data for regression fitting.
This method ensures
xandyare compatible, extracts data types, and packages them into a backendDataset. It also ensures the targetyis at least 2D for backend consistency.- Parameters:
- xarray-like of shape (n_samples, n_features)
The input features.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
The target values.
- Returns:
- Dataset
The backend-specific dataset object.
- Return type:
- validate_data_predict(x)#
Validates and prepares input data for prediction.
This method ensures the input features match the structure seen during training, handles feature name alignment, and restores data types.
- Parameters:
- xarray-like of shape (n_samples, n_features)
The input data to validate.
- Returns:
- pd.DataFrame
The validated data as a pandas DataFrame, with dtypes restored to match those observed during training.
- Raises:
- NotFittedError
If the model has not been fitted yet.
- ValueError
If the input contains no samples or has inconsistent features.