NominalModel#

class sklearn_nominal.sklearn.nominal_model.NominalModel(backend='pandas', *args, **kwargs)[source]#

Mixin class for all nominal models in sklearn_nominal.

This mixin provides the foundational infrastructure for models that natively handle nominal (categorical) attributes. It abstracts the complexities of managing different computation backends and provides a bridge between the scikit-learn API and the library’s internal core logic.

Attributes:
backendstr

The backend to use for computations (e.g., “pandas”).

model_sklearn_nominal.backend.core.Model

The underlying fitted model object from the backend.

is_fitted_bool

Indicates whether the model has been successfully fitted.

dtypes_pd.Series or list

The data types of the features as observed during fit.

Examples

>>> from sklearn_nominal.sklearn.nominal_model import NominalModel
>>> from sklearn.base import BaseEstimator
>>> class MyNominalModel(NominalModel, BaseEstimator):
...     def __init__(self, backend='pandas'):
...         super().__init__(backend=backend)
...     def fit(self, X, y):
...         # implementation here
...         return self

Methods

check_is_fitted()

Checks if the model has been fitted.

complexity()

Returns the complexity of the fitted model.

get_dtypes(x)

Extracts and maps data types from the input.

get_feature_names()

Returns the names of the features seen during fit.

pretty_print([class_names])

Returns a string representation of the fitted model.

set_dtypes(x)

Sets and persists the data types based on the input.

set_model(model)

Sets the underlying backend model and marks it as fitted.

set_sklearn_tags(tags)

Sets scikit-learn tags for the nominal model.

check_is_fitted()[source]#

Checks if the model has been fitted.

Raises:
NotFittedError

If the is_fitted_ attribute is not set or is False.

complexity()[source]#

Returns the complexity of the fitted model.

The definition of complexity is backend and model dependent. For trees, it typically represents the number of nodes.

Returns:
int or float

The complexity metric of the model.

Raises:
NotFittedError

If the model has not been fitted yet.

get_dtypes(x)[source]#

Extracts and maps data types from the input.

This method identifies the data types of the input features to ensure they are correctly handled by the backend.

Parameters:
x{array-like, sparse matrix} of shape (n_samples, n_features)

The input data.

Returns:
dict or None

A dictionary mapping column names to data types if x is a DataFrame, otherwise None.

get_feature_names()[source]#

Returns the names of the features seen during fit.

Returns:
ndarray of str or None

The feature names, or None if they were not available during fit (e.g., if input was a numpy array).

pretty_print(class_names=None)[source]#

Returns a string representation of the fitted model.

Delegates the visualization logic to the underlying backend model.

Parameters:
class_nameslist of str, optional

Names of the classes to use in the output. If None, default identifiers are used.

Returns:
str

A human-readable representation of the model.

Raises:
NotFittedError

If the model has not been fitted yet.

set_dtypes(x)[source]#

Sets and persists the data types based on the input.

This is called during fit to ensure that subsequent calls to predict can cast the input data to the same types, preserving nominal/numeric distinctions.

Parameters:
x{pd.DataFrame, np.ndarray, sparse matrix}

The input data to extract types from.

Raises:
ValueError

If the input type is not supported or if the input is not 2D.

set_model(model)[source]#

Sets the underlying backend model and marks it as fitted.

Parameters:
modelsklearn_nominal.backend.core.Model

The trained model instance from the backend.

set_sklearn_tags(tags)[source]#

Sets scikit-learn tags for the nominal model.

Configures the estimator tags to accurately reflect its capabilities, specifically its ability to handle string inputs and missing values natively.

Parameters:
tagsTags

The scikit-learn tags object to be modified in-place.