feature_encoders.generate package

Module contents

class feature_encoders.generate.CyclicalFeatures(*, seasonality, ds=None, period=None, fourier_order=None, remainder='passthrough', replace=False)[source]

Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator

Create cyclical (seasonal) features as fourier terms.

Parameters
  • seasonality (str) – The name of the seasonality. The feature generator can provide default values for period and fourier_order if seasonality is one of ‘daily’, ‘weekly’ or ‘yearly’.

  • ds (str, optional) – The name of the input dataframe’s column that contains datetime information. If None, it is assumed that the datetime information is provided by the input dataframe’s index. Defaults to None.

  • period (float, optional) – Number of days in one period. Defaults to None.

  • fourier_order (int, optional) – Number of Fourier components to use. Defaults to None.

  • remainder ({'drop', 'passthrough'}, optional) – By specifying remainder='passthrough', all the remaining columns of the input dataset will be automatically passed through (concatenated with the output of the transformer), otherwise, they will be dropped. Defaults to “passthrough”.

  • replace (bool, optional) – Specifies whether replacing an existing column with the same name is allowed (applicable when remainder=passthrough). Defaults to False.

Raises

ValueError – If remainder is neither ‘drop’ nor ‘passthrough’.

fit(X: pandas.core.frame.DataFrame, y=None)[source]

Fit the feature generator on the available data.

Parameters
  • X (pandas.DataFrame of shape (n_samples, n_features)) – The input dataframe.

  • y (None, optional) – Ignored. Defaults to None.

Returns

Fitted encoder.

Return type

CyclicalFeatures

Raises

ValueError – If either period or fourier_order is not provided, but seasonality is not one of ‘daily’, ‘weekly’ or ‘yearly’.

transform(X: pandas.core.frame.DataFrame)[source]

Apply the feature generator.

Parameters

X (pandas.DataFrame of shape (n_samples, n_features)) – The input dataframe.

Raises
  • ValueError – If the input data does not pass the checks of utils.check_X.

  • ValueError – If common columns are found and replace=False.

Returns

The transformed dataframe.

Return type

pandas.DataFrame

class feature_encoders.generate.DatetimeFeatures(ds=None, remainder='passthrough', replace=False, subset=None)[source]

Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator

Generate date and time features.

Parameters
  • ds (str, optional) – The name of the input dataframe’s column that contains datetime information. If None, it is assumed that the datetime information is provided by the input dataframe’s index. Defaults to None.

  • remainder ({'drop', 'passthrough'}, optional) – By specifying remainder='passthrough', all the remaining columns of the input dataset will be automatically passed through (concatenated with the output of the transformer), otherwise, they will be dropped. Defaults to “passthrough”.

  • replace (bool, optional) – Specifies whether replacing an existing column with the same name is allowed (applicable when remainder=passthrough). Defaults to False.

  • subset (str or list of str, optional) – The names of the features to generate. If None, all features will be produced: ‘month’, ‘week’, ‘dayofyear’, ‘dayofweek’, ‘hour’, ‘hourofweek’. The last 2 features are generated only if the timestep of the input’s ds (or index if ds is None) is smaller than pandas.Timedelta(days=1). Defaults to None.

Raises

ValueError – If remainder is neither ‘drop’ nor ‘passthrough’.

fit(X: pandas.core.frame.DataFrame, y=None)[source]

Fit the feature generator on the available data.

Parameters
  • X (pandas.DataFrame of shape (n_samples, n_features)) – The input dataframe.

  • y (None, optional) – Ignored. Defaults to None.

Returns

Fitted encoder.

Return type

DatetimeFeatures

Raises

ValueError – If the input data does not pass the checks of utils.check_X.

transform(X: pandas.core.frame.DataFrame)[source]

Apply the feature generator.

Parameters

X (pandas.DataFrame of shape (n_samples, n_features)) – The input dataframe.

Raises
  • ValueError – If the input data does not pass the checks of utils.check_X.

  • ValueError – If common columns are found and replace=False.

Returns

The transformed dataframe.

Return type

pandas.DataFrame

class feature_encoders.generate.TrendFeatures(ds=None, name='growth', remainder='passthrough', replace=False)[source]

Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator

Generate linear time trend features.

Parameters
  • ds (str, optional) – The name of the input dataframe’s column that contains datetime information. If None, it is assumed that the datetime information is provided by the input dataframe’s index. Defaults to None.

  • name (str, optional) – The name of the generated dataframe’s column. Defaults to ‘growth’.

  • remainder ({'drop', 'passthrough'}, optional) – By specifying remainder='passthrough', all the remaining columns of the input dataset will be automatically passed through (concatenated with the output of the transformer), otherwise, they will be dropped. Defaults to “passthrough”.

  • replace (bool, optional) – Specifies whether replacing an existing column with the same name is allowed (applicable when remainder=passthrough). Defaults to False.

Raises

ValueError – If remainder is neither ‘drop’ nor ‘passthrough’.

fit(X: pandas.core.frame.DataFrame, y=None)[source]

Fit the feature generator on the available data.

Parameters
  • X (pandas.DataFrame of shape (n_samples, n_features)) – The input dataframe.

  • y (None, optional) – Ignored. Defaults to None.

Returns

Fitted encoder.

Return type

TrendFeatures

Raises

ValueError – If the input data does not pass the checks of utils.check_X.

transform(X: pandas.core.frame.DataFrame)[source]

Apply the feature generator.

Parameters

X (pandas.DataFrame of shape (n_samples, n_features)) – The input dataframe.

Raises
  • ValueError – If the input data does not pass the checks of utils.check_X.

  • ValueError – If common columns are found and replace=False.

Returns

The transformed dataframe.

Return type

pandas.DataFrame