FPS_plus

class PAsampling.wrappers.FPS_plus(method='kmedoids', mu=3)[source]

Implements a modified version of the Farthest Point Sampling (FPS) algorithm.

This class provides a wrapper around the fps_np function and integrates it with various sampling strategies. It allows for the selection of a subset of samples from a dataset based on the FPS strategy, followed by additional selection using different methods such as k-medoids, facility location, random sampling, and twinning.

Attributes:

methodstr, optional (default=’kmedoids’)

The sampling method to use after the initial FPS selection. Options are ‘kmedoids’, ‘facility_location’, ‘random’, and ‘twin’.

muint, optional (default=3)

The number of initial points to select using FPS before applying the respective strategy. mu is expressed as a percentage of the total number of samples in the dataset. Default is 3%.

fit(X, initial_subset, b_samples, metric='euclidean', ratio=5, idx_initial_point=0, init_kmedoids='k-medoids++', random_state=None)[source]

Fits the model to the data X and returns the indices of the selected samples.

Parameters:

Xnumpy.ndarray (n_samples, n_features)

Input points, representing a set of data points.

initial_subsetlist

List of indices (rows of the input points matrix) representing the initial set of selected elements.

b_samplesint

The desired number of points to select.

metricstr, optional (default=’euclidean’)

The metric to use for computing distances. Options are ‘euclidean’, ‘manhattan’, etc.

ratioint, optional (default=5)

The ratio parameter for the twinning method.

idx_initial_pointint, optional (default=0)

The initial point index for the twinning method.

init_kmedoidsstr, optional (default=’k-medoids++’)

The method for initialization in k-medoids. Options are ‘random’, ‘heuristic’, ‘k-medoids++’, and ‘build’.

random_stateint, optional (default=None)

The seed used by the random number generator.

Returns:

sampleslist

List of indices representing the selected points using the modified FPS algorithm.