grakel.GraphletSampling

class grakel.GraphletSampling(n_jobs=None, normalize=False, verbose=False, random_state=None, k=5, sampling=None)[source][source]

The graphlet sampling kernel.

See [SPM+09].

If either “delta”, “epsilon”, “a” or “n_samples” is given calculates the kernel value for the given (or derived) random picked n_samples, by randomly sampling from k from 3 to 5. Otherwise calculates the kernel value drawing all possible connected samples of size k.

Parameters
random_stateRandomState or int, default=None

A random number generator instance or an int to initialize a RandomState as a seed.

kint, default=5

The dimension of the given graphlets.

samplingNone or dict

Defines if random sampling of graphlets will be utilised. If not None the dictionary can either contain:

  • n_samplesint

    Sets the value of randomly drawn random samples, from sizes between 3..k. Overides the parameters a, epsilon, delta.

or

  • delta : float, default=0.05 Confidence level (typically 0.05 or 0.1). For calculation of the number of samples achieving the certain bound. n_samples argument must not be provided and for initialising the default value either “epsilon” or “a” must be set.

  • epsilonfloat, default=0.05

    Precision level (typically 0.05 or 0.1). For calculation of the number of samples achieving the certain bound. n_samples argument must not be provided and for initialising the default value either “delta” or “a” must be set.

  • aint

    Number of isomorphism classes of graphlets. If -1 the number is the maximum possible, from a database 1 until 9 or else predicted through interpolation. For calculation of the number of samples achieving the certain bound. n_samples argument must not be provided and for initializing the default value either “delta” or “epsilon” must be set.

Attributes
Xdict

A dictionary of pairs between each input graph and a bins where the sampled graphlets have fallen.

sample_graphlets_function

A function taking as input a binary adjacency matrix, parametrised to work for the certain samples, k and deterministic/propabilistic mode.

random_state_RandomState

A RandomState object handling all randomness of the class.

_graph_binsdict

A dictionary of graph bins holding pynauty objects

_nxint

Holds the number of sampled X graphs.

_nyint

Holds the number of sampled Y graphs.

_X_diagnp.array, shape=(_nx, 1)

Holds the diagonal of X kernel matrix in a numpy array, if calculated (fit_transform).

_phi_Xnp.array, shape=(_nx, len(_graph_bins))

Holds the features of X in a numpy array, if calculated. (fit_transform).

Methods

diagonal(self)

Calculate the kernel matrix diagonal for fitted data.

fit(self, X[, y])

Fit a dataset, for a transformer.

fit_transform(self, X)

Fit and transform, on the same dataset.

get_params(self[, deep])

Get parameters for this estimator.

initialize(self)

Initialize all transformer arguments, needing initialization.

pairwise_operation(self, x, y)

Calculate a pairwise kernel between two elements.

parse_input(self, X)

Parse and create features for graphlet_sampling kernel.

set_params(self, \*\*params)

Call the parent method.

transform(self, X)

Calculate the kernel matrix, between given and fitted dataset.

Initialise a subtree_wl kernel.

Attributes
X

Methods

diagonal(self)

Calculate the kernel matrix diagonal for fitted data.

fit(self, X[, y])

Fit a dataset, for a transformer.

fit_transform(self, X)

Fit and transform, on the same dataset.

get_params(self[, deep])

Get parameters for this estimator.

initialize(self)

Initialize all transformer arguments, needing initialization.

pairwise_operation(self, x, y)

Calculate a pairwise kernel between two elements.

parse_input(self, X)

Parse and create features for graphlet_sampling kernel.

set_params(self, \*\*params)

Call the parent method.

transform(self, X)

Calculate the kernel matrix, between given and fitted dataset.

__init__(self, n_jobs=None, normalize=False, verbose=False, random_state=None, k=5, sampling=None)[source][source]

Initialise a subtree_wl kernel.

diagonal(self)[source][source]

Calculate the kernel matrix diagonal for fitted data.

A funtion called on transform on a seperate dataset to apply normalization on the exterior.

Parameters
None.
Returns
X_diagnp.array

The diagonal of the kernel matrix, of the fitted data. This consists of kernel calculation for each element with itself.

Y_diagnp.array

The diagonal of the kernel matrix, of the transformed data. This consists of kernel calculation for each element with itself.

fit(self, X, y=None)[source]

Fit a dataset, for a transformer.

Parameters
Xiterable

Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that fitting the given graph format). The train samples.

yNone

There is no need of a target in a transformer, yet the pipeline API requires this parameter.

Returns
selfobject
Returns self.
fit_transform(self, X)[source][source]

Fit and transform, on the same dataset.

Parameters
Xiterable

Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that fitting the given graph format). If None the kernel matrix is calculated upon fit data. The test samples.

yNone

There is no need of a target in a transformer, yet the pipeline API requires this parameter.

Returns
Knumpy array, shape = [n_input_graphs, n_input_graphs]

corresponding to the kernel matrix, a calculation between all pairs of graphs between target an features

get_params(self, deep=True)[source]

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsmapping of string to any

Parameter names mapped to their values.

initialize(self)[source][source]

Initialize all transformer arguments, needing initialization.

pairwise_operation(self, x, y)[source]

Calculate a pairwise kernel between two elements.

Parameters
x, yObject

Objects as occur from parse_input.

Returns
kernelnumber

The kernel value.

parse_input(self, X)[source][source]

Parse and create features for graphlet_sampling kernel.

Parameters
Xiterable

For the input to pass the test, we must have: Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that correspond to the given graph format). A valid input also consists of graph type objects.

Returns
local_valuesdict

A dictionary of pairs between each input graph and a bins where the sampled graphlets have fallen.

set_params(self, **params)[source]

Call the parent method.

transform(self, X)[source][source]

Calculate the kernel matrix, between given and fitted dataset.

Parameters
Xiterable

Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that fitting the given graph format).

Returns
Knumpy array, shape = [n_targets, n_input_graphs]

corresponding to the kernel matrix, a calculation between all pairs of graphs between target an features

Bibliography

SPM+09

N. Shervashidze, T. Petri, K. Mehlhorn, K. M. Borgwardt, and S. Vishwanathan. Efficient Graphlet Kernels for Large Graph Comparison. In Proceedings of the International Conference on Artificial Intelligence and Statistics, 488–495. 2009.