grakel.NeighborhoodHash

class grakel.NeighborhoodHash(n_jobs=None, normalize=False, verbose=False, random_state=None, R=3, nh_type='simple', bits=8)[source][source]

Neighborhood hashing kernel as proposed in [HK09].

Parameters
Rint, default=3

The maximum number of neighborhood hash.

nh_typestr, valid_types={“simple”, “count_sensitive”}, default=”simple”

The existing neighborhood hash type as defined in [HK09].

bytesint, default=2

Byte size of hashes.

random_stateRandomState or int, default=None

A random number generator instance or an int to initialize a RandomState as a seed.

Attributes
NH_function

The neighborhood hashing function.

random_state_RandomState

A RandomState object handling all randomness of the class.

Methods

ROT(self, n, d)

rot operation for binary numbers.

diagonal(self)

Calculate the kernel matrix diagonal of the fit/transfromed data.

fit(self, X[, y])

Fit a dataset, for a transformer.

fit_transform(self, X)

Fit and transform, on the same dataset.

get_params(self[, deep])

Get parameters for this estimator.

initialize(self)

Initialize all transformer arguments, needing initialization.

neighborhood_hash_count_sensitive(self, G)

Count sensitive neighborhood hash as defined in [HK09].

neighborhood_hash_simple(self, G)

(simple) neighborhood hashing as defined in [HK09].

pairwise_operation(self, x, y)

Calculate a pairwise kernel between two elements.

parse_input(self, X)

Parse the given input and raise errors if it is invalid.

radix_sort_rot(self, labels)

Sorts vertices based on labels.

set_params(self, \*\*params)

Call the parent method.

transform(self, X)

Calculate the kernel matrix, between given and fitted dataset.

Initialize a neighborhood_hash kernel.

Attributes
X

Methods

ROT(self, n, d)

rot operation for binary numbers.

diagonal(self)

Calculate the kernel matrix diagonal of the fit/transfromed data.

fit(self, X[, y])

Fit a dataset, for a transformer.

fit_transform(self, X)

Fit and transform, on the same dataset.

get_params(self[, deep])

Get parameters for this estimator.

initialize(self)

Initialize all transformer arguments, needing initialization.

neighborhood_hash_count_sensitive(self, G)

Count sensitive neighborhood hash as defined in [HK09].

neighborhood_hash_simple(self, G)

(simple) neighborhood hashing as defined in [HK09].

pairwise_operation(self, x, y)

Calculate a pairwise kernel between two elements.

parse_input(self, X)

Parse the given input and raise errors if it is invalid.

radix_sort_rot(self, labels)

Sorts vertices based on labels.

set_params(self, \*\*params)

Call the parent method.

transform(self, X)

Calculate the kernel matrix, between given and fitted dataset.

__init__(self, n_jobs=None, normalize=False, verbose=False, random_state=None, R=3, nh_type='simple', bits=8)[source][source]

Initialize a neighborhood_hash kernel.

ROT(self, n, d)[source][source]

rot operation for binary numbers.

Parameters
nint

The value which will be rotated.

dint

The number of rotations.

Returns
rotint

The result of a rot operation.

diagonal(self)[source][source]

Calculate the kernel matrix diagonal of the fit/transfromed data.

Parameters
None.
Returns
X_diagnp.array

The diagonal of the kernel matrix between the fitted data. This consists of each element calculated with itself.

Y_diagnp.array

The diagonal of the kernel matrix, of the transform. This consists of each element calculated with itself.

fit(self, X, y=None)[source][source]

Fit a dataset, for a transformer.

Parameters
Xiterable

Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that fitting the given graph format). The train samples.

yNone

There is no need of a target in a transformer, yet the pipeline API requires this parameter.

Returns
selfobject
Returns self.
fit_transform(self, X)[source][source]

Fit and transform, on the same dataset.

Parameters
Xiterable

Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that fitting the given graph format). If None the kernel matrix is calculated upon fit data. The test samples.

yNone

There is no need of a target in a transformer, yet the pipeline API requires this parameter.

Returns
Knumpy array, shape = [n_targets, n_input_graphs]

corresponding to the kernel matrix, a calculation between all pairs of graphs between target an features

get_params(self, deep=True)[source]

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsmapping of string to any

Parameter names mapped to their values.

initialize(self)[source][source]

Initialize all transformer arguments, needing initialization.

neighborhood_hash_count_sensitive(self, G)[source][source]

Count sensitive neighborhood hash as defined in [HK09].

Parameters
Gtuple, len=3

A tuple three elements consisting of vertices sorted by labels, vertex label dict, edge dict and number of occurencies dict for labels.

Returns
vertices_labels_edges_noctuple

A tuple of vertices, new_labels-dictionary and edges.

neighborhood_hash_simple(self, G)[source][source]

(simple) neighborhood hashing as defined in [HK09].

Parameters
Gtuple

A tuple of three elements consisting of vertices sorted by labels, vertex label dictionary and edge dictionary.

Returns
vertices_labels_edgestuple

A tuple of vertices, new_labels-dictionary and edges.

pairwise_operation(self, x, y)[source][source]

Calculate a pairwise kernel between two elements.

Parameters
x, ylist

Dict of len=2, tuples, consisting of vertices sorted by (labels, vertices) and edge-labels dict, for all levels from 0 up to self.R-1.

Returns
kernelnumber

The kernel value.

parse_input(self, X)[source]

Parse the given input and raise errors if it is invalid.

Parameters
Xiterable

For the input to pass the test, we must have: Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that correspond to the given graph format). A valid input also consists of graph type objects.

Returns
Xplist

List of graph type objects.

radix_sort_rot(self, labels)[source][source]

Sorts vertices based on labels.

Parameters
labelsdict

Dictionary of labels for vertices.

Returns
labels_countslist

A list of labels with their counts (sorted).

set_params(self, **params)[source]

Call the parent method.

transform(self, X)[source][source]

Calculate the kernel matrix, between given and fitted dataset.

Parameters
Xiterable

Each element must be an iterable with at most three features and at least one. The first that is obligatory is a valid graph structure (adjacency matrix or edge_dictionary) while the second is node_labels and the third edge_labels (that fitting the given graph format). If None the kernel matrix is calculated upon fit data. The test samples.

Returns
Knumpy array, shape = [n_targets, n_input_graphs]

corresponding to the kernel matrix, a calculation between all pairs of graphs between target an features

Bibliography

HK09(1,2,3,4,5,6,7,8)

Shohei Hido and Hisashi Kashima. A Linear-time Graph Kernel. In Proceedings of the 9th International Conference on Data Mining, 179–188. 2009.