Kernel (general class)

In the literature, a graph kernel appears as a function: \(k \; : \; \mathcal{G} \times \mathcal{G} \rightarrow \mathbb{R}\) for which there exists a map:

\[\phi \; :\; \mathcal{G} \rightarrow \mathbb{H}, \text{for a Hilbert space } \mathbb{H}\]

where each kernel value can be computed as \(k(G_{i}, G_{j}) = \langle \phi(G_{i}), \phi(G_{j}) \rangle\) where \(\langle \cdot , \cdot \rangle\) denotes the inner product inside this space. The emerging matrix \(\mathbf{K}_{ij} = k(G_{i}, G_{j})\) is known as the kernel matrix. For a kernel matrix to be valid it is required to be positive semidefinite (i.e. \(\lambda_{min} \ge 0\)).

Note

The kernels implemented inside this package have all a Polynomial time complexity.

In many cases, if instead of computing the kernel for each pair of graphs, we calculate it for the whole collection of graphs \([G]_{i=1}^{N}\), we have a significant computational advantage. Given two collections of graphs: \(G^{n}, G^{m}\), the full kernel matrix \(\mathcal{K}\) looks like:

\[\begin{split}\mathcal{K} = \left[ \begin{array}{c||c} \mathcal{K}^{n\times n} & \mathcal{K}^{n\times m} \\ \hline \hline \mathcal{K}^{m\times n} & \mathcal{K}^{m\times m} \end{array} \right]\end{split}\]

Any Kernel object inherited class should match the following behavior:

  • \(\mathcal{K}^{n\times n}=\texttt{<KernelName>.fit_transform}(\mathcal{G}^{n})\)

  • \(\mathcal{K}^{m\times n}=\texttt{<KernelName>.fit}(\mathcal{G}^{\text{n}}).\texttt{transform}(\mathcal{G}^{\text{m}})\)

  • \(\mathcal{K}=\texttt{<KernelName>.fit_transform}([\mathcal{G}^{n}\; \mathcal{G}^{m}])\)

In graph classification, a problem tackled mainly by graph kernels, we are usually interested in calculating the matrices \(\mathcal{K}^{n\times n}\) and \(\mathcal{K}^{m\times n}\).

Parametrization

Any Kernel inherited object comes with three parameters:

  • verbose is a bool parameter in order for the kernel to print messages that are related to the progress of the execution.

  • normalize parameter is a bool parameter which ensures that the kernel output will be normalized, that is \([\mathcal{\hat{K}}]_{ij} = \frac{[\mathcal{K}]_{ij}}{\sqrt[]{[\mathcal{K}]_{ii}*[\mathcal{K}]_{jj}}}\).

  • n_jobs is an int parameter that defines the amount of parallel jobs on which parts of the kernel calculation will be executed (if a parallelization has been implemented).

Note

In order for normalization to happen even in a framework scheme, its Kernel should have an implemented diagonal method.

The Kernel class as discussed above can be found below:

Kernel([n_jobs, normalize, verbose])

A general class for graph kernels.