Nomination

Spectral Vertex Nomination

class graspologic.nominate.SpectralVertexNomination[source]

Class for spectral vertex nomination on a single graph.

Given a graph \(G=(V,E)\) and a subset of \(V\) called \(S\) (the "seed"), Single Graph Vertex Nomination is the problem of ranking all \(V\) in order of relation to members of \(S\). Spectral Vertex Nomination solves this problem by embedding \(G\) into a low dimensional euclidean space (see: Adjacency Spectral Embed Tutorial ), and then generating a nomination list by some distance based algorithm. In the simple unattributed case, for each seed vertex \(u\), the other vertices are ranked in order of euclidean distance from \(u\).

Parameters:
input_graph: bool, default = True

Flag whether to expect two full graphs, or the embeddings.

  • True

    .fit and .fit_predict() expect graphs as adjacency matrix, provided as ndarray of shape (n, n). They will be embedded using the specified embedder.

  • False

    .fit() and .fit_predict() expect an embedding of the graph, i.e. a ndarray of size (n, d).

embedder: str or BaseSpectralEmbed, default = 'ASE'

May provide either a embed object or a string indicating which embedding method to use, which may be either: "ASE" for AdjacencySpectralEmbed or "LSE" for LaplacianSpectralEmbed.

n_neighbors: int, default=None

The number of vertices to nominate for each seed.

metricstr, default = 'euclidean'

Distance metric to use when finding the nearest neighbors, all sklearn metrics available.

metric_paramsdict, default = None

Arguments for the sklearn DistanceMetric specified via metric parameter.

Attributes:
nearest_neighbors_sklearn.neighbors.NearestNeighbors

A fit sklearn NearestNeighbors classifier used to find closest vertices to each seed.

References

[1]

Fishkind, D. E.; Lyzinski, V.; Pao, H.; Chen, L.; Priebe, C. E. Vertex nomination schemes for membership prediction. Ann. Appl. Stat. 9 2015. https://projecteuclid.org/euclid.aoas/1446488749

[2]

Jordan Yoder, Li Chen, Henry Pao, Eric Bridgeford, Keith Levin, Donniell E. Fishkind, Carey Priebe, Vince Lyzinski, Vertex nomination: The canonical sampling and the extended spectral nomination schemes, Computational Statistics & Data Analysis, Volume 145, 2020. http://www.sciencedirect.com/science/article/pii/S0167947320300074

__init__(input_graph=True, embedder='ase', n_neighbors=None, metric='euclidean', metric_params=None)[source]
Parameters:
  • input_graph (bool) --

  • embedder (typing_extensions.Literal[ase, ASE, lse, LSE] | BaseSpectralEmbed) --

  • n_neighbors (int | None) --

  • metric (str) --

  • metric_params (Dict[str, Any] | None) --

fit(X, y=None)[source]

Constructs the embedding if not provided, then calculates the pairwise distance from each seed to each vertex in graph.

Parameters:
Xnp.ndarray
  • If input_graph is True

    Expects a graph as an adjacency matrix, i.e. an ndarray of shape (n, n). Will be embedded using the specified embedder.

  • If input_graph is False

    Expects an embedding of the graph, i.e. a ndarray of size (n, d).

yNoneType

Included by sklearn convention.

Returns:
selfobject

Returns an instance of self.

Parameters:
  • X (ndarray) --

  • y (Any | None) --

Return type:

SpectralVertexNomination

predict(y)[source]

Nominates vertices for each seed vertex. Methodology is distance based ranking.

Parameters:
ynp.ndarray

The indices of the seed vertices. Should be a dim 1 array with length less than \(|V|\).

Returns:
Nomination Listnp.ndarray

Shape is (number_vertices, number_vertices_in_seed) . Each column is a seed vertex, and the rows of each column are a list of vertex indexes from the original adjacency matrix in order degree of match.

Distance Matrixnp.ndarray

The matrix of distances associated with each element of the nomination list.

Parameters:

y (List | ndarray) --

Return type:

Tuple[ndarray, ndarray]

fit_predict(X, y)[source]

Calls this class' fit and then predict methods.

Parameters:
Xnp.ndarray
  • If input_graph is True

    Expects a graph as an adjacency matrix, i.e. an ndarray of shape (n, n). Will be embedded using the specified embedder.

  • If input_graph is False

    Expects an embedding of the graph, i.e. a ndarray of size (n, d).

ynp.ndarray.

List of unattributed seed vertex indices.

Returns:
Nomination Listnp.ndarray

Shape is (number_vertices, number_vertices_in_seed) . Each column is a seed vertex, and the rows of each column are a list of vertex indexes from the original adjacency matrix in order degree of match.

Distance Matrixnp.ndarray

The matrix of distances associated with each element of the nomination list.

Parameters:
  • X (ndarray) --

  • y (ndarray) --

Return type:

Tuple[ndarray, ndarray]

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

Vertex Nomination via SGM

class graspologic.nominate.VNviaSGM[source]

This class implements Vertex Nomination via Seeded Graph Matching (VNviaSGM) with the algorithm described in [1].

Rather than providing a 1-1 matching for the vertices of two graphs, as in GraphMatch, VNviaSGM ranks the potential matches for a vertex of interst (VOI) in one to graph to the vertices in another graph, based on probability of matching.

Parameters:
order_voi_subgraph: int, positive (default = 1)

Order used to create induced subgraph on A about VOI where the max distance between VOI and other nodes is order_voi_subgraph. This induced subgraph will be used to determine what seeds are used when the SGM algorithm is called. If no seeds are in this subgraph about VOI, then a UserWarning is thrown, and nomination_list_ is None.

order_seeds_subgraph: int, positive (default = 1)

Order used to create induced subgraphs on A and B. These subgraphs are centered about the seeds that were determined by the subgraph generated by order_voi_subgraph. These two subgraphs will be passed into the SGM algorithm.

n_init: int, positive (default = 100)

Number of random initializations of the seeded graph matching algorithm (SGM). Increasing the number of restarts will make the probabilities returned more precise.

max_nominations: int (default = None)

Max number of nominations to include in the nomination list. If None is passed, then all nominations computed will be returned.

graph_match_kwsdict (default = {})

Gives users the option to pass custom arguments to the graph matching algorithm. Format should be {'arg_name': arg_value, ...}. See GraphMatch

Attributes:
n_seeds_: int

Number of seeds passed in seedsA that occured in the induced subgraph about VOI

nomination_list_: 2d-array

An array containing vertex nominations in the form nomination list = [[j, p_val],...] where p_val is the probability that the VOI matches to node j in graph B (sorted by descending probability)

Notes

VNviaSGM generates an initial induced subgraph about the VOI to determine which seeds are close enough to be used. If no seeds are close enough, then a warning is thrown and nomination_list_ is set to None.

All the seeds that are close enough are then used to generate subgraphs in both A and B. These subgraphs are matched using several random initializations of the seeded graph matching algorithm (SGM), and a nomination list is returned. See GraphMatch for SGM docs

References

[1]

Patsolic, HG, Park, Y, Lyzinski, V, Priebe, CE. Vertex nomination via seeded graph matching. Stat Anal Data Min: The ASA Data Sci Journal. 2020; 13: 229– 244. https://doi.org/10.1002/sam.11454

nomination_list_: ndarray | None
__init__(order_voi_subgraph=1, order_seeds_subgraph=1, n_init=100, max_nominations=None, graph_match_kws={})[source]
Parameters:
  • order_voi_subgraph (int) --

  • order_seeds_subgraph (int) --

  • n_init (int) --

  • max_nominations (int | None) --

  • graph_match_kws (Dict[str, Any]) --

fit(A, B, voi, seeds)[source]

Fits the model to two graphs.

Parameters:
A: 2d-array, square

Adjacency matrix, the graph where voi is known

B: 2d-array, square

Adjacency matrix, the graph where voi is not known

voi: int

Vertex of interest

seeds: list, 2d-array

List of length two, of form [seedsA, seedsB]. The elements of seedsA and seedsB are vertex indices from A and B, respectively, which are known to be matched; that is, vertex seedsA[i] is matched to vertex seedsB[i]. Note: len(seedsA)==len(seedsB).

Returns:
self: An instance of self
Parameters:
  • A (ndarray) --

  • B (ndarray) --

  • voi (int) --

  • seeds (ndarray | List[List[int]]) --

Return type:

VNviaSGM

fit_predict(A, B, voi, seeds)[source]

Fits model to two adjacency matrices and returns nomination list

Parameters:
A: 2d-array, square

Adjacency matrix, the graph where voi is known

B: 2d-array, square

Adjacency matrix, the graph where voi is not known

voi: int

Vertex of interest

seeds: list, 2d-array

List of length two, of form [seedsA, seedsB]. The elements of seedsA and seedsB are vertex indices from A and B, respectively, which are known to be matched; that is, vertex seedsA[i] is matched to vertex seedsB[i]. Note: len(seedsA)==len(seedsB).

Returns:
nomination_list_2d-array

The nomination list.

Parameters:
  • A (ndarray) --

  • B (ndarray) --

  • voi (int) --

  • seeds (ndarray | List[List[int]]) --

Return type:

ndarray | None

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_fit_request(*, A='$UNCHANGED$', B='$UNCHANGED$', seeds='$UNCHANGED$', voi='$UNCHANGED$')

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
Astr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for A parameter in fit.

Bstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for B parameter in fit.

seedsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for seeds parameter in fit.

voistr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for voi parameter in fit.

Returns:
selfobject

The updated object.

Parameters:
Return type:

VNviaSGM

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.