{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Nomination via SGM\n", "\n", "This class implements Vertex Nomination via Seeded Graph Matching (VNviaSGM) with the algorithm described in [1].\n", "\n", "Given two graphs $G_1$ and $G_2$ with associated adjacency matrices $A$ and $B$, VNviaSGM proposes a nomination list of potential matches in graph $G_2$ to a vertex of interest $voi \\in G_1$ with associated probabilities. \n", "\n", "Let $A_L(a)$ be the induced subgraph derived from $A$, and centered about vertex $a \\in A$ with a maximum distance from $a$ of $L$. VNviaSGM first finds $A_L(voi)$, and if no seeds are in this subgraph, the algorithm stops early and returns a nomination list of None.\n", "\n", "Define $S_A \\subset A_L(voi)$ to be the seed vertices in the subgraph centered around the voi, with associated seeds from graph $B$ ($S_B$). \n", "\n", "Two subgraphs are then generated around $S_A$ for graph $A$, as well as around the associated seeds $S_B$ for graph $B$.\n", "\n", "Specifically, define $SG_1 = \\underset{s_A \\in S_A}{\\bigcup} A_L(s_A)$ and $SG_2 = \\underset{s_B \\in S_B}{\\bigcup} B_L(s_B)$\n", "\n", "These subgraphs ($SG_1$ and $SG_2$) are matched using SGM over several random initializations, resulting in probabilities corresponding to the proportion in which a node in $B$ is matching to the voi. See [Graph Matching Algorithm Reference](https://graspologic.readthedocs.io/en/latest/reference/match.html#graph-matching) for more details. \n", "\n", "\n", "\n", "[1] Patsolic, HG, Park, Y, Lyzinski, V, Priebe, CE. Vertex nomination via seeded graph matching. Stat Anal Data Min: The ASA Data Sci Journal. 2020; 13: 229– 244. https://doi.org/10.1002/sam.11454\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from graspologic.nominate import VNviaSGM\n", "from graspologic.simulations import er_np\n", "from graspologic.plot import heatmap\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "np.set_printoptions(suppress=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define parameters\n", "n = 50\n", "p = 0.3\n", "num_seeds = 4\n", "\n", "voi = 5 # choose a vertex of interest" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(2)\n", "G1 = er_np(n=n, p=p)\n", "node_shuffle_input = np.random.permutation(n)\n", "\n", "G2 = G1[np.ix_(node_shuffle_input, node_shuffle_input)]\n", "\n", "heatmap(G1, title = \"Origional ER Graph (unshuffled)\")\n", "heatmap(G2, title = \"Shuffled ER graph\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kklst= [(xx, yy) for xx, yy in zip(node_shuffle_input, np.arange(len(node_shuffle_input)))]\n", "kklst.sort(key=lambda x:x[0])\n", "print(\"Association voi in G1 to vertex in G2 =\", kklst[voi])\n", "kklst = np.array(kklst)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The algorithm produces an $n \\times 2$ nomination list, where n is the number of nominees. Each row has the following format (vertex $j \\in G_2$, probability that j matches voi). Note: the output is sorted with the largest probability coming first in the output list. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "VNalg = VNviaSGM()\n", "print(VNalg.fit_predict(G1, G2, voi, [kklst[0:num_seeds, 0], kklst[0:num_seeds, 1]]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As seen, the actual correspondence is 5--37 and the model predicts that 5 (in graph $G_1$) matches with 37 (in graph $G_2$) with >90% confidence." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.12" } }, "nbformat": 4, "nbformat_minor": 2 }