Stanford InfoLab Publication Server

Identifying Users in Social Networks with Limited Information

Vesdapunt, Norases and Garcia-Molina, Hector (2014) Identifying Users in Social Networks with Limited Information. Technical Report. Stanford InfoLab.

BibTeXDublinCoreEndNoteHTML
WarningThere is a more recent version of this item available.

[img]
Preview
PDF - Draft Version
4Mb

Abstract

We study the problem of Entity Resolution (ER) with limited information. ER is the problem of identifying and merging records that represent the same real-world entity. In this paper, we focus on the resolution of a single node g from one social graph (Google+ in our case) against a second social graph (Twitter in our case). We want to find the best match for g in Twitter, by dynamically probing the Twitter graph (using a public API), limited by the number of API calls that social systems allow. We propose a strategy that is designed for limited information and is robust to rule changing. We evaluate our strategy against random on a real dataset and show that our strategy is 46% faster than random and achieves 88.3% accuracy. We also propose a greedy strategy, which is a natural extension of bipartite matching subject to limited API calls, and it achieves 85% accuracy.

Item Type:Techreport (Technical Report)
ID Code:1086
Deposited By:Norases Vesdapunt
Deposited On:21 Feb 2014 14:34
Last Modified:10 Oct 2014 12:58

Download statistics

Repository Staff Only: item control page