Contributors: Department of Computer Science - K.U.Leuven; Catholic University of Leuven = Katholieke Universiteit Leuven (KU Leuven); Jozef Stefan Institute Ljubljana (IJS); Machine Learning in Information Networks (MAGNET); Inria Lille - Nord Europe; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL); Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS); Irma Ravkic was partially supported by the KU Leuven Research Fund (OT/11/051). Martin Žnidaršič was partially supported by the KU Leuven Research Fund (OT/11/051) and the Slovenian Research Agency (P2-0103). Jesse Davis is partially supported by the KU Leuven Research Fund (OT/11/051, C14/17/070, C22/15/015, C32/17/036) and FWO-Vlaanderen (G.0356.12, SBO-150033).; European Project: 240186,EC:FP7:ERC,ERC-2009-StG,MIGRANT(2009)
نبذة مختصرة : International audience ; Counting the number of times a pattern occurs in a database is a fundamental data mining problem. It is a subroutine in a diverse set of tasks ranging from pattern mining to supervised learning and probabilistic model learning. While a pattern and a database can take many forms, this paper focuses on the case where both the pattern and the database are graphs (networks). Unfortunately , in general, the problem of counting graph occurrences is #P-complete. In contrast to earlier work, which focused on exact counting for simple (i.e., very short) patterns, we present a sampling approach for estimating the statistics of larger graph pattern occurrences. We perform an empirical evaluation on synthetic and real-world data that validates the proposed algorithm, illustrates its practical behavior and provides insight into the trade-off between its accuracy of estimation and computational efficiency.
No Comments.