Hasso-Plattner-Institut
  
    • de
Hasso-Plattner-Institut
Prof. Dr. Emmanuel Müller
  
 

EvoMine

Complex networks like social networks are ever-changing. New links are formed, existing ties are broken, individuals change their attitudes. In this project, we aim at microscopic descriptions of the processes that govern network evolution by mining frequently occurring graph evolution rules. These rules formally characterize the evolution of a dynamic network, help domain experts analyze the underlying processes, and allow to build data-driven models for friendship and opinion dynamics. In collaboration with COPAN@Potsdam Institute for Climate Impact Research.

Erik Scharwächter, Emmanuel Müller, Jonathan Donges, Marwan Hassani, Thomas Seidl. Detecting Change Processes in Dynamic Networks by Frequent Graph Evolution Rule Mining. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), 2016. (in press)

Corresponding author: Erik Scharwächter <erik.scharwaechter(at)hpi.de>

Datasets

Datasets can be obtained from the following links:

The DBLP dataset can be used as provided online, whereas the Epinions and Flickr dataset need preprocessing (remove self-loops, use natural numbers as timestamps) to be in the right format. We provide all preprocessed datasets used in our experiments at [TBA, please send an e-mail to get the data].

Supplementary results

The publication contains illustrations of the top most frequent patterns extracted by EvoMine (Fig. 8 and Fig. 9). Here we provide access to all extracted patterns for all algorithms. The table follows the structure of Table II of the publication.




max. edges = 3

max. edges = 4
max. edges = 5

EM-embEM-evGERMLFREM-embEM-evGERMLFREM-embEM-evGERMLFR
DBLP 92-02pdfpdfpdfpdfpdfpdfpdfpdfpdfpdfpdfpdf
DBLP 03-05pdfpdfpdfpdfpdfpdfpdfpdfpdfpdfpdfpdf
DBLP 05-07pdfpdfpdfpdfpdfpdfpdfpdfpdfN/Apdfpdf
Epinions 1-30pdfpdfpdfpdf

 

The parameters used for the results are:

  • DBLP: tau = 5000, max. edges = 3, 4, 5
  • Epinions: tau = 100, max. edges = 3

Binaries

Statically linked executables for our implementations of EvoMine, GERM and LFR-Miner for x86_64 Linux machines can be found here. The executables require OpenMP. [Download]

  • evomine: … for networks with edge insertions only (stored as timestamped graphs)
  • evomineDyn: … for fully dynamic networks (transaction list of labeled graphs)
  • germ: … for networks with edge insertions only (stored as timestamped graphs)
  • lfrminer: … for networks with edge insertions only (stored as timestamped graphs)
  • projectslim: … to compute the database projections without mining

Usage

Instructions can be obtained by running the executables without any arguments. The support measure used by EvoMine must be selected with the -T parameter, where full refers to the embedding-based support on the full union graph projection, neigh uses the compressed union graph projection, and event is the event-based support on the event graph projection.

./evomine [OPTIONS] -f FILE
-c has edge colors
-C has node colors
-d directed graph
-e MAX_EDGES maximum number of edges in patterns
-s MIN_SUPP minimum support of patterns
-j NUM_JOBS number of OpenMP jobs/threads
-L TIMESTEPS limit for the maximum number of timesteps to read
-t has edge timestamps
-r has edge removal timestamps
-T PROJECTION projection type (full, neigh, event)
-u undirected graph

To obtain rules up to a size of 3 edges with a minimum support threshold of 5000 from the DBLP 92–02 network using the compressed union graph database, run:

./evomine -e 3 -s 5000 -T neigh -u -r -t -f dblp9202

Results will be stored in the file dblp9202.out.evomine.NEIGH.5000.3.

File format

For networks with edge insertions only: single graph with edge timestamps.

v 0 [node color]
...
v N [node color]
e v1 v2 [edge color] [edge timestamp [edge removal timestamp]]
...

For networks with edge insertions/deletions and node/edge relabelings: transaction list of graphs.

t # 0
v 0 [node color]
...
v N [node color]
e v1 v2 [edge color]
...
t # 1
...