communities_generator

This module generates communities and their evolution in a graph.

class CommunitiesGenerator(community_count=30, snapshot_count=12, community_size_min=3, core_nodes_ratio=0.5, matching_metric_type=<class 'dyn.benchmark.generator.communities_generator.RelativeOverlap'>, seed=None)

Bases: IGenerator

This class generates a graph for the evolution of communities.

This generator works by running the following steps:

Parameters:
  • community_count (int) – target number of evolving communities

  • snapshot_count (int) – target number of snapshots

  • community_size_min (int) – minimum size of a static community

  • core_nodes_ratio (float) – ratio of members staying in community

  • matching_metric_type (Type[MatchingMetric]) –

  • seed (Optional[Any]) –

Note

All methods prefixed with draw_ are probability distribution that can be overriden when subclassing.

change_size(size)

Return a new size greater than the minimum size of the communities.

The new size if chosen using a truncated normal function centered on the current size. The minimum value is set to the minimum size of the communities.

\[max(round(size * community\_change_{ratio}), community\_size_{min}\]
Parameters:

size (int) –

Return type:

int

community_change_ratio()
community_lifetime()
community_size()
community_start(community_length=0)
configure_seed(seed)

Set :attr`seed` and :attr`rng`.

Parameters:

seed (Any) –

Returns:

instance generator

Note

necessary to easily override it while being able to call it

copy()

Return a copy of generator.

Return type:

Self

Returns:

create_communities()

Create the communities across the time steps.

The lifetime of the communities is chosen randomly using a probability distribution. Their starting time step is chosen randomly. Their size changes over the time steps as the communities grow and shrink.

After this step, all communities are created as unconnected nodes with an evolving community, a size and a snapshot.

create_communities_generator()

Return the communities generator across the time steps.

The lifetime of the communities is chosen randomly using a probability distribution. Their starting time step is chosen randomly. Their size changes over the time steps as the communities grow and shrink.

After this step, all communities are created as unconnected nodes with an evolving community, a size and a snapshot.

create_community(community)

Generate one community.

Parameters:

community (Any) –

create_inter_community_migrations()

Create inter-community flows.

For each snapshot, the remaining output flow of each node is distributed towards each successive node (other than the community successor) while keeping the same predecessors/successors according to the matching metric, and node sizes.

The distribution route the highest available output flow towards the highest available input flow (node with biggest gap between input flow and size), iteratively.

create_inter_community_migrations_generator()

Create inter-community flows generator.

For each snapshot, the remaining output flow of each node is distributed towards each successive node (other than the community successor) while keeping the same predecessors/successors according to the matching metric, and node sizes.

The distribution route the highest available output flow towards the highest available input flow (node with biggest gap between input flow and size), iteratively.

create_intra_community_migration(node)

Create intra-community output flow for given node.

If node community is continuing, core_nodes_ratio ratio of node size is the intra-community flow. This flow is clamped to not exceed the size of the successor.

Parameters:

node

create_intra_community_migrations()

Create intra-community flows.

For each continuing community node, core_nodes_ratio ratio of node size is the intra-community flow. This flow is clamped to not exceed the size of the successor.

create_intra_community_migrations_generator()

Create intra-community flows generator.

For each continuing community node, core_nodes_ratio ratio of node size is the intra-community flow. This flow is clamped to not exceed the size of the successor.

create_migrations()

Create edges between the communities to migrate nodes.

First intra-community edges are created, then inter-community ones.

create_snapshot_inter_community_migrations(t)

Create inter community migrations on given snapshot.

The remaining output flow of each node is distributed towards each successive node (other than the community successor) while keeping the same predecessors/successors according to the matching metric, and node sizes.

The distribution route the highest available output flow towards the highest available input flow (node with biggest gap between input flow and size), iteratively.

Parameters:

t (int) –

draw_change_ratio(*args, **kwargs)

Draw community change ratio.

This value is used to determine how much a community changes at each snapshot according to its current size. A ratio of 0 is interpreted as a stable community, -1 corresponds to all member leaving the community and 1 corresponds to as many new members as current size.

Returns:

drawn community change ratio

Warning

This value should be greater of equal to -1

draw_community_lifetime(*args, **kwargs)

Draw community lifetime.

Returns:

drawn lifetime

Warning

This should return a value between 1 and snapshot_count

draw_community_size(*args, **kwargs)

Draw size of a community.

Returns:

drawn size

Warning

This should return a value greater or equal to community_size_min

draw_community_start(*args, **kwargs)

Draw community birth time.

This value is meant to be linearly transformed towards the interval \([0, snapshot\_count + 1 - lifetime]\). This way a value of 0 is be interpreted as a birth at snapshot 0 while a value of 1 is the last snapshot possible for the community to be born while living its full lifetime without going over snapshot_count.

Returns:

drawn community birth time

Warning

This should return a value between 0 and 1

generate()

Generate evolving community flow graph.

Return type:

EvolvingCommunitiesGraph

Returns:

evolving community flow graph

property matching_metric: MatchingMetric

Return matching metric

Return type:

MatchingMetric

property rng: Generator

Random number generator of the generator

Return type:

Generator

property seed: SeedSequence

Seed of the generator

Return type:

SeedSequence

spawn(n_children=1)

Spawn a copy of the generated with a child seed.

Return type:

Union[Sequence[Self], Self]

Returns:

new generator

validate()

Validate the generated community flow graph.

Raises:

ValueError

  • if any community node has a bigger in flow than its size

  • if any community node has a bigger out flow than its size

  • if communities don’t match those built by matching metric

class Match(evolving_communities)

Bases: MatchingMetric

This class computes the Match Matching metric.

\[min(| C_0 \cap C_1 | / | C_0 |, | C_0 \cap C_1 | / | C_1 |)\]
Parameters:

evolving_communities (EvolvingCommunitiesGraph) –

compute(ct0, ct1)

Compute matching metric.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

Return type:

float

Returns:

matching metric result

compute_max_flow(ct0, ct1, max_mm)

Compute maximum flow below target matching metric result.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

  • max_mm – matching metric result strict upper bound

Return type:

int

Returns:

flow

match()

Compute evolving communities as node clusters using matching metric.

Return type:

Dict[Any, Set[Any]]

Returns:

evolving communities as clusters

class MatchingMetric(evolving_communities)

Bases: ABC

This abstract class defines the interface of matching metrics in the context of this module.

Parameters:

evolving_communities (EvolvingCommunitiesGraph) –

abstract compute(ct0, ct1)

Compute matching metric.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

Return type:

float

Returns:

matching metric result

compute_max_flow(ct0, ct1, max_mm)

Compute maximum flow below target matching metric result.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

  • max_mm – matching metric result strict upper bound

Return type:

int

Returns:

flow

match()

Compute evolving communities as node clusters using matching metric.

Return type:

Dict[Any, Set[Any]]

Returns:

evolving communities as clusters

class Overlap(evolving_communities)

Bases: MatchingMetric

This class computes the Overlap Matching metric.

\[| C_0 \cap C_1 |\]
Parameters:

evolving_communities (EvolvingCommunitiesGraph) –

compute(ct0, ct1)

Compute matching metric.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

Return type:

float

Returns:

matching metric result

compute_max_flow(ct0, ct1, max_mm)

Compute maximum flow below target matching metric result.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

  • max_mm – matching metric result strict upper bound

Return type:

int

Returns:

flow

match()

Compute evolving communities as node clusters using matching metric.

Return type:

Dict[Any, Set[Any]]

Returns:

evolving communities as clusters

class RelativeOverlap(evolving_communities)

Bases: MatchingMetric

This class computes the Relative Overlap Matching metric.

\[| C_0 \cap C_1 | / | C_0 \cup C_1 |\]
Parameters:

evolving_communities (EvolvingCommunitiesGraph) –

compute(ct0, ct1)

Compute matching metric.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

Return type:

float

Returns:

matching metric result

compute_max_flow(ct0, ct1, max_mm)

Compute maximum flow below target matching metric result.

Parameters:
  • ct0 – source community node

  • ct1 – target community node

  • max_mm – matching metric result strict upper bound

Return type:

int

Returns:

flow

match()

Compute evolving communities as node clusters using matching metric.

Return type:

Dict[Any, Set[Any]]

Returns:

evolving communities as clusters

main(output)

Generate and save communities and their evolution based on configuration.

Parameters:

output (str) – output filename