================================== Ground truth Generation Parameters ================================== This reference page provides a comprehensive overview of parameters used in generating evolving communities with the ``dyn-benchmark`` package. As describe in our paper, the benchmark generation process consists of three main components: 1. **Evolving Communities Generation**: Creates evolving communities and their interactions 2. **Node Generation**: Assigns members to communities at each snapshot 3. **Graph Generation**: Creates the underlying network structure Each component is configurable through various parameters that control the properties of the generated benchmark. 1. Evolving Communities Generator Parameters -------------------------------------------- Community Generator ~~~~~~~~~~~~~~~~~~~ The :class:`CommunitiesGenerator ` class creates the structure of evolving communities: .. list-table:: :widths: 30 25 45 :header-rows: 1 * - Parameter - Default Value - Description * - ``community_count`` - ``30`` - Number of evolving communities to generate * - ``snapshot_count`` - ``12`` - Number of time snapshots in the temporal network * - ``community_size_min`` - ``3`` - Minimum size of any static community * - ``core_nodes_ratio`` - ``0.5`` - Fraction of members that stay in the same community between snapshots * - ``matching_metric_type`` - ``RelativeOverlap`` - Algorithm for matching communities across snapshots * - ``seed`` - ``None`` - Random seed for reproducibility Example of creating a custom community generator: .. code-block:: python generator = CommunitiesGenerator( community_count=15, snapshot_count=10, community_size_min=5, core_nodes_ratio=0.7 ) Probability Distribution Methods ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Community evolution is controlled by probability distributions via these methods that can be overridden in custom subclasses: .. list-table:: :widths: 30 25 45 :header-rows: 1 * - Method - Default Distribution - Description * - ``draw_community_size()`` - Normal(μ=10, σ=3) - Initial size of communities * - ``draw_community_lifetime()`` - TruncNormal(μ=3, σ=2, min=1, max=5) - How long communities exist * - ``draw_community_start()`` - Uniform(0, 1) - When communities appear (scaled to valid snapshot range) * - ``draw_change_ratio()`` - Normal(μ=0, σ=0.2) - How communities change in size over time Example of custom probability distributions: .. code-block:: python class CustomGenerator(CommunitiesGenerator): def draw_community_size(self, *args, **kwargs): return self.rng.normal(loc=50, scale=10) def draw_community_lifetime(self, *args, **kwargs): return max(2, min(self.snapshot_count, self.rng.normal(loc=6, scale=2))) Matching Metrics ~~~~~~~~~~~~~~~~ Three metrics are available for determining how communities are linked across snapshots: .. list-table:: :widths: 30 25 45 :header-rows: 1 * - Metric - Formula - Use Case * - ``Match`` - min(|C₀ ∩ C₁|/|C₀|, |C₀ ∩ C₁|/|C₁|) - When community sizes vary significantly * - ``RelativeOverlap`` - |C₀ ∩ C₁|/|C₀ ∪ C₁| - Default, balanced approach * - ``Overlap`` - |C₀ ∩ C₁| - When community sizes are stable 2. Node Generator Parameters ---------------------------- The :class:`RandomMemberGenerator ` class assigns members to communities: .. list-table:: :widths: 20 15 65 :header-rows: 1 * - Parameter - Default Value - Description * - ``seed`` - ``None`` - Random seed for reproducibility Member attributes (generated internally): .. list-table:: :widths: 20 80 :header-rows: 1 * - Attribute - Description * - ``coreness`` - Tendency to join previously visited communities * - ``intermittence`` - Tendency to stay out of the network temporarily 3. Graph Generator Parameters ----------------------------- Multiple graph generation models are available: Stochastic Block Model (SBM) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Stochastic Block Model (SBM) implements a classic community-based random graph model where edge probabilities depend on community membership, available through the :class:`SBM ` class. .. list-table:: :widths: 20 15 65 :header-rows: 1 * - Parameter - Default Value - Description * - ``p_in`` - ``0.8`` - Probability of edge between nodes in same community * - ``p_out`` - ``0.01`` - Probability of edge between nodes in different communities * - ``max_iter`` - ``10`` - Maximum attempts to generate a connected graph * - ``seed`` - ``None`` - Random seed for reproducibility Preferential Attachment Model (PAM) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Preferential Attachment Model (PAM) creates scale-free networks where new nodes preferentially connect to existing high-degree nodes, implemented in the :class:`PAM ` class. .. list-table:: :widths: 20 15 65 :header-rows: 1 * - Parameter - Default Value - Description * - ``m`` - ``5`` - Number of edges to add for each new node * - ``self_loop`` - ``False`` - Whether self-loops are allowed * - ``seed`` - ``None`` - Random seed for reproducibility Block Preferential Attachment Model (BPAM) and FastBPAM ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Block Preferential Attachment Model (BPAM) and its optimized variant FastBPAM combine community structure with preferential attachment, creating realistic networks with both community organization and scale-free properties, available through the :class:`BPAM ` and :class:`FastBPAM ` classes. .. list-table:: :widths: 20 15 65 :header-rows: 1 * - Parameter - Default Value - Description * - ``gamma_in`` - ``0.8`` - Intra-community interaction strength * - ``gamma_out`` - ``0.01`` - Inter-community interaction strength * - ``m`` - ``5`` - Number of edges to add for each new node * - ``self_loop`` - ``False`` - Whether self-loops are allowed * - ``seed`` - ``None`` - Random seed for reproducibility Main Generator Parameters ------------------------- The :class:`GroundtruthGenerator ` combines all components: .. list-table:: :widths: 30 30 40 :header-rows: 1 * - Parameter - Default Value - Description * - ``community_generator`` - ``CommunitiesGenerator()`` - Generator for evolving communities * - ``node_generator`` - ``RandomMemberGenerator()`` - Generator for members * - ``edge_generator`` - ``FastBPAM()`` - Generator for network structure * - ``seed`` - ``None`` - Master seed for reproducibility For monitoring progress during generation: .. code-block:: python # Use ProgressiveGroundtruthGenerator instead of GroundtruthGenerator # for visual feedback during generation generator = ProgressiveGroundtruthGenerator( community_generator=CommunitiesGenerator(), seed=42 ) 4. Some examples ---------------- For realistic social network-like benchmarks: .. code-block:: python # Social network-like settings generator = GroundtruthGenerator( community_generator=CommunitiesGenerator( community_count=20, snapshot_count=10, community_size_min=5, core_nodes_ratio=0.7 ), edge_generator=FastBPAM( gamma_in=0.7, gamma_out=0.05, m=5 ), seed=42 ) For stable communities with clear boundaries: .. code-block:: python # Stable, clearly defined communities generator = GroundtruthGenerator( community_generator=CommunitiesGenerator( core_nodes_ratio=0.9 ), edge_generator=SBM( p_in=0.8, p_out=0.01 ), seed=42 ) For highly dynamic communities: .. code-block:: python # Highly dynamic communities generator = GroundtruthGenerator( community_generator=CommunitiesGenerator( core_nodes_ratio=0.3 ), edge_generator=SBM( p_in=0.6, p_out=0.1 ), seed=42 )