统计代写|复杂网络代写complex networks代考| Non-hierarchical

统计代写|复杂网络代写complex networks代考| Non-hierarchical

统计代写|复杂网络代写complex networks代考|Non-hierarchical

The non-hierarchical methods approach the problem from a different perspective. In principle, they intend to calculate a full distance matrix for the nodes of the network. This can then be treated by conventional techniques.

One of the earliest approaches to community detection is due to Eriksen et al. $[41,42]$. They study a diffusion process on a network and analyze the decay of the modes of the following diffusive system with discrete time:

2622 Standard Approaches to Network Structure: Block Modeling
\rho_{i}(t+1)-\rho_{i}(t)=\sum_{j}\left(T_{i j}-\delta_{i j}\right) \rho_{j}(t)
Here $T_{i j}$ represents the adjacency matrix of the network such that $T_{i j}=1 / k_{j}$ for $A_{i j}=1$ and zero otherwise. Hence $T_{i j}$ represents the probability of a random walker to go from $j$ to $i$. The decay of a random initial configuration $\rho(t=0)$ toward the steady state is characterized by the eigenmodes of the transition matrix $T_{i j}$. The eigenvectors corresponding to the largest eigenvalues can then be used to define a distance between nodes which helps in identifying communities. To do this, the eigenvectors belonging to the largest non-trivial positive eigenvalues are plotted against each other. This diffusion approach is very similar in spirit to other algorithms based on the idea of using flow simulations for community detection as suggested by van Dongen [43] under the name of “Markov clustering” (MCL).

The method presented by Zhou [44-46] first converts the sparse adjacency matrix of the graph into a full distance matrix by calculating the average time a Brownian particle needs to move from node $i$ to $j$. Then this distance matrix is clustered using ordinary hierarchical clustering algorithms. This approach is based on the observation that a random walker has shorter traveling time between two nodes if many (short) alternative paths exist.

Another spectral approach has been taken by Muños and Donetti [47]. They work with the Laplacian matrix of the network. The Laplacian is defined as
L_{i j}=k_{i} \delta_{i j}-A_{i j} .
Otherwise, the method proposed is similar to Ref. [41]. Plotting the nontrivial eigenvectors against each other gives a low-dimensional representation of a distance measure of the network on top of which a conventional clustering procedure then needs to operate.

Though these methods are able to recover known community structures with good accuracy, they suffer from being less intuitive. Communities found can only be interpreted with respect to the particular system under study, be it a diffusive system or the eigen vectors of the Laplacian matrix. Problematic is also that there is no local variant of these methods, i.e., there is no way to find the community around a given node using spectral methods.

统计代写|复杂网络代写complex networks代考|Optimization Based

A different approach which is reminiscent of the parametric clustering procedures known in computer science is the idea of searching for partitions with maximum modularity $Q$ using combinatorial optimization techniques [48]. This approach has been adopted by Guimera et al. in Refs. [2, 49] or Massen et al. $[50]$ using simulated annealing [51] or Duch and Arenas using extremal optimization [52].

Though this approach will be the preferred one for the remainder of this book, a number of issues remain. For the hierarchical algorithms, a community was to be understood as whatever the algorithm outputs. Now, it is not the algorithm that defines what a community is, but the quality function, i.e., the modularity $Q$ in this case. Also, the modularity $Q$ as defined by Newman [23] is parameter free and an understanding for hierarchical and overlapping structures needs to be developed.

统计代写|复杂网络代写complex networks代考|Conclusion

Block structure in networks is a very common and well-studied phenomenon. The concepts of structural and regular equivalence as well as the types of blocks defined for generalized block modeling are well defined but appear too rigid to be of practical use for large and noisy data sets. Diagonal block models or modular structures have received particular attention in the literature and have developed into an almost independent concept of cohesive subgroups or communities. The comparison of many different community definitions from various fields has shown that the concept of module or community in a network is only vaguely defined. The diversity of algorithms published is only a consequence of this vague definition. None of the algorithms could be called “ideal” in the sense that it combines the features of computational efficiency, accuracy, flexibility and adaptability with regard to the network and easy interpretation of the results. More importantly, none of the above-cited publications allows an estimation to which degree the community structure found is a reality of the network or a product of the clustering process itself. The following chapters are addressing these issues and present a framework in which community detection is viewed again as a special case of a general procedure for detecting block structure in networks.

最早的社区检测方法之一是由 Eriksen 等人提出的。[41,42]. 他们研究了网络上的扩散过程,并分析了以下扩散系统模式在离散时间下的衰减:

2622 网络结构的标准方法:块建模
这里吨一世j表示网络的邻接矩阵,使得吨一世j=1/ķj为了一种一世j=1否则为零。因此吨一世j表示随机游走者离开的概率j到一世. 随机初始配置的衰减ρ(吨=0)向稳态的特征是转移矩阵的特征模态吨一世j. 然后可以使用对应于最大特征值的特征向量来定义节点之间的距离,这有助于识别社区。为此,将属于最大非平凡正特征值的特征向量相互绘制。这种扩散方法在精神上与其他基于使用流模拟进行社区检测的算法非常相似,正如 van Dongen [43] 以"马尔可夫聚类"(MCL)的名义提出的。

Zhou[44-46]提出的方法首先通过计算布朗粒子从节点移动所需的平均时间,将图的稀疏邻接矩阵转换为全距离矩阵一世到j. 然后使用普通的层次聚类算法对这个距离矩阵进行聚类。这种方法基于以下观察:如果存在许多(短)替代路径,则随机游走者在两个节点之间的旅行时间更短。

Muños 和 Donetti [47] 采用了另一种光谱方法。他们使用网络的拉普拉斯矩阵。拉普拉斯算子定义为


统计代写|复杂网络代写complex networks代考|Optimization Based

让人想起计算机科学中已知的参数聚类过程的另一种方法是搜索具有最大模块化的分区的想法问使用组合优化技术[48]。这种方法已被 Guimera 等人采用。在参考文献中。[2, 49] 或 Massen 等人。[50]使用模拟退火 [51] 或使用极值优化的 Duch 和 Arenas [52]。

尽管这种方法将是本书其余部分的首选方法,但仍然存在许多问题。对于分层算法,社区将被理解为算法输出的任何内容。现在,定义社区的不是算法,而是质量函数,即模块化问在这种情况下。此外,模块化问正如 Newman [23] 所定义的那样,它是无参数的,需要开发对分层和重叠结构的理解。

统计代写|复杂网络代写complex networks代考|Conclusion


