zoukankan      html  css  js  c++  java
  • 【Python排序搜索基本算法】之深度优先搜索、广度优先搜索、拓扑排序、强联通&Kosaraju算法

    Graph Search and Connectivity


        Generic Graph Search

        Goals 1. find everything findable

                  2. don't explore anything twice

        Generic Algorithm (given graph G, vertex S)

                   --- initialize S explored (all others unexplored)

                   --- while possible:

                          --- choose an edge(u, v) with u explored and v unexplored

                          --- mark v explored


         1. Breadth-First Search (BFS)     O(m+n) time using a queue

            --- explore nodes in 'layers'

            --- can compute shortest paths

            --- can compute connected components of an undirected graph

           

            The basics:pseudocode

            BFS(Graph G, start vertex s)

                (all nodes initially unexplored)

                mark s as explored

                let Q = queue data structure(FIFO), initialized with s

                while Q != 0:

                    remove the first node of Q, call it v

                    for each edge(v, w):

                        if w unexplored

                            mark w as explored

                            add into Q (at the end)


              Shortest Paths:

              Goal: compute dist(v), the fewest # of edges on a path from s to v

              Extra code: 

                  initialize dist(v) = 0 if v == s 

                  when considering edge(v, w):

                      if w unexplored then set dist(w) = dist(v) + 1

              claim: at termination, dist(v) = i  <=>  v in ith layer


              Undirected Connectivity

                  let G = (V, E) be an undirected graph

                  Connected components == the 'pieces' of G

                  Goal: compute all connected components(why? check if network is disconnected, graph visualization, clustering, similarity)

                      all nodes unexplored

                      (assume labelled 1 to n)

                      for i = 1 to n

                          if i not yet explored

                              BFS(G, i)      //discovers precisely i's connected components


        2. Depth-First Search (DFS)       O(m+n) time using a stack

            --- explore aggressively like a maze, backtrack only when necessary

            --- compute topological ordering of directed acycle graph(DAG)

            --- compute connected components in directed graphs

            

            pseudocode:

            use a stack instead of a queue

            recursive version: 

                DFS(Graph G, start vertex s)

                    mark s as explored

                    for every edge(s,v)

                        if v unexplored

                           DFS(G,v)

            

            Application: Topological Sort   (DAG)

            Definition: A topological ordering of a directed graph G is a labelling f of G's node's such that:

                           1. the f(v)'s are the set{1,2,...,n}

                           2. (u,v)  => f(u) < f(v)

            note that if G has directed cycle => no topological ordering


            Straightforward solution to Topological Sort

            note: every directed acyclic graph has a sink vertex(入度为0的node,无前驱)

            To compute topological ordering:

                let v be a sink vertex of G

                set f(v) = n

                recurse on G - {v}

                (1) 从有向图中选一个没有前驱的顶点

                (2) 从图中删去该点,并删去从该点出发的所有边

                (3) 重复上两步,直到图中再没有有前驱的点为止


            Topological Sort via DFS

            DFS(G, s)

                mark s explored

                for every edge(s, v)

                    if v not yet explored

                        DFS(G, v)

                 set f(s) = current_label

                 current_label --

            DFS-loop(Graph G)

                mark all node unexplored

                current_label = n

                for each vertex v:

                    if v unexplored

                        DFS(G, v)

             



              3. Computing Strong Components: The Algorithm

                    Strongly connected Components

                    Formal Definition: the strongly connected Components(SCCs) of a directed graph G are the equivalance classes of the relation:

                               u~v <=> u ->v and v -> u in G

                   

                                   

                   Kosaraju's Two-Pass Algorithm   2*DFS = O(m+n) 

                   1. let Gr = G with all arcs reversed

                   2. run DFS-loop on Gr       <---------- Goal: compute 'magical ordering' of nodes

                            let f(v) = 'finishing time' of each v 

                   3. run DFS-loop on G        <---------- Goal: discover the SCCs one-by-one 

                       processing nodes in decreasing order of finishing times

                       SCCs = nodes with the same 'leader'

                

                   pseudocode:

                   DFS(G, i)

                       make i as explored 

                       set leader(i) = node s

                       for each arc(i, j):

                           if j not yet explored:

                               DFS(G, j)

                      t++

                      set f(i) = t      // i's finishing time


                  DFS-loop(Graph G)

                      global variable t = 0    // # of nodes pressed so far (for finishing times in 1st pass)             

                      global variable s = Null  // current source vertex  (for leaders in 2nd pass)

                      Assume nodes labelled 1 to n

                      for i = n down to 1

                      if i not yet explored 

                          s = i

                          DFS(G, i)






    Python Code:


    import sys
    import threading
    import copy
    
    threading.stack_size(67108864)
    sys.setrecursionlimit(300000)
    
    def DFS(edges, i, index):
        global t, vertices, new_vertices, s, compare
        if index == 1:   # 1st pass
            vertices[i-1][1] = True   # mark it explored
        if index == 2:    # 2nd pass
            vertices[compare[i]-1][1] = True
            vertices[compare[i]-1].append(s)   # set leader(i) = node s
        if i in edges:
            for v in edges[i]:
                if index == 1:
                    if vertices[v-1][1] == False:
                        DFS(edges, vertices[v-1][0], index)
                if index == 2:
                    if vertices[compare[v]-1][1] == False:
                        DFS(edges, vertices[compare[v]-1][0], index)
                        
        if index == 1:
            t = t + 1    # i's finishing time
            vertices[i-1].append(t)
            temp = vertices[i-1].copy()
            temp[1] = False
            new_vertices.append(temp)
            compare[vertices[i-1][0]] = t
            
    def DFS_loop(edges, index):
        global t, vertices, new_vertices, s
        t = 0  #for finishing times in 1st pass
        n = len(vertices)
        for i in range(1, n+1):
            v = vertices[n-i]
            if v[1] == False:
                s = v[0]
                DFS(edges, v[0], index)
    
    def main():       
        global vertices, new_vertices, compare
        f = open('SCC.txt')
        _f = list(f)
        vertices = list()    #[number, False]  false indicates unexplored
        new_vertices = list() #[number, False, t, s]
        edges = dict()       # {1:[2,5,6...]...}
        edges_rev = dict()   # {2:[8,9,5...]...}
        compare = dict()
        for i in range(0, 875714):  #875714  initialize V
            vertices.append([i+1, False])
        for edge in _f:   # initialize E
            temp = edge.split()
            edge_temp = [int(temp[0]), int(temp[1])]
            edge_rev_temp = [edge_temp[1], edge_temp[0]]
            if edge_temp[0] not in edges: 
                edges[edge_temp[0]] = [edge_temp[1]]
            else: 
                edges[edge_temp[0]].append(edge_temp[1])
            if edge_rev_temp[0] not in edges_rev: 
                edges_rev[edge_rev_temp[0]] = [edge_rev_temp[1]]
            else:
                edges_rev[edge_rev_temp[0]].append(edge_rev_temp[1])
    
        DFS_loop(edges_rev, 1)   
        vertices = copy.deepcopy(new_vertices)
        DFS_loop(edges, 2)
    
        result = dict()
        for item in vertices:  # nodes with the same 'leader'
            if item[3] not in result:
                result[item[3]] = 1
            else:
                result[item[3]] = result[item[3]] + 1
    
        r = list()   #output the sizes of the 10 largest SCCs
        for key in result:
            r.append(result[key])
        r = sorted(r, reverse = True)
        print(r[0:9])
    
    
    
    if __name__ == '__main__':
        thread = threading.Thread(target = main)
        thread. start()
       
    







  • 相关阅读:
    Java 报错 -source 1.5 中不支持 diamond 运算符
    MacBook Java开发环境的配置
    MacBook 版本控制工具
    版本控制工具 Git SourceTree SSH 连接码云
    接口 请求https接口
    快递 共享电子面单
    快递 已发货订单重新打印电子面单
    SQL Server 分部分项导入后 数据的修改
    Hive数据的存储以及在centos7下进行Mysql的安装
    Hive初体验
  • 原文地址:https://www.cnblogs.com/javawebsoa/p/3249401.html
Copyright © 2011-2022 走看看