【Python排序搜索基本算法】之深度优先搜索、广度优先搜索、拓扑排序、强联通&Kosaraju算法

zoukankan html css js c++ java

【Python排序搜索基本算法】之深度优先搜索、广度优先搜索、拓扑排序、强联通&Kosaraju算法
Graph Search and Connectivity

  Generic Graph Search

Goals 1. find everything findable

2. don't explore anything twice

Generic Algorithm (given graph G, vertex S)

--- initialize S explored (all others unexplored)

--- while possible:

--- choose an edge(u, v) with u explored and v unexplored

--- mark v explored

1. Breadth-First Search (BFS) O(m+n) time using a queue

--- explore nodes in 'layers'

--- can compute shortest paths

--- can compute connected components of an undirected graph

The basics:pseudocode

BFS(Graph G, start vertex s)

(all nodes initially unexplored)

mark s as explored

let Q = queue data structure(FIFO), initialized with s

while Q != 0:

remove the first node of Q, call it v

for each edge(v, w):

if w unexplored

mark w as explored

add into Q (at the end)

Shortest Paths:

Goal: compute dist(v), the fewest # of edges on a path from s to v

Extra code:

initialize dist(v) = 0 if v == s

when considering edge(v, w):

if w unexplored then set dist(w) = dist(v) + 1

claim: at termination, dist(v) = i <=> v in ith layer

Undirected Connectivity

let G = (V, E) be an undirected graph

Connected components == the 'pieces' of G

Goal: compute all connected components(why? check if network is disconnected, graph visualization, clustering, similarity)

all nodes unexplored

(assume labelled 1 to n)

for i = 1 to n

if i not yet explored

BFS(G, i) //discovers precisely i's connected components

2. Depth-First Search (DFS) O(m+n) time using a stack

--- explore aggressively like a maze, backtrack only when necessary

--- compute topological ordering of directed acycle graph(DAG)

--- compute connected components in directed graphs



pseudocode:

use a stack instead of a queue

recursive version:

DFS(Graph G, start vertex s)

mark s as explored

for every edge(s,v)

if v unexplored

DFS(G,v)



Application: Topological Sort (DAG)

Definition: A topological ordering of a directed graph G is a labelling f of G's node's such that:

1. the f(v)'s are the set{1,2,...,n}

2. (u,v) => f(u) < f(v)

note that if G has directed cycle => no topological ordering

Straightforward solution to Topological Sort

note: every directed acyclic graph has a sink vertex(入度为0的node，无前驱)

To compute topological ordering:

let v be a sink vertex of G

set f(v) = n

recurse on G - {v}

(1) 从有向图中选一个没有前驱的顶点

(2) 从图中删去该点，并删去从该点出发的所有边

(3) 重复上两步，直到图中再没有有前驱的点为止

Topological Sort via DFS

DFS(G, s)

mark s explored

for every edge(s, v)

if v not yet explored

DFS(G, v)

set f(s) = current_label

current_label --

DFS-loop(Graph G)

mark all node unexplored

current_label = n

for each vertex v:

if v unexplored

DFS(G, v)

3. Computing Strong Components: The Algorithm

Strongly connected Components

Formal Definition: the strongly connected Components(SCCs) of a directed graph G are the equivalance classes of the relation:

u~v <=> u ->v and v -> u in G

Kosaraju's Two-Pass Algorithm 2*DFS = O(m+n)

1. let Gr = G with all arcs reversed

2. run DFS-loop on Gr <---------- Goal: compute 'magical ordering' of nodes

let f(v) = 'finishing time' of each v

3. run DFS-loop on G <---------- Goal: discover the SCCs one-by-one

processing nodes in decreasing order of finishing times

SCCs = nodes with the same 'leader'



pseudocode:

DFS(G, i)

make i as explored

set leader(i) = node s

for each arc(i, j):

if j not yet explored:

DFS(G, j)

t++

set f(i) = t // i's finishing time

DFS-loop(Graph G)

global variable t = 0 // # of nodes pressed so far (for finishing times in 1st pass)

global variable s = Null // current source vertex (for leaders in 2nd pass)

Assume nodes labelled 1 to n

for i = n down to 1

if i not yet explored

s = i

DFS(G, i)

Python Code:
```
import sys
import threading
import copy

threading.stack_size(67108864)
sys.setrecursionlimit(300000)

def DFS(edges, i, index):
    global t, vertices, new_vertices, s, compare
    if index == 1:   # 1st pass
        vertices[i-1][1] = True   # mark it explored
    if index == 2:    # 2nd pass
        vertices[compare[i]-1][1] = True
        vertices[compare[i]-1].append(s)   # set leader(i) = node s
    if i in edges:
        for v in edges[i]:
            if index == 1:
                if vertices[v-1][1] == False:
                    DFS(edges, vertices[v-1][0], index)
            if index == 2:
                if vertices[compare[v]-1][1] == False:
                    DFS(edges, vertices[compare[v]-1][0], index)
                    
    if index == 1:
        t = t + 1    # i's finishing time
        vertices[i-1].append(t)
        temp = vertices[i-1].copy()
        temp[1] = False
        new_vertices.append(temp)
        compare[vertices[i-1][0]] = t
        
def DFS_loop(edges, index):
    global t, vertices, new_vertices, s
    t = 0  #for finishing times in 1st pass
    n = len(vertices)
    for i in range(1, n+1):
        v = vertices[n-i]
        if v[1] == False:
            s = v[0]
            DFS(edges, v[0], index)

def main():       
    global vertices, new_vertices, compare
    f = open('SCC.txt')
    _f = list(f)
    vertices = list()    #[number, False]  false indicates unexplored
    new_vertices = list() #[number, False, t, s]
    edges = dict()       # {1:[2,5,6...]...}
    edges_rev = dict()   # {2:[8,9,5...]...}
    compare = dict()
    for i in range(0, 875714):  #875714  initialize V
        vertices.append([i+1, False])
    for edge in _f:   # initialize E
        temp = edge.split()
        edge_temp = [int(temp[0]), int(temp[1])]
        edge_rev_temp = [edge_temp[1], edge_temp[0]]
        if edge_temp[0] not in edges: 
            edges[edge_temp[0]] = [edge_temp[1]]
        else: 
            edges[edge_temp[0]].append(edge_temp[1])
        if edge_rev_temp[0] not in edges_rev: 
            edges_rev[edge_rev_temp[0]] = [edge_rev_temp[1]]
        else:
            edges_rev[edge_rev_temp[0]].append(edge_rev_temp[1])

    DFS_loop(edges_rev, 1)   
    vertices = copy.deepcopy(new_vertices)
    DFS_loop(edges, 2)

    result = dict()
    for item in vertices:  # nodes with the same 'leader'
        if item[3] not in result:
            result[item[3]] = 1
        else:
            result[item[3]] = result[item[3]] + 1

    r = list()   #output the sizes of the 10 largest SCCs
    for key in result:
        r.append(result[key])
    r = sorted(r, reverse = True)
    print(r[0:9])



if __name__ == '__main__':
    thread = threading.Thread(target = main)
    thread. start()
   
```
查看全文

相关阅读:
Anaconda的安装和更新
 Python数据分析学习目录
 国标28181sip开源库介绍（陆续补充完备）
开源sip server & sip client 和开发库一览
 几种开源SIP协议栈对比
 用TCP穿透NAT（TCP打洞）的实现
 使用TCP协议的NAT穿透技术
 TCP点对点穿透探索--失败
 snmp++开发实例一
 socket跟TCP/IP 的关系,单台服务器上的并发TCP连接数可以有多少

原文地址：https://www.cnblogs.com/javawebsoa/p/3249401.html