一、起因
最近有个WPF的项目,需要频繁的加载不同的网页并提取网页的数据,随后就使用了WebBrowser控件。WPF使用WebBrowser控件获取网页,一般步骤是:
1、使用Navgiate方法或者WebBrowser的Source属性。
2、监听LoadCompleted事件,当该事件触发时,可以获取DOM。
由于我需要查看的网页比较多,需要依次向WebBrowser控件加载多个指定网址,随后获取各个DOM。一开始我是使用WithEvents来声明WebBrowser控件。但是随后发现会多次触发LoadCompleted事件。随着历史Navgiate方法的调用,该LoadCompleted事件会被越来越多的触发。
举个例子来说明:窗体加载时,首次调用Navgiate方法,导航到URL1,此时只会触发一次LoadCompleted;如果第二次调用Navgiate方法,导航到URL2,此时LoadCompleted将被触发2次,依次是URL1和URL2;依次类推,当加载很多次后,该事件将多次触发。
其实要避免这样的事情发生,可以通过手动绑定及解除LoadCompleted事件监听即可。我单独写了一个HtmlLoader类,来完成该操作,该类的详细代码如下:
二、HtmlLoader类模块
1 Imports mshtml 2 Public Class HtmlLoader 3 Private mWindow As Window 4 Private wb As WebBrowser 5 Private mURL As String 6 Public Property URL() As String 7 Get 8 Return mURL 9 End Get 10 Set(ByVal value As String) 11 mURL = value 12 End Set 13 End Property 14 15 Private mLoadCompleted As Boolean 16 17 Public Sub New(CurWindow As Window, webBrowser As WebBrowser) 18 mWindow = CurWindow 19 wb = webBrowser 20 End Sub 21 22 Public Sub New(CurWindow As Window, webBrowser As WebBrowser, URL As String) 23 mWindow = CurWindow 24 wb = webBrowser 25 Me.Reload(URL) 26 End Sub 27 28 Public Sub Reload() 29 mLoadCompleted = False 30 If Me.URL Is Nothing OrElse Me.URL.Trim.Length = 0 Then Exit Sub 31 mWindow.Dispatcher.Invoke(Sub() 32 LogNote.WriteLine(wb.Name & "-StartLoad", Me.URL) 33 AddHandler wb.LoadCompleted, AddressOf wb_LoadCompleted 34 'wb.Source = New Uri(Me.URL) 35 wb.Navigate(Me.URL) 36 End Sub) 37 End Sub 38 39 Public Sub Reload(URL As String) 40 Me.URL = URL 41 Reload() 42 End Sub 43 44 Private Sub wb_LoadCompleted(sender As Object, e As NavigationEventArgs) 45 mWindow.Dispatcher.Invoke(Sub() 46 LogNote.WriteLine(wb.Name & "-LoadCompleted", Me.URL) 47 RemoveHandler wb.LoadCompleted, AddressOf wb_LoadCompleted 48 End Sub) 49 mLoadCompleted = True 50 End Sub 51 52 Public Function GetDocument() As HTMLDocument 53 Do Until mLoadCompleted 54 55 Loop 56 Return mWindow.Dispatcher.Invoke(Function() As HTMLDocument 57 Return wb.Document 58 End Function) 59 End Function 60 End Class
三、说明
这个类接受WebBrowser控件所在窗体、WebBrowser控件对象及定位的URL做为参数。由于涉及到UI更新问题,需要使用Window对象的Dispatcher来操作,所以需要传递Window。
LogNote是我自己写的日志静态类,这里就不提供代码了,可以简单的将其注释掉就行了。