zoukankan      html  css  js  c++  java
  • 从IE浏览器获取当前页面的内容

          从IE浏览器获取当前页面内容可能有多种方式,今天我所介绍的是其中一种方法。基本原理:当鼠标点击当前IE页面时,获取鼠标的坐标位置,根据鼠标位置获取当前页面的句柄,然后根据句柄,调用win32的东西进而获取页面内容。具体代码:

     1      private void timer1_Tick(object sender, EventArgs e)
     2         {
     3             lock (currentLock)
     4             {
     5                 System.Drawing.Point MousePoint = System.Windows.Forms.Form.MousePosition;
     6                 if (_leftClick)
     7                 {
     8                     timer1.Stop();
     9                     _leftClick = false;
    10 
    11                     _lastDocument = GetHTMLDocumentFormHwnd(GetPointControl(MousePoint, false));
    12                     if (_lastDocument != null)
    13                     {
    14                         if (_getDocument)
    15                         {
    16                             _getDocument = true;
    17                             try
    18                             {
    19                                 string url = _lastDocument.url;
    20                                 string html = _lastDocument.documentElement.outerHTML;
    21                                 string cookie = _lastDocument.cookie;
    22                                 string domain = _lastDocument.domain;
    23 
    24                                 var resolveParams = new ResolveParam
    25                                     {
    26                                         Url = new Uri(url),
    27                                         Html = html,
    28                                         PageCookie = cookie,
    29                                         Domain = domain
    30                                     };
    31 
    32                                 RequetResove(resolveParams);
    33                             }
    34                             catch (Exception ex)
    35                             {
    36                                 System.Windows.MessageBox.Show(ex.Message);
    37                                 Console.WriteLine(ex.Message);
    38                                 Console.WriteLine(ex.StackTrace);
    39                             }
    40                         }
    41                     }
    42                     else
    43                     {
    44                         new MessageTip().Show("xx", "当前页面不是IE浏览器页面,或使用了非IE内核浏览器,如火狐,搜狗等。请使用IE浏览器打开网页");
    45                     }
    46 
    47                     _getDocument = false;
    48                 }
    49                 else
    50                 {
    51                     _pointFrm.Left = MousePoint.X + 10;
    52                     _pointFrm.Top = MousePoint.Y + 10;
    53                 }
    54             }
    55 
    56         }

    第11行的  GetHTMLDocumentFormHwnd(GetPointControl(MousePoint, false))  分解下,先从鼠标坐标获取页面的句柄:

     1         public static IntPtr GetPointControl(System.Drawing.Point p, bool allControl)
     2         {
     3             IntPtr handle = Win32APIsFull.WindowFromPoint(p);
     4             if (handle != IntPtr.Zero)
     5             {
     6                 System.Drawing.Rectangle rect = default(System.Drawing.Rectangle);
     7                 if (Win32APIsFull.GetWindowRect(handle, out rect))
     8                 {
     9                     return Win32APIsFull.ChildWindowFromPointEx(handle, new System.Drawing.Point(p.X - rect.X, p.Y - rect.Y), allControl ? Win32APIsFull.CWP.ALL : Win32APIsFull.CWP.SKIPINVISIBLE);
    10                 }
    11             }
    12             return IntPtr.Zero;
    13 
    14         }

    接下来,根据句柄获取页面内容:

     1        public static HTMLDocument GetHTMLDocumentFormHwnd(IntPtr hwnd)
     2         {
     3             IntPtr result = Marshal.AllocHGlobal(4);
     4             Object obj = null;
     5 
     6             Console.WriteLine(Win32APIsFull.SendMessageTimeoutA(hwnd, HTML_GETOBJECT_mid, 0, 0, 2, 1000, result));
     7             if (Marshal.ReadInt32(result) != 0)
     8             {
     9                 Console.WriteLine(Win32APIsFull.ObjectFromLresult(Marshal.ReadInt32(result), ref IID_IHTMLDocument, 0, out obj));
    10             }
    11 
    12             Marshal.FreeHGlobal(result);
    13 
    14             return obj as HTMLDocument;
    15         }

    大致原理:

    给IE窗体发送消息,获取到一个指向 IE浏览器(非托管)的某个内存块的指针,然后根据这个指针获取到HTMLDocument对象。

    这个方法涉及到win32的两个函数:

          [System.Runtime.InteropServices.DllImportAttribute("user32.dll", EntryPoint = "SendMessageTimeoutA")]
            public static extern int SendMessageTimeoutA(
                [InAttribute()] System.IntPtr hWnd,
                uint Msg, uint wParam, int lParam,
                uint fuFlags,
                uint uTimeout,
                System.IntPtr lpdwResult);
          [System.Runtime.InteropServices.DllImportAttribute("oleacc.dll", EntryPoint = "ObjectFromLresult")]
            public static extern int ObjectFromLresult(
             int lResult,
             ref Guid riid,
             int wParam,
             [MarshalAs(UnmanagedType.IDispatch), Out]
            out Object pObject
            );
  • 相关阅读:
    lambda表达式
    PAT 1071. Speech Patterns
    PAT 1070. Mooncake
    1069. The Black Hole of Numbers
    PAT 1068. Find More Coins
    背包问题(动态规划)
    PAT 1067. Sort with Swap(0,*)
    PAT 1066. Root of AVL Tree
    PAT 1065. A+B and C
    PAT 1064. Complete Binary Search Tree
  • 原文地址:https://www.cnblogs.com/wangqiang3311/p/8028164.html
Copyright © 2011-2022 走看看