zoukankan      html  css  js  c++  java
  • IPv4和IPv6地址库

    常用地址库

    研究了下IP地址库,目前比较常用的库有下面几个:

    • 纯真数据库 :完全免费,精度不高,可以通过(www.cz88.net/soft/setup.zip)下载安装包;
    • IPIP数据库:国内做的最好的IP地址库,免费版的差强人意;
    • GeoIP:免费版国内的城市精度一般,收费版比较精确,数据比较有特色,还同时提供了经纬度信息;
    • Ip2Location:试了一下,挺好用的,不过地址都是汉语拼音或英文,想用汉字的,数据得自己处理一下;

    关于IPv6

    关于为什么要使用IPv6,可以参看协议森林04 地址耗尽危机 (IPv4与IPv6地址)

    IPV6的长度为128位,是IPV4地址长度的4倍。所以IPV4的点分十进制格式不再适用,采用十六进制表示,IPV6有三种表示方法:

    1. 冒分十六进制表示法:格式为X:X:X:X:X:X:X:X,每个X表示地址中的16b,以十六进制表示,例如:ABCD:EF01:2345:6789:ABCD:EF01:2345:6789,这种表示法中的前导0是可以省略的,例如:

      2001:0DB8:0000:0023:0008:0800:200C:417A→ 2001:DB8:0:23:8:800:200C:417A

    2. 0位压缩表示法:在某些情况下,IPv6地址中可能包含很长一段0,就可以把0压缩为“::”,但为了保证解析地址的唯一性,“::”只能出现一次,例如:

      FF01:0:0:0:0:0:0:1101 → FF01::1101

      0:0:0:0:0:0:0:1 → ::1

    3. 内嵌IPv4表示法:为了实现IPv4-IPv6互通,IPv4的地址会嵌入到IPv6地址中,此时地址通常表示为:X:X:X:X:X:X:d.d.d.d,前96b采用冒分十六进制表示,后32b则采用IPv4的点分十进制表示,如:192.168.0.1与::FFFF:192.168.0.1,在前96b中,0位压缩法依旧适用。

    扩展阅读:

    关于IPv6的地址库,本人研究了国外Ip2Location中的免费数据,精度一般,最后发现了ZX公司的IPDB,数据收录基本能满足日常学习研究使用。

    数据分析

    纯真数据库(IPv4)

    纯真数据库的安装包中提供了解压工具,可以将qqwry.dat的数据格式转换为txt格式,转换后的数据格式如下:

    0.0.0.0         0.255.255.255   IANA 保留地址
    1.0.0.0         1.0.0.0         美国 亚太互联网络信息中心(CloudFlare节点)
    1.0.0.1         1.0.0.1         美国 APNIC&CloudFlare公共DNS服务器
    1.0.0.2         1.0.0.255       美国 亚太互联网络信息中心(CloudFlare节点)
    1.0.1.0         1.0.3.255       福建省 电信
    1.0.4.0         1.0.7.255       澳大利亚 墨尔本Goldenit有限公司
    

    第一列是起始IP、第二列是截止IP、第三列是地区、第四列是运营商信息。

    IPDB(IPv6)

    ZX公司的IPDB相对麻烦一些,没有提供相关的解压工具,需要自己分析数据格式,找到了Github上Rhilip大神的项目,并做了更改:

    #!/usr/bin/python3
    # -*- coding: utf-8 -*-
    # Copyright (c) 2017-2020 Rhilip <rhilipruan@gmail.com>
    import re
    import os
    
    dir = os.path.dirname(__file__)
    
    v4db_path = os.path.join(dir, 'db/qqwry.dat')
    v6db_path = os.path.join(dir, 'db/ipv6wry.db')
    
    v6ptn = re.compile(r'^[0-9a-f:.]{3,51}$')
    v4ptn = re.compile(r'.*((25[0-5]|2[0-4]d|[0-1]?dd?).){3}(25[0-5]|2[0-4]d|[0-1]?dd?)$')
    
    
    def parseIpv4(ip):
        sep = ip.rfind(':')
        if sep >= 0:
            ip = ip[sep + 1:]
        if v4ptn.match(ip) is None:
            return -1
        v4 = 0
        for sub in ip.split('.'):
            v4 = v4 * 0x100 + int(sub)
        return v4
    
    
    def parseIpv6(ip):
        if v6ptn.match(ip) is None:
            return -1
        count = ip.count(':')
        if count >= 8 or count < 2:
            return -1
        ip = ip.replace('::', '::::::::'[0:8 - count + 1], 1)
        if ip.count(':') < 6:
            return -1
        v6 = 0
        for sub in ip.split(':')[0:4]:
            if len(sub) > 4:
                return -1
            if len(sub) == 0:
                v6 = v6 * 0x10000
            else:
                v6 = v6 * 0x10000 + int(sub, 16)
        return v6
    
    
    def parseIp(ip):
        ip = ip.strip()
        ip = ip.replace('*', '0')
        v4 = parseIpv4(ip)
        v6 = parseIpv6(ip)
        v2002 = v6 >> (3 * 16)
        if v2002 == 0x2002:
            v4 = (v6 >> 16) & 0xffffffff
        v2001 = v6 >> (2 * 16)
        if v2001 == 0x20010000:
            v4 = ~int(''.join(ip.split(':')[-2:]), 16)
            v4 = int(bin(((1 << 32) - 1) & v4)[2:], 2)
        return v4, v6
    
    
    class IpDb(object):
        except_raw = 0x19
        osLen = ipLen = dLen = dbAddr = size = None
    
        def __init__(self, db_path):
            with open(db_path, 'rb') as f:
                db = f.read()
                self.db = db
    
            if db[0:4] != 'IPDB'.encode():
                self.type = 4
                self._init_v4db()
            else:
                self.type = 6
                self._init_v6db()
    
        def _init_v4db(self):
            self.osLen = 3
            self.ipLen = 4
            self.dLen = self.osLen + self.ipLen
            self.dbAddr = int.from_bytes(self.db[0:4], byteorder='little')
            endAddr = int.from_bytes(self.db[4:8], byteorder='little')
            self.size = (endAddr - self.dbAddr) // self.dLen
    
        def _init_v6db(self):
            self.osLen = self.db[6]  # 3
            self.ipLen = self.db[7]  # 8
            self.dLen = self.osLen + self.ipLen
            self.dbAddr = int.from_bytes(self.db[0x10: 0x18], byteorder='little')  # 50434
            self.size = int.from_bytes(self.db[8:0x10], byteorder='little')  # 140045
            
    
        def getSize(self):
            return self.size
    
        def getData(self, index):
            self.checkIndex(index)
            addr = self.dbAddr + index * self.dLen
            ip = int.from_bytes(self.db[addr: addr + self.ipLen], byteorder='little')
            return ip
    
        def checkIndex(self, index):
            if index < 0 or index >= self.getSize():
                raise Exception
    
        def getLoc(self, index):
            self.checkIndex(index)
            addr = self.dbAddr + index * self.dLen
            # ip = int.from_bytes(self.db[addr: addr + self.ipLen],
            # byteorder='little')
            lAddr = int.from_bytes(self.db[addr + self.ipLen: addr + self.dLen], byteorder='little')
            # print('ip_addr: %d ip: %d lAddr:%d' % (addr, ip, lAddr))
            if self.type == 4:
                lAddr += 4
            loc = self.readLoc(lAddr, True)
            if self.type == 4:
                loc = loc.decode('cp936')
                loc = loc.replace('CZ88.NET', '')
            if self.type == 6:
                loc = loc.decode('utf-8')
            return loc
    
        def readRawText(self, start):
            bs = []
            if self.type == 4 and start == self.except_raw:
                return bs
            while self.db[start] != 0:
                bs += [self.db[start]]
                start += 1
            return bytes(bs)
    
        def readLoc(self, start, isTwoPart=False):
            jType = self.db[start]
            if jType == 1 or jType == 2:
                start += 1
                offAddr = int.from_bytes(self.db[start:start + self.osLen], byteorder='little')
                if offAddr == 0:
                    return 'Unknown address'
                loc = self.readLoc(offAddr, True if jType == 1 else False)
                nAddr = start + self.osLen
            else:
                loc = self.readRawText(start)
                nAddr = start + len(loc) + 1
            if isTwoPart and jType != 1:
                partTwo = self.readLoc(nAddr)
                if loc and partTwo:
                    loc += b' ' + partTwo
            return loc
    
        def searchIp(self, val):
            index = self.binarySearch(val)
            if index < 0:
                return "Unknown address"
            if index > self.getSize() - 2:
                index = self.getSize() - 2
            return self.getLoc(index)
    
        def binarySearch(self, key, lo=0, hi=None):
            if not hi:
                hi = self.getSize() - 1
            while lo <= hi:
                if hi - lo <= 1:
                    if self.getData(lo) > key:
                        return -1
                    elif self.getData(hi) <= key:
                        return hi
                    else:
                        return lo
                mid = (lo + hi) // 2
                data = self.getData(mid)
                if data is not None and data > key:
                    hi = mid - 1
                elif data is not None and data < key:
                    lo = mid
                else:
                    return mid
            return -1
    
    
    class IpQuery(object):
        def __init__(self):
            self.v6db = IpDb(v6db_path)
            self.v4db = IpDb(v4db_path)
    
        def searchIp(self, ip):
            ret = ''
            err = None
            try:
                v4, v6 = parseIp(ip)
                # print('v4: %d v6: %d' % (v4, v6))
                if v6 >= 0:
                    print(v6)
                    ret += self.v6db.searchIp(v6)
                if v4 >= 0:
                    if ret != '':
                        ret += ' > '
                    ret += self.v4db.searchIp(v4)
            except Exception as e:
                err = "Internal server error"
            return {
                "ip": ip,
                "loc": ret if ret else None,
                "stats": err or ("Can't Format IP address." if ret == '' else "Success")
            }
    
    
    if __name__ == '__main__':
        # ipquery = IpQuery()
        # ip = '2001:250:230::'
        # ip = '42.156.139.1'
        # ip = '182.117.109.0'
        # ip = '114.242.248.*'
        # ip = None
        # result = ipquery.searchIp(ip)
        v6db = IpDb(v6db_path)
        i = 0
        fs = open('ipv6.csv','w',encoding="utf-8")
        while(i < v6db.size - 1):
            fs.write(str(v6db.getData(i)) + "," + str(v6db.getData(i + 1) - 1) +
            "," + v6db.getLoc(i) + "
    ")
            i+=1
        fs.close()
    

    导出后的数据格式如下:

    0,28428538856079359,IANA保留地址
    28428538856079360,28428538856079360,IANA特殊地址 包含v4地址的v6地址
    28428538856079361,28428538856144895,IANA保留地址
    28428538856144896,28428538856210431,IANA特殊地址 包含v4地址的v6地址
    28428538856210432,72057594037927935,IANA保留地址
    72057594037927936,72057594037927936,IANA特殊地址 仅用于丢弃的地址
    72057594037927937,2306124484190404607,IANA保留地址
    

    第一位和第二位是将IPv6的前4位计算得到的值,第三位是地址。

    查询代码

    为了提升加载速度和代码的一致性,这里考虑将IPv4的地址库处理为和IPv6地址库一致的格式,处理代码如下:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    import ipaddress
    
    fw = open('ipv4.txt','w',encoding='utf-8')
    for line in open('ip.txt','r'):    
        larr = line.replace('CZ88.NET','').strip('
    ').split(' ')
        larr = [sval for sval in filter(lambda s:s != '',larr)]
        start = int(ipaddress.IPv4Address(larr[0]))
        end = int(ipaddress.IPv4Address(larr[1]))
        address = larr[2]
        if len(larr) > 3:
            for i in range(len(larr) - 3):
                address+=larr[3 + i]
        print(start,end,address)
        fw.writelines(str(start) + ',' + str(end) + ',' + address+"
    ")
    print('over')
    

    查找方式使用二分法:

    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using System.Net;
    using System.Numerics;
    using System.Text;
    
    namespace Trail.Common
    {
        /// <summary>
        /// IP地址库工具。
        /// </summary>
        public class IPLocationTool
        {
            private const string IPv4Path = "ipv4.txt";
            private const string IPv6Path = "ipv6.txt";
            private const string UnKnowIP = "未知地址";
            private static IPv4LocInfo[] _IPv4Infos = null;
            private static IPv6LocInfo[] _IPv6Infos = null;
    
            /// <summary>
            /// 加载地址库数据。
            /// </summary>
            public static void Load()
            {
                //IPv4
                using (var sr = new StreamReader(IPv4Path, Encoding.UTF8))
                {
                    string line;
                    var ipv4LocInfos = new List<IPv4LocInfo>();
                    while (!string.IsNullOrEmpty(line = sr.ReadLine()))
                    {
                        var lineArr = line.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries);
                        IPv4LocInfo ipv4LocInfo = new IPv4LocInfo()
                        {
                            Start = Convert.ToUInt32(lineArr[0]),
                            End = Convert.ToUInt32(lineArr[1]),
                            Address = lineArr[2]
                        };
                        ipv4LocInfos.Add(ipv4LocInfo);
                    }
                    _IPv4Infos = ipv4LocInfos.ToArray();
                }
    
                using (var sr = new StreamReader(IPv6Path, Encoding.UTF8))
                {
                    string line;
                    var ipv6LocInfos = new List<IPv6LocInfo>();
                    while (!string.IsNullOrEmpty(line = sr.ReadLine()))
                    {
                        var lineArr = line.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries);
                        IPv6LocInfo ipv6LocInfo = new IPv6LocInfo()
                        {
                            Start = BigInteger.Parse(lineArr[0]),
                            End = BigInteger.Parse(lineArr[1]),
                            Address = lineArr[2]
                        };
                        ipv6LocInfos.Add(ipv6LocInfo);
                    }
                    _IPv6Infos = ipv6LocInfos.ToArray();
                }
            }
    
            /// <summary>
            /// 二分查找IP地址。
            /// </summary>
            /// <param name="ip">IP。</param>
            /// <returns>地址。</returns>
            public static string BinSearch(string ip)
            {
                //IPv6
                if (ip.Contains(":"))
                {
                    var ipNum = IPv6ToIndex(ip);
                    int high = _IPv6Infos.Length;
                    for (int low = 0; low <= high;)
                    {
                        var point_index = (high + low) / 2;
                        if (ipNum < _IPv6Infos[point_index].Start)
                        {
                            high = point_index - 1;
                            continue;
                        }
                        else if (ipNum > _IPv6Infos[point_index].End)
                        {
                            low = point_index + 1;
                            continue;
                        }
                        return _IPv6Infos[point_index].Address;
                    }
                }
                //IPv4
                else
                {
                    //转数字
                    var ipNum = IPv4ToNumber(ip);
                    int high = _IPv4Infos.Length;
                    for (int low = 0; low <= high;)
                    {
                        var point_index = (high + low) / 2;
                        if (ipNum < _IPv4Infos[point_index].Start)
                        {
                            high = point_index - 1;
                            continue;
                        }
                        else if (ipNum > _IPv4Infos[point_index].End)
                        {
                            low = point_index + 1;
                            continue;
                        }
                        return _IPv4Infos[point_index].Address;
                    }
                }
                return UnKnowIP;
            }
    
            /// <summary>
            /// IPv4转换为数值。
            /// </summary>
            /// <param name="ip">IPv4的地址。</param>
            /// <returns>数值。</returns>
            public static long IPv4ToNumber(string ip)
            {
                var ipArr = ip.Split(new char[] { '.' });
                return long.Parse(ipArr[0]) * 16777216 + long.Parse(ipArr[1]) * 65536 + long.Parse(ipArr[2]) * 256 + long.Parse(ipArr[3]);
            }
    
            /// <summary>
            /// IPV6转换为数值。
            /// </summary>
            /// <param name="ip">IPV6的地址。</param>
            /// <returns>数值。</returns>
            private static BigInteger IPv6ToNumber(string ip)
            {
                IPAddress address;
                BigInteger ipnum;
                if (IPAddress.TryParse(ip, out address))
                {
                    byte[] addrBytes = address.GetAddressBytes();
    
                    if (BitConverter.IsLittleEndian)
                    {
                        List<byte> byteList = new List<byte>(addrBytes);
                        byteList.Reverse();
                        addrBytes = byteList.ToArray();
                    }
    
                    if (addrBytes.Length > 8)
                    {
                        //IPv6
                        ipnum = BitConverter.ToUInt64(addrBytes, 8);
                        ipnum <<= 64;
                        ipnum += BitConverter.ToUInt64(addrBytes, 0);
                    }
                    else
                    {
                        //IPv4
                        ipnum = BitConverter.ToUInt32(addrBytes, 0);
                    }
                    return ipnum;
                }
                return 0;
            }
    
            /// <summary>
            /// IPV6转为索引值(IPv6是按头四位索引分配地址)。
            /// </summary>
            /// <param name="ip">IPV6的地址。</param>
            /// <returns>数值。</returns>
            private static BigInteger IPv6ToIndex(string ip)
            {
                //补齐::
                int count = ip.ToCharArray().Count(p => p.Equals(':'));
                ip = ip.Replace("::", ":::::::".Substring(0, 8 - count + 1));
                if (ip.ToCharArray().Count(p => p.Equals(':')) < 6)
                    return -1;
                BigInteger v6 = 0;
                var ipArr = ip.Split(new string[] { ":" }, StringSplitOptions.None);
    
                for (int i = 0; i < 4; i++)
                {
                    if (string.IsNullOrEmpty(ipArr[i]))
                        v6 = v6 * 0x10000;
                    else
                    {
                        v6 = v6 * 0x10000 + Int64.Parse(ipArr[i], System.Globalization.NumberStyles.HexNumber);
                    }
                }
                return v6;
            }
        }
    
        /// <summary>
        /// IPv4地址信息。
        /// </summary>
        public class IPv4LocInfo
        {
            /// <summary>
            /// 范围起始。
            /// </summary>
            public uint Start { get; set; }
    
            /// <summary>
            /// 范围结束。
            /// </summary>
            public uint End { get; set; }
    
            /// <summary>
            /// 归属地。
            /// </summary>
            public string Address { get; set; }
        }
    
        /// <summary>
        /// IPv4地址信息。
        /// </summary>
        public class IPv6LocInfo
        {
            /// <summary>
            /// 范围起始。
            /// </summary>
            public BigInteger Start { get; set; }
    
            /// <summary>
            /// 范围结束。
            /// </summary>
            public BigInteger End { get; set; }
    
            /// <summary>
            /// 归属地。
            /// </summary>
            public string Address { get; set; }
        }
    }
    

    测试代码如下:

        /// <summary>
        /// 测试
        /// </summary>
        /// <param name="args">参数</param>
        static void Main(string[] args)
        {
            try
            {
                IPLocationTool.Load();
                var beginTime = DateTime.Now;
                //IPv4测试
                Console.WriteLine(IPLocationTool.BinSearch("61.152.197.155"));  //上海市网友
                Console.WriteLine(IPLocationTool.BinSearch("211.143.205.140")); //福建省漳州市
                Console.WriteLine(IPLocationTool.BinSearch("218.57.116.146")); //山东省青岛市
                Console.WriteLine(IPLocationTool.BinSearch("121.35.180.254")); //广东省深圳市
                Console.WriteLine(IPLocationTool.BinSearch("112.13.166.125")); //浙江省丽水市
                Console.WriteLine(IPLocationTool.BinSearch("61.181.236.137")); //天津市宝坻区
                Console.WriteLine(IPLocationTool.BinSearch("1.65.212.143")); //香港
                //IPv6测试
                Console.WriteLine(IPLocationTool.BinSearch("2409:8a00::"));  //中国北京市东城区
                Console.WriteLine(IPLocationTool.BinSearch("2408:8410:47ff:ffff:1155:658:1254:632"));  //中国天津市红桥区
                Console.WriteLine(IPLocationTool.BinSearch("2409:8a0c:1200::"));  //中国山西省太原市娄烦县
                Console.WriteLine(IPLocationTool.BinSearch("2409:8a15:9400::"));  //中国辽宁省辽阳市灯塔市
                Console.WriteLine("运行完毕,耗时{0}ms", (DateTime.Now - beginTime).TotalMilliseconds);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message + "
    " + ex.StackTrace);
            }
    
            Console.WriteLine("Over");
            Console.Read();
        }
    
  • 相关阅读:
    在linux下搭建wiki环境【转】
    GitLab版本管理【转】
    linux设备驱动中的并发控制【转】
    分享三个USB抓包软件---Bus Hound,USBlyzer 和-USBTrace【转】
    Git常用命令总结【转】
    Linux中断(interrupt)子系统之一:中断系统基本原理【转】
    大话Linux内核中锁机制之原子操作、自旋锁【转】
    自旋锁spin_lock和raw_spin_lock【转】
    Linux内核同步机制之(四):spin lock【转】
    spin_lock浅析【转】
  • 原文地址:https://www.cnblogs.com/krockey/p/10983437.html
Copyright © 2011-2022 走看看