zoukankan      html  css  js  c++  java
  • apache ignite系列(八):问题汇总

    目录

    1,java.lang.ClassNotFoundException Unknown pair

    1.Please try to turn on isStoreKeepBinary in cache settings - like this; please note the last line:

    down vote
    accepted
    Please try to turn on isStoreKeepBinary in cache settings - like this; please note the last line:

    if (persistence){
    // Configuring Cassandra's persistence
    DataSource dataSource = new DataSource();
    // ...here go the rest of your settings as they appear now...
    configuration.setWriteBehindEnabled(true);

        configuration.setStoreKeepBinary(true);
    }
    

    This setting forces Ignite to avoid binary deserialization when working with underlying cache store.

    2.I can reproduce it when, in loadCaches(), I put something that isn't exactly the expected Item in the cache:
    private void loadCache(IgniteCache<Integer, Item> cache, /* Ignite.binary() */ IgniteBinary binary) {
    // Note the absence of package name here:
    BinaryObjectBuilder builder = binary.builder("Item");
    builder.setField("name", "a");
    builder.setField("brand", "B");
    builder.setField("type", "c");
    builder.setField("manufacturer", "D");
    builder.setField("description", "e");
    builder.setField("itemId", 1);
    参考链接:

    http://apache-ignite-users.70518.x6.nabble.com/ClassNotFoundException-with-affinity-run-td5359.html

    https://stackoverflow.com/questions/44781672/apache-ignite-java-lang-classnotfoundexception-unknown-pair#

    https://stackoverflow.com/questions/47502111/apache-ignite-ignitecheckedexception-unknown-pair#

    2,java.lang.IndexOutOfBoundsException + Failed to wait for completion of partition map exchange

    异常描述:

    2018-06-06 14:24:02.932 ERROR 17364 --- [ange-worker-#42] .c.d.d.p.GridDhtPartitionsExchangeFuture : Failed to reinitialize local partitions (preloading will be stopped): 
    	...
    java.lang.IndexOutOfBoundsException: index 678
    	...	org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2279) [ignite-core-2.3.0.jar:2.3.0]
    	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) [ignite-core-2.3.0.jar:2.3.0]
    	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_73]
    
    2018-06-06 14:24:02.932  INFO 17364 --- [ange-worker-#42] .c.d.d.p.GridDhtPartitionsExchangeFuture : Finish exchange future [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], resVer=null, err=java.lang.IndexOutOfBoundsException: index 678]
    2018-06-06 14:24:02.941 ERROR 17364 --- [ange-worker-#42] .i.p.c.GridCachePartitionExchangeManager : Failed to wait for completion of partition map exchange (preloading will not start): GridDhtPartitionsExchangeFuture 
    ...
    org.apache.ignite.IgniteCheckedException: index 678
    	at org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7252) ~[ignite-core-2.3.0.jar:2.3.0]
    	....
    org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2279) ~[ignite-core-2.3.0.jar:2.3.0]
    	... 2 common frames omitted
    

    出现这个情况的原因如下:

    如果定义的缓存类型是REPLICATED模式,并且开启了持久化,后面将其改为PARTITIONED模式,并导入数据,后续重启的时候就会报这个错误。

    比如下面这种情况:

    default-config.xml

            <property name="cacheConfiguration">
                <list>
                    <bean class="org.apache.ignite.configuration.CacheConfiguration">
                      ...
                        <property name="name" value="Test"/>
                        <property name="atomicityMode" value="ATOMIC"/>
                        <property name="cacheMode" value="REPLICATED"/>
                      ...
                    </bean>
                </list>
            </property>
    
            ignite.destroyCache("Test");
            IgniteCache<Long, CommRate> cache = ignite.getOrCreateCache("Test");
    

    当重新启动的时候,default-config.xml中的配置先生效,所以会出现这个问题。

    解决办法就是在持久化模式下不要更改缓存模式,或者不要在配置文件中预定义缓存类型。

    I can't reproduce your case. But the issue could occur if you had a REPLICATED cache and after some time changed it to PARTITIONED and for example call to getOrCreateCache keeping old cache name.

    参考链接:

    http://apache-ignite-users.70518.x6.nabble.com/Weird-index-out-bound-Exception-td14905.html

    3,Failed to find SQL table for type xxxx

    导入数据有误,将该cache destroy掉重新导入.

    4, ignite消息机制出现重复消息并且按执行次数递增

    ignite消息机制出现重复消息并且按执行次数递增的原因是添加了多次监听器。

    针对相同主题的remoteListen和localListen都只应该执行一次,不然每重复执行一次就会多增加一个监听器,
    然后表现出的现象就像是消息按执行次数重复发。

        private AtomicBoolean rmtMsgInit = new AtomicBoolean(false);
        private AtomicBoolean localMsgInit = new AtomicBoolean(false);
        @RequestMapping("/msgTest")
        public @ResponseBody
        String orderedMsg(HttpServletRequest request, HttpServletResponse response) {
            /***************************remote message****************************/
            IgniteMessaging rmtMsg = ignite.message(ignite.cluster().forRemotes());
    
            /**相同的消息监听只能设置一次,不然会出现接收到重复消息,并且按次数递增*/
            if(!rmtMsgInit.get()) {
                rmtMsg.remoteListen("MyOrderdTopic", (nodeId, msg) -> {
                    System.out.println("Received ordered message [msg=" + msg +", from=" + nodeId + "]");
                    return true;
                });
                rmtMsgInit.set(true);
            }
    
            rmtMsg.send("MyOrderdTopic", UUID.randomUUID().toString());
    //        for (int i=0; i < 10; i++) {
    //            rmtMsg.sendOrdered("MyOrderdTopic", Integer.toString(i), 0);
    //            rmtMsg.send("MyOrderdTopic", Integer.toString(i));
    //        }
    
    
            /***************************local message****************************/
            IgniteMessaging localMsg = ignite.message(ignite.cluster().forLocal());
    
            /**相同的消息监听只能设置一次,不然会出现接收到重复消息,并且按次数递增*/
            if(!localMsgInit.get()){
                localMsg.localListen("localTopic", (nodeId, msg) -> {
                    System.out.println(String.format("Received local message [msg=%s, from=%s]", msg, nodeId));
                    return true;
                });
                localMsgInit.set(true);
            }
    
            localMsg.send("localTopic", UUID.randomUUID().toString());
    
            return "executed!";
        }
    

    5,ignite远程执行(remote)之类的操作控制台无打印

    一般在ignite.cluster().forRemotes()远程执行相关的操作的时候,程序可能会在其他节点执行,
    所以打印的日志和输出也会在节点上输出,而程序终端不一定会有输出。
    例如:

        IgniteMessaging rmtMsg = ignite.message(ignite.cluster().forRemotes());
    
        rmtMsg.remoteListen("MyOrderdTopic", (nodeId, msg) -> {
            System.out.println("Received ordered message [msg=" + msg +", from=" + nodeId + "]");
            return true;
        });
    

    如果想在程序端看到效果,可以使用本地模式:
    IgniteMessaging.localListen
    ignite.events().localListen

    6,ignite持久化占用磁盘空间过大

    wal日志机制

    增加如下配置,修改wal日志同步频率

            <!-- Redefining maximum memory size for the cluster node usage. -->
            <property name="dataStorageConfiguration">
                <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
    
    		...
    
                    <!--Checkpointing frequency which is a minimal interval when the dirty pages will be written to the Persistent Store.-->
                    <property name="checkpointFrequency" value="180000"/>
    
                    <!-- Number of threads for checkpointing.-->
                    <property name="checkpointThreads" value="4"/>
    
                    <!-- Number of checkpoints to be kept in WAL after checkpoint is finished.-->
                    <property name="walHistorySize" value="20"/>
    
    		...
                </bean>
            </property>
    

    7,java.lang.ClassCastException org.cord.xxx cannot be cast to org.cord.xxx

    java.lang.ClassCastException org.cord.ignite.data.domain.Student cannot be cast to org.cord.ignite.data.domain.Student

    在从ignite中查询缓存的时候出现该异常,明明是相同的类,但是却无法接收获取的缓存对象:

            IgniteCache<Long, Student> cache = ignite.cache(CacheKeyConstant.STUDENT);
            Student student = cache.get(1L);
    

    于是使用instanceof进行分析:

    cache.get(1L) instanceof Student返回false

    说明从ignite中返回的对象不是Student的实例,但是debug看类的属性都是相同的,那么只有一种可能,ignite中查询出来的对象用的Student和当前接收结果的Student使用的类加载器不同。

    于是查看两者的类加载器:

    cache.get(1L).getClass().getClassLoader()
    => AppClassLoader
    
    Student.class.getClassLoader()
    => RestartClassLoader
    

    果然,两个类的类加载器不同,经过度娘,RestartClassLoader是spring-boot-devtools热部署插件使用的类加载器。问题找到了,这样就好办了,去掉spring-boot-devtools的依赖后即可。

    8,Ignite持久化情况下使用SqlFieldQuery查询数据中文乱码

    普通模式正常,而开启持久化之后,如果是使用SqlQuery查询的结果是对象,数据不乱码(有反序列化),但是如果是使用SqlFieldQuery则出现乱码。持久化是将内存的数据持久化到磁盘,这说明可能跟文件的编码有关,于是打印一下每个节点的文件编码:System.getProperty("file.encoding"),结果发现持久化的节点的编码为 gb18030,设置file.encoding=UTF-8之后,重新导入数据再查询,不再出现乱码情况了。

    通过设置环境变量 JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8 即可

    9,[Ignite2.7] java.lang.IllegalAccessError: tried to access field org.h2.util.LocalDateTimeUtils.LOCAL_DATE from class org.apache.ignite.internal.processors.query.h2.H2DatabaseType

    这是h2兼容问题导致的,使用最新的h2版本即可

          <dependency>
              <groupId>org.apache.ignite</groupId>
              <artifactId>ignite-indexing</artifactId>
              <version>${ignite.version}</version>
              <exclusions>
                  <exclusion>
                      <groupId>com.h2database</groupId>
                      <artifactId>h2</artifactId>
                  </exclusion>
              </exclusions>
          </dependency>
          <dependency>
              <groupId>com.h2database</groupId>
              <artifactId>h2</artifactId>
              <version>1.4.197</version>
          </dependency>
    

    10, Failed to serialize object...Failed to write field...Failed to marshal object with optimized marshaller 分布式计算无法传播到其它节点

    具体报错信息如下:

    o.a.i.i.m.d.GridDeploymentLocalStore     : Class locally deployed: class org.cord.ignite.controller.ComputeTestController
    2018-12-20 21:13:05.398 ERROR 16668 --- [nio-8080-exec-1] o.a.i.internal.binary.BinaryContext      : Failed to serialize object [typeName=o.a.i.i.worker.WorkersRegistry]
    org.apache.ignite.binary.BinaryObjectException: Failed to write field [name=registeredWorkers]  at org.apache.ignite.internal.binary.BinaryFieldAccessor.write(BinaryFieldAccessor.java:164) [ignite-core-2.7.0.jar:2.7.0]
        ...
    Caused by: org.apache.ignite.binary.BinaryObjectException: Failed to marshal object with optimized marshaller: {...}
    Caused by: org.apache.ignite.IgniteCheckedException: Failed to serialize object: {...}
    Caused by: java.io.IOException: Failed to serialize object [typeName=java.util.concurrent.ConcurrentHashMap]
    Caused by: java.io.IOException: java.io.IOException: Failed to serialize object 
    ...
    Caused by: java.io.IOException: Failed to serialize object [typeName=java.util.ArrayDeque]
    Caused by: java.io.IOException: java.lang.NullPointerException
    ...
    

    在分布式计算类中如果含有特殊注入的bean的话会导致分布式计算传播异常,例如下面这样:

    ...
    @Autowired
    private IgniteConfiguration igniteCfg;
    
    String broadcastTest() {
            IgniteCompute compute = ignite.compute();
            compute.broadcast(() -> System.out.println("Hello Node: " + ignite.cluster().localNode().id()));
            return "all executed.";
        }   
        
    

    这些bean是无法被传播的,所以在分布式计算类中 除了ignite实例注入,最好不要随便注入其它的bean,如果是更复杂的场景可以考虑服务网格;

    11,WARNING: Exception during batch send on streamed connection close; java.sql.BatchUpdateException: class org.apache.ignite.IgniteCheckedException: Data streamer has been closed

    ignite jdbc在进行批量插入操作的时候,如果重复打开流或者流不是顺序模式容易出现这个错误。解决办法:在创建jdbc connection的时候设置打开流;开启流的时候设置为顺序模式: SET STREAMING ON ORDERED

    String url = "jdbc:ignite:thin://127.0.0.1/";
    String[] sqls = new String[]{};
    Properties properties = new Properties();
    properties.setProperty(IgniteJdbcDriver.PROP_STREAMING, "true");
    properties.setProperty(IgniteJdbcDriver.PROP_STREAMING_ALLOW_OVERWRITE, "true");
    try (Connection conn = DriverManager.getConnection(url, properties)){
        Statement statement = conn.createStatement();
        for (String sql : sqls) {
            statement.addBatch(sql);
        }
        statement.executeBatch();
    }
    

    参考链接:https://issues.apache.org/jira/browse/IGNITE-10991

    http://apache-ignite-users.70518.x6.nabble.com/Data-streamer-has-been-closed-td26521.html


    12,java.lang.IllegalArgumentException: Ouch! Argument is invalid: timeout cannot be negative: -2

    如果超时参数设置的太大导致溢出,则启动会抛出这个异常。例如像下面这样设置:

                igniteCfg.setFailureDetectionTimeout(Integer.MAX_VALUE);
                igniteCfg.setNetworkTimeout(Long.MAX_VALUE);
    

    13,ddl创建的表怎么进行集群分组

    with语句中有个TEMPLATE参数,它既可以简单的指定复制(REPLICATED)和分区(PARTITIONED,也可以指定CacheConfiguration的实例,所以可以将ddl与xml中的cache进行关联即可进行集群分组。但是CacheConfiguration如果添加配置默认会创建一个cache,这时候可以通过在cache name后面加一个*号,这样就不会创建对应的cache,这时候ddl就可以与该配置进行关联,示例:

    	    <property name="cacheConfiguration">
                <list>
                    <bean class="org.apache.ignite.configuration.CacheConfiguration">
                        <property name="name" value="student*"/>
                        <property name="cacheMode" value="REPLICATED"/>
                        <property name="nodeFilter"> <!--配置节点过滤器-->
                            <bean class="org.cord.ignite.initial.DataNodeFilter"/>
                        </property>
                    </bean>
                </list>
            </property>  
    
    CREATE TABLE IF NOT EXISTS PUBLIC.STUDENT (
     STUDID INTEGER,
     NAME VARCHAR,
     EMAIL VARCHAR,
     dob Date,
     PRIMARY KEY (STUDID, NAME))
    WITH "template=student,atomicity=ATOMIC,cache_name=student";
    

    14, Failed to communicate with Ignite cluster

    瘦客户端(IgniteJdbcThinDriver)并不是线程安全的,如果要使用瘦客户端并发执行sql查询,则需要为每个线程各自创建Connection

    参考链接:https://stackoverflow.com/questions/49792329/failed-to-communicate-with-ignite-cluster-while-trying-to-execute-multiple-queri


    15,dbeaver关联查询有部分数据关联不到

    dbeaver是瘦客户端,如果关联的缓存的模式有是分区模式的,则关联查询需要开启分布式关联,开启方式为在连接url中添加distributedJoins=true的配置,示例:

    jdbc:ignite:thin://127.0.0.1:10800;distributedJoins=true

    16,WARN [H2TreeIndex] Indexed columns of a row cannot be fully inlined into index what may lead to slowdown due to additional data page reads, increase index inline size if needed

    主键的inlineSize怎么指定?

    H2TreeIndex.computeInlineSize(List<InlineIndexHelper> inlineIdxs, int cfgInlineSize)

    《|》

    int confSize = cctx.config().getSqlIndexMaxInlineSize()
    
    private int sqlIdxMaxInlineSize = DFLT_SQL_INDEX_MAX_INLINE_SIZE = -1;
    
    IGNITE_MAX_INDEX_PAYLOAD_SIZE_DEFAULT=10
    

    也就是说如果建索引的时候不指定inlinesize的话默认就是10;

    recommendedInlineSize计算规则:
    H2Tree.inlineSizeRecomendation(SearchRow row)
    InlineIndexHelper.inlineSizeOf(Value val)
    InlineIndexHelper.InlineIndexHelper(String colName, int type, int colIdx, int sortType, CompareMode compareMode)

    通过python计算inlineSize:

    import os
    import cx_Oracle as oracle
    os.environ["NLS_LANG"] = ".UTF8"
    db = oracle.connect('cord/123456@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=1520)))(CONNECT_DATA=(SID=orcl)))')
    cursor = db.cursor()
    
    query_index_name="select index_name from ALL_INDEXES where table_name='%s' and index_type='NORMAL' and uniqueness='NONUNIQUE'"
    query_index_column="select column_name from all_ind_columns where table_name='%s' and index_name='%s'"
    query_index_column_type="select data_type,data_length from all_tab_columns where table_name='%s' and column_name='%s'"
    
    def inlineSizeOf(data_type, data_length):
        if data_type == 'VARCHAR2':
            return data_length + 3
        if data_type == 'DATE':
            return 16+1
        if data_type == 'NUMBER':
            return 8+1
        return -1
    
    def computeInlineSize(tableName):
        table=tableName.upper()
        retmap = {}
        ###查询索引名
        ret = cursor.execute(query_index_name % table).fetchall()
        if len(ret) == 0:
            print("table[%s] not find any normal index" % table)
            return
        ###根据索引名获取索引字段名
        for indexNames in ret:
            # print(indexNames[0])
            indexName = indexNames[0]
            result = cursor.execute(query_index_column % (table, indexName)).fetchall()
            if len(result) == 0:
                print("table[%s] index[%s] not find any column" % (table, indexName))
                continue
            inlineSize=0
            ###根据字段获取字段类型并计算inlineSze
            for columns in result:
                column=columns[0]
                type_ret = cursor.execute(query_index_column_type % (table, column)).fetchall()
                if len(result) == 0:
                    print("table[%s] index[%s] column[%s] not find any info" % (table, indexName, column))
                    continue
                data_type = type_ret[0][0]
                data_length = type_ret[0][1]
                temp = inlineSizeOf(data_type, data_length)
                if temp == -1:
                    print("table[%s] index[%s] column[%s] type[%s] unknown" % (table, indexName, column, data_type))
                inlineSize += inlineSizeOf(data_type, data_length)
            retmap[indexName] = inlineSize
        print(retmap)
    
    if __name__ == '__main__':
        computeInlineSize('PERSON')
    

    17,class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node)

    如果是使用社区版(8.7.x,纯内存模式)出现该异常,那么在排除是集群发现配置错误外,极有可能是启动过程耗时导致心跳超时出现该错误,因此通过调大心跳超时时间()可解决该问题:

            <!-- Failure detection timeout used by discovery and communication subsystems -->
            <property name="failureDetectionTimeout" value="60000"/>
    
  • 相关阅读:
    深入理解计算机系统 第十章 系统级I/O
    深入理解计算机系统 第九章 虚拟内存(1)
    Leetcode练习(Python):第520题:检测大写字母:给定一个单词,你需要判断单词的大写使用是否正确。 我们定义,在以下情况时,单词的大写用法是正确的: 全部字母都是大写,比如"USA"。 单词中所有字母都不是大写,比如"leetcode"。 如果单词不只含有一个字母,只有首字母大写, 比如 "Google"。 否则,我们定义这个单词没有正确使用大写字母。
    Leetcode练习(Python):第461题:汉明距离:两个整数之间的汉明距离指的是这两个数字对应二进制位不同的位置的数目。 给出两个整数 x 和 y,计算它们之间的汉明距离。
    Leetcode练习(Python):第485题:最大连续1的个数:给定一个二进制数组, 计算其中最大连续1的个数。
    Leetcode练习(Python):第412题:Fizz Buzz:写一个程序,输出从 1 到 n 数字的字符串表示。 1. 如果 n 是3的倍数,输出“Fizz”; 2. 如果 n 是5的倍数,输出“Buzz”; 3.如果 n 同时是3和5的倍数,输出 “FizzBuzz”。
    Leetcode练习(Python):第383题:赎金信:给定一个赎金信 (ransom) 字符串和一个杂志(magazine)字符串,判断第一个字符串 ransom 能不能由第二个字符串 magazines 里面的字符构成。如果可以构成,返回 true ;否则返回 false。
    Leetcode练习(Python):第392题:判断子序列:给定字符串 s 和 t ,判断 s 是否为 t 的子序列。
    Leetcode练习(Python):第387题:字符串中的第一个唯一字符:给定一个字符串,找到它的第一个不重复的字符,并返回它的索引。如果不存在,则返回 -1。
    Leetcode练习(Python):第371题:两整数之和:不使用运算符 + 和
  • 原文地址:https://www.cnblogs.com/cord/p/10061545.html
Copyright © 2011-2022 走看看