发现服务宕机两次,于是查看日志,发现错误如下:
原因:
datab 15:46:59.302 [SimpleAsyncTaskExecutor-7] ERROR o.s.a.i.SimpleAsyncUncaughtExceptionHandler - Unexpected error occurred invoking async method 'public void com.chinadatab.job.ScheduledTasks.lucene_enterprise()'. java.lang.StackOverflowError: null at sun.nio.fs.WindowsNativeDispatcher.CreateFile0(Native Method) at sun.nio.fs.WindowsNativeDispatcher.CreateFile(Unknown Source) at sun.nio.fs.WindowsChannelFactory.open(Unknown Source) at sun.nio.fs.WindowsChannelFactory.newFileChannel(Unknown Source) at sun.nio.fs.WindowsFileSystemProvider.newByteChannel(Unknown Source) at java.nio.file.Files.newByteChannel(Unknown Source) at java.nio.file.Files.createFile(Unknown Source) at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:98) at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:727) at com.chinadatab.lucene.LuceneManage._IndexWriter(LuceneManage.java:934) at com.chinadatab.lucene.LuceneManage._IndexWriter(LuceneManage.java:940)
从日志中可以看出是那个方法导致的OOM,这里是栈溢出,于是找到对应的语句查看,有一个定时器,每隔30分钟执行一次,每次查询取了10000条数据,处理完10000完数据后批量通过indexWrite生成索引导致的,如下:
@Async
@Scheduled(initialDelay=1000*60*30, fixedDelay=1000*60*30)
/**
* 完善基础企业库
*/
public void lucene_enterprise(){
String tableName="company";
LuceneManage luceneManage= LuceneManage.getInstance();
Page page = luceneManage.get(0, 1, true, tableName);
List<Map<String, Object>> lists = page.getList();
long id=0;
for (Map<String, Object> map : lists) {
id=Long.parseLong(map.get("id").toString());
}
if(id>0)
{
TngouDBHelp TngouDBHelp = new TngouDBHelp();
List<Enterprise> list;
try { //这里取了10000条数据,导致了OOM
list = mapper.serach(id);
if(list==null||list.isEmpty()) {
return;
}
List<Fields> ls= new ArrayList<>();
//创建了大量对象
for (Enterprise e : list) {
Fields fields = new Fields();
fields.add(new Field("id", e.getId()+"", Type.Key));
fields.add(new Field("person", e.getOper_name(), Type.Text));
fields.add(new Field("name", e.getName(), Type.Text));
fields.add(new Field("address", e.getAddress(), Type.Text));
ls.add(fields);
};
TngouDBHelp.insert(tableName,ls);
} catch (Exception e1) {
e1.printStackTrace();
}
}
}
分析:
栈内存为线程私有的空间,每个线程都会创建私有的栈内存。栈空间内存设置过大,创建线程数量较多时会出现栈内存溢出StackOverflowError。
同时,栈内存也决定方法调用的深度,栈内存过小则会导致方法调用的深度较小,如递归调用的次数较少。
-Xss:如-Xss128k
解决:
调整JVM参数增加栈的大小,统计减少每次处理的数据量,修改为每次5000条数据