公司项目采用SpringBoot+dubbo技术栈,虽然上了elk,但是还是不能把所有的日志串联起来。因此自己写了一套traceId和spanId的生成方案。在此记录下该方案的具体实现,方便后续查阅
一、 名词解释
traceId:链路Id
spanId:可以理解为服务Id,一各请求可能会调用多个服务,调用一个服务产生一个新的spanId
二、 方案介绍
2.1 traceId实现
- 请求都是从web打过来的,所web的filter拦截,生成一个traceId并put到MDC中
- dubbo消费者filter负责把调用链的MDC内容传递给生产者,即放到attachMent里面
- dubbo生产者filter负责把attachMent里面的内容取出并放到MDC中
2.2 spanId的实现
- 引入一个logicId
- logicId放在dubbo的消费者拦截器里面累加
- 当同一个业务调用多次生产者时,则消费者里面的logicId会自增并传递到生产者
三、 实现思路
3.1 web过滤器实现功能
doFilter前:
生成traceId;
生成 spanId;
生成一个logicId,默认值为0;
存到MDC中
doFilter后:
清除MDC相关内容
3.2 dubbo消费者
invoke前:
(1)累加logicId
(2)将traceId和spanId及logicId传递到 生产者
3.3 dubbo生产者
invoke前:
(1)将traceId放到MDC中
(2)取出spanId和logicId,用.拼接在一起生成一个新的spanId放到MDC中
(3)生成一个logicId,默认值为0 存到mdc中
invoke后:
清除MDC相关内容
说明:
其实dubbo生产者这过滤器和web过滤器是差不多的东西。他们都需要生成当前应用的logicId
四、 实现代码和效果
(1)TraceUtil
package com.cybermax.vaccine.logtrace;
import org.apache.commons.lang.StringUtils;
import org.slf4j.MDC;
import java.util.UUID;
import static com.cybermax.vaccine.logtrace.TraceConstants.*;
public class TraceUtil {
/**
* description: httpFilter的方式初始化MDC,会返回traceId
* @param traceId
* @return String
* @author ZENG.XIAO.YAN
* @time 2021-01-30 20:01
*/
public static String initMDCWithHttpFilter(String traceId) {
String val = StringUtils.defaultIfBlank(traceId, generateTraceId());
MDC.put(TRACE_ID, val);
MDC.put(SPAN_ID, "0");
MDC.put(LOGIC_ID, "0");
return val;
}
/**
* logicId +1
* @param logicId
* @return
*/
public static String incrLogicId(String logicId) {
return String.valueOf(Integer.parseInt(logicId) + 1);
}
/**
* 从 MDC 中清除当前线程的 Trace信息
*/
public static void clearMDC() {
MDC.clear();
}
/**
* 生成traceId
* @return 链路ID
*/
public static String generateTraceId() {
return UUID.randomUUID().toString();
}
/**
* 生成新的spanId
* @param spanId
* @param logicId
* @return
*/
public static String getNewSpanId(String spanId, String logicId) {
return new StringBuilder(spanId).append(".").append(logicId).toString();
}
}
(2)TraceConstants
package com.cybermax.vaccine.logtrace;
/**
* TraceConstants
*
* @author ZENG.XIAO.YAN
* @version 1.0
* @Date 2021-01-30
*/
public class TraceConstants {
/**
* 追踪id
*/
public static final String TRACE_ID = "traceId";
/**
* 分段ID
* 同一个调用链下的分段调用ID
* 对于前端收到请求,生成的spanId固定都是0
* 签名方式生成:0, 0.1, 0.1.1, 0.2
*/
public static final String SPAN_ID = "spanId";
/**
* 逻辑ID
* 同一个分段调用下的业务逻辑ID
* 标识服务内部的调用顺序
*/
public static final String LOGIC_ID = "logicId";
}
(3) LogTraceHttpFilter
package com.cybermax.vaccine.logtrace;
import lombok.extern.slf4j.Slf4j;
import javax.servlet.*;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
@Slf4j
public class LogTraceHttpFilter implements Filter {
private static final String TRACE_ID_HEAD = "ZTraceId";
@Override
public void init(FilterConfig filterConfig) {
// TODO Auto-generated method stub
}
@Override
public final void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException {
try {
// 这个traceId也可以考虑从Http请求里面取,如果是null则默认用uuid
HttpServletRequest request = (HttpServletRequest) servletRequest;
HttpServletResponse response = (HttpServletResponse) servletResponse;
String traceId = TraceUtil.initMDCWithHttpFilter(getTraceId(request));
// 响应头里面添加一个traceId,方便f12debug调试;本来是每个ajax都由前端生成traceId,由于改动成本太高,就变成响应输出traceId
response.setHeader(LogTraceHttpFilter.TRACE_ID_HEAD, traceId);
if (log.isInfoEnabled()) {
log.info("web端接收http请求[{}]----start", request.getRequestURI());
}
filterChain.doFilter(request, response);
} finally {
// 最后清除掉MDC内容
TraceUtil.clearMDC();
}
}
@Override
public void destroy() {
}
/**
* description: 由于考虑到和其他系统对接时,可能会用其他系统传递进来参数作为traceId
* 因此预留此方法方便后续Override
* @return java.lang.String
* @author ZENG.XIAO.YAN
* @time 2021-01-30 20:04
*/
public String getTraceId(ServletRequest servletRequest) {
return null;
}
}
(4)LogTraceConsumerFilter
package com.cybermax.vaccine.logtrace;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.apache.dubbo.common.constants.CommonConstants;
import org.apache.dubbo.common.extension.Activate;
import org.apache.dubbo.rpc.*;
import org.slf4j.MDC;
import java.util.Map;
@Slf4j
@Activate(group = CommonConstants.CONSUMER)
public class LogTraceConsumerFilter implements Filter {
@Override
public Result invoke(Invoker<?> invoker, Invocation invocation) throws RpcException {
// 消费者负责把调用链端的MDC信息传递到生产者
String traceId = MDC.get(TraceConstants.TRACE_ID);
String spanId = MDC.get(TraceConstants.SPAN_ID);
String logicId = MDC.get(TraceConstants.LOGIC_ID);
if (StringUtils.isNotBlank(traceId) && StringUtils.isNotBlank(spanId) && StringUtils.isNotBlank(logicId)) {
// 逻辑Id+1
String newLogicId = TraceUtil.incrLogicId(logicId);
MDC.put(TraceConstants.LOGIC_ID, newLogicId);
// 传递traceId spanId logicId
Map<String, String> attachments = invocation.getAttachments();
attachments.put(TraceConstants.TRACE_ID, traceId);
attachments.put(TraceConstants.SPAN_ID, spanId);
attachments.put(TraceConstants.LOGIC_ID, newLogicId);
}
return invoker.invoke(invocation);
}
}
(5)LogTraceProviderFilter
package com.cybermax.vaccine.logtrace;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.collections4.MapUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.dubbo.common.constants.CommonConstants;
import org.apache.dubbo.common.extension.Activate;
import org.apache.dubbo.rpc.*;
import org.slf4j.MDC;
import java.util.Map;
@Slf4j
@Activate(group = CommonConstants.PROVIDER)
public class LogTraceProviderFilter implements Filter {
@Override
public Result invoke(Invoker<?> invoker, Invocation invocation) throws RpcException {
try {
// 生产者负责把信息放进MDC里面
Map<String, String> attachments = invocation.getAttachments();
String traceId = MapUtils.getString(attachments, TraceConstants.TRACE_ID);
String spanId = MapUtils.getString(attachments,TraceConstants.SPAN_ID);
String logicId = MapUtils.getString(attachments,TraceConstants.LOGIC_ID);
if (StringUtils.isNotBlank(traceId) && StringUtils.isNotBlank(spanId) && StringUtils.isNotBlank(logicId)) {
// 生成新的spanId
String newSpanId = TraceUtil.getNewSpanId(spanId, logicId);
MDC.put(TraceConstants.TRACE_ID, traceId);
MDC.put(TraceConstants.SPAN_ID, newSpanId);
MDC.put(TraceConstants.LOGIC_ID, "0");
}
if (log.isInfoEnabled()) {
log.info("dubbo服务执行dubbo方法{}.{}", invoker.getInterface().getName(), invocation.getMethodName());
}
Result result = invoker.invoke(invocation);
return result;
} finally {
TraceUtil.clearMDC();
}
}
}
(6)配置dubbo的spi扩展
(7)日志中输出TraceId和SpanId
下面只要关注traceId和SpanId的取值,具体的格式自己搞定
<?xml version="1.0" encoding="UTF-8"?>
<!--
默认file appender的encoder输出json时总是有点问题,因此引入logstash-logback-encoder来改变这一现状
-->
<included>
<appender name="FILE"
class="ch.qos.logback.core.rolling.RollingFileAppender">
<encoder
class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
<providers>
<!--<timestamp>-->
<!--<timeZone>UTC</timeZone>-->
<!--</timestamp>-->
<pattern>
<!--可以直接从apollo中取值-->
<pattern>
{"time":"%d{yyyy-MM-dd'T'HH:mm:ss.SSS+08:00}",
"systemName":"%property{systemName}" ,
"level": "%level",
"traceId": "%X{traceId}",
"spanId": "%X{spanId}",
"context": "%t [%c] - %m ",
"exceptionTrace":"%exception{50}"
}%n
</pattern>
</pattern>
</providers>
</encoder>
<file>${LOG_FILE}</file>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>${LOG_FILE}.%d{yyyy-MM-dd}.%i.gz</fileNamePattern>
<maxFileSize>${LOG_FILE_MAX_SIZE:-100MB}</maxFileSize>
<maxHistory>${LOG_FILE_MAX_HISTORY:-0}</maxHistory>
</rollingPolicy>
</appender>
</included>
(8)输出的日志文件效果举例
下面为程序里面输出的一条日志,为了filebeat收集后不用自己写语法去解析日志,我配置成了json格式;这不是重点,重点是下面这个json
{
"time":"2021-09-10T17:15:53.285+08:00",
"systemName":"vaccine-service-app",
"level":"INFO",
"traceId":"c2348f26-0ca6-4772-83a3-de7b7cb1aa9d",
"spanId":"0.2",
"context":"DubboServerHandler-192.168.199.42:20880-thread-199 [com.cybermax.service.realize.hession.bacterin.BacterinFCServiceImpl] - 已命中[REDIS_DEL_allBacterinMap_bactCode]缓存... ",
"exceptionTrace":""
}
效果案例二
下面再贴一张图,这个的日志格式就不是json了,只在原来的springboot的日志格式中添加了traceId和spanId
配置的日志格式为:
${CONSOLE_LOG_PATTERN:%clr(%d{${LOG_DATEFORMAT_PATTERN:yyyy-MM-dd'T'HH:mm:ss.SSS+08:00}}){faint} %clr(${LOG_LEVEL_PATTERN:%5p}) %clr(${PID:- }){magenta} %clr([%X{traceId}] [%X{spanId}]){yellow} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:%wEx}}
效果为:
五、总结
- 重点关注spanId的实现,traceId实现比较简单
- 这个spanId生成后就和阿里
- 这个traceId和spanId如果输出到日志时,可以直接发送到logstash,也可以生成日志文件再用其他filebeat去收集
六、参考链接
《TraceId 和 SpanId 生成规则》 https://help.aliyun.com/document_detail/151840.html?spm=a2c4g.11186623.6.1162.45997711nA76Yy
《Dubbo日志链路追踪TraceId选型》 https://blog.csdn.net/zlt2000/article/details/108820788