zoukankan      html  css  js  c++  java
  • [nginx] async_mode_nginx CPU 100% deadlock问题分析

    很遗憾只定位到了一个比较小的问题范围,理清了root cause, 但是没有找到复现的边界条件以及solution.

    Hi all, I have the quite same problem with the latest software version:
    async_nginx: 0.4.5
    openssl: 1.1.1k
    qatengine: 0.6.4
    qatdriver: 1.7.l.4.13.0.9
    
    the reproduce situation: config values in nginx.conf :
    default_algorithms CIPHERS
    qat_poll_mode heuristic
    
    I have debuged async_ningx and found there is a infinite loop. I think this is the reason here.
    
    1 in function ngx_http_do_read_client_request_body(), nginx goin the for(;;)[line:288] loop and never break.
    as recv()[line:343] always return NGX_AGAIN, and c->read->ready always == 1
    go deep in recv(), the NGX_AGAIN is return by func ngx_ssl_handle_recv()::line:2546 because of async job is paused.
    2. when async context swapd, an other infinite loop was happend. in function qat_chained_ciphers_do_cipher() line:1554
    as the read()[qat_pause_job():line279] always return EAGAIN.
    3. As I know qat_crypto_callbackFn() is called by func qat_engine_poll(). I think, this because of the callback function qat_crypto_callbackFn() never have any CPU chance/CPU TIME to be called, then the paused async job never be waked up.
    then I check the POLL logic in async_nginx. I found point 4 descripte below.
    4. In function ngx_ssl_engine_qat_heuristic_poll(), all the values of the six variables(num_*) never grow up, so function qat_engine_poll() have no any chance to execute.
    
    when I change my engine config in nginx.conf, this issue is disappear, and i can work around. the config like below:
    qat_heuristic_poll_asym_threshold = 0
    qat_heuristic_poll_sym_threshold = 0
    
    It seems a logic deadlock here ? nginx want qat to update counters but counters updated need nginx release some CPU time.
    or, maybe the following code do not consider the long time idle SSL connections ?
    if (*num_asym_requests_in_flight + *num_kdf_requests_in_flight
    + *num_cipher_requests_in_flight + *num_asym_mb_items_in_queue
    + *num_kdf_mb_items_in_queue + *num_sym_mb_items_in_queue
    >= (int) *ngx_ssl_active) {
    
    Anyone have any idea about this ?

    详见:https://github.com/intel/QAT_Engine/issues/181

  • 相关阅读:
    es基本使用之查询数据
    python 时间格式转换
    Linux 使用scp命令定时将文件备份到另一台服务器
    Scrapy项目运行和debug断点调试
    Day042.绘制9个同心圆
    Day041.画蟒蛇-用函数来封装
    Day040.画蟒蛇-不加库前缀
    Day039.画蟒蛇
    Day038.用函数来进行温度转换
    Day037.带循环的温度转换
  • 原文地址:https://www.cnblogs.com/hugetong/p/14922073.html
Copyright © 2011-2022 走看看