zoukankan      html  css  js  c++  java
  • Simultaneous Multithreading: Maximizing On-Chip Parallelism(2)

    Time

    2020.10.28

    Summary

    Section 2 defines in detail our basic machine model, the workloads that we measure, and the simulation environment that we constmcted.

    Section 3 evaluates the performance of a single-threaded superscalar architecture

    Section 4 presents the performance of a range of SM architectures and compares them to the superscalar architecture,as well as a fine-grain multithreaded processor.

    Section 5 explores the effect of cache design alternatives on the performance of simultaneous multithreading.

    Section 6 compares the SM approach with conventional multiprocessor architectures.

    Section 7 We discuss related work

    Section 8 we summarize our results

    Research Objective

    Problem Statement

    Method(s)

    Our goal is to evaluate several architectural alternatives as defined in the previous section: wide superscalars, traditional multithreaded processors, simultaneous multithreaded processors, and small-scale multiple-issue multiprocessors. To do this, we have developed a simulation environment that defines an implementation of a simultaneous multithreaded architecture;

    Evaluation

    Conclusion

    Our results show the limits of superscalar execution and traditional multithreading to increase instruction throughput in future processors.
    For example:

    We compare these two approaches and show that simultaneous multithreading is potentially superior to mukiprocessing in its ability to utilize processor resources.

    Notes

    A more traditional means of achieving parallelism is the conventional multiprocessor.

    The Standard Performance Evaluation Corporation (SPEC) is an American non-profit corporation that aims to "produce, establish, maintain and endorse a standardized set" of performance benchmarks for computers.SPEC benchmarks are widely used to evaluate the performance of computer systems;

    Words

    throughput
    吞吐量
    viable
    可行的
    outperforms
    胜过
    tradeoffs
    折中
    Methodology
    方法论
    model
    对...建模
    hit rates
    命中率
    deviates
    背离
    pipeline
    流水线化
    scheduling window
    调度窗口
    complement
    补充
    direct-mapped
    直接变换的
    hint
    提示
    particular address
    特定地址
    accommodate
    适应
    the raw instruction
    原始指令
    uniprocessor applications,
    单处理器应用程序
    benchmark
    基准
    permutations
    排列
    compilation
    汇编
    Bottlenecks
    瓶颈
    specific
    特定的
    bounding
    限制
    idle cycle
    空闲周期
    appropriately
    适当的
    quantified
    量化
    composite
    综合
    dominant
    优势的、主要的
    parallelism
    并行性
    coarsegrain or fine-grain
    粗颗粒或细颗粒
    assumptions
    假设
    serially
    连续地
    issue 执行
    partitioning 分割
    vary considerably 有很大不同
    IPC (Instructions Per Clock) 每个时钟周期正在完成多少指令
    bound 限制

    Sentence

    two close organizational alternatives
    两个紧密的组织替代方案
    in the number of pipeline stages required for instruction issue
    发出指令所需的流水线级数
    Our simulator uses emulation-based instruction-level simulation
    我们的模拟器使用基于仿真的指令级仿真
    Each of the B runs uses a different ordering of the benchmarks
    每个B轮次使用基准的不同顺序
    the Multiflow trace scheduling compiler
    多流跟踪调度编译器
    It is thus unlikely that
    因此,不太可能
    Not only is there no dominant cause of wasted cycles — there appears to be no dominant solution.
    不仅没有周期浪费的主要原因,而且似乎没有主要解决方案。
    Figure 3 shows the performance of the various models as a function
    of the number of threads.
    图3显示了各种模型的性能与线程数的关系

  • 相关阅读:
    ==和===
    Println、Printf、Sprintf区别
    BurpSuite代理https
    scp
    Tomcat Ajp(CVE-2020-1938)
    Chrome-HackBar破解
    crontab
    Sql注入之postgresql
    Sql注入之oracle
    LeetCode简单题(一)
  • 原文地址:https://www.cnblogs.com/call-me-dasheng/p/13893799.html
Copyright © 2011-2022 走看看