zoukankan      html  css  js  c++  java
  • Cache pingpong

    简单的描述: Cache ping-pong is a more subtle form of thrashing that can happen when more than one CPU is using data that is cached on the same line. For example, say there are two CPUs using an array of structures:
       struct flags {
          bool pushed;
          bool popped;
       } states[N];
    There is no conflict: One CPU touches only flags.pushed, and the other only flags.popped. But if these two elements are cached together then when the first CPU modifies a flags.pushed the cache coherence machinery will need to copy that cache line over to the second CPU's cache. Ping. And when the second CPU modifies a flags.popped its cache line will have to be copied over to the first CPU's cache. Pong. The answer is to lay out your data in memory so that the elements needed by each CPU are on separate cache lines. This may mean breaking up logical structures and spreading their elements across multiple arrays, in a pattern reminiscent of old-school FORTRAN:
       struct flags {
          bool pushed[N];
          char pad[LINE_SIZE];
          bool popped[N];
       } states;
    更一般的描述:
    One way of maintaining cache coherence in multiprocessing designs with CPUs that have local caches is to ensure that single cache lines are never held by more than one CPU at a time. With write-through caches, this is easily implemented by having the CPUs invalidate cache lines on snoop hits. However, if multiple CPUs are working on the same set of data from main memory, this can lead to the following scenario: CPU #1 reads a cache line from memory. CPU #2 reads the same line, CPU #1 snoops the access and invalidates its local copy. CPU #1 needs the data again and has to re-read the entire cache line, invalidating the copy in CPU #2 in the process. CPU #2 now also re-reads the entire line, invalidating the copy in CPU #1. Lather, rinse, repeat. The result is a dramatic performance loss because the CPUs keep fetching the same data over and over again from slow main memory. Possible solutions include: Use a smarter cache coherence protocol, such as MESI. Mark the address space in question as cache-inhibited. Most CPUs will then resort to single-word accesses which should be faster than reloading entire cache lines (usually 32 or 64 bytes). If the data set is small, make one copy in memory for each CPU. If the data set is large and processed sequentially, have each CPU work on a different part of it (one starting at the beginning, one at the middle, etc.).
  • 相关阅读:
    springboot @Select @Insert @Update @Delete
    列表全选与全反选
    日期控件处理
    MyCat
    eclipse中copy qualified name使用方式
    JPA
    java数组
    Java多线程
    Hadoop采样器实现全排序(报错java.io.EOFException)
    Hadoop全排序
  • 原文地址:https://www.cnblogs.com/ohscar/p/3109622.html
Copyright © 2011-2022 走看看