zoukankan      html  css  js  c++  java
  • Design of a ProxySharing Proxy Server

    Design of a Proxy-Sharing Proxy Server


    Design of a Proxy-Sharing Proxy Server

    Clinton L. Jeffery and Samir R. Das

    Division of Computer Science

    The University of Texas at San Antonio

    {jeffery,samir}@cs.utsa.edu

    Abstract

    Proxy servers that cache internet resources have gained wide
    acceptance at the enterprise level. Attempts are being made to extend proxy
    servers to support multi-level caching in order to reduce internetwork load
    and obtain an increase in performance. This paper argues against
    hierarchical models and presents an alternative. Good performance and
    scalability require an implementation model that is
    non-hierarchical and utilizes the natural topology of the internet.
    The ideas in this position paper appeared in the proceedings of IEEE ETACOM
    '96 Conference in Portland Oregon, May 1996.

    Introduction

    The rapid expansion of the internet has resulted in extremely high network
    traffic and degraded response times. Web caching at the individual client
    level has always been ubiquitous but is insufficient. Performance analysis
    of current Web servers indicate a strong need for aggressive, distributed
    caching [Kwan95]. Ideally, such aggressive caching techniques provide ways
    to share caches without introducing overly complex software layers or
    resource requirements to the existing model.

    Extensions to the single client-level cache have so far followed the loose
    analogy to multi-level caches found in the memory hierarchies used in
    computer designs. Caching at the local network level is provided by proxy
    servers. If proxy servers are configured to request their resources through
    other proxy servers, a multi-level cache results; a similar idea was
    used in Harvest, and has been applied to proxy caching in the Squid
    proxy server [squid@nlanr.net].

    There are two problems with the multi-level cache analogy. First, it
    ignores the topology of the internet; multi-level caching only makes sense
    through those levels of the network where all the information is flowing
    from the same location. Second, hierarchical models do not scale well and
    present poor economic feasibility because each succeeding level in a caching
    hierarchy needs a larger cache size. It is better to make each entity in
    the network pay for the size of cache that its own use requires.

    We propose to do away with the hierarchical model and extend proxy servers
    to share resources with each other non-hierarchically. Distributed file
    systems such as AFS provide such non-hierarchical caching and are used to
    cache internet resources without imposing additional complexity on the
    server [Kwan95]. The distributed file system approach is a heavyweight
    solution; it is technically or politically impractical or undesirable to
    extend it across the entire internet any time soon.

    What is needed is a light-weight alternative for caching and sharing
    internet resources. Proxy servers are a good starting point; we are
    providing a simple and efficient means of sharing them. Proxy sharing
    requires more than just configuring servers to allow outside requests;
    information about which local server(s) have cached a remote request must be
    provided. For each resource, our servers maintain a log that tracks proxy
    servers that cached the resource on a recent access and are willing to share
    it. The share log is cross-referenced with knowledge of the internet
    topology to determine the closest cache for any given request of a resource.
    We outline implementation techniques below.

    Example

    Proxy sharing may be illustrated by an example. Without sharing, when a
    user in San Antonio accesses a Web page in Boulder, a short request message
    is sent to Boulder and the Web page, which is usually large, is sent back.
    It is reasonably likely that someone else located nearer on the Internet
    than Boulder has accessed this Web page recently. Perhaps, someone in
    Austin accessed it an hour ago. If a proxy server in Austin cached the page
    and is configured to share resources with others who request it, the San
    Antonio user can get the Web page from Austin if they can find it. The user
    doesn't have any way of knowing who might have a recent copy of the Web page
    -- but the server in Boulder does. It can supply a list of sites that
    recently obtained the page.

    The mechanism for maintaining and providing such a list is the first half of
    proxy sharing. The second half is determining which site is nearest to the
    user and obtaining pages from such third-party locations. The nearest site
    might be inferred from a knowledge-base of the internet's network topology,
    but different policies with different definitions of "nearest" are worth
    exploring. For example, if Austin's site is heavily loaded, some other
    location (say, Dallas) might provide the information more quickly even if
    the static network topology places it farther away. As long as Dallas is
    nearer and faster than Boulder, both the user, the server in Boulder, and
    the rest of the Internet community benefits from improved performance and
    reduced network traffic, compared with obtaining the page from its original
    source in Boulder. The proxy-sharing server in Austin or Dallas has
    suffered some increase in CPU load, but only because it obtained that
    resource itself recently, and agreed to share it; besides, that server
    benefits from accessing other proxies' resources.

    Approach

    Our prototype implementation of proxy sharing is still in progress. Initial
    efforts were based on the CERN server, and were discontinued in favor of the
    Apache server. We have only recently become aware of the Squid server, which
    may be more appropriate for this effort.

    In the first prototype, we extended the web server to maintain a list of
    candidate proxies for each of its resources; these are proxies that accessed
    the resource recently and offered to share it from their cache. The HTTP
    protocol was extended slightly so that resource requests include such an
    offer in an optional header line. Requests that use the extended protocol
    are called proxy requests.

    When a server receives a resource request, it responds with a short list of
    candidate proxies (say, 5-10) that are in closer proximity to the requesting
    client than the owner. The requestor then makes a separate request to a
    suitable server to get a copy of the document. Proxy-sharing servers
    distinguish between an ordinary HTTP request and a proxy request. Ordinary
    HTTP requests are served in the usual way. Proxy requests for resources
    whose size is smaller than a threshold value are also served directly rather
    than deferred to a proxy site.

    There are technical and policy issues that must be addressed in a full
    implementation of the above protocol. The major issue is to discover the
    internet topology so that the owner servers can send a list of suitable
    caching servers to the client and the client is able to pick one from this
    list to access the document. Estimating internet topology is a daunting
    task, especially in the face of changing physical connections, machine
    faults and load variations.

    The first proxy sharing prototype utilized crude static topology information
    (country and continent, and user-specified latitude and longitude) to filter
    candidates during the construction of the "short list" of candidate proxies
    by the original resource server. The HTTP header is requested from all
    members of the short list, in order to select a proxy machine that still has
    the resource in its cache, and can deliver it the quickest to the requesting
    machine. Research done in the area of obtaining network topology in other
    internetworking contexts (e.g., for multicast protocols [Deering-92]) may
    allow improvements.

    Resource Requirements and Performance Implications

    The disk space required to support a proxy server's cache is cost effective
    even when it is used only to support that enterprise's needs. Proxy sharing
    allows neighbors to piggyback resource requests; the CPU overhead imposed by
    the courtesy of servicing outside requests is offset by savings an
    institution reaps in improved internet performance and access to the caches
    on neighbors' proxy servers. Since outside requests only occur for
    resources a site is known to have accessed recently, no overhead is imposed
    on sites that are not themselves requesting resources. In
    security-conscious environments, proxy sharing can only be utilized on
    intranets or by placing a proxy-sharing server outside the firewall.

    The main new resource required by proxy sharing in our model is the
    memory required by each server to maintain its list of candidate proxies.
    The actual memory consumed may be tuned to fit an individual server's
    situation, but if the lists are maintained for each resource of a size
    greater than 10 kbytes, and a typical server has, say, 1000 such resources,
    and each candidate proxy on the list consumes 100 bytes (machine name,
    time stamp, etc.) and the lists are 10 entries long, the lists could
    easily consume on the order of a megabyte on the server. In practice,
    this amount of virtual memory is not a problem, and most sites would need
    less.

    Proxy sharing has multiple effects on performance; changes in average
    access times on resources may be more important to end-users, but change
    in total network traffic may be equally important. In the worst-case,
    a server sends back a list of candidate proxies that all are unavailable
    or have flushed the resource from their cache, leaving the client with
    the task of querying the server for the resource again. For large resources
    across far distances, the cost of checking for a local copy first will be
    insignificant, while for smaller files and shorter distances it will be
    better to respond with the resource than a list of proxy candidates.
    These sorts of tunable parameters provide a rich area for further work.

    Our initial tuning of such parameters starts with the common case in which
    relatively small HTML pages to contain references to relatively large
    embedded images. The demand for low latency suggests that the HTML content
    may be sent directly from the remote server. In this case the list of
    candidate proxies can be piggybacked on the message containing the HTML
    page, and the client proxy server can request subsequent images and other
    resources from a nearby sharing server with low latency, consistent with
    the improvements in the upcoming HTTP 1.1 protocol,

    Conclusion

    Proxy sharing promises to reduce the network traffic generated by access
    to large World-Wide Web resources. An implementation is a modest extension
    to the existing proxy server technology. The benefits obtainable by means
    of proxy sharing will be fully realized only if the technology is widely
    adopted, but the cost of adopting the technology is low.

    Acknowledgements

    Shannon Williams contributed to discussions and assisted with various server
    configuration and administration issues. Garry S. Bernal participated in
    the design and implementation of the proxy-sharing proxy server prototype.
    This work has been supported in part by the National Science Foundation
    under grant CCR-9409082 and by a UTSA Faculty Research Award.

    References

    [Deering-92] S. Deering and D. Cheriton.
    "Multicast routing in datagram internetworks and extended LANs."
    ACM Transactions on Computer Systems, 8(2):85--110, May 1992.

    [Kwan95] Thomas T. Kwan, Robert E. McGrath, and Daniel A. Reed.
    "NCSA's World Wide Web Server: Design and Performance."
    IEEE Computer, 28(11):68--74, November 1995.

    [Luotonen94] Ari Luotonen and Kevin Altis.
    "World-wide web proxies."
    In Proceedings of the First International World-Wide Web
    Conference
    , 1994.

  • 相关阅读:
    hdu2155 小黑的镇魂曲(dp)
    hdu2155 小黑的镇魂曲(dp)
    hdu2158 最短区间版大家来找碴
    hdu2158 最短区间版大家来找碴
    hdu2102 水搜索
    hdu2102 水搜索
    hdu 2058 枚举区间和个数
    hdu 2058 枚举区间和个数
    hdu4982 暴搜+剪枝(k个数和是n,k-1个数的和是平方数)
    hdu4982 暴搜+剪枝(k个数和是n,k-1个数的和是平方数)
  • 原文地址:https://www.cnblogs.com/lexus/p/2528636.html
Copyright © 2011-2022 走看看