zoukankan      html  css  js  c++  java
  • ZooKeeper学习笔记

    Znodes maintain a stat structure that includes version numbers for data changes, acl changes. The stat structure also has timestamps. The version number, together with the timestamp, allows ZooKeeper to validate the cache and to coordinate updates. Each time a znode's data changes, the version number increases. 

    Ephemeral Nodes(临时节点)

    ZooKeeper also has the notion of ephemeral nodes. These znodes exists as long as the session that created the znode is active. When the session ends the znode is deleted. Because of this behavior ephemeral znodes are not allowed to have children.

    Sequence Nodes -- Unique Naming

    When creating a znode you can also request that ZooKeeper append a monotonically increasing counter(单调递增) to the end of path. This counter is unique to the parent znode. The counter has a format of %010d -- that is 10 digits with 0 (zero) padding  i.e. "<path>0000000001".  Note: the counter used to store the next sequence number is a signed int (4bytes) maintained by the parent node。

    Time in ZooKeeper

    ZooKeeper tracks time multiple ways:

    • ZxidZooKeeper使用时间戳

      Every change to the ZooKeeper state receives a stamp in the form of a zxid (ZooKeeper Transaction Id). This exposes the total ordering of all changes to ZooKeeper. Each change will have a unique zxid and if zxid1 is smaller than zxid2 then zxid1 happened before zxid2.

    • Version numbers

      Every change to a node will cause an increase to one of the version numbers of that node. The three version numbers are version (number of changes to the data of a znode), cversion (number of changes to the children of a znode), and aversion (number of changes to the ACL of a znode).

    • Ticks用来维持server心跳

      When using multi-server ZooKeeper, servers use ticks to define timing of events such as status uploads, session timeouts, connection timeouts between peers, etc.

    • Real time

      ZooKeeper doesn't use real time, or clock time, at all except to put timestamps into the stat structure on znode creation and znode modification.

    ZooKeeper Stat Structure

    The Stat structure for each znode in ZooKeeper is made up of the following fields:

    • czxid创建时间

      The zxid of the change that caused this znode to be created.

    • mzxid上次修改时间

      The zxid of the change that last modified this znode.

    • ctime创建现在经过时间

      The time in milliseconds from epoch when this znode was created.

    • mtime上次修改现在经过时间

      The time in milliseconds from epoch when this znode was last modified.

    • version

      The number of changes to the data of this znode.

    • cversion

      The number of changes to the children of this znode.

    • aversion

      The number of changes to the ACL of this znode.

    • ephemeralOwner

      The session id of the owner of this znode if the znode is an ephemeral node. If it is not an ephemeral node, it will be zero.

    • dataLength

      The length of the data field of this znode.

    • numChildren

      The number of children of this znode.

      ZooKeeper Sessions

      A ZooKeeper client establishes a session with the ZooKeeper service by creating a handle to the service using a language binding. 然后从CONNECTING状态转到CONNECTED状态。如果有不可恢复的错误发生,比如session过期或者认证失败或者应用程序关闭了,handle会转到CLOSED状态。

      To create a client session the application code must provide a connection string containing a comma separated list of host:port pairs, each corresponding to a ZooKeeper server (e.g. "127.0.0.1:4545" or "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002"). The ZooKeeper client library will pick an arbitrary(任意) server and try to connect to it. If this connection fails, or if the client becomes disconnected from the server for any reason, the client will automatically try the next server in the list, until a connection is (re-)established.

      Session expiration is managed by the ZooKeeper cluster itself, not by the client. When the ZK client establishes a session with the cluster it provides a "timeout" value detailed above. 超时发生集群删除session临时节点立即通知这个变化相关client (anyone watching those znodes). 

      Example state transitions for an expired session as seen by the expired session's watcher:

      1. 'connected' : session is established and client is communicating with cluster (client/server communication is operating properly)

      2. .... client is partitioned from the cluster

      3. 'disconnected' : client has lost connectivity with the cluster

      4. .... time elapses, after 'timeout' period the cluster expires the session, nothing is seen by client as it is disconnected from cluster

             5. .... time elapses, the client regains network level connectivity with the cluster 

             6.'expired' : eventually the client reconnects to the cluster, it is then notified of the expiration

      Once a connection to the server is successfully established (connected) there are basically two cases where the client lib generates connectionloss  when either a synchronous or asynchronous operation is performed and one of the following holds:

      1. The application calls an operation on a session that is no longer alive/valid

             2.The ZooKeeper client disconnects from a server when there are pending operations to that server, i.e., there is a pending asynchronous call.

      ZooKeeper Watches

      Here is ZooKeeper's definition of a watch: a watch event is one-time trigger, sent to the client that set the watch, which occurs when the data for which the watch was set changes. There are three key points to consider in this definition of a watch:

      • One-time trigger

        One watch event will be sent to the client when the data has changed. For example, if a client does a getData("/znode1", true) and later the data for /znode1 is changed or deleted, the client will get a watch event for /znode1. If /znode1 changes again, no watch event will be sent unless the client has done another read that sets a new watch.

      • Sent to the client

        This implies that an event is on the way to the client, but may not reach the client before the successful return code to the change operation reaches the client that initiated the change. Watches are sent asynchronously to watchers. ZooKeeper provides an ordering guarantee: a client will never see a change for which it has set a watch until it first sees the watch event. 

      • The data for which the watch was set

        This refers to the different ways a node can change. It helps to think of ZooKeeper as maintaining two lists of watches: data watches and child watches. getData() and exists() set data watches. getChildren() sets child watches. Alternatively, it may help to think of watches being set according to the kind of data returned. getData() and exists() return information about the data of the node, whereas getChildren() returns a list of children. Thus, setData() will trigger data watches for the znode being set (assuming the set is successful). A successful create() will trigger a data watch for the znode being created and a child watch for the parent znode. A successful delete() will trigger both a data watch and a child watch (since there can be no more children) for a znode being deleted as well as a child watch for the parent znode.

      Watches are maintained locally at the ZooKeeper server to which the client is connected. This allows watches to be lightweight to set, maintain, and dispatch. When a client connects to a new server, the watch will be triggered for any session events. Watches will not be received while disconnected from a server. When a client reconnects, any previously registered watches will be reregistered and triggered if needed. In general this all occurs transparently. There is one case where a watch may be missed: a watch for the existence of a znode not yet created will be missed if the znode is created and deleted while disconnected.

      Consistency Guarantees

      ZooKeeper is a high performance, scalable service. Both reads and write operations are designed to be fast, though reads are faster than writes. The reason for this is that in the case of reads, ZooKeeper can serve older data, which in turn is due to ZooKeeper's consistency guarantees:

      Sequential Consistency

      Updates from a client will be applied in the order that they were sent.

      Atomicity

      Updates either succeed or fail -- there are no partial results.

      Single System Image

      A client will see the same view of the service regardless of the server that it connects to.

      Reliability

      Once an update has been applied, it will persist from that time forward until a client overwrites the update. This guarantee has two corollaries:

      1. If a client gets a successful return code, the update will have been applied. On some failures (communication errors, timeouts, etc) the client will not know if the update has applied or not. We take steps to minimize the failures, but the guarantee is only present with successful return codes. (This is called the monotonicity condition in Paxos.)

      2. Any updates that are seen by the client, through a read request or successful update, will never be rolled back when recovering from server failures.

      Timeliness

      The clients view of the system is guaranteed to be up-to-date within a certain time bound (on the order of tens of seconds). Either system changes will be seen by a client within this bound, or the client will detect a service outage.

      Using these consistency guarantees it is easy to build higher level functions such as leader election, barriers, queues, and read/write revocable locks solely at the ZooKeeper client (no additions needed to ZooKeeper). See Recipes and Solutions for more details.

       


  • 相关阅读:
    iOS开发实用干货——强化你的Xcode控制台
    Android App 性能优化实践
    AFNetworking 之于 https 认证
    点击 Run 之后发生了什么?
    happypack 原理解析
    JavaScript 笔记 ( Prototype )
    成立快两年的安卓绿色联盟,现在怎么样了?
    盘点20款主流应用FPS,最Skr帧率测试方法都在这里!
    探寻百度AI3.0背后的技术实践
    流畅购物哪家强?购物类应用“页面过度绘制”情况调查
  • 原文地址:https://www.cnblogs.com/clara/p/3130922.html
Copyright © 2011-2022 走看看