Content
1. Discusses the ways people deploy servers in modern web hosting environments,
HTTP support for virtual web hosting, and how to replicating content across geographically distant servers。
1.1 The collective duties of storing, brokering, and administering content resources is called web hosting
Hoster
virtually hosted: Virtual Server Request Lacks Host Information->
Port/IP/ Virtual hosting by Host header
1.2 Making Web Sites Reliable
a. Mirrored Server Farms: Server farm: clone each each and backup for each other.
b. Content Distribution Networks
Surrogate Caches in CDNs: only cache the content from specific web server.
Proxy Caches in CDNsA: cache the content from the request through it.
2. Discusses the technologies for creating web content and installing it onto web servers:
How to create the content and publish it. Like now, How I write blog and publish it into cnblogs. It seems out of date.
3. Distributing incoming web traffic among a collection of servers.
Why? Perform HTTP transactions reliably/ Minimize delay / Conserve network bandwidth
Where? Every node that the request go though to response server: DNS, proxy, cache, load balance server, http redirection.
and other Ip redirection/ MAC redirection.
3.1 DNS redirection
Also DNS can take load balance. If there are more than one server hosting site with different ips, the dns can use some strateg to choose one webserver(ip) from the server list. One simple way:: DNS round robin.
3.2 Proxy
There are three for proxy to redirect the http request.
Explicit Browser Configuration: user will to setting the configure on the web browser or other client. But the proxy is static.
Proxy Auto-configuration: User just configure the proxy server, the proxy server will automatic figure out which proxy will be used.
Web Proxy Autodiscovery: Even there is nothing to do, the client/browser will automate detect/discover proxy server, get the proxy list file, and apply it to the client/broswer. It much more smart,but it not wider used.
3.3 Cache : CND.
3.4 Http redirection
4. log formats
Why? one is for debug and the other for statistics.
The standard log format will easy to manager the log, for there are a lot of tools can easy to collect, anlysis the log. Stardard will make life much better.
The stardard log format: Common Log Format , Combined Log Format.
One problem: when web server want to get the statistics from log, but the cache node in the web to make it hard to get the precise date.
So there are approach to access the problem.
One approach is that define new protocal ; RFC 2227 defines the Hit Metering to define the communication between with cache node and web server.