1. Application Layer Overview
The protocol number is the glue that binds the network and transport layers together, whereas the port number is the glue that binds the transport and application layers together.
2. Domain Name System
As a substitute for hosts.txt dating from ARPANET, DNS bears certain resemblance to a huge hierarchical page table in Internet, which provides name resolution rather than address translation. DNS is actually a distributed hierarchical database, which includes three classes of servers: root DNS servers, TLD servers and authoritative DNS servers. Moreover, DNS extensively exploits DNS caching in order to reduce the query delays just like what a TLB does in memory hierarchy.
A resource record (RRs) in a DNS database is a four-tuple contains the following fields: Name, Value, Type and TTL. The meanings of Name and Value depends on the Type of an RR, which can be A for a hostname and its IPv4 address, AAAA for a hostname and its IPv6 address, or NS for a domain and the hostname of its authoriative DNS server, or CNAME for a hostname and its canonical name.
A client that has been configured with its Local DNS Server can initiate a DNS query to resolve a name. Such a DNS query can be either recursive or iterative. Both DNS queries and replies take the same form: the first 12 bytes is the header section, followed by Questions, Answers, Authority and Additional Information. In Linux system, you can type the command line 'nslookup' to get access to the DNS server to initiate multiple DNS queries. Also, there are certain Java APIs that can resolve a hostname by initiating DNS queries, such as the following statement supported by java.net.InetAddress:
String IPAddr = InetAddress.getByName(name).getHostAddress();
3. HyperText Transfer Protocol
Click here to see the evolution of the Web.
A Web page consists of objects - such as an HTML file, a JPEG image, a Java applet or a video clip - that is addressable by a single URL (protocol+server+page). Web pages can be either static (e.g. HTML file) or dynamic (e.g. PHP at server-side or applet at client-side).
A client side of the Web is a browser, which can only interpret some built-in MIME types, and its viewers such as plug-ins or helpers. A Web server often exploits cache to eliminate the disk access and sprays requests over multiple CPUs (server farms), multiple threads and multiple disks.
HTTP is a stateless protocol based on TCP connection (either persistent or non-persistent). A client must initiate a TCP connection (create a socket) to the server first when he wants to get a Web page. An HTTP request always takes a form like this:
As for an HTTP response, the general form includes a status line (status code and status phrase), header lines, a blank line and the body entity. The status code is a three-digit number that tells whether the request was satisfied, and if not, why not.
Although Web is stateless, a server can drop a cookie at the client side to maintain the user status and take cookie-specific actions. Cookies usually include four components: (1) cookie header line of HTTP response message, (2) cookie header line in HTTP request message, (3) cookie file kept on user's host, managed by the browser, and (4) back-end database at Web site.
Web can use caching to decrease page load time (PLT), that is to install a proxy server as both the server of a near client and the client of a remote server. A conditional GET request will indicate the proxy server to send an object only if it has an up-to-date cached version.
4. File Transfer Protocol
FTP is based on parallel TCP connections: the control connection is persistent on port 21, and the data connection is non-persistent on port 20 in active mode or a client-spercified port number in passive mode. An FTP server maintains the state of a client all the time and can know exactly what "the current directory" is.
For FTP commands and return codes, please see my demo program.
5. Electronic Mail
There are four components in Email System: (1) user agents, (2) mail agents, (3) transfer protocol, and (4) access protocols.
Simple Mail Transfer Protocol (SMTP) in RFC822 stipulates that a message consists of three parts: header lines (to where, from where, subject etc), a blank line, and the message body, which can only be ASCII characters. Multipurpose Internet Mail Extensions (MIME) can support different language texts, audios and images by adding structure to the message body and defining encoding rules for non-ASCII messages (RFC 1341).
The access protocol can be either Post Office Protocol (POP), which is stateless across sessions, or Internet Mail Access Protocol (IMAP), which keeps user state across sessions.
References:
1. Kurose, James F., Keith W. Ross. Computer Networking: a top-down approach[M]. 北京:高等教育出版社, 2009-08