HTTP (Hypertext Transfer Protocol)
The Hypertext Transfer Protocol (HTTP) is a protocol used mainly to access data on the World Wide Web. The protocol transfer all data in the form of plain text, hypertext, audio, video, and so on. However it is called the hypertext transfer protocol because its efficiency allows its use in a hypertext environment where there are rapid jumps from one document to another.
HTTP functions like a combination of FTP and SMTP. It is similar to FTP because it transfers files and uses the services of TCP. However, it is much simpler than FTP because it uses only data are transferred between the client and the server.
HTTP is like SMTP because the data transferred between the client and server look like SMTP messages. In addition, the format of the messages is controlled by MIME-like headers.
However, HTTP differs from SMTP in the way the messages are sent from the client to the server and from the server to the client. Unlike SMTP, the HTTP messages are not destined to be read by humans; they are read and interpreted by the HTTP server and HTTP client (browser). SMTP messages are stored and forwarded, but HTTP messages are delivered immediately.
The idea of HTTP is very simple. A client sends a request, which looks like mail, to the server. The server sends the response, which looks like a mail reply, to the client. The request and response messages carry data in the form of a letter with MIME-like format.
The commands from the client to the server are embedded in a letter like request message. The contents of the requested file or other information are embedded in a letter like response message.
Figure illustrates the HTTP transaction between the client and server. The client initializes the transaction by sending a request message. The server replies by sending a response.
There are two general types of HTTP messages, shown in figure request and response. Both message types follow almost the same format.
A request message consists of a request line, headers, and sometimes a body.
A response message consists of a status line, headers, and sometimes a body.
Uniform Resource Locator (URL)
A client that wants to access a document needs an address. To facilitate the access of documents distributed throughout the world, HTTP uses the concept of locators. The uniform resource locator (URL) is a standard for specifying any kind of information on the Internet.
The URL defines four things:
The method is the protocol used to retrieve the document, for example HTTP. The host is the computer where the information is located, although the name can be an alias.
Web pages are usually stored in computers, and computers are given alias names that usually begin with the characters “www”. This is not mandatory, however, as the host can be any name given to the computer that hosts the web page.
The URL optionally can contain the port number of the server. If the port is included, it should be inserted between the host and the path, and it should be separated from the host by a colon.
Path is the pathname of the file where the information is located. Note that the path can itself contain slashes that, in the UNIX operating system, separate the directories from subdirectories and files.