URL is the key user perspective about an web application S V Ramu (2003-02-16) Prelude
The Internet is the culmination of years of passion and commitment to collaborate, in spite of differences. The three key broad conceptual pillars of this Internet revolution, is the HTTP, HTML and the URL. HTTP, for most common Internet user, is well behind the scenes. You can in fact build and run a site without knowing even a single command of HTTP (Hyper Text Transfer Protocol - RFC 1945, RFC 2068). HTTP's job is just to make two different computers to talk together. Simply put, a URL (Uniform Resource Locater - RFC 1738) is a standard naming convention to call a website and its pages. And HTML is just a convenient page presentation format. In itself each of these concepts/standards is fairly simple, but together they have created this amazing world of Internet that we are all the fortunate benefactor. By the way, this RFC (Request For Comments) scheme is by itself a remarkable example of decentralized interests but yet very coherent standards. Every minute bit of the Internet usage is documented and standardized. HTTP and Distributed Computing
...The HTTP protocol is based on a request/response paradigm. A client establishes
a connection with a server and sends a request to the server in the form of a
request method, URI, and protocol version, followed by a MIME-like message
containing request modifiers, client information, and possible body content.
The server responds with a status line, including the message's protocol
version and a success or error code, followed by a MIME-like message containing
server information, entity meta information, and possible body content...
RFC 1945 From even the early stages of computing, application designers have dreamed of distributed programming. Though much have been achieved, it is still a dream in many parts. Though websites are not examples of distributed computing in general, the upcoming Web Services model, and its related efforts, prove that this very old websites are the forerunner of the current happenings. The basic underpinnings of this web model is HTTP, which is still relevant in the modern web services world. HTTP is a typical wire protocol. Which means it standardize just the pure text messages that is exchanged between the client and the server. The power of this model is that this protocol does not depend on the CPU, operating system, or any other thing of the computers that reside on the both ends of the system. As long as the application on the both ends create these HTTP message and knows how to handle them, things will work. The beauty is it does not depend on the lower network protocol either.
...On the Internet, HTTP communication generally takes place over TCP/IP connections.
The default port is TCP 80, but other ports can be used. This does not
preclude HTTP from being implemented on top of any other protocol on the Internet,
or on other networks. HTTP only presumes a reliable transport; any protocol that
provides such guarantees can be used, and the mapping of the HTTP/1.0 request
and response structures onto the transport data units of the protocol in
question is outside the scope of this specification...
RFC 1945 Whenever you browse to a page, remember that the machine that is serving the page to your cool Windows XP machine, could be an old machine powered by some non-Intel CPU (say SunSPARC etc.), maybe running under Linux or Solaris, and the pages are generated by an handful of Perl scripts, or ASP if you like. The combination can be anything. As long as there is a web server on one side, and a browser on the other side the communication can be happily proceeding. This abstractive power of Internet to bridge any two machines, is what have inspired the modern technologists to use this very same protocol, HTTP, for their new offerings in the form of web services. HTML and XML - The genesis of universal Markup Languages
HTML is a meta language. For a long time ASCII was the only predominant standard that was widely accepted and unquestiningly used. Two machines which can send and receive bit streams are no-good with communication, unless, something is accepted regarding the bit sequence format. ASCII was this format. But, human speech is highly expressive. A Simple ASCII just conveys the content of what needs to be told. The nuances and gestures that we use while we speak to emphasize or elide something, needs more expressiveness than what ASCII can provide. Things like showing a specific text in bold or italics is the usual way of coding this nuances into a plain ASCII text. HTML is just one such simple tagging meta language to encode such nuances of speech into plain ASCII, using ASCII only.
Is the HTML's tagging format the most optimal one? Mostly yes. One
superfluousness that I find is regarding the closing tag. If say the font
tag starts with
If you had noted, the whole HTML meta encoding, and hence the XML encoding use
just 5 meta characters, namely
If you had seen the efforts like NanoXML, or TinyXML, which want to simplify the
XML further, you notice that they envision that new format mainly to encode data
structures. The main target of these simplification efforts is the 'attributes'.
But the fame and usefulness of XML like tagging, came into fore only in HTML,
which is not a data structure per se, but is a document. When you want to
say that a given word should be 'red' in color, it is messy to think that we need
a tag like URL - The old fame and new meaning
In the early days, or in a simple case even now, when a website is just a
collection of HTML files, standardizing a convention to call these files
directly through the URL was important, as the whole process can be automated
with a complaint web server which will appropriately pick the files from the local
directories and serve it across the net. For example if you say
In many modern websites, even a call to This redirecting of URL is being practiced for a long time now. But, thinking of our web application itself as a layer, separate from the user's URL model, is becoming more and more important, especially for web applications that can be packaged into a CD, and given to different clients. While developing a web application, we do need to fill the response pages with links and real URL. But at that time we do not know what our server customers would like as the URL model. For all we know, they might want to use our application in tandem with their existing dynamic pages. If so the need for having configurable URL would be an elegant extensibility factor for our web application. While HTTP has abstracted the client and server completely to each other, what remain as the interface to server for the client is only the URL. The URL now assumes, almost an API like importance. The URL is our client's only interface to our web site or service. For many ISPs it might be mandatory to abstract this URL model completely to both the client and the application. The URL given to the client might be strategically important, and might need to be fairly unchanging, irrespective of their web application's status or upgrade. Maybe we can start to think about an website as a collection of URLs that need to be serviced in a certain way, and then work towards satisfying those URL requests. Maybe this way, the testing of a site, might be testing of a series of URLs. Implementing the URL abstraction
Appreciate the following similarities between a URL, The key thing that differentiate these two is the easy possibility of distributedness in case of the URL. Thus URL by itself is a reasonably endowed RPC (Remote Procedure Call) format. So, to accomplish this complete URL separation to our application, we must allow the URL coming to the web server to be first transformed into appropriate service processor. And response that is generated by our application, should do it based on internal logical URL, which will then be transformed into client based URL, based on the mapping module that in first place directed the client request to the web services. Practically I'm considering Apache like URL rewriting capabilities, that can be dynamically configured with Regex like tools. Once the incoming URLs are transformed suitably with the powerful server based regex capabilities, it comes to my universal front controlling servlet/jsp, which then dispatches the request to appropriate request processing Java interfaces. Epilogue
I must tell that these thoughts started from my first attempt at creating an redistributable web application. For all I know, these ideas might be already practiced by the veterans in this field. All the same I tried to explains the details of the route I took. Moreover I wanted to share this exciting understanding that the web model is trying to give all the API (Application Programming Interface) like capabilities through URL. |