5 : Browsers

JafSoft's Introduction to the Internet

(You can download a .ZIP file containing an up-to-date version of these files)

5 Browsers

5.1 Overview

Browsers were developed to use the HyperText Transport Protocol (http). A browser views a page written in HTML. HTML is a language that describes in abstract terms how a page should be laid out. It also allows hyperlinks to be defined. It is this feature that allows browsers to go from page to page, and essentially defines the web.

The standards body for the HTML language is the W3 consortium. Their web site (http://www.w3.org/) remains one of the major definitive sites for information on HTML.

The first browser was developed at CERN, given away free, and became known as Mosaic. Since then more commercial browsers have been invented (some still free) whilst Mosaic itself has fallen behind and is no longer under development.

The exponential growth of the Internet has allowed companies like Netscape to come from nowhere to having a turnover measured in 100's of millions of dollars.

At present the main browsers in use are

Netscape. Originally given away free, but now a commercial product. Netscape's browser gained in popularity because it added a lot of new features to HTML. In doing so it made HTML non-standard though a lot of the Netscape extensions have since been accepted into standard HTML. One of the major Netscape extensions was JavaScript.
Internet Explorer. Microsoft's (rather late) response to Netscape. Not yet as popular, but being free and ubiquitous may fix that in a few years. As with Netscape, IE has also extended HTML in its own way.
Hotjava. Not that popular at the moment, but supplied by the people (Sun) who supply the Java programming language and environment that is likely to play a large part in making web pages more interactive.
Mosaic. Still in use, but showing its age, and no longer being developed.
Lynx. This is a text-based browser that is remarkably popular despite (or because of) this fact. Lynx can be a very fast and effective way of browsing the net, if only because it dispenses with all the time-consuming graphics.
Opera. New kid on the block. Given good reviews primarily for its reputation for being fast (especially since the big 2 went to version 4 and became unweildly). Probably not as "fully featured" as some others.

5.2 What all browsers do

Actually, people will be amazed at how little all browsers do. The basic browser displays text, shows text hyperlinks, and allows those links to be selected.

You only need to bear this in mind if you're putting your own pages on the web.

5.3 What all browsers don't do

Over time more and more features have been added to browsers. These changes have arisen through

Official and unofficial changes to the HTML standard. Later versions of a browser usually support newer features of the HTML. Changes in HTML over the years have included
- Use of TABLES
- Use of Frames
- Use of Cascading Style Sheets (CSS).
Introduction of JavaScript and other scripting languages
Introduction of Java, ActiveX and other interactive software
Introduction of plug-ins and add-ons. Plug-ins are commonly required to handle special file types (e.g. audio and video files)

Whether or your browser can do any or all of these things depends on

Which browser you are using. For example there is great rivalry between Netscape and Microsoft, with the former favoring Java and Javascript and the latter favoring ActiveX and trying to introduce Visual Basic Script.

This is one reason that you sometimes see a "best viewed in XXXX" logo

Which version of the browser you are using. Older versions do less.

This is another reason that you sometimes see a "best viewed in XXXX" logo. In these cases it usually states a version number of both main browsers and may offer an alternative version (e.g. a non-frames version)

Whether or not you have the requisite features "switched on". For example use of Java and Javascript can be switched off in the options of those browsers that support these features. This is because of possible security worries related to these features.

This is why you may see a "use a java-enabled browser" type message on a page.

Whether or not you have have the requisite add-ons and plug-ins.

Normally a page that requires such an extension will point you to where it can be got from.

Incidently, one of the major reasons you'll see a "best viewed in XXXX" logo, is that XXXX will have offered free software to the web page author.

5.4 Understanding Web addresses

Web addresses are a special type of URL. They take the form

http:// [Internet node] / [resource name] ? [extra data]

The "Internet node" can either be an IP address or a Domain name, unless you are browsing your organisation's Intranet, in which case it will be some local machine name.

The "resource name" will normally look like a Unix file or directory name. For example. These look like Windows 95 filenames, with the slash the other way round.

/index.html
/pub/
/pub/download/file.txt

Directory names should end in a "/". If they don't the remote server will normally have to add this for you, incurring an extra delay.

Resource names are often case sensitive (depending on the host machine), so you should usually match the case of the URL as you're given it.

Sometimes you'll see a tilde (~) at the start of the resource name. This often points to files belonging to a user of the machine you're visiting, e.g.

/~jaf/jafs_file.html

Knowing that the resource name often corresponds to real files and directories on the target machine can sometimes be useful, as it allows you to work out which directory the file is in, and to attempt to look at that directory, or the one above it. In the above case if you want to see what other files use Jaf has, you could try

/~jaf/

However, if the user doesn't want you to see the directory contents, they can create a file (usually called index.html) which the server will search for first. If such a file exists then this is what you'll be shown.

"Extra data" is usually only required when passing information to software at the other end such as a search engine. The format will depend on the resource being accessed. You almost never type this part in manually, rather it is added automatically by your browser in response to data you have typed in.

A fuller description of URLs can be found in RFC 1738, e.g. at http://www.cis.ohio-state.edu/htbin/rfc/rfc1738.html

5.5 Using web browsers

5.5.1 surfing

Browsers are easy to use... that's their attraction. Normally you simply enter the URL of the page you want to visit, and away you go. There are several ways of going to new pages :-

Type in another URL
Click on a text or picture hyperlink
Select a URL that you've added to your bookmark list.

The hardest part is deciding what URLs you want to visit in the first place. For this you'll need to use search engines, and to bookmark useful starting points. Increasingly people are advertising URLs in non-Internet locations such as newspapers, magazines and on television.

Another possibility is to find a site that regularly compiles lists of interesting sites to visit. Taking this approach one stage further leads to a site such as Web soup that compiles lists of lists of people's recommendations. If you really want to surf at random, start here.

5.5.2 Search engines

Search engines are an invaluable aid in locating pages that are of interest.

The Internet is so large that locating good quality information is both possible and hard work. Search engines make this task much easier.

The basic idea is that the search engine will have visited and documented a large number of web pages whose details it will store in a database.

You simply visit the search engine, enter a request, and all the URLs that match your request are shown, often ranked in some order of suitability.

There are universal search engines such as Altavista and Dejanews that allow you to search the entire Internet or Usenet, and subject-specific search engines

So popular (and necessary) have search engines become, that many sites offer search engines for just the web pages on their site. Search engine technology is forever improving, to the extent that Digital now license their AltaVista technology to other companies, have developed the LiveTopics feature of AltaVista to help you more intelligently search for data, and even offer a version for use on PC's to search all your own documents.

A site dedicated to monitoring search engine development can be found at

http://www.searchenginewatch.com/reports/index.html

This is more dedicated to discussing and monitoring the preformance of various search engines. At the same sime is a page dedicated to listing specialist search engines

http://www.searchenginewatch.com/facts/specialty.html

5.5.3 Sending email

Most modern browsers allow you to send email. Usually this is invoked whenever you click on a "mailto" hyperlink, or whenever you select a mail option from menu. In essence this is no different to sending email normally, but you should be aware of the following:

Often the mail software inside your browser is completely independent of any other mail software you have. Because of this it will need to be configured independently to make sure that it behaves the same as normal mail. In particular you won't get a copy of any mail you send unless you set it up correctly.
If you share a PC with other users, it may be inconvenient to keep changing the return address etc.
You usually have a "mail document" option. This usually sends a copy of the HTML file you're looking at. If you merely wanted to send the URL this can be a bit wasteful, so check for attachments before you send

5.5.4 Downloading files

It's quite common now to download files using browsers. In many ways this has replaced the older FTP software which required you to supply a username and password in order to access files, although there are still many resources that are only accessible this way.

Downloading a file usually through the FTP protocol by clicking on a URL that starts ftp:, or by selecting a http: link with a known download filetype such as .zip.

When file download is selected, you will be prompted for a location on your computer to save it to. The file will then download. In older browsers you can't continue browsing whilst the download is occurring, in newer browsers you can. In most cases a status bar may give some indication as to how long the transfer still has to go.

When downloading at peak times, or from busy sites, this process can become quite slow, so don't be surprised it the time remaining increases occasionally or states "2 seconds" for over 10 minutes.

What happens once the file download is complete depends on what you've downloaded, and how your machine and browser are set up.

In some cases nothing happens, and it's up to you to make use of the file in whatever way suits.

In other cases a helper application is launched to "play" the newly downloaded file, be it a piece of music, some video of just a special document type.

5.5.5 Bookmarking popular sites

All browsers allow you to bookmark sites that you want to go back to time and time again. However, depending of the browser used this list could be called the Hotlist, Favourites or Bookmarks.

In most cases you can group these URLs together into folders forming a hierarchy of links like files and directories on your hard disk.

5.5.6 Taking a local copy

Most browsers will allow you to view HTML files stored on your own hard disk. This being so, they also offer the ability to save the page currently on display to your hard disk for later viewing, e.g. once you are no longer connected to the Internet.

This is usually an option on the File menu, and is sometimes an option on a pop-up menu if you right click on the main body of the page.

You can usually copy images by right clicking on them and selecting save.

Taking a local copy can give you faster access and off-line access to a page, but there are a number of issues to be aware of

The saved page has all its links, but not the pages referenced by those links. Consequently, if viewed off-line you may find all the images missing, and all the hyperlinks to pages other than this one won't work.
By viewing the copy, rather than the original, you lose the ability to view any changes or other information the original author would want you to see.
If you reuse or redistribute the page in anyway, you are breaching any copyright the original author has on the page.

Generally it's fine to take a copy for personal use and convenience.

5.5.7 Viewing HTML source

Most browsers will allow you to view the HTML source of the page (or frame) being viewed.

This is an invaluable aid when debugging your own pages, and for learning how other people have put theirs together.

It can sometimes give you additional information, though most of the information the author wanted you to see is already on screen.

5.5.8 Tricks for using browsers

Here are a few miscellaneous tricks

If you want to view a page that has changed, press reload or refresh. Sometimes to force this, hold down the shift key at the same time.
Most browsers will store the data they download in a local cache so that next time you ask for the same data it can retrieve the local copy, giving a faster draw.

You should check your set-up options to understand what caching you have, and be aware that by being served a local copy you may not see any changes immediately.

Browsers have lots and lots of options. You should look though the menus to ascertain what there is, but only play with the ones you understand.
If you want faster downloads, try switching images off. In this mode you only get the text, which naturally comes much faster. You can always click on the missing picture, or switch the option back on and reload should you decide you want to see the pictures again.

However this can make some pages unnavigable, as people over-rely on graphic image maps (pictures where you click on the bit you want).

5.6 Extending your browser's capabilities

Over time more and more functionality has become available over the Internet. Inevitably this means that the time will come when you are missing out because your browser if not up to it. There are a number of ways of enhancing your browser.

5.6.1 Change browser or get a newer version

This is the "throw it away and get a new one" approach. Depending what you change to this may cost you money. Make sure your computer is powerful enough for any new version you decide to get.

When you install the new version, it might be useful (if you can afford the disk space) to keep the old one just in case.

You may be able to try out the new version free for a while to see if it's what you want.

Finally, be aware that adding any new software to your system can cause unexpected changes to your systems configuration. Internet explorer is particularly keen to set itself up as your default browser should you install it.

5.6.2 Install some Plug-ins

Netscape were the first to develop the idea of helper applications and plug-ins. The idea behind plug-ins is that rather than produce an enormous, resource-hungry piece of software that does everything, why not instead make a slimline browser to which you can add only those extras you need.

You'll know you need a plug-in when you keep coming to a site that tells you what you need. Normally you can simply download the plug-in by following the link and downloading and installing the software as instructed.

Plug-ins commonly handle a particular file type, often a new type invented for use on the Internet by the manufacturers of the plug-in itself.

Some plug-ins are free, others are not. Often the plug-in required to read or play back a given file type is free, whilst the software required to author such files is not. This is a payment model frequently used, as it ensures maximum take-up of a new file structure.

5.6.3 Enable active content

HTML as originally devised was a fairly "passive" language. That is, it could define a page that one could view, but not interact with.

Nowadays there are several ways in which web pages are being made more and more interactive. In most cases you need a browser capable of interfacing with the new content types, and you the need to choose to have these features enabled, usually by searching though network or security options.

5.6.3.1 Animated .GIFs

Animated .GIFs are basically animated pictures. Their most common use is in advertising. In content terms these are passive in that you can't interact with them, but they can make a page more lively.

The problem with animated .GIFs is that they are, of necessity, many times larger than a static picture the same size. Consequently they can greatly increase the time a page takes to draw.

5.6.3.2 Java and ActiveX

Java and ActiveX are both methods of allowing programs to be downloaded automatically and run "inside" your web page. In this way they can effectively give you software you can interact with on a web page.

These programs are temporary in the sense that once you exit your browser the program no longer exists on your machine (in fact, once you back out of the web page its gone).

In the case of Java an area of screen is reserved for an "applet" to run in, and this applet is downloaded and run locally on your computer. The Applet isn't stored on your computer, and is designed to run in a way that cannot contaminate your hard disk or computer memory.

ActiveX is Microsoft's response to Java, but unlike Java it only runs on Windows machines, and is allegedly less secure than Java.

5.6.3.3 Javascript and other scripting languages

Javascript was developed by Netscape as an attempt to make web pages more interactive. It pre-dates Java with which, confusingly, it has nothing in common.

Javascript code is written into the HTML source files themselves. This code is interpreted by your browser as it reads the HTML page. The code can tell the browser what to do when you click on certain buttons, or move your mouse to a certain location.

The language allows messages to be displayed, and can manipulate the contents of the web page (something Java cannot do... it is restricted to the reserved applet box).

Microsoft are hoping to make their popular Visual Basic the basis of their own scripting language. As ever, the two browser companies continue to battle it out.

5.7 Common Errors

5.7.1 The dreaded 404

This is the commonest error. It simply means "file not found". This is usually because the file has been moved, and you have followed an old link.

5.7.2 DNS lookup error

A Domain Name Server lookup error has occurred. What this means is that the Internet Domain name you have specified cannot currently be translated into a valid IP node number.

This could mean the machine doesn't exist anymore, but sometimes trying a second time, or a day later solves the problem.

5.7.3 Access denied

The machine you have accessed is choosing to deny access to the particular resource requested. This is either because you've asked for something you shouldn't have, or the remote machine has undergone a change of configuration or is undergoing some maintenance.

5.7.4 URL case sensitivity

Be aware the URLs are - in theory at least - case sensitive. This means you should always type in a URL exactly as you see it.

Whether or not a particular URL is case sensitive will usually depend on the type of computer the server is, and the sort of server software that it runs.

5.8 Cookies

Cookies are tiny nuggets of data that a web server can get a browser to write to your computer. This nugget of data can only be passed back by your browser to the same web server.

This device allows servers to keep some context information on each visitor to their site. For example a search engine could use this to remember what topics you were interested in last time you visited a few weeks ago.

This technique is becoming increasingly popular as a means to customize the way a particular site works for you. Because of this access to some sites is dependent on being able to accept cookies.

Not all browsers accept cookies, and those that do can usually be configured to show alerts each time someone tries to set one.

Whether you allow cookies to be set is a matter of personal preference.

Back to Contents List