$_$_TITLE JAF's Introduction to the Internet
$_$_DESCRIPTION Yezerski Roper's Introduction to the Internet. Available freely in .zip format
1 Introduction
==============
This document gives a guide to how to get the most out of access to the
Internet. It gives a brief history of the Net, an overview of its major
features, and hints and instructions on how to use the software available
for the Net.
This document was originally written in support of a seminar given by the
author at Hong Kong University.
The text version of this document will be converted into HTML using the
author's own [h:AscToHTM] conversion tool. The HTML version of this document
will contain hyperlinks to additional sources of information.
The last major revision of this document was in May '97. The last
minor revision date is shown in the footer.
1.1 Other guides
There are many, many detailed guides available on all aspects of the Internet.
For example
http://www.library.unt.edu/webintro/howto.html
The Big Dummy's guide to the Internet is a rich source of information and
can be found at
http://www.eff.org/papers/eegtti/eeg_toc.html
Of course you can search for lists of search engines. Here's two that I've
found
http://www.rcch.com/hotlist/search.htm
http://www.med.harvard.edu/countway/webref/inet.html
2 The Internet
==============
2.1 Some history
The Internet is actually a child of the Cold War. Faced with the prospect
of a Nuclear war, a resilient network was required such that, should one
communications centre be destroyed, messages would automatically re-route
themselves to ensure they could still get from A to B.
This led to ARPANET which became an American Universities network, and
ultimately the Internet of today.
In the early days, sending information via this network was neither
particularly rapid nor easy. At this time you had to be fairly technically
minded and working at a university or major research to use the Net.
But over time, more and more useful utilities and tools were being developed,
and as computer hardware dropped in price and commerce became interested, the
Net began to spread, and really hasn't stopped since.
The "half-life" of the Internet is around 10 months, that is 10 months ago
it will have been half the size it is now. This growth has been going on
for years now, and still looks to continue until, in a few years from now,
almost every computer in the world will access the Net. This process itself
is encouraging computer ownership, to the extent that "getting on the Internet"
is now a major reason for purchasing a computer for home use.
For a brief summary of the Internet's growth, visit
http://www.nw.com/zone/host-count-history or
http://www.weyrich.com/web_business/www_history.html
For surveys of who is using the web and how, visit
http://www.gvu.gatech.edu/user_surveys/
The real breakthrough for the Internet was the invention of the World Wide Web
with the introduction of the Mosaic hypertext browser developed at CERN.
This coincided with affordable graphics-capable home computers and almost
overnight the Net went from a text-based [[GOTO Nerd]]'s paradise, to the user
friendly click and point world we know today.
Did I say user friendly? Well almost... read on...
2.2 How it works, and why it sometimes doesn't
The Internet works by breaking up information into packets of data. Each packet
of data is given an address and sent off on its merry way. When the packets
are received at the other end, they are reassembled to give a faithful copy
of the original data.
The Internet consists of a network of computers all passing messages to and fro.
Each packet gets passed by a machine to its neighbours which then decide to pass
it on, or pass it back. This trial and error approach is both the Internet's
greatest strength and weakest.
It's a strength because if a Internet node goes down (and this happens even
without nuclear strikes), the messages simply divert round the missing node.
This may mean taking a detour via a satellite link over the Indian Ocean, or
travelling via optical fibre via America. It really doesn't matter to you,
the Internet sorts it all out.
It's a weakness because you usually need all your packets to reassemble the
original message, and if one takes a detour this may delay your whole message.
If one gets lost, it will usually prevent you getting the rest of the message.
You can see this in a browser sometimes when a link stalls for no good reason
and then after a while carries on. Most likely one of your packets just took
the tourist route.
In summary, the Internet is flexible, but may as a result not be all that fast.
2.3 Domain names
Each internet node is given its own [IP] or IP address. This
is a series of four numbers such as
176.5.120.34
and is unique. Most internet nodes also choose to make this number correspond
to a Domain name, that is a text name that is more comprehensible to users.
The translation between names and IP addresses is performed by Domain Name
Servers, and failure to convert a name to an ID gives the dreaded
[[GOTO DNS lookup error]] or error 404.
2.3.1 What's in a domain name?
Domain names form part of your email addresses, web addresses, ftp addresses
etc.
In the case of email, the domain name is that part of the email address *after*
the "@".
The part of your email address before the "@" depends on the administration
of email on your Internet node.
2.3.2 How domain names are allocated
The way it works is broadly speaking as follows. The IP address space
(in the form nnn.nnn.nnn.nnn, all numbers) is carved up into smaller chunks,
which are then administered by different organisations. This carving up is
all done behind the scenes by some IP committee somewhere.
In the early days reserving names of well known companies used to be quite
lucrative, as the named company would eventually have to buy the name off you.
There is still some competition for "good names".
In order for 194.159.108.2 to be understood as www.yrl.co.uk, someone
on the Internet backbone has to act as a domain name server, i.e. they know
how to match a name with an ID.
These organisations, often ISPs, can sub-let parts of domain space, and arrange
the allocation within their own space.
Thus extensions like .ac.uk (uk academic establishments) are run by
appropriate bodies. Each body is free to charge for this service, and in
the commercial areas there is usually a fee in the £100/year region.
There may be further rules within each domain. For example an .ac.uk
domain name won't be allowed unless you're a recognised university.
2.3.3 Understanding domain names
Domain names are similar to postal addresses, in that the last part of the
address is most likely to be familiar to you. Each domain name reads
roughly as follows
[machine name].[organisation name].[domain type]
Reading this backwards we have the domain type which is always present.
The main domain types include
.com - commercial
.org - organisation
.edu - education
.net - network provider
.mil - military
Since the Internet evolved in America, most of these apply mostly to American
(or multi-national) organisations. Other countries append a (usually) 2-letter
country code, and most reproduce the above structure in some way. Thus in the
UK we have
.co.uk - UK commercial
.ac.uk - UK academic
.org.uk - UK organisations
Other countries have slightly different organisations.
In front of the domain type is the organisation name. This must be present
and unique within the domain type. Thus "microsoft.co.uk" and "microsoft.com"
are both allowed (and both exist).
In front of the organisation name is the machine name. For Web access this
is commonly just "www". Thus the web address of Microsoft is
www.microsoft.com
indicating that this is the web access for the American company Microsoft.
Knowing this naming structure you allows you to frequently guess correctly
what the [[GOTO URL]] of a desired site might be.
2.3.4 More information
- The ruling body for domain names is, or rather was, the [h:IAHC].
(This body was dissolved on May 1, 1997 and now refers you to
http://www.gtld-mou.org)
- A full list of existing domains can be found at
http://www.itu.int/intreg/dns.html
- A list survey of domains and how many hosts they have can be
found at http://www.nw.com/zone/WWW/dist-byname.html where this
analysis is repeated every 6 months.
- Plans are under way to introduce new domains e.g. .firm, .nom
etc. See the discussion at http://www.netfact.com/iahc/
- There are some sites that will translate domain names into
real company names, often as an aid to choosing your own
domain name and making sure it doesn't conflict with an
existing name. See http://www.namesnet.co.uk/ for example
2.4 Software and services currently supported by the Internet
Using the Internet as a communications network an increasing number of
protocols have evolved to allow different types of information to be
distributed. Many of these services were originally text-based and
unfriendly to use. As such, they have been superseded by the Web, but we
mention them here for completeness.
2.4.1 Email
Email is one of the earliest and still popular and enduring uses made of the
Net. Reflecting this, there are a large number of free and commercial email
packages of increasing sophistication.
Email is almost enough justification in itself for seeking Internet access,
especially since these days all sorts of information can be sent this way.
See the chapter on [Section:email] for a fuller discussion.
2.4.2 File Transfer Protocol (FTP)
FTP is a means by which users can transfer files between computers. A typical
use of FTP is to set up a file server onto which users can login and download
software and other files.
Although increasingly integrated with Web browser software, there are still
a number of very useful FTP sites that cannot be accessed by "anonymous login"
See http://www.eff.org/papers/eegtti/eeg_137.html for details on
how to use FTP outside a browser.
Used inside a browser, FTP appears just like a directory listing, and you
can simply navigate up and down the directory tree.
Useful sites include [SIte:RTFM] which lists all the FAQ's by usenet group,
for example
ftp://rtfm.mit.edu/ftp/pub/usenet-by-group/alt.games.tiddlywinks/
contains the tiddlywinks FAQ (posted monthly).
Also sites like
http://ftpsearch.ntnu.no/ftpsearch/
can help you find the FTP site that holds the resource you're looking for.
2.4.3 Telnet
Telnet allows you to log-in to another machine over the Internet.
For more details see http://www.eff.org/papers/eegtti/eeg_93.html
2.4.4 News, or UseNet
Usenet or News is a set (21,000 and rising) of public discussion groups.
Messages are [posted] to the [newsgroup], and distributed to all who read
the group.
These are an unrivalled means of communication with your peers and a rich
source of information. They are discussed fully in the chapter on
[Section:News]
2.4.5 Internet Relay Chat (IRC)
Internet Relay Chat allows you to "chat" interactively with one or more people.
To do this you need IRC software installed, and you need to join a
"discussion room". These rooms are arranged by subject, so in principle
you should be able to find someone to talk to about something.
For more details see http://www.eff.org/papers/eegtti/eeg_230.html
2.4.6 The http protocol (the Web!)
The HyperText Transfer Protocol (HTTP) marks the arrival of the Web proper.
It is discussed fully in the chapters on [Section:Browsers] and
[Section:WebPages].
2.4.7 Other services
Other services of less interest here include
- Gopher
- Archie
- Veronica
- finger
2.5 Software and services "coming soon" to the Internet
2.5.1 Smart agents
Software is already starting to appear that will search the Net intelligently
sniffing out information that you want. It's difficult to see where this
will lead, but an often quoted example is that software could search all the
news web sites to construct a daily "newspaper" containing only articles
known to be of interest to you.
2.5.2 "Search" technologies
Search technologies are already here, but expect them to get smarter and
smarter. For example, [Site:Altavista] have recently added something called
Live Topics, which analyses and categorizes your search results to help
you fine tune your results further.
[h:Oil_Change] is software that will analyse your PC, and search the Web for
any software upgrades available, offering to [download] and apply any updates
it finds.
2.5.3 "push" technologies
The whole way software is purchased is changing to be increasingly through
the Internet. New technologies such as [h:Castanet] and [[GOTO Java]] allow you to
have software that will automatically update itself each time you connect to
the Internet.
Windows '98 is likely to feature this technology in the guise of "channels"
that you can "tune into" to receive regular updates of news and software etc.
2.5.4 Internet Telephones
Again, already here. You can now make International telephone calls from
computer to computer via the Internet at a fraction of existing prices. Only
bandwidth and protected interests are stopping this taking off now.
2.6 Security and privacy
Security and privacy were never high on the list of design objectives when
the Internet was first designed. Although probably not as big a problem
as newspaper headlines might suggest, you should be aware of the following.
points.
- Messages can get lost. Thus you cannot rely on message always
getting through, or getting through on time.
If a message is vital, make sure you get a reply.
- Messages pass through many other nodes, thus they can in principle
be read in transit. This means that you should be very wary about
offering sensitive information on the Net or in Emails.
If in doubt, don't do it. Use the Net to get a voice phone or fax
number and use that instead.
- Because Email is sent as plain text, there is some risk of it being
intercepted. You can get round this by using various encryption
techniques, but these are illegal in certain countries.
- Email is easily faked. Particularly if you don't read the headers
(which many packages hide). This means that you cannot always be
sure that the person who apparently sent a message did.
If you want to check, try sending a mail to the person. If that fails,
or if you get a "not me" reply, then it may have been faked.
- If you [download] software onto your machine you run the risk of
introducing a virus into your system.
This is a tricky one. The simplest solution is don't do it. If
you must do it, only download from trusted sites, and use a virus
checker. However, even this is no guarantee as software downloaded
from Microsoft has been infected in the past.
- If your browser is ActiveX or Java enabled you may download software
without realising it. Although both these systems are supposed to
be secure, there has been much discussion over the extent to which
this is true.
If in doubt, disable these features in your browser.
- If you post to a newsgroup, your email address will get captured
and used for [[GOTO Spam]].
This is the Net equivalent of junk mail, and is simply a fact of life.
Note, most of these risks are no different to those you run using paper mail,
cordless phones, mail order catalogues, or software off a bulletin board.
The difference is that you will be doing this electronically, more frequently
and more publicly than before.
Common sense will get you past most problems.
2.7 Getting access to the Internet
Access to the Internet is usually though either the academic or commercial
organisation you work for, or from an Internet Service Provider (ISP) contacted
from home.
At work you are likely to have permanent access through a high-[[GOTO bandwidth]] link.
At home you are likely to have sporadic dial-up access through a modem.
Modem access is usually at local telephone rates, which is free in some
countries, but by no means all. Additionally modem access can be relatively
slow, making it proportionally more expensive.
This usually means that private users are less enamoured of high graphics
content on web pages, are less inclined to download large software programs,
and won't / can't afford to touch video content with a bargepole.
Many web page designers forget this simple fact.
2.8 Response times/loading
The response time on the Internet varies according to a large number of
factors, most of which are simply the consequences of its success and phenomenal
growth rate.
Factors include:
- What type of access you have. Are you on a modem, or a high bandwidth
network link.
- What time of day it is. The demographics of the Net are largely
American, thus when America sleeps access to American resources
may be easier... unless the rest of the world has the same idea.
- What day of the week it is. Most commercial access occurs during
working hours. So does most recreational. Since most people have
access "at work", things are often quieter at the weekend.
- Where you are trying to get to. At one time access from England to
France went via America because there was no direct link. At that
time accessing American sites was easier and faster that accessing
geographically closer sites in France.
- How many other people is your service provider serving. In extreme
cases this can translate into "can I even connect to my ISP". If
your local server is carrying too much traffic it will slow things
down.
This behaviour usually alternates as service gets so bad that people
leave and new resources are commissioned. This used to happen for
the whole UK's access to the US, but is more consistent now.
- How many other people are accessing the same site. Microsoft has
a very powerful site, and it's *always* slow. I have a very weak
one, and it's seldom slow :)
- What technologies have been released. Each new technology requires
more bandwidth than the last, images, animated images, Internet
telephones, video on demand. Each has threatened to grind the Internet
to a halt... but it hasn't happened yet.
2.9 Commerce on the Net
Commerce on the Internet has not fully developed yet. Whilst few companies
or organisation these days do no have a presence on the Internet, it's
still early days for commerce via the Net, though the predictions are
quite astounding.
What is true is that searching for items via the net and advertising via
the net are certainly here to stay. What is missing is a reliable, secure
and international method of paying for services.
However, fear not, for it's on its way. For example see Digital's plan for
[h:Millicent], a proposal to allow microcash to be charged each time you click
on a particular service on a web page.
The days of free Internet services such as electronic newspapers
may be limited.
2.10 Intranets
So successful has the Internet been, that it's spawned a child - the Intranet.
Intranet is the term used to describe the adoption of Internet technologies
such as IP networks, email and browsers for the internal networking needs
of an organisation.
One sign of this process is the increased use of HTML to produce Web-like
documentation, and the distinction between [on-line] and [off-line] software and
services begins to blur when documents reference on-line source material
and are even capable of being updated automatically via the Internet when
accessible.
This adoption of Internet technology is attractive because the software
is cheap or free (due to its mass appeal), and familiar (also due to its
mass appeal).
Equally the browser software that has made the Internet so popular with
users is felt by many to be more intuitive than many traditional software
interfaces.
This last point has been particularly taken on board by Microsoft, who are
increasingly making browser Internet Explorer into a larger and more
central part of their future operating systems.
3 Net Culture
=============
As the Internet has grown it has developed its own jargon, slang and rules
of accepted behaviour. If you don't know who your [[GOTO ISP]] is, have never read
any [FAQ] and have no idea if you're being [Trolled], then chances are you're
a [[GOTO newbie]] and it shows.
There's nothing wrong with this, but whilst you're soaking up the culture
you'll be stepping on people's toes, and you're going to get [flamed]
3.1 Jargon
----------
3.1.1 Computer Jargon
Being based on computers, the Internet is full of Jargon. What is worse is that
the pace of change means that newer, fresher Jargon is always being added
meaning that there's always new scope for impressing/confusing people.
As the Internet has become more popular, the language has become more flowery
so that where once terse acronyms such as [IP], [[GOTO FTP]] and [[GOTO IRC]] were common,
these days semi-meaningless phrases are place names such as Cool Talk,
Live Topics, Marimba and [[GOTO Java]] are all the rage.
It's all still jargon unfortunately.
We've supplied a [[GOTO Glossary]] which you should browse, but it will soon
become out of date.
3.1.2 Internet slang
As the Internet has offered people a new medium by which to communicate, so
too it has developed its own slang and terminology. This is particularly
true in email and newsgroups which are conversational by nature, and is less
true on Web pages which are of comparatively fixed content.
Most of the slang has evolved to describe activities that just don't happen
outside the Internet such as [[GOTO Trolling]], [flaming], [[GOTO Spam]] and of course
the ever-present [smileys] that have actually crossed over into non-Internet
culture.
Again, we've tried to give you a flavour in the [[GOTO Glossary]].
3.2 Netiquette
---------------
Netiquette is, unsurprisingly, short for Net-etiquette. As with etiquette
there are no hard and fast rules, but there is a broad consensus of what the
dos and donts should be.
3.2.1 What to do
Here are some guidelines :-
- Be courteous. Unless you are convinced that someone has deliberately
said something to annoy you, you should give people the benefit of the
doubt.
- Remember that many people are not writing in their first language. So
try to be a little tolerant of spelling and grammar. The exception to
this is in the newsgroup alt.usage.english.
- Bear in mind that each part of the Internet has its own local customs,
just like countries do. Try to observe and discover what these are
before diving in. What this often means is [lurk] for a while, and
if possible read the [FAQ].
You can read a set of guidelines on how to use News at
http://sasun4.epfl.ch/News/Document
3.2.2 What not to do
Some don'ts :-
- A good example of what not to do is given by the famous Emily postnews
(for example at http://scwww.ucs.indiana.edu/FAQ/Emily/). Read this
advice and ignore it :)
- Don't use up [[GOTO bandwidth]] by posting binary files where they are not
wanted. Usually there are accepted places to post such files, and the
usual practice is to post the binary in the acceptable location, and
to post a pointer to it in other relevant locations.
- Don't steal other people's art work/web pages.
- Don't cross-post news articles to too many newsgroups (see [[GOTO Spam]])
- DO NOT PASS ON ANY "MAKE MONEY FAST" SCHEMES. This is serious, as
people complain about this and it is common to lose Internet access
for doing this. More seriously this can sometime be construed as
mail fraud in the States.
- When responding to a email or news article try not to "quote" the
whole original. [[GOTO Quoting]] is accepted good practice, but you should
limit your quotes to only the relevant parts.
3.2.3 What to beware of
There are a number of well-known scams on the Net. Because of the high
[[GOTO newbie]] quotient of the Net, there are always people ready to try them out
or fall for them. You'll soon see a series of sheepish "I'm sorry, I didn't
realise" posts.
The ones to avoid are
- "Make money fast". There are no end of pyramid letters on the
net. They don't work. If you do the sums you soon realise that
if they did work you'd never see anything but these messages on the
Net. Although it may seem like it at times, this is not the case.
If you're not convinced, visit http://ga.to/mmf/ and
see people who've tried it ridiculed and their "success" examined in
detail.
People react angrily to these irritating posts, complaining to
the [postmaster] involved, and frequently succeeding in getting the
offending poster dropped form their Internet account.
In many parts of the world such posts are illegal.
Knowing all this, some people use bulk email software and a fake
email address to send these messages out. Needless to say such
people are best avoided.
Unless you know the correct complaints procedure, just ignore these
mails and posts.
- The "Good Times Virus". This is a scare that does the rounds every
few months stating that a virus is going round that is spread via
email with the subject line "Good times".
Needless to say this was impossible, and also needless to say some
jokers sent out mails with the subject line "Good times".
It's impossible to say that any such warnings in the future may
not be true. Whilst it's unlikely that an email message would
contain a virus itself because emails are intrinsically passive,
email are increasingly used to send files around, and it is possible
to embed a macro virus in a Word documents.
The moral is take everything you hear with a pinch of salt, but
don't dismiss it out of hand. If in doubt, seek a more authoritative
opinion for confirmation, for example http://www.av.ibm.com/.
A site discussing "hoax" viruses can be found at http://kumite.com/myths/
3.3 Expressing yourself in text
-------------------------------
Although the Internet is increasingly lending itself to multimedia
communication such as graphics, audio and video, the written word is still
far and away the dominant form of communication used.
Furthermore, a number of factors combine to make the written word on the
Internet different from their paper-based equivalent. These include
- the speed of communication that email and newsgroups have. Often only
minutes from one side of the globe to the other.
- the many different languages that people speak. English is the dominant
language on the Net, but a large proportion of people do not speak it
as a first language.
- the greater variety of backgrounds people on the net have.
- the demographics of the Net.
These factors combine to make the Internet far more conversational and less
formal than traditional forms of written communication.
3.3.1 Emphasis
3.3.1.1 Adding emphasis to words
Emphasis can be added to words or phrases by adding an asterisk (*) either
side. Sometimes other characters are used such as "/". In addition to this
you occasionally see an underscore either side when quoting the title of
something. This is because not all computers handle inverted commas the
same way.
Examples :
This **used** to work okay.
I'm /really/ happy about that.
I enjoyed reading _Alice in Wonderland_
Writing words in capitals is taken as shouting (see below) and shouldn't
be used unless that's the effect you wish to convey.
3.3.1.2 Smiling, grinning, frowning
You can add "mood" to your words by via [smileys] or [emoticons]. These
can be used to defuse apparently critical sentences, or to re-inforce the
mood of the written word.
Example
You shouldn't have done that :-)
3.3.1.3 SHOUTING
Writing in CAPITALS is read as SHOUTING on the Net. If you post an entire
article with your caps lock on don't be surprised if people complain about
being deafened by you.
3.3.2 Quoting
The conversational nature of email is highlighted by the practice of [[GOTO Quoting]]
from the item to which you are replying.
3.4 Demographics and language
-----------------------------
Be aware that the Internet community does not share the same demographics
as the "real world". This is inevitable because of its history and makeup.
As an illustration, look at the number of sites in each country listed
in http://www.nw.com/zone/WWW/dist-byname.html. Those countries without
Internet access reads like a who's who of the third world and repressive
regimes around the world.
Currently the Internet Community is
- Young. Almost all students in the Western world have access, whilst
very few of their parents do.
- Largely American. America dominates the Net, with the rest of the
first world catching up fast.
- Electronic. This means that you can find information on
almost anything that happened in the last 5-10 years, and you can
find - through [h:Project_Gutenburg] - electronic copies of classic
books such as Alice in Wonderland, the complete works of Shakespeare
and the Bible that are out of copyright, but you won't find much 20-30
year old stuff that is still in copyright.
- Computer literate. Historically only those who understood computers
could access the Net.
It's worth stating that all of these characteristics are being normalised as
the Net gains in popularity. That is the Internet community is getting
older, less American, less computer literate (as computers get easier to use),
and more and more "traditional" content is finding its way onto the web.
A useful site that produces surveys of these trends can be found at
http://www.gvu.gatech.edu/user_surveys/
4 Email
=======
4.1 Email addresses
-------------------
4.1.1 What does my email address mean?
Email addresses come in the form
@
The domain name describes the Internet node your mail should be addressed to
(see [[GOTO Domain Names]]). The local user name determines how the mail
is addressed within your Internet node, and will depend on either
a) The machine you run mail on, and more particularly the mail protocol
used.
or
b) The Internet service provider you use to gain access to the Internet.
Some providers, such as AOL, basically give you an account and web space
on their machine. You therefore get a specified number of usernames and
it's a safe bet that your name (e.g. "paul") will have been used. This
explains the many odd AOL user names you see.
In this case, mail addressed to your machine is delivered to you, and it's
down to your machine's mail software to determine what address before the
"@" it supports.
However, this is usually more work than most people wish to get involved
in, and careful choice of ISP will usually allow you to find a suitable
email address such as
my.name@wherever.somewhere.uk
4.1.2 How do I find people's email addresses?
Apart from being told someone's address and writing it on a Post-it sticker
(don't knock it... this works!), there are a number of ways of attempting
to find the email address of someone you want to contact.
In the most part these rely either on the person having registered their
address, or it being "captured" by one of the Internet search engines
that exist. This last technique will only work if the person has taken
an "active" role on the Internet such as posting to a newsgroup.
People who have dormant accounts (i.e. never used publicly) are virtually
impossible to find. In such cases you're as well to ring them up and ask.
Other techniques include
- Use an email address finding service. There are several of these
e.g. http://www.four11.com and in fact Netscape has access to the
address finders such as [h:InfoseekPeople] built into its menu
structure (Directory ... Internet Search)
- Search the [Site:RTFM] email address database.
- Use a search engine. For example [[GOTO Altavista]] and [Site:Dejanews]
will both allow you to search Usenet for name. However this will
only find "active" users, who will probably already be captured
in the above databases.
- Read the "How to find people's Email" FAQ at
http://www.qucis.queensu.ca/FAQs/email/finding.html
4.2 How to use email
The details of using email will depend largely on the software package and
computer that you use. The following discussions are just about general
usage.
4.2.1 Composing email
Normally you compose a new mail either by selecting "new message" in your
mail package, or by selecting a "mailto" link in your browser in email.
The latter is discussed in [Section:Browsemail].
When creating an email you will have to supply an email address to send to,
and can then optionally supply
- Subject. What's it all about (Alfie) ?
- The message itself. Depending on the package you're using and the
editor it provides, the text you type in may automatically "wrap"
(start on a new line). However, not all packages send messages
as they appear on screen, and many send entire paragraphs as a single
line.
This is bad news as some computers have a limit on how large a line
can be, and in such cases the last part of each paragraph is lost.
To avoid this either
a) Manually add line breaks every so often.
b) Configure your mailer to do this for you.
If in doubt, stick to (a).
- CC address. An email address the message is to be copied to. You
can enter your own email address, though it is normal to use the
options of the email package to state if you always want a copy
of outgoing messages.
- Attachments. Increasingly these days it is possible to "attach" files
to your message. This is often denoted by a paper clip icon (signifying
attachment).
When adding attachments you will normally be asked for the name of
the file to attach. Depending on the file type you may also be asked
whether you want the file sent as plain text or encoded in some way
(e.g. mime encoded). Encoding is a way of splitting files into text
that is safe to pass through the email system. Binary files have to
be encoded.
Some mail readers cannot process attachments. If you're sending to
such a person (or don't know), you may be wise to select "attach as
text" if the option arises. This won't be an option for binary files.
4.2.2 Getting your mail
How mail gets to you depends on what type of internet access you have. If
you work for a large organisation you probably have a permanent connection
to the internet, and mail will simply arrive in your inbox.
If you have a home account, you may need to get your modem to dial up your
[[GOTO ISP]] and check for new mail. This can be done manually or semi-automatically.
Either way, when you next start your mail program it will tell you you have
new mail, and will list the messages available.
You simply select the message(s) you want to read, and view them.
4.2.3 Reading your mail
Reading your mail is normally done in a edit-style window. Depending on the
mail software you're using you will normally be able to read the message, and
scroll up and down through it. You probably won't be able to edit it.
Usually once you've read the message it remains in a [[GOTO Mail folders]].
If you wish you can re-read the message at any time by selecting it from the
folder.
4.2.4 Replying to your mail
It's common to reply to incoming mail. Usually your software will offer a
reply option (e.g. an icon or button) from inside your read window. This
will normally launch a window like the one used for mail except that
- The "To" address will already be set to the name of the person the
mail is from.
- The "Subject" will be set to "RE:" plus the original subject. The
"RE:" signals that this is a reply on the same subject.
- The body of the original message may be placed in the edit buffer.
This is to facilitate [[GOTO Quoting]] of the original message, and many
packages will insert the quote character (usually, but not always, ">")
in front of each line.
Note, [[GOTO Netiquette]] decrees that it is bad practice to keep all this
text in the reply, instead you should delete all bar those parts you
genuinely wish to quote and respond to in your reply.
4.2.5 Forwarding your mail
Forwarding takes two forms:
a) You wish to forward your mail to another address (e.g. whilst
you're on holiday). This can usually be done by playing with the
configuration options available in your software package.
b) You wish to forward a particular mail to a third party who might
be interested. This can usually be done via a "Forward" option
(or icon or button) available whilst you have the message selected.
Depending on the software, you may or may not be allowed to edit
or add to the original message, and it may add "FWD:" to the subject
line to signal that the message has been forwarded.
4.2.6 Organising your email
After using email for a while you'll start to accumulate a large amount of
mail and contacts which you'll need to start organising.
4.2.6.1 Mail Folders
Most email software allows you to organise your mail into folders, and to
move mail between folders.
Some packages allow you to have subfolders as well.
Another common feature is for "deleted" mail to be placed in a wastebasket
folder, from where it can be retrieved. In such cases you should make sure you
understand how and when (or if!) this folder gets empty before relying on
being able to retrieve such mail. For example the wastebasket will often be
emptied once you exit the email software.
4.2.6.2 Address books
Some email packages will allow you to store email addresses in an "address
book". This will usually allow you to give commonly used addresses a
nickname, shortcut or alias that will allow you to type in "John" when
sending mail, rather than a somewhat less memorable email address.
4.2.6.3 Deleting old mails
Every so often you should delete any old, unwanted mail. Depending on your
software you may be able to automate this, e.g. for mails over a certain age
to be deleted.
4.2.6.4 Compressing your mail files
Depending on the computer system and software package you may need to
"compress" your mail files ever so often. Read the help for you software
for details.
4.3 Email services
There are a number of services that are available via email.
4.3.1 mailing lists
The commonest use of email. Mailing lists are organised for discussion round
a single topic. Topics range from rock band fan clubs, to self-help groups
for all sorts of medical conditions.
Mailing lists usually have two mail addresses, one you use to join and leave
the list, and one you send actual posts for the list to.
When you join a list you will normally be sent a lengthy message describing
- How to leave the list. This is IMPORTANT as many people join lists
for a short time, and leaving the list can be automated if you
keep the instructions.
- How to post to the list
- What the do's and don't of the list are. Often a list will have
a charter and you should stick to this.
When you post mail to the list a copy is forwarded to all recipients of the
list. Equally any replies to the list are sent to you. In some cases
all posts have to be accepted by a list [moderator], although more commonly
the list will simply have an administrator responsible for list maintenance,
rather than content.
To help you distinguish email from a list from other mail, some lists
put some common text on the subject line.
Where a mailing list has higher volume, it may offer the list in [digest] form.
These are single mails containing several posts to the list. Digests are
sometimes easier to manage.
Depending on how the list is organised, old mails may be archived, allowing
you to read back through discussions previously held on the list.
Contact your list administrator for details.
For a searchable directory of mailing lists, visit http://www.liszt.com/
4.3.2 Netmind
Netmind is a free email service that allows you to monitor when web pages
are updated. Visit http://www.netmind.com/ for more details.
4.3.3 FTP
It is possible to FTP files via email. This is mainly of interest when
accessing files on slow sites, and over an expensive modem link.
4.4 Subject lines
In order to help people better anticipate the contents of an email some
conventions have evolved on use of subject lines. These conventions are
basically a subset of the conventions used in newsgroups.
See [Section:Subject lines] for details.
4.5 How to deal with spam and junk mail
In the context of Email, [[GOTO Spam]] means any unsolicited emails, usually of
a commercial nature. These are the "junk mail" of the Net, and can become
quite irritating.
You'll start getting these as soon as your email address becomes known. This
can happen without you even realising it, most commonly once you [delurk]
in USENET.
Spammers are the scum of the earth, often using fake email addresses so that
you can't reply to them (they'll give a telephone number or [[GOTO Snail mail]]
address for that).
There are a number of counter-measures you can use.
- Don't let your true email address get out. People sometimes set
up their return address to have extra, obviously spurious characters
like "nospam" in it. They then put details in their [signature]
on how to undo this. This will deter some, but I suspect the
commercial spammers will soon get round this.
- Complain to the site that mailed you, by mailing a user called
"abuse" (e.g. abuse@cyberpromo.com) at that site. Not all sites
have such a user. Your message should include all the junk mail's
headers to aid proper diagnosis. Often spammers have
accounts with ISPs which prohibit such behaviour, and it is fairly
common for these people to lose their accounts in enough people
complain. Before you send the complaint, make sure the junk mail's
headers confirm the origin of the mail first.
Spammers are notorious for using fake email addresses.
A useful site that discusses a lot of the issues and techniques involved can
be found at
http://www.junkbusters.com/
There is also an [FAQ] on email abuse which can be found at
http://members.aol.com/emailfaq/emailfaq.html
5 Browsers
==========
5.1 Overview
------------
Browsers were developed to use the HyperText Transport Protocol (http).
A browser views a page written in [[GOTO HTML]]. HTML is a language that describes
in abstract terms how a page should be laid out. It also allows [[GOTO hyperlinks]]
to be defined. It is this feature that allows browsers to go from page to
page, and essentially defines the web.
The standards body for the HTML language is the W3 consortium. Their web
site (http://www.w3.org/) remains one of the major definitive sites for
information on HTML.
The first browser was developed at CERN, given away free, and became known as
Mosaic. Since then more commercial browsers have been invented (some still
free) whilst Mosaic itself has fallen behind and is no longer under development.
The exponential growth of the Internet has allowed companies like Netscape
to come from nowhere to having a turnover measured in 100's of millions of
dollars.
At present the main browsers in use are
- Netscape. Originally given away free, but now a commercial product.
Netscape's browser gained in popularity because it added a lot of
new features to HTML. In doing so it made HTML non-standard though
a lot of the Netscape extensions have since been accepted into standard
HTML. One of the major Netscape extensions was [[GOTO JavaScript]].
- Internet Explorer. Microsoft's (rather late) response to Netscape.
Not yet as popular, but being free and ubiquitous may fix that in
a few years. As with Netscape, IE has also extended HTML in its
own way.
- Hotjava. Not that popular at the moment, but supplied by the people
(Sun) who supply the Java programming language and environment that
is likely to play a large part in making web pages more interactive.
- Mosaic. Still in use, but showing its age, and no longer being developed.
- Lynx. This is a text-based browser that is remarkably popular
despite (or because of) this fact. Lynx can be a very fast and
effective way of browsing the net, if only because it dispenses
with all the time-consuming graphics.
- Opera. New kid on the block. Given good reviews primarily for its
reputation for being fast (especially since the big 2 went to version 4
and became unweildly). Probably not as "fully featured" as some others.
5.2 What all browsers do
Actually, people will be amazed at how little *all* browsers do. The
basic browser displays text, shows text hyperlinks, and allows those links
to be selected.
You only need to bear this in mind if you're putting your own pages
on the web.
5.3 What all browsers don't do
Over time more and more features have been added to browsers. These changes
have arisen through
- Official and unofficial changes to the HTML standard. Later versions
of a browser usually support newer features of the HTML.
Changes in HTML over the years have included
- Use of TABLES
- Use of Frames
- Use of Cascading Style Sheets ([[GOTO CSS]]).
- Introduction of [[GOTO JavaScript]] and other scripting languages
- Introduction of [[GOTO Java]], [[GOTO ActiveX]] and other interactive software
- Introduction of plug-ins and add-ons. Plug-ins are commonly required
to handle special file types (e.g. audio and video files)
Whether or your browser can do any or all of these things depends on
- Which browser you are using. For example there is great rivalry between
Netscape and Microsoft, with the former favoring Java and Javascript
and the latter favoring ActiveX and trying to introduce Visual Basic
Script.
This is one reason that you sometimes see a "best viewed in XXXX" logo
- Which version of the browser you are using. Older versions do less.
This is another reason that you sometimes see a "best viewed in XXXX"
logo. In these cases it usually states a version number of both main
browsers and may offer an alternative version (e.g. a non-frames version)
- Whether or not you have the requisite features "switched on". For
example use of Java and Javascript can be switched off in the options
of those browsers that support these features. This is because of
possible security worries related to these features.
This is why you may see a "use a java-enabled browser" type message
on a page.
- Whether or not you have have the requisite add-ons and plug-ins.
Normally a page that requires such an extension will point you to where
it can be got from.
Incidently, one of the _major_ reasons you'll see a "best viewed in XXXX" logo,
is that XXXX will have offered free software to the web page author.
5.4 Understanding Web addresses
Web addresses are a special type of [[GOTO URL]]. They take the form
http:// [Internet node] / [resource name] ? [extra data]
The "Internet node" can either be an IP address or a [[GOTO Domain name]], unless
you are browsing your organisation's [[GOTO Intranet]], in which case it will be
some local machine name.
The "resource name" will normally look like a Unix file or directory name. For
example. These look like Windows 95 filenames, with the slash the other way
round.
/index.html
/pub/
/pub/download/file.txt
Directory names should end in a "/". If they don't the remote server will
normally have to add this for you, incurring an extra delay.
Resource names are often case sensitive (depending on the host machine), so
you should usually match the case of the URL as you're given it.
Sometimes you'll see a tilde (~) at the start of the resource name. This
often points to files belonging to a user of the machine you're visiting,
e.g.
/~jaf/jafs_file.html
Knowing that the resource name often corresponds to real files and directories
on the target machine can sometimes be useful, as it allows you to work out
which directory the file is in, and to attempt to look at that directory, or
the one above it. In the above case if you want to see what other files
use Jaf has, you could try
/~jaf/
However, if the user doesn't want you to see the directory contents, they
can create a file (usually called index.html) which the server will search
for first. If such a file exists then this is what you'll be shown.
"Extra data" is usually only required when passing information to software
at the other end such as a search engine. The format will depend on the
resource being accessed. You almost never type this part in manually, rather
it is added automatically by your browser in response to data you have typed
in.
A fuller description of URLs can be found in [[GOTO RFC]] 1738, e.g. at
http://www.cis.ohio-state.edu/htbin/rfc/rfc1738.html
5.5 Using web browsers
5.5.1 surfing
Browsers are easy to use... that's their attraction. Normally you simply
enter the [[GOTO URL]] of the page you want to visit, and away you go. There are
several ways of going to new pages :-
- Type in another URL
- Click on a text or picture [hyperlink]
- Select a URL that you've added to your [bookmark] list.
The hardest part is deciding what URLs you want to visit in the first place.
For this you'll need to use search engines, and to bookmark useful starting
points. Increasingly people are advertising URLs in non-Internet locations
such as newspapers, magazines and on television.
Another possibility is to find a site that regularly compiles lists of
interesting sites to visit. Taking this approach one stage further leads
to a site such as [h:Web_soup] that compiles lists of lists of people's
recommendations. If you really want to surf at random, start here.
5.5.2 Search engines
Search engines are an invaluable aid in locating pages that are of interest.
The Internet is so large that locating good quality information is both
possible and hard work. Search engines make this task much easier.
The basic idea is that the search engine will have visited and documented
a large number of web pages whose details it will store in a database.
You simply visit the search engine, enter a request, and all the URLs that
match your request are shown, often ranked in some order of suitability.
There are universal search engines such as [[GOTO Altavista]] and [Site:Dejanews] that
allow you to search the entire Internet or [[GOTO Usenet]], and subject-specific
search engines
So popular (and necessary) have search engines become, that many sites offer
search engines for just the web pages on their site. Search engine technology
is forever improving, to the extent that Digital now license their AltaVista
technology to other companies, have developed the LiveTopics feature of
AltaVista to help you more intelligently search for data, and even offer
a version for use on PC's to search all your own documents.
A site dedicated to monitoring search engine development can be found at
http://www.searchenginewatch.com/reports/index.html
This is more dedicated to discussing and monitoring the preformance of various
search engines. At the same sime is a page dedicated to listing specialist
search engines
http://www.searchenginewatch.com/facts/specialty.html
5.5.3 Sending email
Most modern browsers allow you to send email. Usually this is invoked whenever
you click on a "mailto" hyperlink, or whenever you select a mail option from
menu. In essence this is no different to sending email normally, but you
should be aware of the following:
- Often the mail software inside your browser is completely independent
of any other mail software you have. Because of this it will need
to be configured independently to make sure that it behaves the same
as normal mail. In particular you won't get a copy of any mail you send
unless you set it up correctly.
- If you share a PC with other users, it may be inconvenient to keep
changing the return address etc.
- You usually have a "mail document" option. This usually sends a copy
of the HTML file you're looking at. If you merely wanted to send the
URL this can be a bit wasteful, so check for attachments before you
send
5.5.4 Downloading files
It's quite common now to download files using browsers. In many ways this
has replaced the older [[GOTO FTP]] software which required you to supply a
username and password in order to access files, although there are still
many resources that are only accessible this way.
Downloading a file usually through the FTP protocol by clicking on a URL
that starts ftp:, or by selecting a http: link with a known download filetype
such as .zip.
When file download is selected, you will be prompted for a location on your
computer to save it to. The file will then download. In older browsers you
can't continue browsing whilst the download is occurring, in newer browsers
you can. In most cases a status bar may give some indication as to how long
the transfer still has to go.
When downloading at peak times, or from busy sites, this process can become
quite slow, so don't be surprised it the time remaining increases occasionally
or states "2 seconds" for over 10 minutes.
What happens once the file download is complete depends on what you've
downloaded, and how your machine and browser are set up.
In some cases nothing happens, and it's up to you to make use of the file
in whatever way suits.
In other cases a [helper application] is launched to "play" the newly downloaded
file, be it a piece of music, some video of just a special document type.
5.5.5 Bookmarking popular sites
All browsers allow you to bookmark sites that you want to go back to time
and time again. However, depending of the browser used this list could be
called the Hotlist, Favourites or Bookmarks.
In most cases you can group these URLs together into folders forming a hierarchy
of links like files and directories on your hard disk.
5.5.6 Taking a local copy
Most browsers will allow you to view HTML files stored on your own hard disk.
This being so, they also offer the ability to save the page currently on display
to your hard disk for later viewing, e.g. once you are no longer connected
to the Internet.
This is usually an option on the File menu, and is sometimes an option on
a pop-up menu if you right click on the main body of the page.
You can usually copy images by right clicking on them and selecting save.
Taking a local copy can give you faster access and [off-line] access to a page,
but there are a number of issues to be aware of
- The saved page has all its links, but not the pages referenced by those
links. Consequently, if viewed [off-line] you may find all the images
missing, and all the hyperlinks to pages other than this one won't
work.
- By viewing the copy, rather than the original, you lose the ability
to view any changes or other information the original author would
want you to see.
- If you reuse or redistribute the page in anyway, you are breaching
any copyright the original author has on the page.
Generally it's fine to take a copy for personal use and convenience.
5.5.7 Viewing HTML source
Most browsers will allow you to view the HTML source of the page (or frame)
being viewed.
This is an invaluable aid when debugging your own pages, and for learning
how other people have put theirs together.
It can sometimes give you additional information, though most of the information
the author wanted you to see is already on screen.
5.5.8 Tricks for using browsers
Here are a few miscellaneous tricks
- If you want to view a page that has changed, press reload or refresh.
Sometimes to force this, hold down the shift key at the same time.
- Most browsers will store the data they download in a local cache so
that next time you ask for the same data it can retrieve the local
copy, giving a faster draw.
You should check your set-up options to understand what caching you
have, and be aware that by being served a local copy you may not see
any changes immediately.
- Browsers have lots and lots of options. You should look though the menus
to ascertain what there is, but only play with the ones you understand.
- If you want faster downloads, try switching images off. In this mode
you only get the text, which naturally comes much faster. You can
always click on the missing picture, or switch the option back on
and reload should you decide you want to see the pictures again.
However this can make some pages unnavigable, as people over-rely
on graphic image maps (pictures where you click on the bit you want).
5.6 Extending your browser's capabilities
Over time more and more functionality has become available over the Internet.
Inevitably this means that the time will come when you are missing out because
your browser if not up to it. There are a number of ways of enhancing your
browser.
5.6.1 Change browser or get a newer version
This is the "throw it away and get a new one" approach. Depending what you
change to this may cost you money. Make sure your computer is powerful
enough for any new version you decide to get.
When you install the new version, it might be useful (if you can afford the
disk space) to keep the old one just in case.
You may be able to try out the new version free for a while to see if it's
what you want.
Finally, be aware that adding any new software to your system can cause
unexpected changes to your systems configuration. Internet explorer is
particularly keen to set itself up as your default browser should you
install it.
5.6.2 Install some Plug-ins
Netscape were the first to develop the idea of helper applications and plug-ins.
The idea behind plug-ins is that rather than produce an enormous,
resource-hungry piece of software that does everything, why not instead
make a slimline browser to which you can add only those extras you need.
You'll know you need a plug-in when you keep coming to a site that tells
you what you need. Normally you can simply download the plug-in by following
the link and downloading and installing the software as instructed.
Plug-ins commonly handle a particular file type, often a new type invented
for use on the Internet by the manufacturers of the plug-in itself.
Some plug-ins are free, others are not. Often the plug-in required to read
or play back a given file type is free, whilst the software required to author
such files is not. This is a payment model frequently used, as it ensures
maximum take-up of a new file structure.
5.6.3 Enable active content
HTML as originally devised was a fairly "passive" language. That is, it
could define a page that one could view, but not interact with.
Nowadays there are several ways in which web pages are being made more and
more interactive. In most cases you need a browser capable of interfacing
with the new content types, and you the need to choose to have these features
enabled, usually by searching though network or security options.
5.6.3.1 Animated .GIFs
Animated .GIFs are basically animated pictures. Their most common use is
in advertising. In content terms these are passive in that you can't interact
with them, but they can make a page more lively.
The problem with animated .GIFs is that they are, of necessity, many times
larger than a static picture the same size. Consequently they can greatly
increase the time a page takes to draw.
5.6.3.2 Java and ActiveX
Java and ActiveX are both methods of allowing programs to be downloaded
automatically and run "inside" your web page. In this way they can effectively
give you software you can interact with on a web page.
These programs are temporary in the sense that once you exit your browser
the program no longer exists on your machine (in fact, once you back out of
the web page its gone).
In the case of Java an area of screen is reserved for an "applet" to run in,
and this applet is downloaded and run locally on your computer. The Applet
isn't stored on your computer, and is designed to run in a way that cannot
contaminate your hard disk or computer memory.
ActiveX is Microsoft's response to Java, but unlike Java it only runs on
Windows machines, and is allegedly less secure than Java.
5.6.3.3 Javascript and other scripting languages
Javascript was developed by Netscape as an attempt to make web pages more
interactive. It pre-dates Java with which, confusingly, it has nothing in
common.
Javascript code is written into the HTML source files themselves. This code
is interpreted by your browser as it reads the HTML page. The code can tell
the browser what to do when you click on certain buttons, or move your
mouse to a certain location.
The language allows messages to be displayed, and can manipulate the contents
of the web page (something Java cannot do... it is restricted to the reserved
applet box).
Microsoft are hoping to make their popular Visual Basic the basis of their
own scripting language. As ever, the two browser companies continue to
battle it out.
5.7 Common Errors
5.7.1 The dreaded 404
This is the commonest error. It simply means "file not found". This is
usually because the file has been moved, and you have followed an old link.
5.7.2 DNS lookup error
A Domain Name Server lookup error has occurred. What this means is that
the Internet [[GOTO Domain name]] you have specified cannot currently be translated
into a valid IP node number.
This could mean the machine doesn't exist anymore, but sometimes trying a
second time, or a day later solves the problem.
5.7.3 Access denied
The machine you have accessed is choosing to deny access to the particular
resource requested. This is either because you've asked for something you
shouldn't have, or the remote machine has undergone a change of configuration
or is undergoing some maintenance.
5.7.4 URL case sensitivity
Be aware the URLs are - in theory at least - case sensitive. This means you
should always type in a URL exactly as you see it.
Whether or not a particular URL is case sensitive will usually depend on the
type of computer the server is, and the sort of server software that it runs.
5.8 Cookies
Cookies are tiny nuggets of data that a web server can get a browser to
write to *your* computer. This nugget of data can only be passed back by your
browser to the same web server.
This device allows servers to keep some context information on each visitor
to their site. For example a search engine could use this to remember
what topics you were interested in last time you visited a few weeks ago.
This technique is becoming increasingly popular as a means to customize the
way a particular site works for you. Because of this access to some sites
is dependent on being able to accept cookies.
Not all browsers accept cookies, and those that do can usually be configured
to show alerts each time someone tries to set one.
Whether you allow cookies to be set is a matter of personal preference.
6 USENET and Newsgroups
=======================
6.1 What they are
Usenet is a set of bulletin boards or newsgroups made available via the
Internet. Each newsgroup is dedicated to an area of interest and people
[post] articles or "posts" on different subjects within the area of interest.
People can choose to write on a new topic, or to post a [followup] on an
existing topic. Followups traditionally have the same subject line with
the letters "RE:" inserted at the start.
The combination of the original article and all its subsequent followups is
known as a [[GOTO thread]]. All news-reading software allows you to read news by
thread, and to choose to follow or ignore particular threads.
A good set of (text) documents describing NEWS can be found at
http://sasun4.epfl.ch/News/Document
6.2 How they are organised
6.2.1 Newsgroup hierarchies
Newsgroups are organised into a fairly loose set of hierarchies. There
are a number of standard hierarchies, and any number of local hierarchies, for
example most of the main ISPs have their own groups, some of which are made
publicly available.
The main hierarchies are
alt... Alternative. Anything goes.
biz... Business
comp... Computers
news... Administrative for News generally
rec... Recreational.
sci... Science
soc... Society
Of these the alt... hierarchy is largely unregulated, whilst the other
hierarchies are more controlled. There are more alt... groups and they
are easier to create, but correspondingly the [[GOTO Signal to Noise ratio]] is
much lower in these groups, and the language is sometimes less formal.
In addition to the above this many counties and regions have their own
hierarchies, as do some collaborations, e.g.
uk...
fr...
ruhr...
bionet...
The current list of newsgroups stands at around 17,000. You should contact
your service provider to see what groups they make available.
6.2.2 NEWS distribution
NEWS is distributed round the globe by being passed from Internet node to
Internet node. Each internet node is free to decide which groups they will
or won't take. This is one place where censorship starts to enter the Internet.
Some ISPs only get a core set of groups to which they will happily add
any newsgroup a customer requests. This is usually an attempt to save
bandwidth.
Some ISPs choose to refuse certain groups on the grounds that they give
offence, or contravene the prevailing laws in their territory. This is
especially true of providers looking to meet "family" needs.
Other ISPs choose to refuse binary groups on the grounds that these take
up far too much bandwidth, and besides, they contain the bulk of the
pornography floating round the Net.
Yet more ISPs declare that in the interests of free speech they will take a
"full feed" and make this available to their customers. These providers
are the ones most frequently being taken to court.
You should be aware that depending on who you get your news from you may
well not be getting a full feed.
You should also be aware that given the genuinely unsavory content in some
parts of Usenet, you may not *want* a full feed.
6.2.3 .Answers groups
Many newsgroups have [[GOTO FAQS]] associated with them. These FAQs are posted
regularly to the newsgroup concerned, [Site:RTFM], and often to the .answers
newsgroup in the same hierarchy.
Thus comp.answers contains many useful posts of computer related FAQs, whilst
rec.answers contains FAQs on every type of hobby.
Many of these are additionally posted to news.answers.
6.2.4 Moderated newsgroups
Some newsgroups and mailing lists are "moderated". In these cases
all articles posted to the group are checked by a Moderator.
The moderator is free to
- reject the article. This is usually only done if the article
violates the group's charter (e.g. is off-topic like [[GOTO Spam]],
or is too long)
- edit the article. This is relatively unusual, but the moderator
may correct spelling, factual errors, or remove information
supplied in a previous post.
- Accept the article. In this case the article is forwarded to the
newsgroup and mailing list, and enters the public domain.
Moderation is a good way of improving the [[GOTO Signal to Noise ratio]] in
a group, but is hard work for the moderators who frequently do the
job voluntarily and are unpaid.
6.2.5 Binary groups
It is possible to post binary files (such as pictures and software) to
newgroups. However, such posts tend to be much larger than normal, and
as such are unwelcome in most newsgroups.
To get round this problem, there are a number of newsgroups dedicated to
accepting binary posts. These are mostly the alt.binaries... groups.
If you have a picture you'd like to share with people in a newsgroup, consider
offering it via email, and first asking the group if they're interested.
Next consider placing it on a web page.
If enough people express an interest, find out which binary group is
most suitable and post it there. Once you've done that, post an article
to your normal newsgroup telling people what you've done, so that they can
go and fetch it.
Posting binaries usually entails converting them into ASCII files using
some form of encoding. It's increasingly common these days for email
packages to offer this.
Often large binaries are split into a number of posts. This is because
some parts of the Internet reject messages over a certain size.
To reassemble the binary, you need to locate all parts of the original,
reassemble the parts, and convert back to binary.
Again, it's increasingly common for your newsreading software to be capable of
doing this for you.
6.3 How to use newsgroups
6.3.1 Using a newsreader
You normally access news using newsreader software. There are a variety
of commercial and free packages available.
Normally you "subscribe" to groups that are of interest to you. Then each
time you start the software, it will get all the new articles in those groups.
Some packages only get the message headers and subject lines. This allows you
to pick which actual posts you're interested in, and which the newsreader
should fetch the article bodies for. This is more efficient and saves
[[GOTO bandwidth]] and time.
Similarly some newsreaders set a limit of the maximum article size they will
automatically fetch.
Sometimes you can set up a "kill" file of threads and authors you don't
want to see. The newsreader will then ignore any such posts. This can
be useful for avoiding the "village idiot" posters that each group seems
to have.
If you are accessing the Internet from home or over a modem line, make
sure you use a package that allows [off-line] reading. Such packages will
fetch the articles you're interested in, and save them to your hard disk,
thereby allowing the connection time to the Internet (and hence phone bills)
to be minimized.
6.3.2 In a browser
Increasingly it is possible to read news via your browser. At the time of
writing these are not generally as good as dedicated newsreader software,
particularly since they don't support [off-line] reading.
However this is a good way of quickly dipping into a group you've just
discovered, or have only a fleeting interest in.
6.3.3 Using DejaNews
The [Site:Dejanews] site offers a search engine that allows all current and
past news articles to be searched.
6.4 Usenet Conventions
6.4.1 Subject lines
In order to help people better anticipate the contents of articles some
conventions have evolved on use of subject lines. These conventions are
mostly used in newsgroups, but you will see them occasionally in email.
RE: - Signifies a reply. Most mail software does this for you
FWD: - Signifies a forwarded message (email usually)
FS: - For sale
WTB: - Want to buy
In addition, the charter of some groups may define additional shorthand
local to that group.
6.4.2 Signatures
Many software packages will allow you to place a "signature" at the end of
all your email and Usenet posts. This allows you to add contact information
and witty comments.
By convention the first line should be 2 dashes, and the whole signature
should only be 4-5 lines long. Large signatures are considered bad
[[GOTO Netiquette]] and will attract criticism.
6.5 Usenet related sites
6.5.1 Locating useful resources by newsgroup
One site that collects resources referenced in newsgroups, and lists them
by the newsgroup name is http://www.phoaks.com/. This can be a useful way
of locating FAQs or useful web pages relating to a USENET group.
6.5.2 Getting statistics on Usenet group usage
The site http://sunsite.unc.edu/usenet-i/ offers various Usenet related services.
This includes a list of newsgroups at
http://sunsite.unc.edu/usenet-i/ (this is a *large* file), and statistics for each newsgroup.
Statistics are help for each newsgroup in a page below the "/groups-html"
directory at this location. For example the statistics on comp.risks
newsgroup are contained in a page at
http://sunsite.unc.edu/usenet-i/groups-html/comp.risks.html.
How accurate of recent these pages are I couldn't say (they seem a little
out of date to me), but it might give you a feel for the *relative*
popularity, availability and throughput of a newsgroup if nothing else. This
could help you find the right forum for your announcements, or a quiet backwater
in which to have a chat with like-minded souls.
6.5.3 DejaNews
DejaNews is discussed more fully in section 7.2.
6.6 Other on-line news sources
6.6.1 Service provider forums
Some of the larger service providers provide their own equivalent
services for the benefit of their own customers.
AOL and Compuserve in particular come to mind.
6.6.2 More conventional NEWS sites
Most major news-gathering organisations (newspapers and TV) now have a
presence on the Net. Simply seek out your favourite newspaper and search
their small print for a web address.
Particularly noteworthy are [h:CNN] and the [h:BBC].
Many of these news services are free, though how long that will continue
is doubtful, given the way that most newspaper sites request subscription
information and require a password to enter.
A list of news services can be found at http://www.discover.co.uk/NET/NEWS/news.html
7 Sites you should know about
=============================
7.1 Altavista
Visit [h:Altavista] for full details.
Altavista is one of the Net's best search engines, with over 40 million web
pages indexed.
7.1.1 Using Altavista
Using AltaVista is easy, simply type in some words that you want to search for.
You should read the Help page for extra tips, and there is an Advanced Search
function that allows you look for one word near another, and LiveTopics that
will categorize the matches you've found to help you further select what you
are interested in.
Tricks to be aware of include :-
- Altavista will only match case if your word is in mixed or upper case.
If in doubt, type it all in lower case.
- If you want to locate a phrase, place it in double quotes.
- a "+" in front of a word means that a web page must have that word
in it. A "-" means that is must not. The latter is useful if refining
a search by removing items you're not interested in.
7.1.2 Special searches
Altavista doesn't just index the text on a page, it also indexes the hyperlinks,
titles etc. Thus typing the search string
url:microsoft
will find all web pages with the string microsoft in the URL.
link:www.jafsoft.com
finds all pages that have hyperlinks pointing at the www.jafsoft.com site, and
title:Flamingo
finds all pages with flamingo in the title.
7.1.3 Translating web pages
Recently AltaVista has started offering to translate web pages on-line. This
service is offered in the results of a search, but can also be used
directly by going to
http://babelfish.altavista.digital.com/
7.2 Dejanews
Visit [h:DejaNews] for full details.
Dejanews has all the postings ever made in newsgroups in a searchable database.
The database is divided into current and old.
7.2.1 Finding postings
To find articles of interest, simply type in a few keywords. For each post
found you can
- read the post itself. From here you can follow the [[GOTO thread]] this
post belonged to.
- Get an author profile of the person who posted.
- post a followup to the article in the original newsgroups.
7.2.2 Finding newsgroups
Dejanews now has a search for newsgroups feature. Simply type in a few
keywords and see a ranked list of newsgroups where Dejanews believes
articles containing these words are posted.
This is a perfect way of finding newsgroups that may be of interest to you.
7.2.3 Getting author profiles
Dejanews offers you author profiles, that is a list of the posts a person
has made to various newsgroups. This can give you a feel for the interests
and character of a person, and is increasingly being used (as is the Internet
generally) to discover what job applicants are like.
This "Big brother" aspect of the Net is something you should be aware of.
If you start posting to newsgroups and publishing web pages, you are placing
that information in the public domain in a way that is easily archived and
searched using modern technology.
7.3 RTFM
RTFM stands for [Read the FM] which was basically an oft-repeated plea
to [newbies] to read the [FAQ] before posting the same old questions and
making the same old mistakes.
Eventually one man decided to collect all the FAQs together so people could
easily find them. This now vast repository can be found at [h:RTFM].
7.3.1 Reading the FAQ's
The RTFM archive is an FTP site. In a browser FTP files appear as folders
and directories much like they do on a PC. You can navigate round these
folders to find the FAQ for the newsgroup you are interested in.
FAQ maintainers will normally update the copy on RTFM regularly, so this
is a good place to start searching for such information.
Start here [h:RTFM-usenet].
7.3.2 Finding people's email addresses.
RTFM has a complete directory listing people's email addresses. These can
be searched, although there are now more user-friendly ways of doing this
(see [[GOTO Email addresses]]).
This database can be searched via email. For details send a message
with the subject blank and the message set to
send usenet-addresses/help
to
mail-server@rtfm.mit.edu
7.4 Other search engines
7.4.1 Yahoo
Visit [h:Yahoo].
Yahoo is one of the oldest and largest Internet directories around. Yahoo
attempt to place each site into a suitable category. By selecting the
categories you are interested in you can get a concentrated list of
suitable sites.
All links in Yahoo are added by hand, which means it is difficult to get listed
meaning their index is selective rather than comprehensive.
7.4.2 Infoseek
Visit [h:Infoseek].
Another major search site. This site takes the same approach as Yahoo, namely
putting sites into categories, but seems to offer more additional services
such as finding people's email addresses.
7.4.3 Excite
Visit [h:Excite]
Excite is an up and coming search engine, which offer news services in addition
to straight search engines.
7.4.4 WebCrawler
Visit [h:WebCrawler]
7.4.5 HotBot
Visit [h:HotBot]
This search engine puts fairly advanced search options on its front page.
Most search engines have these options, usually under "advanced search".
7.4.6 MetaCrawler
Visit [h:MetaCrawler]
MetaCrawler is an interesting search engine in that is uses the other search
engines to match your request and then collates the results.
7.5 The Internet movie database
If you're a film fan, visit [h:IMDB].
7.6 Other sites
Here's a list of interesting sites. I've tried to only list sites that will
act as good starting points for finding various types of information, or that
you would visit regularly.
$_$_BEGIN_TABLE
http://www.digital.com/info/rcfoc/ The rapidly changing face of
technology. Discusses up and coming
developments on the Net.
http://www.theonion.com/ Satirical newsletter
http://www.thepubliceye.com/ Allows you to check up on companies
selling via the Net
http://www.virtualpromote.com/ Discusses promotion of web pages
$_$_END_TABLE
8 Creating your own web pages
=============================
The braver amongst you may decide you want to create your own web pages.
You can start this very easily, and these notes will help get you started
and give you some pointers. However the whole subject is vast, and if
you intend becoming expert at authoring Web pages I suggest you familiarise
yourself with the subject and then buy a suitably expert book on the subject.
8.1 What exactly is HTML?
Web pages are HTML documents. HTML stands for "HyperText Markup Language".
That is, it is a language used to "mark-up" documents for display by a
browser. One of the most important points to understand here is that each
browser is free to implement or ignore a given mark-up as it sees best.
Thus a "header" may be shown as larger and bolder on a PC, but on a text
terminal it might be shown in reverse video.
As an author of web pages it is crucial to remember that everyone is going
to see you page slightly differently.
8.1.1 The overall structure of an HTML page
HTML pages consist of normal text with "tags" added for the markup. Tags
are key words contained between angle brackets. Often tags come in pairs
with some text between the two tags. In such cases the closing tag has
a slash character (/) after the opening angle bracket. The text between the
two tags is thus effectively "marked-up" by the two tags.
For example
Some of this is in bold
The end tag doesn't have to be on the same line, and generally browsers ignore
the use of white space in the source document. Care should be taken to make
sure that each tag has a matching end tag. Tag pairs can be nested, and it
is good practice to place the end tags in reverse order that they are applied
i.e.
Bold and in italics
instead of
Bold and in italics
You might get away with the second usage, but it's bad practice, and depends
on the browser as to how it reacts.
The overall structure of a HTML document should be
.
. (other tags that belong in the header)
.
.
. (other tags and text that belongs in the main body)
.
8.1.2 The HTML standard
The HTML standard is maintained by the W3 consortium. Visit [h:W3] for
up to date chapter and verse on what HTML is. You'll also find definitive
lists of standard HTML tags there.
8.1.3 Some common and useful tags
Here is a brief list of the most commonly used markups. A fuller list can
be found at (amongst others) http://www.htmlgoodies.com/html_ref.html
8.1.3.1 ... tags
These tags go in the ... section of the document. The text
marked up in this way becomes the document's title shown at the top of the
window.
8.1.3.2 Bold, Italics and underline tags
The .. .. and .. markups produce bold, italics and
underlining effects. Note, hyperlinks are underlined automatically.
8.1.3.3 Strong and emphasis tags
Recently there has been a move away from the bold and italic tags to
.. and .. markups. The former are known as
physical markups since they describe physical characteristics. If a browser
cannot do italics or bold, then those markups will be ignored.
These newer markups are called "logical" markups as the tell the browser
the degree of emphasis wanted. This leaves the browser free to choose how
to achieve this effect.
8.1.3.4
line break tags
Browsers ignore the use of white space in a source document, and this includes
line breaks. This is to allow paragraphs of text to adjust as the browser
window is resized.
If you want a line break, the
tag tells the browser to do just that.
8.1.3.5 ...
Paragraph markers
The ..
markup is used to mark up paragraphs. It is quite common for
the
to be omitted, however as more arguments are added to the tag
it may become important to supply a
tag to mark the end of the
specified effect.
8.1.3.6
Horizontal rule tags
The
tag puts a horizontal line across the page.
8.1.3.7 Anchor (hyperlink) tags
The .. tags can be used to define anchor points and hypertext
links. These tags always require extra arguments in the opening
tag to define the link.
This is discussed more fully in [[GOTO Adding hyperlinks to web pages]]
8.1.3.8 image tags
The tag can be used to add pictures to your page. The basic definition
is something like
The ALT attribute is a text description displayed whilst the image is
loading. This helps to give the viewer an idea of what's coming before
it arrives. It's a good idea to *always* include an ALT attribute for the
following reasons
- It will be shown whilst the image is still loading
- It will be shown even when the user switched images off
- It can be understood by browsers used by the partially sighted
- In recent browsers it is shown as a "tooltip" whenever the mouse
is moved over the image.
The HEIGHT and WIDTH attributes specify the display size of the image in either
pixels or percentage of screen size. This allows the browser to reserve
space whilst the image downloads, allowing the text to be displayed faster.
It won't make the download faster, but it will *seem* faster. It will
also preserve the page layout if you switch images off. You don't need to
supply both.
If you don't do this, the browser either has to wait until it's got the
image, or it has to completely redraw the page once the image arrives.
Neither is particularly nice.
Note, the HEIGHT and WIDTH need not be the original size of the image, and
people sometimes think, wrongly, that they can speed up the download by
making the HEIGHT and WIDTH smaller. In fact the download is just as slow,
but it gets *drawn* smaller. If you want a page with small pictures it's
usual to make smaller copies and link those to the larger originals. These
small pictures are often called "thumbnails".
You can get away with specifying just one of HEIGHT or WIDTH. The browser
will set the other to scale.
The SRC attribute gives the URL where the image file can be found.
Only the SRC attribute is really needed, but supplying ALT, HEIGHT and WIDTH
are good habits to get into.
8.1.4 HTML extensions
Both Netscape and Microsoft have invented non-standard HTML tags to give
their browsers added features. Some of these have subsequently been adopted
into standard HTML. Generally it is a very bad idea to use these extensions
as it means you are forcing your audience to use one browser over another.
In the early days all the Netscape extensions became the de facto standard.
This situation is very unlikely to occur again.
8.1.5 Adding hyperlinks to web pages
Hyperlinks are added to web pages using the Anchor tags .. . There
are two basic methods for using the anchor tag. One creates an "anchor point",
that is a point that a hyperlink can jump to, and the other creates the
hyperlink itself.
For example
This is an anchor point
creates an anchor point called "AnchorPoint" in the current document. The
text between the ... tags will appear as normal.
By contrast the markup
Goto the Anchor point
creates the hyperlink that will take you to the first point. In this case
anything between the and tag is highlighted, and may be selected
to activate the link. This can include images.
The HREF part of this tag is in fact a [[GOTO URL]]. In this context a URL is
fully specified as
:////#
where
normally "http"
The internet node the resource is on. If
omitted the current machine is assumed
The directory path on the machine that the
resource file lives in. If omitted the
current directory is assumed.
The file that is to be viewed. If omitted
the current file is assumed.
The location within the file that the
browser is to go to. If omitted (or
if invalid) the top of the file is assumed.
Note:
- If the machine and directory paths are omitted this is called a
"relative link". HTML that uses relative links is a lot easier
to move from machine to machine, as the relationships remain intact,
even though the complete address has changed.
- Not all browsers support the use of anchor points. In such cases
they are ignored and the browser goes to the top of the file.
- If an invalid anchor point is given, the browser goes to the top of
the file. Anchor names can be case sensitive.
- Internet Explorer goes to the top of the file briefly before going
to the anchor point.
- If a URL pointing to an anchor point in the same file is used, then
some browsers will not bother to reload the page. This gives a
faster re-draw.
8.1.6 Adding colour to web pages
In addition to adding images to your web pages, you can change the colours
used on your web page by adding attributes to your tag as follows;
Where
BGCOLOR Is the background colour of the page
TEXT Is the colour of the text
LINK Is the colour of an unused link
VLINK Is the colour of a visited link
ALINK Is the colour of a link as you visit it
In each case the colour is specified as a set of three hexadecimal numbers
that express the red, green and blue component of the colour.
In Hex digits can be 0..9,A..F, so for each colour you have a theoretical
range of "00" to "FF" or 0-255 in real money.
If you're not familiar with hexadecimal, think of "F0" as a two digit number,
in which case you'll see that the sequence on numbers goes
00, 01, 02... 09, 0A... 0F, 10, 11, ...1A...1F, 20.... ..... FF
On this scale "F0" is pretty high, whilst "0F" is pretty low, that is it's
the first digit that is most significant, just as it is in base 10.
In the above case we have
BGCOLOR 12 00 00 = (18,0,0) i.e. a dark red.
LINK 00 9A 00 = (0,154,0) i.e. a medium green
VLINK 00 00 CD = (0,0,205) i.e. a bright blue
ALINK FF FF FF = (255,255,255) i.e. brilliant white
TEXT 00 00 00 = (0,0,0) i.e. dark black
Be careful not to have two colours the same, as this will make something
go invisible.
There are plenty of colour palette's for you to use on the Net. For example
visit http://www.concentric.net/~noree643/colors/contents.html
8.2 Composing web pages
8.2.1 HTML editors
There are an increasing number of web editing tools around. These usually
offer ease of use and better graphics handling an [[GOTO WYSIWYG]] functions.
However really simple web pages can be created with just a text editor and
a small HTML reference book.
You pays your money and takes your choice.
8.2.2 Testing your pages
The quickest way to test the layout of your pages is to view them straight
from your own hard disk. This will save you upload time to your server
machine, and can be done to a large extent [off-line], disconnected from
the Net.
To do this save your file to disk, and open a browser window. Instead of
entering a web URL, instead select a "open file in browser" option. The
location of this option will vary according to the browser you are running.
When you do this the location will be something like
file:///c|/directory/ (whatever)
instead of the usual http: address. If you are going to edit this frequently
it will be a good idea to [bookmark] this location for future use.
View the page as normal in your browser. You should be able to check the
layout and appearance of the page, but you may not be able to test some of
the links unless you are connected to the Internet.
If necessary, go back to your editor and make any changes and save the file
again.
You can now view the changed version of your file by selecting the reload
or refresh option in your browser.
Note, you probably don't need to exit either your browser or your editor
in doing this. This makes development a lot faster. In some cases the
browser and editor are even part of the same software package.
Once you are happy you should [upload] your new file(s) to the server.
Note, when you upload your file you may find that some of your links that
work on your own machine may not work when you've loaded the page onto the
Internet. *This is a really common fault*, so make sure you always view your
pages after uploading to the Internet, preferably from a different machine.
The usual cause is that you've forgotten to also copy the files referenced
(e.g. image files), or that the relative links used are invalid on the Web
(e.g. files in sub-directories on your machine are in the same diretory
when loaded to the Web). It is always a good idea to organize your local
directories to exactly match the target configuration on the web.
8.3 How to teach yourself HTML
8.3.1 Find an HTML reference site
There are lots and lots and lots of on-line web pages dedicated to teaching
people HTML. I'm not even going to start to suggest one.
Go to [h:Altavista] and type something like
+learn +HTML +beginners
and pick one of the 5000+ sites you find. You can refine your search by adding
more keywords.
8.3.2 Buy an HTML book
Similarly there are lots and lots and lots of computer books. However this
is more problematical as computer related books tend to be large and expensive.
HTML books come in several forms:
- Idiots guides. These try to talk in layman's terms about HTML, a fairly
technical subject. Depending of what level you anticipate getting to
its possible that you will work through this book once, and then never
read it again. On the other hand, that may be all you want.
- Brief introductions. These books cover the basics in a relatively short
time. They serve the same purpose as Idiots guides, but get there
faster, and probably have a longer shelf life as reference manuals.
- Reference manuals. These contain complete specifications for a given
version of HTML. These books can be very dry and of little use to
someone learning, but are a great aid to the more advanced HTML author,
and usually highlight the differences between the various HTML
extensions made by Netscape and Microsoft.
One problem with such books is that they lose their edge when the next
version of HTML comes out.
- Programming manuals. These are aimed more at the web professional, and
will deal with topics like CGI scripts and other server-related subjects.
My advice would be attempt to struggle through an on-line course, and learn by
example. Depending on how easy or hard you find that, choose an appropriate
book.
8.3.3 Learn to view local files
The best way to pick up tricks and learn a new (computer) language is to see
how it's currently being used. Fortunately this is very easy in HTML as
most browsers will have a View... Source... option, and some will have a
save to disk option, allowing you to study the file at your leisure.
If you see a web page with a feature you want to understand, try just looking
at the source.
Unfortunately as the language gets more and more sophisticated
and more is done via HTML extensions this can be harder to do.
Another fact working against you doing this is that more and more pages are
written from HTML editing software, rather than "by hand". Such pages are
harder to view sensibly because they use far too many features (special
fonts etc), and create very long source lines.
You should be aware that if you see